3. Case study: create a plot block
plot-block.Rmd
In this vignette, we show how to create a plot block layer by layer with the ggplot2 package.
Introduction
Anatomy of a {ggplot2}
plot
With ggplot2, plots are built layer
by layer. It all starts with ggplot()
which initialises the ggplot object passing optional data and mappings.
Then, we add geoms, like geom_point()
.
Geoms also accept custom mappings (overwrite mappings
passed to the first ggplot()
call), data, as well as other
parameters. As a result, there are many ways to build a ggplot:
# 1
ggplot(data = <DATA>, mapping = <MAPPING>) +
geom_point()
# 2
ggplot(data = <DATA>) +
geom_point(mapping = <MAPPING>) +
geom_point(mapping = <MAPPING>)
# 3
ggplot() +
geom_point(data = <DATA>, mapping = <MAPPING>) +
geom_point(data = <DATA>, mapping = <MAPPING>)
In this tutorial, we go for the first option.
To do list
What do we need to create?
It seems obvious to add new ggplot_block()
and geom
constructors. On the Shiny side, we have to handle the
plot output element. Remember that the server_output()
generic is defined in the blockr
server.R
script. It supports tables with
server_output.block()
and plots with
server_output.plot_block()
. Therefore, we don’t need to
create another S3 method and have to make sure our new plot block
inherits from the plot_block
class to
dispatch to the correct method. Most of our work will
be on the block side since we’ll have to generate the plot layer by
layer, each layer being a block. Before writing this vignette,
blockr did not support layer by layer plots. This new
concept required to expose a new way of combining plot expressions to
generate the entire stack code. With the previous infrastructure, we
could only do:
However, handling ggplot2 grammar requires to be able
to combine with +
such as:
data %>% plot_block + layer_block
All the process is described in the following, which should highlight the flexibility provided by blockr.
New block helpers
New evaluate_block()
method
We define below a new way to pass data to a block layer, through the
evaluate_block.plot_layer_block()
method. As a consequence,
these block layer must have the plot_layer_block
class:
evaluate_block.plot_layer_block <- function(x, data, ...) {
stopifnot(...length() == 0L, inherits(data, "ggplot"))
eval(
substitute(data + expr, list(expr = generate_code(x))),
list(data = data)
)
}
Compared to evaluate_block.transform_block()
, we simply
replaced %>%
by +
. Nothing more! As a side
note, if you had to develop your own method in a test script, you could
register it and test it with:
Generate a valid stack code
Before adding the layer by layer plot blocks,
generate_code.stack()
was:
generate_code.stack <- function(x) {
binary_substitute <- function(x, y) {
substitute(x %>% y, list(x = x, y = y))
}
Reduce(binary_substitute, lapply(x, generate_code))
}
This method leverages Reduce()
which applies a
binary function to combine elements of a vector (from
left to right by default). For instance, we could sum the 3 first
integers with:
Reduce(`+`, 1:3)
generate_code.stack()
successively combines all the
block expressions given by lapply(x, generate_code)
. If
there are 3 blocks in the stack:
my_stack <- new_stack(new_dataset_block, new_filter_block, new_select_block)
binary_substitute
is called twice:
- Once to combine x being
<dataset_block_EXPR>
(the data block expression) and y being<filter_block_EXPR>
, to give<dataset_block_EXPR> %>% <filter_block_EXPR>
. - Then to combine the previous result, namely x, equal to
<dataset_block_EXPR> %>% <filter_block_EXPR>
with y equal to<select_block_EXPR>
. - This ultimately yields:
<dataset_block_EXPR> %>% <filter_block_EXPR> %>% <select_block_EXPR>
substitute()
ensures we don’t evaluate the generated
expression and replace x
and y
by their
respective values.
Can you spot the current point? How do we combine plot layer
expressions linked by +
with this setup? In order to
overcome this limitation, we introduced the
block_combiner()
, a new generic which aims at linking
multiple block expressions depending on their class. Under the hood, we
only check the class of the right block to determine
the expression link (the S3 dispatch occurs on the second block), that
is %>%
or +
(and maybe more in the
future).
block_combiner <- function(left, right, ...) UseMethod("block_combiner", right)
Another important modification to bring is about
generate_code.stack()
. With the above logic,
Reduce()
only consumes the block expressions, thereby
preventing us from being able to dispatch. We pass a vector of blocks
instead, such that we get:
generate_code.stack <- function(x) {
if (length(x) == 0) return(quote(identity()))
# Handles monoblock stacks
if (length(x) > 1) {
aggregate_code <- function(x, y) {
block_combiner(x, y)
}
Reduce(aggregate_code, lapply(x, \(b) b))
} else {
generate_code(x[[1]])
}
}
For empty stacks, the returned code is arbitrary. For stacks with one
block, we don’t need Reduce()
and only call
generate_code()
for the corresponding block. We create a
new internal function, that is aggregate_code
, subsequently
calling block_combiner()
.
There is still one last issue to solve. Since we now pass blocks, we
have to find a way to let block_combiner()
evaluate each
block expression. We apply generate_code()
within the
substitute()
call so we can inject the block expression and
not the block itself. This eventually leads us to:
block_combiner.transform_block <- function(left, right, ...) {
substitute(
left %>% right,
list(left = generate_code(left), right = generate_code(right))
)
}
block_combiner.plot_block <- block_combiner.transform_block
block_combiner.plot_layer_block <- function(left, right, ...) {
substitute(
left + right,
list(left = generate_code(left), right = generate_code(right))
)
}
Importantly, plot constructor like ggplot()
should have
the plot_block
class so that the link is
%>%
(the same as for other transform blocks).
Create the ggplot block
To create a new block, we call the new_block()
constructor. It expects:
- fields: a list of field, which are translated into shiny inputs.
- expr: the expression returned by the block, necessary to produce an output and export the code.
- name: a name (randomly choosen, you don’t need to worry about this).
- class: a class to dispatch to the relevant S3 methods.
-
layout: an optional layout (default to
default_layout_fields()
).
Our block will have c("ggplot_block", "plot_block")
as
classes, some fields to pass mappings and we leave the layout to the
default choice. Note that the mapping field depends on the provided
data. To stay simple, we assume to only handle x
and
y
aesthetics.
new_ggplot_block <- function(...) {
new_block(
fields = list(
# TO DO
),
expr = quote(),
class = c("ggplot_block", "plot_block"),
...
)
}
We finally end with the new_ggplot_block
constructor.
The next step is to create the mappings fields with two
new_select_field()
. Those fields are converted into shiny
inputs, specifically selectInput()
. To set the field
choices which depend on the data, we define the data_cols
helper:
data_cols <- function(data) colnames(data)
We modify new_ggplot_block
such that:
new_ggplot_block <- function(col_x = character(), col_y = character(), ...) {
data_cols <- function(data) colnames(data)
new_block(
fields = list(
x = new_select_field(col_x, data_cols, type = "name"),
y = new_select_field(col_y, data_cols, type = "name")
),
expr = quote(),
class = c("ggplot_block", "plot_block"),
...
)
}
Passing type = "name"
allows to inject functions inside
the select field choices. This is useful to dynamically update the
choices whenever data change.
Producing the expression, which is certainly the most “technical”
part, as it involves a bit of metaprogramming. The
expression must not be evaluated in the block, that’s why it is wrapped
in a quote()
. blockr is then able to
generate the expression with generate_code()
and evaluate
it with evaluate_block()
.
Our ggplot expression can be written as:
In the above expression, you may notice .()
, which is
actually required by generate_code()
. Under the hoods, this
is needed by bquote()
which only evaluates arguments
wrapped by .()
with variables from the environment. It
makes it easier to use than substitute()
, since we
explicitly mark what we want to evaluate.
All combined together, this eventually yields:
new_ggplot_block <- function(col_x = character(), col_y = character(), ...) {
data_cols <- function(data) colnames(data)
new_block(
fields = list(
x = new_select_field(col_x, data_cols, type = "name"),
y = new_select_field(col_y, data_cols, type = "name")
),
expr = quote(
ggplot(mapping = aes(x = .(x), y = .(y)))
),
class = c("ggplot_block", "plot_block"),
...
)
}
As you can, see the code base is reasonable in terms of complexity.
Create a geom block
Now that we have a valid new_ggplot_block
, we want to
add it a geometry. The easiest one is geom_point()
. To keep
the vignette as simple as possible, we only handle two options, namely
the point color and shape. We host this information in a
new_select_field()
, for instance:
new_select_field(default, choices)
The expression is straightforward to get:
geom_point(color = .(color), shape = .(shape)) # Don't forget to wrap it with quote(...).
Importantly, since we use .(color)
, the field name must
be color and conversely for the shape. Also note that,
since the previous data contains the ggplot
object, we must
extract its data located in data$data
, such that the column
names are obtained with
data_cols <- function(data) colnames(data$data)
.
This finally gives us:
new_geompoint_block <- function(color = character(), shape = character(), ...) {
data_cols <- function(data) colnames(data$data)
new_block(
fields = list(
color = new_select_field(color, data_cols, type = "name"),
shape = new_select_field(shape, data_cols, type = "name")
),
expr = quote(
geom_point(aes(color = .(color), shape = .(shape)), size = 2)
),
class = c("plot_layer_block", "plot_block"),
...
)
}
Note the class plot_layer_block
. This is necessary to
invoke the corresponding evaluate_block()
method (to use
+
instead of %>%
).
Try it
We can try it on the following stack:
stack <- new_stack(
data_block = new_dataset_block("penguins", "palmerpenguins"),
plot_block = new_ggplot_block("flipper_length_mm", "body_mass_g"),
layer_block = new_geompoint_block("species", "species")
)
serve_stack(stack)
Going further
Below is a possible implementation of an interactive ggiraph plot.
library(blockr)
library(ggplot2)
library(ggiraph)
custom_data_block <- function(...) {
new_dataset_block(
...,
selected = "mtcars"
)
}
new_ggplot_block <- function(...) {
data_cols <- function(data) colnames(data)
new_block(
fields = list(
x = new_select_field("wt", data_cols, type = "name"),
y = new_select_field("qsec", data_cols, type = "name"),
color = new_select_field("disp", data_cols, type = "name")
),
expr = quote(
ggplot(mapping = aes(x = .(x), y = .(y), color = .(color)))
),
class = c("ggplot_block", "plot_block"),
...
)
}
# We could have use a mutate_block instead of
# changing the data from inside the block ...
# {blockr} is flexible here.
new_geompoint_interactive_block <- function(...) {
build_expr <- function(data) {
# Get data from the previous ggplot layer
# data is the ggplot so we need to use data$data
# to get the initial data
dat <- data$data
dat$carname <- row.names(dat)
substitute(
geom_point_interactive(
# Pass in new data
data = new_data,
aes(
tooltip = carname,
data_id = carname
)
),
list(new_data = dat)
)
}
new_block(
fields = list(
expression = new_hidden_field(build_expr)
),
expr = quote(.(expression)),
class = c("plot_layer_block", "plot_block"),
...
)
}
new_theme_block <- function(...) {
new_block(
fields = list(
theme = new_select_field(
"theme_minimal",
grep("^theme_.*$", ls("package:ggplot2"), perl = TRUE, value = TRUE),
type = "name"
)
),
expr = quote(
.(theme)()
),
class = c("plot_layer_block", "plot_block"),
...
)
}
new_ggiraph_block <- function(...) {
new_block(
fields = list(
pointsize = new_numeric_field(12, min = 1, max = 20)
),
expr = quote(girafe(ggobj = data, pointsize = .(pointsize))),
class = c("ggiraph_block", "plot_block"),
...
)
}
# Necessary to get ggiraph plot rendering
uiOutputBlock.ggiraph_block <- function(x, ns) {
ggiraph::girafeOutput(ns("plot"))
}
server_output.ggiraph_block <- function(x, result, output) {
ggiraph::renderGirafe(result())
}
stack <- new_stack(
custom_data_block,
new_ggplot_block,
new_geompoint_interactive_block,
new_theme_block,
new_ggiraph_block
)
serve_stack(stack)