Every time I encounter a question like #3497, I wonder why ggplot2 doesn't have a Stat that simply applies a function by group. Though, in terms of the computational efficiency, it's generally better to have a summarised version of the data before entering ggplot2, it would be handy if we can summarise in ggplot2 especially when we generate plots one after another with different groupings.
I believe StatSummary could have been implemented to be able to summarise data with other groupings than c("group", "x") because the code following seems very general one:
|
summarise_by_x <- function(data, summary, ...) { |
|
summary <- dapply(data, c("group", "x"), summary, ...) |
|
unique <- dapply(data, c("group", "x"), uniquecols) |
|
unique$y <- NULL |
|
|
|
merge(summary, unique, by = c("x", "group"), sort = FALSE) |
|
} |
But, as the current make_summary_fun() expects a function that takes a vector, not a data.frame, it would be difficult to expand StatSummary to accept a function that summarises both x and y. So, to satisfy the need, I feel it might be nice to have some simple geom like below.
I don't see reasons why we shouldn't implement such a Stat. Am I missing something...?
library(ggplot2)
stat_summary_by_group <- function(mapping = NULL, data = NULL,
geom = "pointrange", position = "identity",
...,
fun.data = NULL,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE) {
layer(
data = data,
mapping = mapping,
stat = StatSummaryByGroup,
geom = geom,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
fun.data = fun.data,
na.rm = na.rm,
...
)
)
}
StatSummaryByGroup <- ggproto("StatSummaryByGroup", Stat,
compute_group = function(data, scales, fun.data = NULL, na.rm = FALSE) {
summary <- fun.data(data)
unique <- ggplot2:::dapply(data, c("group"), ggplot2:::uniquecols)
unique[names(summary)] <- summary
unique
}
)
d <- data.frame(x = c(1:5, 3:7), y = 1:10, g = rep(c("a", "b"), each = 5), stringsAsFactors = FALSE)
f <- function(d) {
data.frame(x = min(d$x), xend = max(d$x), y = mean(d$y), yend = mean(d$y))
}
ggplot(d) +
geom_point(aes(x, y, colour = g)) +
stat_summary_by_group(fun.data = f, aes(x, y, xend = stat(xend), yend = stat(yend)), geom = "segment") +
facet_grid(cols = vars(g))

Created on 2019-08-24 by the reprex package (v0.3.0)
Every time I encounter a question like #3497, I wonder why ggplot2 doesn't have a
Statthat simply applies a function by group. Though, in terms of the computational efficiency, it's generally better to have a summarised version of the data before entering ggplot2, it would be handy if we can summarise in ggplot2 especially when we generate plots one after another with different groupings.I believe
StatSummarycould have been implemented to be able to summarise data with other groupings thanc("group", "x")because the code following seems very general one:ggplot2/R/stat-summary.r
Lines 163 to 169 in b842024
But, as the current
make_summary_fun()expects a function that takes a vector, not a data.frame, it would be difficult to expandStatSummaryto accept a function that summarises bothxandy. So, to satisfy the need, I feel it might be nice to have some simple geom like below.I don't see reasons why we shouldn't implement such a Stat. Am I missing something...?
Created on 2019-08-24 by the reprex package (v0.3.0)