r - Combining different data sources into one ggplot or lattice diagram -
in r, both ggplot2 , lattice package provide possibilities visualize data not x , y position considering additional factor, changing color, size or shape of observation representation (point, smooth line, etc.) or splitting visualization separate diagrams along factor.
example ggplot:
require(ggplot2) ggplot(diamonds, aes(x = carat, y = price, col=clarity)) + geom_point(alpha = .3) example lattice:
require(lattice) require(mlmrev); data(chem97, package = "mlmrev") densityplot(~ gcsescore | factor(score), chem97, groups = gender, plot.points = false, auto.key = true) obviously, these easy ways of differentiating data factor created use 1 single dataframe, containing observations shown. however, more have separate data inputs, in form of separate dataframes, containing different columns represented x , y. third factor separation in plot dataframe resp. data source itself. solution this, able find far, merging data 1 dataframe , adding column each source dataframe, containing third factor, resp. data source (so in each cell of column there same string expression). ggplot2 , lattice able separate data again third factor , visalize them separated, wished.
now final problem: seems poor workflow , not efficient bigger amounts of data. there perhaps alternative way achieve same result or @ least way efficiently automate last described workflow?
it's idea merge more data source 1 when working ggplot. there of course exception this, , ggplot give tools deal cases.
that said, it's possible pass data argument each geom_*
a general rule use if different data sources used in same geom_* have combined, if used in different geom_s, can (and maybe should) stay separate.
bind data sources use in same geom_*
df1 <- data.frame(group = letters[1:3], obs = runif(3)) df2 <- data.frame(group = letters[1:3], obs = runif(3)) library(purrr) dft <- list(df1 = df1, df2 = df2) %>% map_df(~rbind(.x), .id = 'src') library(ggplot2) ggplot(dft, aes(x = group, y = obs)) + geom_line(aes(group = src, color = src), size = 1) 
use different data sources
df1 <- data.frame(group = letters[1:3], hvalue = runif(3)) df2 <- data.frame(group = rep(letters[1:3], each = 3), pvalue = runif(9)) library(ggplot2) ggplot() + geom_line(data = df1, aes(x = group, y = hvalue, group = 1), size = 1) + geom_point(data = df2, aes(x = group, y = pvalue, color = group)) 
Comments
Post a Comment