A boxplot summarizes the distribution of a continuous variable. It displays its median, its first and third quartiles, and its outliers.
The ggplot2
library allows to make a boxplot using
geom_boxplot()
. You have to specify a quantitative variable
for the Y axis, and a qualitative variable for the X axis.
# Libraries
library(ggplot2)
# The mtcars dataset is natively available
# head(mtcars)
# A really basic boxplot
ggplot(mtcars, aes(x=as.factor(cyl), y=mpg)) +
geom_boxplot(fill="slateblue", alpha=0.2) +
xlab("cyl") +
theme_bw()
Different types of boxplot can be created:
# Libraries
library(viridis)
library(ggplot2)
library(hrbrthemes)
library(tidyverse)
# Creation of dataset
set.seed(123)
data <- data.frame(
name=c( rep("A",500), rep("B",500), rep("C",20), rep('D', 100) ),
time=c( rep(c("M1", "M3"), each=250), rep(c("M1", "M3"), each=250), rep(c("M1", "M3"), each=10), rep(c("M1", "M3"), each=50) ),
value=c( rnorm(500, 10, 5), rnorm(500, 13, 1),rnorm(20, 25, 4), rnorm(100, 12, 1) )
)
# Basic boxplot
ggplot(data, aes(x=name, y=value, fill=name)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6, option="A") +
theme_ipsum() +
theme(legend.position="none",
plot.title = element_text(size=11)) +
ggtitle("Basic boxplot") +
xlab("")
# Boxplot by groups
ggplot(data, aes(x=name, y=value, fill=time)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6, option="A") +
theme_ipsum() +
theme(legend.position="right",
plot.title = element_text(size=11)) +
labs(title = "Boxplot by groups", fill = "Time") +
xlab("") +
ylab("")
This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Developed and updated by Noémie Bigot and Anouk Walter-Petrich
noemie.bigot@aphp.fr; anouk.walter-petrich@u-paris.fr
Based on The R Graph Gallery by Yan Holtz.