Boxplot


A boxplot summarizes the distribution of a continuous variable. It displays its median, its first and third quartiles, and its outliers.

The ggplot2 library allows to make a boxplot using geom_boxplot(). You have to specify a quantitative variable for the Y axis, and a qualitative variable for the X axis.

# Libraries
library(ggplot2)
 
# The mtcars dataset is natively available
# head(mtcars)
 
# A really basic boxplot
ggplot(mtcars, aes(x=as.factor(cyl), y=mpg)) + 
  geom_boxplot(fill="slateblue", alpha=0.2) + 
  xlab("cyl") +
  theme_bw()


Different types of boxplot can be created:

# Libraries
library(viridis)
library(ggplot2)
library(hrbrthemes)
library(tidyverse)

# Creation of dataset
set.seed(123)
data <- data.frame(
  name=c( rep("A",500), rep("B",500), rep("C",20), rep('D', 100) ),
  time=c( rep(c("M1", "M3"), each=250),  rep(c("M1", "M3"), each=250),  rep(c("M1", "M3"), each=10),  rep(c("M1", "M3"), each=50) ),
  value=c( rnorm(500, 10, 5), rnorm(500, 13, 1),rnorm(20, 25, 4), rnorm(100, 12, 1) )
)

# Basic boxplot
ggplot(data, aes(x=name, y=value, fill=name)) +
  geom_boxplot() +
  scale_fill_viridis(discrete = TRUE, alpha=0.6, option="A") +
  theme_ipsum() +
  theme(legend.position="none",
        plot.title = element_text(size=11)) +
  ggtitle("Basic boxplot") +
  xlab("")

# Boxplot by groups
ggplot(data, aes(x=name, y=value, fill=time)) +
  geom_boxplot() +
  scale_fill_viridis(discrete = TRUE, alpha=0.6, option="A") +
  theme_ipsum() +
  theme(legend.position="right",
        plot.title = element_text(size=11)) +
  labs(title = "Boxplot by groups", fill = "Time") +
  xlab("") +
  ylab("")




Contact

This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Developed and updated by Noémie Bigot and Anouk Walter-Petrich
noemie.bigot@aphp.fr; anouk.walter-petrich@u-paris.fr

Based on The R Graph Gallery by Yan Holtz.

SBIM