Basic ggplot2 Boxplot


A boxplot summarizes the distribution of a continuous variable. It displays its median, its first and third quartiles and its outliers. This page explains how to build a basic boxplot with ggplot2.

The ggplot2 library allows to make a boxplot using geom_boxplot(). You have to specify a quantitative variable for the Y axis, and a qualitative variable for the X axis.

# Load ggplot2
library(ggplot2)
 
# The mtcars dataset is natively available
# head(mtcars)
 
# A really basic boxplot
ggplot(mtcars, aes(x=as.factor(cyl), y=mpg)) + 
    geom_boxplot(fill="slateblue", alpha=0.2) + 
    xlab("cyl")


Different types of boxplot can be created:

# Libraries
library(viridis)
library(ggplot2)
library(hrbrthemes)
library(tidyverse)

# Create dataset
data <- data.frame(
  name=c( rep("A",500), rep("B",500), rep("C",20), rep('D', 100)  ),
  time=c( rep(c("M1", "M3"), each=250),  rep(c("M1", "M3"), each=250),  rep(c("M1", "M3"), each=10),  rep(c("M1", "M3"), each=50) ),
  value=c( rnorm(500, 10, 5), rnorm(500, 13, 1),rnorm(20, 25, 4), rnorm(100, 12, 1) )
)

# Basic boxplot
data %>%
  ggplot( aes(x=name, y=value, fill=name)) +
    geom_boxplot() +
    scale_fill_viridis(discrete = TRUE, alpha=0.6, option="A") +
    theme_ipsum() +
    theme(
      legend.position="none",
      plot.title = element_text(size=11)
    ) +
    ggtitle("Basic boxplot") +
    xlab("")

# Boxplot by groups
data %>%
  ggplot( aes(x=name, y=value, fill=time)) +
  geom_boxplot() +
  scale_fill_viridis(discrete = TRUE, alpha=0.6, option="A") +
  theme_ipsum() +
  theme(
    legend.position="right",
    plot.title = element_text(size=11)
  ) +
  labs(title = "Boxplot by groups", fill = "Time") +
  xlab("") +
  ylab("")




Contact

This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Based on The R Graph Gallery by Yan Holtz.

SBIM