Spaghetti Plot


Spaghetti plots are visualization tools that allow to follow individual trajectories, over time, or through different conditions. We use the ggplot2 package to draw these plots.

Data


We use the R dataset sleepstudy in the lme4 package.
We add a fourth column to gather the subjects in 3 groups.

# Library
library(lme4)

# Load dataset
data("sleepstudy")

# 4th column for the last plot
sleepstudy$Group <- factor((c(rep(c("1"), times =70), rep(c("2"), times =60), rep(c("3"), times =50))))


sleepstudy format :

A data frame with 180 observations on the following 3 variables :
Reaction Average reaction time (ms)
Days Number of days of sleep deprivation
Subject Subject number on which the observation was made
Group Three groups of subjects




#Head of dataset
knitr::kable(head(sleepstudy,8), align = "l")
Reaction Days Subject Group
249.5600 0 308 1
258.7047 1 308 1
250.8006 2 308 1
321.4398 3 308 1
356.8519 4 308 1
414.6901 5 308 1
382.2038 6 308 1
290.1486 7 308 1



Spaghetti plot


geom_line and geom_point functions


geom_line(
mapping, Set of aesthetic mappings created by aes()
data, The data
stat = "identity", The statistical transformation to use on the data for this layer, as a string
position = "identity", Position adjustment, either as a string, or the result of a call to a position adjustment function
na.rm = FALSE, FALSE: the default, missing values are removed with a warning. TRUE: missing values are silently removed
orientation = NA, The orientation of the layer. The default (NA) automatically determines the orientation from the aesthetic mapping
show.legend = NA, logical. Should this layer be included in the legends?
inherit.aes = TRUE, If FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn’t inherit behaviour from the default plot specification, e.g. borders()
...
)

geom_point(
mapping, Set of aesthetic mappings created by aes()
data, The data
stat = "identity", The statistical transformation to use on the data for this layer, as a string
position = "identity", Position adjustment, either as a string, or the result of a call to a position adjustment function
na.rm = FALSE,FALSE: the default, missing values are removed with a warning. TRUE: missing values are silently removed
show.legend = NA, logical. Should this layer be included in the legends?
inherit.aes = TRUE, If FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn’t inherit behaviour from the default plot specification, e.g. borders()
...
)

Basic spaghetti plot


# Library
library(ggplot2)

# Basic spaghetti plot
ggplot(sleepstudy, aes(y=Reaction, x=Days))+
  geom_point()+
  geom_line(aes(group=Subject)) +
  theme_bw()

Add color


We can then use a different color per Group in order to better visualize the trajectories (by adding color=Group in the aes function.

# Colorful spaghetti plot
ggplot(sleepstudy, aes(y=Reaction, x=Days, color=Group))+
  geom_point()+
  geom_line(aes(group=Subject))+
  scale_x_continuous(breaks=0:9) +
  theme_bw()






Contact

This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Based on The R Graph Gallery by Yan Holtz.