Data & Functions
Interaction forest plot


Data


We load the packages we will use :

# Libraries
library(rms) # or library(survival)
library(cmprsk)
library(forestplot) # version 3.1.1


We create a database from R dataset lung :

set.seed(143)
lung$bras <- sample(c('A','B'), nrow(lung), replace=TRUE)
lung$status2 <- sample(0:2, nrow(lung), replace=TRUE, prob=c(0.6, 0.25, 0.15))

lung$status <- lung$status-1
lung$ecog[lung$ph.ecog %in% c(0,1)] <- '01'
lung$ecog[lung$ph.ecog %in% c(2,3)] <- '23'

lung <- lung[,c("bras","time","status","status2","age","sex","ecog","meal.cal")]



#Head of dataset
knitr::kable(head(lung,8), align = "l")
bras time status status2 age sex ecog meal.cal
B 306 1 0 74 1 01 1175
B 455 1 1 68 1 01 1225
B 1010 0 0 56 1 01 NA
B 210 1 0 57 1 01 1150
A 883 1 0 60 1 01 NA
B 1022 0 1 74 1 01 513
B 310 1 1 68 2 23 384
A 361 1 0 71 2 23 538



Download functions


Last update date : 30/07/2024

Download Forest_autom.R

create_table_forestplot function


This function is used to create a table from any database to represent the interaction analysis between a variable of interest and a selection of covariates. This table can be used in forestplot::forestplot() function of R or we can use the dessin_forest_plot() function directly, which includes this function.

The function also returns three vectors Mean, Upper and Lower which represent respectively the estimate (OR for the logistic model, HR for the Cox model, cHR for the cause-specific Cox model and SHR for the Fine & Gray model), the lower bound and the upper bound of the confidence interval for this estimate. These vectors will be used to plot the forest plot.


create_table_forestplot(
data_frame, study database.
covariables, vector containing the covariates of the database whose we want to study interactions with the treatment arm.
nom_var, vector with the names of the covariates we want to display in the table (optional).
levels_var, vector with the names of the levels for qualitative variables (optional).
var_quali_ou_quanti, vector of the same size as covariates to specify whether the covariate is qualitative(1) or quantitative(0).
modele_souhaite, to specify the model we want to use (a logistic model for a binary criterion for example or a Cox model to study survival).
         - “logistique”, “log”, “logit” for a logistic regression
         - “cox”, “coxph”, “survie”, “surv” for a Cox model (survival)
         - “cscox”, “cscoxph” for a Cox cause-specific model (cHR instead of HR, to be used in case of competing risks)
         - “finegray”, “crr” for a Fine & Gray model (risks in competition)
bras, name of the treatment arm in the database.
nom_bras, vector with the names of the arm levels (optional).
var_dinteret, endpoint (usually status) (binary for logistic regression and Cox model: failure/success, no/yes,…).
failcode = 1, event of interest value in the case of a Fine & Gray model.
delai_dinteret, occurrence time of the endpoint (in general time) (not necessary for a logistic model).
digit = 2, to round the pvalue, the confidence interval, and the threshold value of quantitative variables.
fixed_digit = FALSE, to fixed the number of characters after the decimal point for the estimation and the confidence interval.
cut_off, vector containing the threshold values that we want to give to the quantitative variables (the median is used by default).
Wald = 'Yes', to obtain the confidence interval of Wald(Yes) or the one computed with the confint()(No) function.
percent = FALSE, to indicate if we also want to display the percentages in addition to the numbers.
digit_percent = 0, to round up the percentage.
test = "gailsimon", to specify the test we want to use.
         - “gailsimon”: Gail and Simon test
         - “vrais”: pvalue of the interaction term (if 2 categories) or likelihood ratio test (if more than 2 categories)
test_qualit = TRUE, to specify the type of test we want to use in the case of qualitative interaction (if Gail and Simon).
         - TRUE: qualitative test
         - FALSE: quantitative test
print_pval = FALSE, to display or not pvalues for each subgroup.

)

dessin_forest_plot function


This function allows to draw a forest plot. It takes into account different arguments to make a “nicer” plot and to simplify the forest plot for the user. The function create_table_forestplot(), already called inside this function, creates the table that will be used to draw the forest plot. Arguments are the same as create_table_forestplot() function, with additional optional arguments.


dessin_forest_plot(
data_frame, study database.
covariables, vector containing the covariates of the database whose we want to study interactions with the treatment arm.
nom_var, vector with the names of the covariates we want to display in the table (optional).
levels_var, vector with the names of the levels for qualitative variables (optional).
var_quali_ou_quanti, vector of the same size as covariates to specify whether the covariate is qualitative(1) or quantitative(0).
modele_souhaite, to specify the model we want to use (a logistic model for a binary criterion for example or a Cox model to study survival).
         - “logistique”, “log”, “logit” for a logistic regression
         - “cox”, “coxph”, “survie”, “surv” for a Cox model (survival)
         - “cscox”, “cscoxph” for a Cox cause-specific model (cHR instead of HR, to be used in case of competing risks)
         - “finegray”, “crr” for a Fine & Gray model (risks in competition)
bras, name of the treatment arm in the database.
nom_bras, vector with the names of the arm levels (optional).
var_dinteret, endpoint (usually status) (binary for logistic regression and Cox model: failure/success, no/yes,…).
failcode = 1, event of interest value in the case of a Fine & Gray model.
delai_dinteret, occurrence time of the endpoint (in general time) (not necessary for a logistic model).
digit = 2, to round the pvalue, the confidence interval, and the threshold value of quantitative variables.
fixed_digit = FALSE, to fixed the number of characters after the decimal point for the estimation and the confidence interval.
cut_off, vector containing the threshold values that we want to give to the quantitative variables (the median is used by default).
Wald = 'Yes', to obtain the confidence interval of Wald(Yes) or the one computed with the confint()(No) function.
percent = FALSE, to indicate if we also want to display the percentages in addition to the numbers.
digit_percent = 0, to round up the percentage.
test = "gailsimon", to specify the test we want to use.
         - “gailsimon”: Gail and Simon test
         - “vrais”: pvalue of the interaction term (if 2 categories) or likelihood ratio test (if more than 2 categories)
         - NULL : no pvalue
test_qualit = TRUE, to specify the type of test we want to use in the case of qualitative interaction (if Gail and Simon).
         - TRUE: qualitative test
         - FALSE: quantitative test
label_bras = 'Events/Pts', what we want to display under the levels of the arm (example: event for logistic model, death for survival).
title, title we want to give to the forest plot (optional).
xlab to put a legend on the x-axis of the forest plot (optional).
xlog = FALSE, allows to choose if you want a log scale.
zero = 1, where we want to put the zero of the forest plot.
pos_graphique = 0, can be equal to 0, 1 or 2.
         - 0: graph is located to the left of the HR/OR
         - 1: graph is located to the left of pvalue_interaction
         - 2: graph is located to the right of pvalue_interaction
col = fpColors(box="black", lines="gray", summary="black"), allows to choose the color of the graphic.
lwd.ci = 1, confidence interval thickness.
txt_size = fpTxtGp(label=gpar(cex=1), summary=gpar(cex=1), title=gpar(cex=1.3), xlab=gpar(cex=0.8), ticks=gpar(cex=0.8)), allows to choose the font size.
forme = fpDrawDiamondCI, allows to choose the shape of the estimation (example: fpDrawDiamondCI for diamonds, fpDrawNormalCI for squares).
boxsize, allows to choose the size of the estimate (based on precision by default).
clip, allows to choose the upper and lower limits of the x-axis.
         - by default: from minimum HR/OR to maximum HR/OR
         - we specify the limits: vector of size 2 with the lower bound and the upper bound
         - “clip_tot”: we display the whole graph
by = 0.5, allows to choose the distance between the different ticks of the x-axis.
favors, vector with the names to display below axis for directional difference (optional).
nom_fichier, path and name of the file (pdf, png or tiff) in which we want to save the forest plot (optional).
dim_fichier, size 2 vector to specify the width and height of the graphic when we want to save it (optional).
)

dessin_forest_simple function


This function allows to draw a forest plot from a dataframe simply containing an estimate and its confidence interval


dessin_forest_simple(
data_frame, database with variable, mean, lower, upper columns.
name_col_var = "Variable", desired column name with variables.
name_col_mean = "HR (95%CI)", desired column name with estimates.
IC = TRUE, TRUE: displays the CI in the column with the estimate.
digit = 2, to round off the estimate and its confidence interval.
lwd.ci = 1, confidence interval thickness.
zero = 1, where we want to put the zero of the forest plot.
clip, allows to choose the upper and lower limits of the x-axis.
         - by default: from minimum HR/OR to maximum HR/OR
         - we specify the limits: vector of size 2 with the lower bound and the upper bound
         - “clip_tot”: we display the whole graph
by = 0.5, allows to choose the distance between the different ticks of the x-axis.
)




Contact

This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Based on The R Graph Gallery by Yan Holtz.