create_table_forestplot
functionThis function is used to create a table from any database to
represent the interaction analysis between a variable of interest and a
selection of covariates. This table can be used in
forestplot::forestplot()
function of R or we can use the
dessin_forest_plot()
function directly, which includes this
function.
The function also returns three vectors Mean
,
Upper
and Lower
which represent respectively
the estimate (OR for the logistic model, HR for the Cox model, cHR for
the cause-specific Cox model and SHR for the Fine & Gray model), the
lower bound and the upper bound of the confidence interval for this
estimate. These vectors will be used to plot the forest plot.
Arguments
create_table_forestplot(
data_frame,
study database.
covariables,
vector containing the
covariates of the database whose we want to study interactions with the
treatment arm.
nom_var,
vector with the names of the
covariates we want to display in the table (optional).
levels_var,
vector with the names of the levels for
qualitative variables (optional).
var_quali_ou_quanti,
vector of the same size as covariates to specify whether the covariate
is qualitative(1) or quantitative(0).
modele_souhaite,
to specify the model we want to use (a logistic model for a binary
criterion for example or a Cox model to study survival).
-
“logistique”, “log”, “logit” for a logistic regression
-
“cox”, “coxph”, “survie”, “surv” for a Cox model (survival)
- “cscox”, “cscoxph” for a Cox cause-specific model (cHR
instead of HR, to be used in case of competing risks)
-
“finegray”, “crr” for a Fine & Gray model (risks in competition)
bras,
name of the treatment arm in the database.
nom_bras,
vector with the names of the arm levels
(optional).
var_dinteret,
endpoint (usually status)
(binary for logistic regression and Cox model: failure/success,
no/yes,…).
failcode = 1,
event of interest value in
the case of a Fine & Gray model.
delai_dinteret,
occurrence time of the endpoint (in general time) (not necessary for a
logistic model).
digit = 2,
to round the pvalue, the
confidence interval, and the threshold value of quantitative variables.
fixed_digit = FALSE,
to fixed the number of characters
after the decimal point for the estimation and the confidence interval.
cut_off,
vector containing the threshold values that
we want to give to the quantitative variables (the median is used by
default).
Wald = 'Yes',
to obtain the confidence
interval of Wald(Yes) or the one computed with the
confint()
(No) function.
percent = FALSE,
to indicate if we also want to display the percentages in addition to
the numbers.
digit_percent = 0,
to round up the
percentage.
median = FALSE,
to indicate if we also
want to display the medians (in case of Cox model).
digit_median = 1,
to round up the median.
test = "gailsimon",
to specify the test we want to use.
- “gailsimon”: Gail and Simon test
-
“vrais”: pvalue of the interaction term (if 2 categories) or likelihood
ratio test (if more than 2 categories)
test_qualit = TRUE,
to specify the type of test we want to
use in the case of qualitative interaction (if Gail and Simon).
- TRUE: qualitative test
- FALSE: quantitative
test
print_pval = FALSE,
to display or not pvalues for
each subgroup.
)
dessin_forest_plot
functionThis function allows to draw a forest plot. It takes into account
different arguments to make a “nicer” plot and to simplify the forest
plot for the user. The function create_table_forestplot()
,
already called inside this function, creates the table that will be used
to draw the forest plot. Arguments are the same as
create_table_forestplot()
function, with additional
optional arguments.
Arguments
dessin_forest_plot(
data_frame,
study
database.
covariables,
vector containing the
covariates of the database whose we want to study interactions with the
treatment arm.
nom_var,
vector with the names of the
covariates we want to display in the table (optional).
levels_var,
vector with the names of the levels for
qualitative variables (optional).
var_quali_ou_quanti,
vector of the same size as covariates to specify whether the covariate
is qualitative(1) or quantitative(0).
modele_souhaite,
to specify the model we want to use (a logistic model for a binary
criterion for example or a Cox model to study survival).
-
“logistique”, “log”, “logit” for a logistic regression
-
“cox”, “coxph”, “survie”, “surv” for a Cox model (survival)
- “cscox”, “cscoxph” for a Cox cause-specific model (cHR
instead of HR, to be used in case of competing risks)
-
“finegray”, “crr” for a Fine & Gray model (risks in competition)
bras,
name of the treatment arm in the database.
nom_bras,
vector with the names of the arm levels
(optional).
var_dinteret,
endpoint (usually status)
(binary for logistic regression and Cox model: failure/success,
no/yes,…).
failcode = 1,
event of interest value in
the case of a Fine & Gray model.
delai_dinteret,
occurrence time of the endpoint (in general time) (not necessary for a
logistic model).
digit = 2,
to round the pvalue, the
confidence interval, and the threshold value of quantitative variables.
fixed_digit = FALSE,
to fixed the number of characters
after the decimal point for the estimation and the confidence interval.
cut_off,
vector containing the threshold values that
we want to give to the quantitative variables (the median is used by
default).
Wald = 'Yes',
to obtain the confidence
interval of Wald(Yes) or the one computed with the
confint()
(No) function.
percent = FALSE,
to indicate if we also want to display the percentages in addition to
the numbers.
digit_percent = 0,
to round up the
percentage.
median = FALSE,
to indicate if we also
want to display the medians (in case of Cox model).
digit_median = 1,
to round up the median.
test = "gailsimon",
to specify the test we want to use.
- “gailsimon”: Gail and Simon test
-
“vrais”: pvalue of the interaction term (if 2 categories) or likelihood
ratio test (if more than 2 categories)
- NULL : no pvalue
test_qualit = TRUE,
to specify the type of test we
want to use in the case of qualitative interaction (if Gail and Simon).
- TRUE: qualitative test
- FALSE:
quantitative test
label_bras = 'Events/Pts',
what we
want to display under the levels of the arm (example: event for logistic
model, death for survival).
label_median = 'Median (95%CI)',
what we want to display
for median.
title,
title we want to give to the forest
plot (optional).
xlab
to put a legend on the x-axis of
the forest plot (optional).
xlog = FALSE,
allows to
choose if you want a log scale.
zero = 1,
where we
want to put the zero of the forest plot.
pos_graphique = 0,
can be equal to 0, 1 or 2.
- 0: graph is located to the left of the HR/OR
- 1: graph
is located to the left of pvalue_interaction
- 2: graph is
located to the right of pvalue_interaction
col = fpColors(box="black", lines="gray", summary="black"),
allows to choose the color of the graphic.
lwd.ci = 1,
confidence interval thickness.
txt_size = fpTxtGp(label=gpar(cex=1), summary=gpar(cex=1), title=gpar(cex=1.3), xlab=gpar(cex=0.8), ticks=gpar(cex=0.8)),
allows to choose the font size.
forme = fpDrawDiamondCI,
allows to choose the shape of the
estimation (example: fpDrawDiamondCI for diamonds, fpDrawNormalCI for
squares).
boxsize,
allows to choose the size of the
estimate (based on precision by default).
clip,
allows
to choose the upper and lower limits of the x-axis.
- by
default: from minimum HR/OR to maximum HR/OR
- we specify
the limits: vector of size 2 with the lower bound and the upper bound
- “clip_tot”: we display the whole graph
by = 0.5,
allows to choose the distance between the
different ticks of the x-axis.
favors,
vector with the
names to display below axis for directional difference (optional).
favors_fontsize = 15,
fontsize for the text display below
axis for directional difference.
y_favors_arrow,
coordinate for favors arrows (optional).
y_favors_text,
coordinate for favors labels (optional).
nom_fichier,
path and name of the file (pdf, png or
tiff) in which we want to save the forest plot (optional).
dim_fichier,
size 2 vector to specify the width and height
of the graphic when we want to save it (optional).
)
dessin_forest_simple
functionThis function allows to draw a forest plot from a dataframe simply containing an estimate and its confidence interval
Arguments
dessin_forest_simple(
data_frame,
database with variable, mean, lower, upper columns.
name_col_var = "Variable",
desired column name with
variables.
name_col_mean = "HR (95%CI)",
desired
column name with estimates.
zero = 1,
where we want to
put the zero of the forest plot.
pos_graphique = 3,
can be equal to 1, 2, 3 or 4, localisation of the graph
print_variable = TRUE,
FALSE: remove the column with the
variable name
print_mean = TRUE,
FALSE: remove the
column with the estimate
var_color = NULL,
group
variable name to colour variable names by group
default colors are
1, 2, 3 etc it can be a vector of colors title = NULL,
to
change the name of the forestplot
IC = TRUE,
TRUE:
displays the CI in the column with the estimate.
boxsize = NULL,
Override the default box size based on
precision
digit = 2,
to round off the estimate and its
confidence interval.
lwd.ci = 1,
confidence interval
thickness.
clip,
allows to choose the upper and lower
limits of the x-axis.
- by default: from minimum HR/OR to
maximum HR/OR
- we specify the limits: vector of size 2
with the lower bound and the upper bound
- “clip_tot”: we
display the whole graph
by = 0.5,
allows to choose the
distance between the different ticks of the x-axis.
print_pval = FALSE
TRUE : print the pvalue in the last
column
new_page = TRUE
TRUE : If you want the plot to
appear on a new blank page
favors,
vector with the
names to display below axis for directional difference (optional)
favors_fontsize = 15,
fontsize for the text display below
axis for directional difference
y_favors_arrow,
coordinate for favors arrows (optional)
y_favors_text,
coordinate for favors labels (optional)
)
dessin_forest_models
functionThis function allows to draw a forest plot from different models
Arguments
dessin_forest_models(
vect_models,
vector
of model names to be displayed (example: c(‘mod1’, ‘mod2’))
- if it is a multivariate model, only the estimate of the first
variable will be displayed
- accepted models: coxph,
svycoxph, glm, svyglm
vecnoms = NULL,
vector of the
names you want to give to the different models.
name_col_var = "Variable",
desired column name with
variables.
name_col_mean = "HR (95%CI)",
desired
column name with estimates.
zero = 1,
where we want to
put the zero of the forest plot.
pos_graphique = 3,
can be equal to 1, 2, 3 or 4, localisation of the graph
IC = TRUE,
TRUE: displays the CI in the column with the
estimate.
digit = 2,
to round off the estimate and its
confidence interval.
lwd.ci = 1,
confidence interval
thickness.
clip,
allows to choose the upper and lower
limits of the x-axis.
- by default: from minimum HR/OR to
maximum HR/OR
- we specify the limits: vector of size 2
with the lower bound and the upper bound
- “clip_tot”: we
display the whole graph
by = 0.5,
allows to choose the
distance between the different ticks of the x-axis.
print_pval = FALSE
TRUE : print the pvalue in the last
column
)
We load the packages we will use :
We create a database from R dataset lung
:
set.seed(143)
lung$bras <- sample(c('A','B'), nrow(lung), replace=TRUE)
lung$status2 <- sample(0:2, nrow(lung), replace=TRUE, prob=c(0.6, 0.25, 0.15))
lung$status <- lung$status-1
lung$ecog[lung$ph.ecog %in% c(0,1)] <- '01'
lung$ecog[lung$ph.ecog %in% c(2,3)] <- '23'
lung <- lung[,c("bras","time","status","status2","age","sex","ecog","meal.cal")]
This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Developed and updated by Noémie Bigot and Anouk Walter-Petrich
noemie.bigot@aphp.fr; anouk.walter-petrich@u-paris.fr
Based on The R Graph Gallery by Yan Holtz.