We load the packages we will use :
We create a database from R dataset lung
:
set.seed(143)
lung$bras <- sample(c('A','B'), nrow(lung), replace=TRUE)
lung$status2 <- sample(0:2, nrow(lung), replace=TRUE, prob=c(0.6, 0.25, 0.15))
lung$status <- lung$status-1
lung$ecog[lung$ph.ecog %in% c(0,1)] <- '01'
lung$ecog[lung$ph.ecog %in% c(2,3)] <- '23'
lung <- lung[,c("bras","time","status","status2","age","sex","ecog","meal.cal")]
create_table_forestplot
functionThis function is used to create a table from any database to
represent the interaction analysis between a variable of interest and a
selection of covariates. This table can be used in
forestplot::forestplot()
function of R or we can use the
dessin_forest_plot()
function directly, which includes this
function.
The function also returns three vectors Mean
,
Upper
and Lower
which represent respectively
the estimate (OR for the logistic model, HR for the Cox model, cHR for
the cause-specific Cox model and SHR for the Fine & Gray model), the
lower bound and the upper bound of the confidence interval for this
estimate. These vectors will be used to plot the forest plot.
create_table_forestplot(
data_frame,
study database.
covariables,
vector containing the
covariates of the database whose we want to study interactions with the
treatment arm.
nom_var,
vector with the names of the
covariates we want to display in the table (optional).
levels_var,
vector with the names of the levels for
qualitative variables (optional).
var_quali_ou_quanti,
vector of the same size as covariates to specify whether the covariate
is qualitative(1) or quantitative(0).
modele_souhaite,
to specify the model we want to use (a logistic model for a binary
criterion for example or a Cox model to study survival).
-
“logistique”, “log”, “logit” for a logistic regression
-
“cox”, “coxph”, “survie”, “surv” for a Cox model (survival)
- “cscox”, “cscoxph” for a Cox cause-specific model (cHR
instead of HR, to be used in case of competing risks)
-
“finegray”, “crr” for a Fine & Gray model (risks in competition)
bras,
name of the treatment arm in the database.
nom_bras,
vector with the names of the arm levels
(optional).
var_dinteret,
endpoint (usually status)
(binary for logistic regression and Cox model: failure/success,
no/yes,…).
failcode = 1,
event of interest value in
the case of a Fine & Gray model.
delai_dinteret,
occurrence time of the endpoint (in general time) (not necessary for a
logistic model).
digit = 2,
to round the pvalue, the
confidence interval, and the threshold value of quantitative variables.
fixed_digit = FALSE,
to fixed the number of characters
after the decimal point for the estimation and the confidence interval.
cut_off,
vector containing the threshold values that
we want to give to the quantitative variables (the median is used by
default).
Wald = 'Yes',
to obtain the confidence
interval of Wald(Yes) or the one computed with the
confint()
(No) function.
percent = FALSE,
to indicate if we also want to display the percentages in addition to
the numbers.
digit_percent = 0,
to round up the
percentage.
test = "gailsimon",
to specify the test we
want to use.
- “gailsimon”: Gail and Simon test
- “vrais”: pvalue of the interaction term (if 2 categories) or
likelihood ratio test (if more than 2 categories)
test_qualit = TRUE,
to specify the type of test we want to
use in the case of qualitative interaction (if Gail and Simon).
- TRUE: qualitative test
- FALSE: quantitative
test
print_pval = FALSE,
to display or not pvalues for
each subgroup.
)
dessin_forest_plot
functionThis function allows to draw a forest plot. It takes into account
different arguments to make a “nicer” plot and to simplify the forest
plot for the user. The function create_table_forestplot()
,
already called inside this function, creates the table that will be used
to draw the forest plot. Arguments are the same as
create_table_forestplot()
function, with additional
optional arguments.
dessin_forest_plot(
data_frame,
study
database.
covariables,
vector containing the
covariates of the database whose we want to study interactions with the
treatment arm.
nom_var,
vector with the names of the
covariates we want to display in the table (optional).
levels_var,
vector with the names of the levels for
qualitative variables (optional).
var_quali_ou_quanti,
vector of the same size as covariates to specify whether the covariate
is qualitative(1) or quantitative(0).
modele_souhaite,
to specify the model we want to use (a logistic model for a binary
criterion for example or a Cox model to study survival).
-
“logistique”, “log”, “logit” for a logistic regression
-
“cox”, “coxph”, “survie”, “surv” for a Cox model (survival)
- “cscox”, “cscoxph” for a Cox cause-specific model (cHR
instead of HR, to be used in case of competing risks)
-
“finegray”, “crr” for a Fine & Gray model (risks in competition)
bras,
name of the treatment arm in the database.
nom_bras,
vector with the names of the arm levels
(optional).
var_dinteret,
endpoint (usually status)
(binary for logistic regression and Cox model: failure/success,
no/yes,…).
failcode = 1,
event of interest value in
the case of a Fine & Gray model.
delai_dinteret,
occurrence time of the endpoint (in general time) (not necessary for a
logistic model).
digit = 2,
to round the pvalue, the
confidence interval, and the threshold value of quantitative variables.
fixed_digit = FALSE,
to fixed the number of characters
after the decimal point for the estimation and the confidence interval.
cut_off,
vector containing the threshold values that
we want to give to the quantitative variables (the median is used by
default).
Wald = 'Yes',
to obtain the confidence
interval of Wald(Yes) or the one computed with the
confint()
(No) function.
percent = FALSE,
to indicate if we also want to display the percentages in addition to
the numbers.
digit_percent = 0,
to round up the
percentage.
test = "gailsimon",
to specify the test we
want to use.
- “gailsimon”: Gail and Simon test
- “vrais”: pvalue of the interaction term (if 2 categories) or
likelihood ratio test (if more than 2 categories)
- NULL :
no pvalue
test_qualit = TRUE,
to specify the type of
test we want to use in the case of qualitative interaction (if Gail and
Simon).
- TRUE: qualitative test
- FALSE:
quantitative test
label_bras = 'Events/Pts',
what we
want to display under the levels of the arm (example: event for logistic
model, death for survival).
title,
title we want to
give to the forest plot (optional).
xlab
to put a
legend on the x-axis of the forest plot (optional).
xlog = FALSE,
allows to choose if you want a log scale.
zero = 1,
where we want to put the zero of the forest
plot.
pos_graphique = 0,
can be equal to 0, 1 or 2.
- 0: graph is located to the left of the HR/OR
- 1: graph is located to the left of pvalue_interaction
- 2: graph is located to the right of pvalue_interaction
col = fpColors(box="black", lines="gray", summary="black"),
allows to choose the color of the graphic.
lwd.ci = 1,
confidence interval thickness.
txt_size = fpTxtGp(label=gpar(cex=1), summary=gpar(cex=1), title=gpar(cex=1.3), xlab=gpar(cex=0.8), ticks=gpar(cex=0.8)),
allows to choose the font size.
forme = fpDrawDiamondCI,
allows to choose the shape of the
estimation (example: fpDrawDiamondCI for diamonds, fpDrawNormalCI for
squares).
boxsize,
allows to choose the size of the
estimate (based on precision by default).
clip,
allows
to choose the upper and lower limits of the x-axis.
- by
default: from minimum HR/OR to maximum HR/OR
- we specify
the limits: vector of size 2 with the lower bound and the upper bound
- “clip_tot”: we display the whole graph
by = 0.5,
allows to choose the distance between the
different ticks of the x-axis.
favors,
vector with the
names to display below axis for directional difference (optional).
nom_fichier,
path and name of the file (pdf, png or tiff)
in which we want to save the forest plot (optional).
dim_fichier,
size 2 vector to specify the width and height
of the graphic when we want to save it (optional).
)
dessin_forest_simple
functionThis function allows to draw a forest plot from a dataframe simply containing an estimate and its confidence interval
dessin_forest_simple(
data_frame,
database with variable, mean, lower, upper columns.
name_col_var = "Variable",
desired column name with
variables.
name_col_mean = "HR (95%CI)",
desired
column name with estimates.
IC = TRUE,
TRUE: displays
the CI in the column with the estimate.
digit = 2,
to
round off the estimate and its confidence interval.
lwd.ci = 1,
confidence interval thickness.
zero = 1,
where we want to put the zero of the forest plot.
clip,
allows to choose the upper and lower limits of
the x-axis.
- by default: from minimum HR/OR to maximum
HR/OR
- we specify the limits: vector of size 2 with the
lower bound and the upper bound
- “clip_tot”: we display
the whole graph
by = 0.5,
allows to choose the
distance between the different ticks of the x-axis.
)
This document is a work of the statistics team in the Biostatistics and Medical Information Department at Saint-Louis Hospital in Paris (SBIM).
Based on The R Graph Gallery by Yan Holtz.