Plot fitted survival and Kaplan-Meier curves from a multi-state model
Source:R/survplot.R
survplot.RdPlot fitted survival probabilities from an msm::msm() model and compare
them with Kaplan-Meier estimates. The function can also return the data used
to build each curve.
Usage
survplot(
x,
from = 1,
to = NULL,
range = NULL,
covariates = "mean",
exacttimes = TRUE,
times,
grid = 100L,
km = FALSE,
ci = c("none", "normal", "bootstrap"),
interp = c("start", "midpoint"),
B = 100L,
ci_km = c("none", "plain", "log", "log-log", "logit", "arcsin"),
print_plot = TRUE,
verbosity = getOption("msmtools.verbosity", "quiet"),
...
)Arguments
- x
A fitted msm model object.
- from
State from which to compute the estimated survival. Defaults to state 1.
- to
The absorbing state to which compute the estimated survival. Defaults to the highest state found by
msm::absorbing.msm().- range
A numeric vector of two elements giving the time range of the plot.
- covariates
Covariate values for which to evaluate the expected probabilities. These can be
"mean", denoting the means of the covariates in the data (default); the number 0, indicating that all covariates should be set to zero; or a list of values, with optional names. For example:list(75, 1)The unnamed list must follow the order of the covariates in the original model formula. A named list is also accepted:
list(age = 75, gender = "M").- exacttimes
If
TRUE(default), transition times are known and exact. This should match the value used when fitting the model with msm.- times
An optional numeric vector giving the times at which to compute the fitted survival.
- grid
An integer specifying the grid points at which to compute the fitted survival curve (see Details). If
timesis passed,gridis ignored. Defaults to 100 points.- km
If
TRUE, the Kaplan-Meier curve is plotted. Default isFALSE.- ci
A character vector with the type of confidence intervals to compute for the fitted survival curve. Specify either
"none"(default), for no confidence intervals,"normal"or"bootstrap", for confidence intervals computed with the respective method inmsm::pmatrix.msm(). This is computationally intensive, since intervals must be computed at a series of times.- interp
If
"start"(default), then the entry time into the absorbing state is assumed to be the time it is first observed in the data. If"midpoint", then the entry time into the absorbing state is assumed to be halfway between the time it is first observed and the previous observation time. This is generally more reasonable for "progressive" models with observations at arbitrary times.- B
Number of bootstrap or normal replicates for the confidence interval. The default is 100 rather than the usual 1000, since these plots are for rough diagnostic purposes.
- ci_km
A character vector with the type of confidence intervals to compute for the Kaplan-Meier curve. Specify either
"none","plain","log","log-log","logit", or"arcsin", as coded insurvival::survfit().- print_plot
If
TRUE(default), the plot is printed before being returned. IfFALSE, the plot is returned without printing.- verbosity
Controls informational output. Use
"quiet"to suppress status messages and"summary"or"progress"for high-level messages.- ...
Reserved for the migration trampoline. Passing the legacy
outargument here raises an informative error pointing to the new$fitted/$kmaccess pattern. The trampoline will be removed in v2.3.0.
Value
A gg/ggplot object. The fitted and (when km = TRUE)
Kaplan-Meier data tables are attached to the returned plot as named
fields:
$fitted— adata.tablewith columnstime,surv, and (whenciis not"none")lwr/upr. Always present.$km— adata.tablewith the Kaplan-Meier curve, exposed only whenkm = TRUE.
Access the data through the standard $ operator:
p <- survplot(model, km = TRUE)
p # prints the plot
p$fitted # fitted survival data
p$km # Kaplan-Meier dataprint_plot only controls whether the plot is printed as a side effect.
Returned objects are unchanged: use print_plot = FALSE to create the plot
or returned data silently.
Details
The function wraps msm::plot.survfit.msm() and adds support for
exact-time plots by resetting the time scale to follow-up time. It returns
a gg/ggplot object so the plot composes directly with ggplot2::ggsave(),
ggplot2::theme(), and other ggplot operations.
You can pass custom evaluation times through times, or let survplot()
define them from grid. Larger grid values produce a finer grid and
increase computation time.
References
Titman, A. and Sharples, L.D. (2010). Model diagnostics for multi-state models, Statistical Methods in Medical Research, 19, 621-651.
Titman, A. and Sharples, L.D. (2008). A general goodness-of-fit test for Markov and hidden Markov models, Statistics in Medicine, 27, 2177-2195.
Jackson, C.H. (2011). Multi-State Models for Panel Data: The msm Package for R. Journal of Statistical Software, 38(8), 1-29. https://www.jstatsoft.org/v38/i08/.
Author
Francesco Grossetti francesco.grossetti@unibocconi.it.
Examples
if (FALSE) { # interactive()
data(hosp)
# augmenting the data
hosp_augmented = augment(data = hosp, data_key = subj, n_events = adm_number,
pattern = label_3, t_start = dateIN, t_end = dateOUT,
t_cens = dateCENS)
# let's define the initial transition matrix for our model
Qmat = matrix(data = 0, nrow = 3, ncol = 3, byrow = TRUE)
Qmat[1, 1:3] = 1
Qmat[2, 1:3] = 1
colnames(Qmat) = c('IN', 'OUT', 'DEAD')
rownames(Qmat) = c('IN', 'OUT', 'DEAD')
# fitting the model using
# gender and age as covariates
library(msm)
msm_model = msm(status_num ~ augmented_int, subject = subj,
data = hosp_augmented, covariates = ~ gender + age,
exacttimes = TRUE, gen.inits = TRUE, qmatrix = Qmat,
method = 'BFGS', control = list(fnscale = 6e+05, trace = 0,
REPORT = 1, maxit = 10000))
# plotting the fitted and empirical survival from state = 1
theplot = survplot(x = msm_model, km = TRUE)
# the fitted and Kaplan-Meier data tables are attached to the plot
head(theplot$fitted)
head(theplot$km)
}