Skip to contents

Plot fitted survival probabilities from an msm::msm() model and compare them with Kaplan-Meier estimates. The function can also return the data used to build each curve.

Usage

survplot(
  x,
  from = 1,
  to = NULL,
  range = NULL,
  covariates = "mean",
  exacttimes = TRUE,
  times,
  grid = 100L,
  km = FALSE,
  ci = c("none", "normal", "bootstrap"),
  interp = c("start", "midpoint"),
  B = 100L,
  ci_km = c("none", "plain", "log", "log-log", "logit", "arcsin"),
  print_plot = TRUE,
  verbosity = getOption("msmtools.verbosity", "quiet"),
  ...
)

Arguments

x

A fitted msm model object.

from

State from which to compute the estimated survival. Defaults to state 1.

to

The absorbing state to which compute the estimated survival. Defaults to the highest state found by msm::absorbing.msm().

range

A numeric vector of two elements giving the time range of the plot.

covariates

Covariate values for which to evaluate the expected probabilities. These can be "mean", denoting the means of the covariates in the data (default); the number 0, indicating that all covariates should be set to zero; or a list of values, with optional names. For example:

list(75, 1)

The unnamed list must follow the order of the covariates in the original model formula. A named list is also accepted:

list(age = 75, gender = "M").

exacttimes

If TRUE (default), transition times are known and exact. This should match the value used when fitting the model with msm.

times

An optional numeric vector giving the times at which to compute the fitted survival.

grid

An integer specifying the grid points at which to compute the fitted survival curve (see Details). If times is passed, grid is ignored. Defaults to 100 points.

km

If TRUE, the Kaplan-Meier curve is plotted. Default is FALSE.

ci

A character vector with the type of confidence intervals to compute for the fitted survival curve. Specify either "none" (default), for no confidence intervals, "normal" or "bootstrap", for confidence intervals computed with the respective method in msm::pmatrix.msm(). This is computationally intensive, since intervals must be computed at a series of times.

interp

If "start" (default), then the entry time into the absorbing state is assumed to be the time it is first observed in the data. If "midpoint", then the entry time into the absorbing state is assumed to be halfway between the time it is first observed and the previous observation time. This is generally more reasonable for "progressive" models with observations at arbitrary times.

B

Number of bootstrap or normal replicates for the confidence interval. The default is 100 rather than the usual 1000, since these plots are for rough diagnostic purposes.

ci_km

A character vector with the type of confidence intervals to compute for the Kaplan-Meier curve. Specify either "none", "plain", "log", "log-log", "logit", or "arcsin", as coded in survival::survfit().

print_plot

If TRUE (default), the plot is printed before being returned. If FALSE, the plot is returned without printing.

verbosity

Controls informational output. Use "quiet" to suppress status messages and "summary" or "progress" for high-level messages.

...

Reserved for the migration trampoline. Passing the legacy out argument here raises an informative error pointing to the new $fitted / $km access pattern. The trampoline will be removed in v2.3.0.

Value

A gg/ggplot object. The fitted and (when km = TRUE) Kaplan-Meier data tables are attached to the returned plot as named fields:

  • $fitted — a data.table with columns time, surv, and (when ci is not "none") lwr / upr. Always present.

  • $km — a data.table with the Kaplan-Meier curve, exposed only when km = TRUE.

Access the data through the standard $ operator:

p <- survplot(model, km = TRUE)
p           # prints the plot
p$fitted    # fitted survival data
p$km        # Kaplan-Meier data

print_plot only controls whether the plot is printed as a side effect. Returned objects are unchanged: use print_plot = FALSE to create the plot or returned data silently.

Details

The function wraps msm::plot.survfit.msm() and adds support for exact-time plots by resetting the time scale to follow-up time. It returns a gg/ggplot object so the plot composes directly with ggplot2::ggsave(), ggplot2::theme(), and other ggplot operations.

You can pass custom evaluation times through times, or let survplot() define them from grid. Larger grid values produce a finer grid and increase computation time.

References

Titman, A. and Sharples, L.D. (2010). Model diagnostics for multi-state models, Statistical Methods in Medical Research, 19, 621-651.

Titman, A. and Sharples, L.D. (2008). A general goodness-of-fit test for Markov and hidden Markov models, Statistics in Medicine, 27, 2177-2195.

Jackson, C.H. (2011). Multi-State Models for Panel Data: The msm Package for R. Journal of Statistical Software, 38(8), 1-29. https://www.jstatsoft.org/v38/i08/.

Author

Francesco Grossetti francesco.grossetti@unibocconi.it.

Examples

if (FALSE) { # interactive()
data(hosp)

# augmenting the data
hosp_augmented = augment(data = hosp, data_key = subj, n_events = adm_number,
                          pattern = label_3, t_start = dateIN, t_end = dateOUT,
                          t_cens = dateCENS)

# let's define the initial transition matrix for our model
Qmat = matrix(data = 0, nrow = 3, ncol = 3, byrow = TRUE)
Qmat[1, 1:3] = 1
Qmat[2, 1:3] = 1
colnames(Qmat) = c('IN', 'OUT', 'DEAD')
rownames(Qmat) = c('IN', 'OUT', 'DEAD')

# fitting the model using
# gender and age as covariates
library(msm)
msm_model = msm(status_num ~ augmented_int, subject = subj,
                 data = hosp_augmented, covariates = ~ gender + age,
                 exacttimes = TRUE, gen.inits = TRUE, qmatrix = Qmat,
                 method = 'BFGS', control = list(fnscale = 6e+05, trace = 0,
                 REPORT = 1, maxit = 10000))

# plotting the fitted and empirical survival from state = 1
theplot = survplot(x = msm_model, km = TRUE)

# the fitted and Kaplan-Meier data tables are attached to the plot
head(theplot$fitted)
head(theplot$km)
}