Skip to contents

as_nlp_topic_fit() converts supported fitted topic-model objects into the current nlp_topic_fit class used by fit_topic_model(). It can adopt raw fits from supported backends and saved outputs from the removed warp_lda() wrapper without refitting models.

Usage

as_nlp_topic_fit(x, ...)

# S3 method for class 'nlp_topic_fit'
as_nlp_topic_fit(x, ...)

# S3 method for class 'list'
as_nlp_topic_fit(
  x,
  k = NULL,
  doc_ids = NULL,
  vocab = NULL,
  docvars = NULL,
  doc_data = NULL,
  control = list(),
  warn_partial = TRUE,
  ...
)

# S3 method for class 'TopicModel'
as_nlp_topic_fit(x, docvars = NULL, doc_data = NULL, ...)

# S3 method for class 'LDA_Gibbs'
as_nlp_topic_fit(x, docvars = NULL, doc_data = NULL, ...)

# S3 method for class 'LDA_VEM'
as_nlp_topic_fit(x, docvars = NULL, doc_data = NULL, ...)

# S3 method for class 'CTM_VEM'
as_nlp_topic_fit(x, docvars = NULL, doc_data = NULL, ...)

# S3 method for class 'textmodel'
as_nlp_topic_fit(x, model = NULL, docvars = NULL, doc_data = NULL, ...)

# S3 method for class 'textmodel_lda'
as_nlp_topic_fit(x, model = NULL, docvars = NULL, doc_data = NULL, ...)

# S3 method for class 'WarpLDA'
as_nlp_topic_fit(
  x,
  theta = NULL,
  doc_ids = NULL,
  vocab = NULL,
  docvars = NULL,
  doc_data = NULL,
  control = list(),
  warn_partial = TRUE,
  ...
)

# S3 method for class 'STM'
as_nlp_topic_fit(x, doc_ids = NULL, docvars = NULL, doc_data = NULL, ...)

# Default S3 method
as_nlp_topic_fit(x, ...)

Arguments

x

Object to convert.

...

Additional arguments forwarded to methods.

k

Optional topic count. Usually inferred from theta, phi, or the stored backend object.

doc_ids

Optional document IDs used when legacy theta does not already contain document identifiers.

vocab

Optional vocabulary used when legacy phi does not already contain term names.

docvars

Optional document metadata to store on the converted object.

doc_data

Optional document metadata or text sidecar to store on the converted object.

control

Optional backend controls to store as migration metadata. Use control$model$doc_topic_prior and control$model$topic_word_prior when the old model used non-default WarpLDA priors.

warn_partial

Logical. Warn when theta or phi cannot be recovered. Defaults to TRUE.

model

Optional model family override for raw seededlda objects. Use "seqlda" for sequential LDA fits, which are not reliably distinguishable from ordinary seededlda LDA after fitting.

theta

Optional document-topic matrix for raw text2vec WarpLDA objects. Raw WarpLDA objects do not retain the return value of fit_transform(), so pass that matrix here when available.

Value

An object of class c("nlp_topic_fit", "list").

Details

Supported input families are:

  • topicmodels S4 fits from topicmodels::LDA() and topicmodels::CTM() (LDA_Gibbs, LDA_VEM, and CTM_VEM);

  • seededlda textmodel fits from textmodel_lda() and textmodel_seededlda();

  • raw text2vec WarpLDA/LDA R6 objects, optionally paired with the theta matrix returned by fit_transform();

  • raw stm STM objects without content covariates;

  • saved list outputs from the removed NLPstudio warp_lda() wrapper.

The conversion is non-refitting. It standardizes cached DTW/TWW matrices, topic IDs, document IDs, vocabulary, and metadata where those components are already present on the input object. Raw text2vec objects do not retain document-topic weights internally, so pass theta when downstream DTW access is needed.

Raw stm content-covariate models are not converted because they imply covariate-specific topic-word distributions, while NLPstudio currently standardizes one TWW matrix per fit.

Examples

if (FALSE) { # interactive()
old <- readRDS("legacy-warp-lda-output.rds")
fit <- as_nlp_topic_fit(old)
get_top_terms(fit)
}