Extract an ordered list of raw topicmodels::LDA(method = "VEM") fits and a
vocabulary-aligned weighted DFM for OpTop::optimal_topic().
Arguments
- x
An
nlp_k_selectionobject created withreturn_fits = TRUE, a list ofnlp_topic_fitobjects, or a list of rawLDA_VEMobjects.- weighted_dfm
A weighted quanteda::dfm, usually created with
as_optop_weighted_dfm()from the same fitting input used for the LDA models.
Value
A list of class c("nlp_optop_input", "list") with:
lda_modelsRaw
LDA_VEMobjects ordered by topic count.weighted_dfmWeighted DFM aligned to the LDA vocabulary.
kInteger topic counts in ascending order.
Details
OpTop currently expects LDA_VEM objects from topicmodels. This adapter
intentionally rejects Gibbs LDA, CTM, text2vec, seededlda, ETM, and
partial fits so that users do not pass objects outside OpTop's current
assumptions.
NLPstudio does not import or call OpTop. After preparing the input, call
OpTop::optimal_topic(lda_models = input$lda_models, weighted_dfm = input$weighted_dfm, ...) when OpTop is installed.
Examples
dtm <- methods::as(
Matrix::Matrix(
matrix(c(2, 1, 0, 0, 1, 1, 1, 0, 0, 1, 2, 1,
0, 0, 1, 2, 1, 0, 1, 1, 1, 2, 0, 1),
nrow = 6, byrow = TRUE),
sparse = TRUE
),
"dgCMatrix"
)
rownames(dtm) <- paste0("doc", 1:6)
colnames(dtm) <- paste0("term", 1:4)
dfmat <- quanteda::as.dfm(dtm)
selection <- select_k_topics(
dfmat,
engine = "topicmodels",
model = "lda",
method = "VEM",
k_grid = 2:3,
metrics = c("diversity", "exclusivity"),
holdout = 0,
return_fits = TRUE,
control = list(fit = list(seed = 1, em = list(iter.max = 5), var = list(iter.max = 5)))
)
optop_input <- as_optop_input(selection, as_optop_weighted_dfm(dfmat))
# OpTop::optimal_topic(optop_input$lda_models, optop_input$weighted_dfm)
