Create a faceted bar chart of the highest-probability terms for each topic
using the long-format output from get_top_terms() via ggplot2. Each facet corresponds
to one topic, and bars represent the estimated topic–word probabilities
(\(\phi\)).
Usage
plot_top_terms(top_terms, facet_args = list(scales = "free_y"), ...)Arguments
- top_terms
A data.table returned by
get_top_terms()withformat = "long". Must contain the columnsrank,topic,term, andprobability.- facet_args
A named list of additional arguments passed to facet_wrap(). Defaults to
list(scales = "free_y"), which allows each facet to have its own y-axis scale.- ...
Additional arguments passed to geom_col().
Value
A ggplot object: a faceted horizontal bar chart with one facet per topic. Each bar shows the contribution of a term to that topic, as estimated by the topic–word distribution matrix (\(\phi\)).
Details
The function visualizes topic–word probabilities in a tidy, per-topic
format. Terms are ranked within each topic by descending probability
and reordered internally using tidytext::reorder_within to ensure
correct sorting within facets. Typically, this function is used in
combination with get_top_terms() (with format = "long") to prepare
the input data.
Examples
if (FALSE) { # interactive() && requireNamespace("text2vec", quietly = TRUE)
# Requires the optional tidytext package.
dtm <- methods::as(
Matrix::Matrix(
matrix(
c(1, 0, 1,
1, 1, 0,
0, 1, 1,
1, 1, 1),
nrow = 4,
byrow = TRUE
),
sparse = TRUE
),
"dgCMatrix"
)
colnames(dtm) <- paste0("term", 1:3)
rownames(dtm) <- paste0("doc", 1:4)
model <- fit_topic_model(
dtm,
engine = "text2vec",
model = "lda",
k = 2,
control = list(fit = list(n_iter = 25, progressbar = FALSE))
)
top_terms <- get_top_terms(model, n = 3, format = "long")
plot_top_terms(top_terms)
}
