Skip to contents

Corpus Preparation

from_json_to_df()
Convert JSON to data.table
define_corpus()
Generate a Quanteda Corpus from a Data Table
tokenize_corpus()
Fast Corpus Tokenization
singularize_tokens()
Fast Tokens Singularization
reshape_corpus()
Fast Corpus Reshape
lookup_tokens()
Fast Tokens Lookup

Document-Level Analysis

summarize_corpus()
Fast Corpus Summarization
calculate_readability()
Fast Calculation of Readability Measures
calculate_similarity() calculate_distance()
Fast Calculation of Similarity and Distance Measures
parse_corpus()
Fast Corpus Parsing via spaCy

Built-In Dictionaries

data_dictionary_BozanicRoulstoneVanBuskirk_FLS
Bozanic Roulstone VanBuskirk Forward Looking Statement Dictionary
data_dictionary_Cannon_Ling_Wang_Watanabe
Cannon, Ling, Wang, and Watanabe CSR Dictionary
data_dictionary_Li_FLS
Feng Li Forward Looking Statement Dictionary
data_dictionary_LoughranMcDonald_Complexity
Loughran and McDonald Firm Complexity Dictionary
data_dictionary_SDG
U.N. Sustainable Development Goals (SDG) Mapping Dictionary

Topic Model API

fit_topic_model()
Fit a Topic Model Via a Unified API
as_nlp_topic_fit()
Convert Existing Topic-Model Objects to nlp_topic_fit
get_dtw()
Extract Standardized Document Topic Weights
get_tww()
Extract Standardized Topic Word Weights
get_top_terms()
Extract Top Terms from Standardized TWW
predict_topic_model()
Predict Document Topic Weights for New Data
evaluate_topic_model()
Evaluate a Fitted Topic Model
select_k_topics()
Select the Number of Topics by Grid Search
summarize_k_selection()
Summarize Topic-Count Selection Results
assess_topic_stability()
Assess Topic Stability Across Repeated Fits
summarize_topics()
Summarize Topics for Interpretation
get_representative_candidates()
Extract Representative Topic Candidates

Topic Model Plots and Backends

plot_top_terms()
Visualize Topic-Word Probabilities
plot_dtw()
Plot the Distribution of Document Topic Weights
get_stm_topic_labels()
Extract STM Topic Labels
summarize_stm_topics()
Summarize STM Topics
estimate_stm_topic_effects()
Estimate STM Topic Effects
get_topic_hyperparameters()
Extract Topic-Model Hyperparameters
get_topic_embeddings()
Extract ETM Topic Embeddings
get_term_embeddings()
Extract ETM Term Embeddings
plot_topic_embeddings()
Plot ETM Topic Embeddings

OpTop Interoperability

as_optop_weighted_dfm()
Prepare Weighted DFM Input for OpTop
as_optop_input()
Prepare NLPstudio Topic Models for OpTop

S3 Methods

plot(<nlp_k_selection>)
Plot Topic-Count Selection Results
print(<nlp_k_selection>)
Print a Compact Summary of Topic-Count Selection Results
print(<nlp_k_selection_summary>)
Print K-selection summary
print(<nlp_optop_input>)
Print OpTop input summary
print(<nlp_topic_fit>)
Print a Compact Summary of a Topic-Model Fit
print(<nlp_topic_stability>)
Print Topic Stability Results