Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Selecting the Number and Labels of Topics in Topic Modeling: A Tutorial

Selecting the Number and Labels of Topics in Topic Modeling: A Tutorial Topic modeling is a type of text analysis that identifies clusters of co-occurring words, or latent topics. A challenging step of topic modeling is determining the number of topics to extract. This tutorial describes tools researchers can use to identify the number and labels of topics in topic modeling. First, we outline the procedure for narrowing down a large range of models to a select number of candidate models. This procedure involves comparing the large set on fit metrics, including exclusivity, residuals, variational lower bound, and semantic coherence. Next, we describe the comparison of a small number of models using project goals as a guide and information about topic representative and solution congruence. Finally, we describe tools for labeling topics, including frequent and exclusive words, key examples, and correlations among topics. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Advances in Methods and Practices in Psychological Science SAGE

Selecting the Number and Labels of Topics in Topic Modeling: A Tutorial

Loading next page...
 
/lp/sage/selecting-the-number-and-labels-of-topics-in-topic-modeling-a-tutorial-IAdNFuedNx

References (47)

Publisher
SAGE
Copyright
© The Author(s) 2023
ISSN
2515-2459
eISSN
2515-2467
DOI
10.1177/25152459231160105
Publisher site
See Article on Publisher Site

Abstract

Topic modeling is a type of text analysis that identifies clusters of co-occurring words, or latent topics. A challenging step of topic modeling is determining the number of topics to extract. This tutorial describes tools researchers can use to identify the number and labels of topics in topic modeling. First, we outline the procedure for narrowing down a large range of models to a select number of candidate models. This procedure involves comparing the large set on fit metrics, including exclusivity, residuals, variational lower bound, and semantic coherence. Next, we describe the comparison of a small number of models using project goals as a guide and information about topic representative and solution congruence. Finally, we describe tools for labeling topics, including frequent and exclusive words, key examples, and correlations among topics.

Journal

Advances in Methods and Practices in Psychological ScienceSAGE

Published: May 1, 2023

Keywords: child; development; development; health; infant; natural language processing; structural topic modeling; topic modeling

There are no references for this article.