BERTopic

An unsupervised machine learning approach for topic modeling that generates interpretable topics and performs dynamic clustering, suitable for large unstructured datasets. An embedding-based topic modeling algorithm that uses BERT embeddings, UMAP for dimensionality reduction, and HDBSCAN for clustering. Excels at semantic coherence, minimal preprocessing, and automatically detecting the number of topics. Effective for short text.

BERTopic is an unsupervised machine learning approach specifically employed for topic modeling and generating clusters of related keywords. It is noted for its ability to produce interpretable topics and perform dynamic clustering. BERTopic is an excellent tool for exploratory keyword analysis on large unstructured datasets where predefined topics are not available, offering an alternative to supervised methods like Sentence-BERT or rule-based string fuzzy matching.

Explore other ML Models & Algorithms terms