Groups data points based on density and proximity. Does not require pre-defining the number of clusters and is good for finding arbitrarily shaped clusters and outliers.
Density-based clustering algorithms group data points based on their density and proximity. The approach identifies dense regions (topics) in the data space. Examples of algorithms in this category include DBSCAN and HDBSCAN.
This method is suitable for clustering both text and numeric data. In practice, BERTopic utilizes this clustering type, specifically employing HDBSCAN to identify dense clusters within the low-dimensional embedding space. Density-based clustering is used in SEO for tasks like backlink profile analysis and local search clustering.
Sources & References
Explore other Task Types terms
B
Binary Classification
Classification task with two possible outcomes (e.g., positive or negative sentiment).
C
Centroid-based Clustering
Organizes data into non-hierarchical clusters based on the arithmetic mean (centroid) of the points. Efficient…
C
Clustering (ML Task)
Grouping data points into clusters based on similarity; an unsupervised learning task.
D
Distribution-based Clustering
Assumes data is composed of probabilistic distributions (e.g., Gaussian Mixture Model).
H
Hard Clustering
A type of clustering where data points are assigned exclusively to a single cluster.
H
Hierarchical Clustering
A clustering approach where data points are recursively merged or split to create a tree-like…
M
Multi-Class Classification
Classification where data is assigned exclusively to one of three or more options (e.g., categorizing…
M
Multi-Label Classification
Classification where an input can belong to multiple categories simultaneously (e.g., tagging a blog post…
S
Soft/Fuzzy Clustering
A type of clustering where data points can belong to multiple topics/clusters with varying probabilities…
