Density-based Clustering - MLforSEO

New course by Beatrice Gamba: AI Search & LLMs: Entity SEO and Knowledge Graph Strategies for Brands now live -> Start learning today ✨

Density-based Clustering

Groups data points based on density and proximity. Does not require pre-defining the number of clusters and is good for finding arbitrarily shaped clusters and outliers.

Density-based clustering algorithms group data points based on their density and proximity. The approach identifies dense regions (topics) in the data space. Examples of algorithms in this category include DBSCAN and HDBSCAN.
This method is suitable for clustering both text and numeric data. In practice, BERTopic utilizes this clustering type, specifically employing HDBSCAN to identify dense clusters within the low-dimensional embedding space. Density-based clustering is used in SEO for tasks like backlink profile analysis and local search clustering.

Sources & References

Introduction to Machine Learning for SEOs Course

academy.mlforseo.com

Explore other Task Types terms

Binary Classification

Classification task with two possible outcomes (e.g., positive or negative sentiment).

Centroid-based Clustering

Organizes data into non-hierarchical clusters based on the arithmetic mean (centroid) of the points. Efficient…

Clustering (ML Task)

Grouping data points into clusters based on similarity; an unsupervised learning task.

Distribution-based Clustering

Assumes data is composed of probabilistic distributions (e.g., Gaussian Mixture Model).

Hard Clustering

A type of clustering where data points are assigned exclusively to a single cluster.

Hierarchical Clustering

A clustering approach where data points are recursively merged or split to create a tree-like…

Multi-Class Classification

Classification where data is assigned exclusively to one of three or more options (e.g., categorizing…

Multi-Label Classification

Classification where an input can belong to multiple categories simultaneously (e.g., tagging a blog post…

Soft/Fuzzy Clustering

A type of clustering where data points can belong to multiple topics/clusters with varying probabilities…