Your cart is currently empty!
KeyBERT is a keyword extraction technique that leverages the contextual word embeddings generated by BERT (Bidirectional Encoder Representations from Transformers) to identify the keywords and phrases most relevant to a document. This approach surpasses traditional frequency-based methods by incorporating semantic understanding.
The process involves tokenizing raw documents into N-grams (candidate keywords), embedding both the document and the N-grams using a language model (like SBERT), and then calculating the Cosine Similarity between them. Keywords with the highest similarity score to the document are extracted. KeyBERT is primarily used to cluster keywords based on semantic similarity, allowing the organization of a keyword universe into groups or silos based on shared characteristics like unigrams, bigrams, or trigrams. The resulting clusters can be paired with metrics like search volume, difficulty, and entity data to create a nuanced map of the keyword universe.
