Unigram/Bigram/Trigram/N-gram

Terms used to describe keyword clusters or patterns (1-word, 2-word, 3-word clusters/phrases) identified during analysis or search intent reverse-engineering.

An N-gram is a contiguous sequence of n items (usually words or characters) extracted from text. They are often categorized by length: Unigrams (1 word), Bigrams (2 words), and Trigrams (3 words).
N-grams are used in various fuzzy matching techniques where they focus on substring patterns rather than individual characters or semantics. N-gram based algorithms are highly efficient for fast data extraction involving large patterns, making them scalable for large datasets. In KeyBERT, keywords are clustered based on different N-gram lengths. N-grams are also extracted and visualized in competitor analysis to show top shared phrases across content metadata.