N-gram

A contiguous sequence of n items from a sequence of text, used for analysis.

An N-gram is a contiguous sequence of ‘n’ items (words) from a text sequence. N-grams, including unigrams (N=1), bigrams (N=2), and trigrams (N=3), are used to analyze common patterns and relationships in search queries. They are central to techniques like KeyBERT for identifying the core semantic meaning of a keyword. N-gram analysis can also be employed in rule-based classification to reverse-engineer micro-intents or specific user personas by identifying frequently co-occurring terms (like “how to fix” plus an entity attribute).