What Is the EAV Model in SEO? A Practical Framework for Semantic Keyword Research

Written by Lazarina Stoy.

If you’ve ever done programmatic SEO, you’ve already felt this model at work — even if you didn’t have a name for it. The Entity–Attribute–Variable (EAV) model is what lets you think past keywords to the real-world concepts people search for. It’s the closest thing search has to a unified framework that connects how users phrase queries with how systems like Google interpret them.

While keyword lists reduce queries to strings with estimated volumes, entities capture the things a searcher cares about — people, places, products, ideas — and the relationships between them. Layer in attributes (the ways those entities are described) and variables (the specific values those attributes take), and you have a structured way to reason about user intent, plan content that mirrors real decision journeys, and discover gaps competitors haven’t covered.

In this post, we’ll cover what the EAV model is, where it came from outside of SEO, why entities and keywords aren’t the same thing, what the model unlocks at the intro level, and why this framework matters more in AI search than it ever did in traditional Google ranking. The aim isn’t to give you the full programmatic implementation — that’s a much bigger workflow with validation steps, scaling decisions, and quality controls that take a while to walk through properly. The aim is to make sure you understand the model well enough to recognise where it applies, and to start sketching your own version of it.

Where the EAV model actually comes from

EAV isn’t an SEO invention. It’s a data modelling pattern borrowed from database design, where it’s used to handle situations with lots of entities, lots of possible attributes per entity, and sparse data — meaning most entities only have values for a small subset of the possible attributes.

Medical records are the textbook example. You have entities (patients), attributes (every possible measurement, symptom, condition, prescription, lab result), and values (the specific readings for each patient). Most patients don’t have values for most attributes, so a traditional row-and-column structure is wasteful. EAV stores facts as triples — (entity, attribute, value) — and only records the ones that exist.

That triple structure is exactly what Google’s Knowledge Graph uses internally. A fact like Apple Inc. — founded — 1976 is a triple. Apple Inc. — CEO — Tim Cook is another. The whole graph is built of triples connected through shared entities. When Google answers a question like “when was Apple founded,” it’s not retrieving a document — it’s reading a triple.

The reason this matters for SEO is that the system we’re optimising for thinks in triples. Pages that align with how facts are structured in the graph are easier for the system to interpret, easier for it to extract specific facts from, and easier for it to associate with the right entity. Pages that don’t — pages that bury entities in passive prose, that reference attributes without specific values, that talk around facts rather than stating them — are harder to interpret. In AI search, where systems explicitly retrieve entity-attribute combinations to compose responses, that interpretability gap shows up as visibility loss.

Entities vs. keywords: the core distinction

Keywords are the phrases people type. Entities are the concepts they mean. A single string can point to multiple entities, and that ambiguity is exactly why entity understanding matters.

The cleanest illustration is the word cone. It can be an ice-cream cone, a traffic cone, or a pine cone. All three are valid entities. All three are something a real user might mean when they search “cone.” When Google interprets a query containing that word, it isn’t matching strings — it’s disambiguating meaning, deciding whether the user means dessert, construction, or botany. It does this through entity recognition, contextual signals (other words in the query, recent search history, location, time of day), and the structure of the Knowledge Graph. Keyword density has nothing to do with it.

Keywords represent phrases or specific words that hold some value from an SEO perspective — they carry search volume, they have measurable competition, they map to ranking opportunities. Entities reflect the things that actually exist in the real world. Sometimes the two overlap. Sometimes they don’t. The query is composed of keywords of varying importance — and some of those keywords are also entities.

Take a richer example: “shop online Nike Jordan Air Force One.” A keyword-centric lens spots shop (transactional intent), online (web purchase intent), Nike (brand), Jordan Air Force One (product). Each is useful for understanding intent and brand relevance.

But an entity-centric lens identifies three distinct, classifiable entities: Nike (an Organization type), Air Force One (a Product type), and Michael Jordan (a Person type, referenced as just “Jordan”). The query isn’t really about the words at all — it’s about those three entities and the relationship between them.

Both views are useful. For revenue and traffic, you need to keep intent and brand signals in view — they’re how SEOs commission, prioritise, and report on the work. But for users — and for Google — the semantic, entity-level interpretation sits at the core of what the query actually means. As SEOs, we should blend both perspectives. We’re commissioned to increase traffic and revenue, but the systems we’re optimising for understand queries semantically before they consider keyword-level signals.

What the EAV model actually is

Before we go further, let’s define the parts.

Entity is a distinct, well-defined concept — a person, place, thing, or idea. Barack Obama is an entity. The Louvre is an entity. The smartphone is an entity. The iPhone is a more specific product entity that inherits from “smartphone.” Different types of concepts exist in the real world, and the classifications search engines use (Person, Place, Organization, Product, Event, Work, Concept) are how that knowledge gets organised.

Attribute is a characteristic or property that describes an entity. For the entity dog, attributes include breed, colour, size, food type, training need, age, life stage. Attributes are the dimensions on which an entity can be further specified. They’re how you say “not just any dog — a dog with this breed, this size, this dietary need.”

Variable is the specific value an attribute takes. For the attribute breed on the entity dog, variables include Labrador, Husky, French Bulldog, Poodle, Border Collie, and roughly three hundred and fifty other recognised breeds. For the attribute colour, variables include black, brown, golden, brindle, and so on.

Each layer adds detail, turning an abstract concept into structured information that both search engines and humans can interpret. When you build a keyword universe using EAV, you enumerate the attributes that matter for your entity, then the variables those attributes can take.

Finite vs. infinite variable sets

This is one of the most overlooked parts of the model, and it’s where most “I’ll just generate every possible combination” approaches go wrong.

For some attributes, the set of variables is finite. Dog breeds: roughly 356, depending on which kennel club you consult. Countries: under 200. Months of the year: 12. Programming languages with meaningful market share: a few dozen. iPhone models: a few dozen across the product’s history.

For other attributes, the set is effectively infinite. Names of individual people. URLs. Slogans. Social media handles. Personal goals. You could enumerate them forever and never reach the end.

There’s also a middle category — finite if you impose a cutoff, infinite if you don’t. Cities: practically infinite, but finite if you say “cities with more than 100,000 residents.” Influencers in a niche: practically infinite, but finite if you say “influencers with more than 10,000 followers.” Products in a category: practically infinite globally, but finite if you constrain to a specific marketplace or price range.

Identifying which attributes are finite, which are practically-finite-with-a-cutoff, and which are genuinely infinite is one of the most useful early decisions you’ll make. Finite attributes scale into content programmes cleanly. Infinite attributes need guardrails or they’ll generate millions of meaningless combinations.

Why search volume isn’t the only signal

If you build an EAV map for any reasonably mature niche, you’ll quickly find combinations that have near-zero search volume but real business value. And you’ll find combinations with high search volume that have no business reason to exist. Volume is one signal, not the only one.

A few patterns worth recognising at the intro level.

Demand can follow utility. The famous example here is Zapier. When the company started building out integration pages — landing pages for every individual app and every app-to-app integration combination — most of those queries had near-zero search volume. Nobody was Googling “connect Slack to Trello” in measurable numbers. Today, Zapier’s programmatic pages drive around 2.6 million monthly organic visitors and rank for tens of thousands of long-tail integration combinations. The pages created the demand by becoming the canonical answer to a question users learned to ask. EAV combinations that don’t currently have search volume can have business value if they’ll have search volume soon.

Niche utility beats broad volume. Consider an attribute like marital status on the entity influencer. A query like “influencers married with kids” has very low search volume — but a list serving exactly that combination has real value for brand-marketing teams researching influencer partnerships for children’s products. The query gets typed by a small number of high-intent professionals, not by mass-market consumers. That’s a legitimate EAV combination to pursue.

Infinite attributes are almost never useful. Consider the same entity (influencer) with an attribute like name. “Influencers named Anna” has the same low search volume but no business value — there’s no reason a real user would benefit from a page listing every influencer named Anna. The combination is enumerable but useless. This is the trap that catches teams who build EAV maps without thinking about whether the resulting page would have a reason to exist.

Validation matters. For every interesting EAV combination, the question isn’t just “is there volume?” but “is there a real user who would benefit from this page existing?” If the answer is yes, low volume is a green light. If the answer is no, even high volume can be the wrong reason to build.

What you can do with the EAV model at the intro level

You don’t need a full programmatic SEO implementation to start applying EAV thinking. A few intro-level applications you can use immediately.

Map your core entity and its attribute space. Pick the single most important entity for your business — your main product category, your industry, the central concept your audience cares about. Write down the entity, then list every attribute users care about when they’re researching, buying, or using something in that space. For a running shoes business, attributes include brand, distance type, terrain, cushioning, price range, width, runner experience level. The attribute list itself is useful — it tells you which dimensions your content needs to cover to be considered comprehensive.

Audit your existing content against the attribute space. For each attribute, check whether you have substantive content addressing it. If you sell running shoes but have nothing about marathon-specific cushioning, trail vs. road, or shoes for wide feet, those are content gaps you can fill. EAV gives you a structured way to identify gaps rather than relying on what competitors happen to have already written.

Identify pattern keywords. Look at your existing keyword data and find patterns that follow EAV structures: “can dogs eat {food},” “{brand} running shoes,” “best {product} for {situation}.” Each pattern is an entity, an attribute, and a variable slot — and once you see it as a pattern, you can extend it systematically. Patterns also tell you which attributes have established search behaviour around them (the head of the pattern is recurring across many queries) versus which don’t.

Reorganise your site structure around entities. If your category pages are organised by surface taxonomy (alphabetical, popular, featured) rather than by entity attributes (by brand, by use case, by user type), you’re missing an opportunity to make the site easier for both users and search systems to navigate. Entity-aware navigation surfaces attribute filters that match how people actually shop.

These intro-level applications take an afternoon each. They’re not the full programmatic workflow, but they’re enough to start seeing the gaps in your current strategy and to make a case for going deeper.

Why EAV matters more in AI search

Everything above applies to traditional Google search. It applies more in AI search, and that’s worth flagging directly.

When Google AI Mode, ChatGPT search, or Perplexity expands a user’s query through fan-out, the sub-queries it generates are organised around entities and attributes. Google’s Thematic Search patent describes this explicitly: a single query like “moving to Denver” can fan out into thematic sub-queries about neighbourhoods, cost of living, things to do, pros and cons. Those themes are attributes of the core entity. The sub-queries are entity-attribute combinations the system generates on the fly to research the topic before synthesising an answer.

I’ve covered the practical implications of this in depth on iPullRank — particularly in How AI Search Personalizes Fan-Out Queries, where the same query can generate completely different sub-queries depending on what the system has inferred about the user. The short version for the EAV discussion: content built around a clear entity-attribute structure aligns directly with how these systems retrieve information. Content that comprehensively covers an entity’s attribute space is more likely to be retrieved across the diverse fan-out queries an AI system generates for any given user. Conversely, thin content that targets one keyword and ignores related attributes is increasingly invisible — not because it ranks poorly, but because the system never asks a question that would surface it.

This shifts the optimisation target in a way that’s not always obvious to people coming from a traditional keyword-research background. You’re not just trying to rank for the user’s typed query. You’re trying to be retrievable across the full set of sub-queries the system generates around the topic. The query the user typed is the entry point, but the fan-out is where the actual competition happens — and the fan-out is generated against the entity-attribute space, not against keyword strings.

There’s also a citation dimension to this. AI systems are increasingly willing to use content as grounding context but selective about which sources they explicitly cite. Pages with weak entity recognition often get used to inform a response without ever being named as a source. Strong, well-structured entity coverage is part of what distinguishes content that gets cited from content that just gets read. (See my piece on the citation gap for more on this dynamic.)

Why EAV is the foundation for almost everything else in semantic SEO

A practical observation: most of the concepts that come up in semantic keyword research — query augmentation, query paths, information gain, knowledge graphs, search intent classification — rest on entity understanding. They’re different lenses on the same underlying reality.

Query augmentation works by recognising the entities in a query and proposing related attributes. Query paths describe how users move between EAV combinations as they refine their search. Information gain measures whether a new document covers entity-attribute combinations that the user hasn’t already seen elsewhere. Knowledge graph signals depend on entities being recognised, typed, and connected. Even search intent classification often comes down to recognising which attributes of an entity the user is interrogating — pricing attributes signal commercial intent, feature attributes signal investigative intent, location attributes signal transactional intent.

If you only learn one of the semantic concepts, EAV is the one that pays off across all the others. It’s the substrate the rest of semantic SEO sits on. Skipping it and trying to learn the others is much harder than the reverse — and it’s why most introductions to semantic SEO start here.

What this looks like at scale

Everything above is intro-level. The full implementation is more work, and it’s where the practical decisions get harder.

A complete EAV-driven keyword research workflow involves building a comprehensive keyword corpus across multiple data sources (search consoles, keyword tools, autocomplete scrapes, competitor analyses, internal site search, customer feedback). It involves running entity extraction on that corpus at scale — usually via Google’s Natural Language API or a custom NLP pipeline — and getting structured data on entity types, salience, sentiment, and Knowledge Graph linkages for every keyword in your set. It involves n-gram analysis to spot recurring patterns, manual pattern identification to validate and label them, and decisions about which attributes are finite, which need cutoffs, and which to exclude.

Then it involves combining the EAV map with SERP data to validate combinations: which actually have search volume, which face strong existing competition, which surface AI Overviews, which are dominated by specific content formats. It involves prioritisation frameworks that weight scalability, business value, and feasibility against each other. And it involves operationalising the model into something the content and PPC teams can actually use — usually a structured dataset with annotations, dashboards, and a clear handoff for what to build first.

This is the part that takes time to do well, and it’s where the value compounds. The intro-level applications above will give you better content decisions. The full implementation gives you a content engine.

Continue your learning (MLforSEO)

This post covered what the EAV model is, where it comes from, why entities and keywords aren’t the same thing, and what you can start doing with it immediately. The full implementation — the entity extraction at scale, the pattern identification and labelling, the prioritisation frameworks for EAV combinations, the integration with SERP validation, and the operational workflow for using EAV maps in content programmes — is covered in the Semantic AI-Powered SEO Keyword Research course on MLforSEO. The Entities, Attributes, and Variables module sits alongside lessons on query semantics, search intent, query paths, query augmentation, knowledge graphs, and SERP analysis — together they form a complete semantic keyword research system, with the workflows, templates, and decision frameworks needed to turn the model into real outputs.

Lazarina Stoy.

seo@lazarinastoy.com – Web

Lazarina Stoy is a Digital Marketing Consultant with expertise in SEO, Machine Learning, and Data Science, and the founder of MLforSEO. Lazarina’s expertise lies in integrating marketing and technology to improve organic visibility strategies and implement process automation.

A University of Strathclyde alumna, her work spans across sectors like B2B, SaaS, and big tech, with notable projects for AWS, Extreme Networks, neo4j, Skyscanner, and other enterprises.

Lazarina champions marketing automation, by creating resources for SEO professionals and speaking at industry events globally on the significance of automation and machine learning in digital marketing. Her contributions to the field are recognized in publications like Search Engine Land, Wix, and Moz, to name a few.

As a mentor on GrowthMentor and a guest lecturer at the University of Strathclyde, Lazarina dedicates her efforts to education and empowerment within the industry.