Semrush put out an infographic final week. The sort constructed to be screenshotted into LinkedIn carousels and pasted into webinar decks. 4 pillars. The fourth one is named “Technical GEO”: schema, structured knowledge, clear structure. The road that justifies it: “Ensures AI engines can parse and join your content material.”
Ensures.
That’s the whole piece in a single phrase. The structure of huge language fashions is, by design, the other of ensured. And schema has nothing to do with whether or not an LLM can parse textual content. LLMs parse textual content by studying textual content.
Semrush is way from alone. Each SaaS vendor with pores and skin on this recreation is working variations of the identical play. Search engine marketing-era controllability, repackaged beneath a brand new acronym. The identical percentages, pillars, and pyramids. All dressed for a system that was constructed particularly to not work this manner.
I’ve made the strategic model of this case earlier than, in “Your AI Technique Isn’t a Technique.” This piece is the technical flooring beneath it.
Constructed To Learn No matter’s There
Language fashions exist as a result of the net is a multitude. Boards, Wikipedia stubs, weblog posts written at 2 A.M., scraped product copy, machine-translated junk, code feedback, half-formed sentences, typos, contradictions, each register from journal article to subreddit shitpost. Pre-training knowledge is the general public net, and the general public net has by no means been structured.
The transformer structure handles this by treating language as sequences of tokens. There isn’t any parser contained in the mannequin searching for tags. There isn’t any choice for FAQ markup. The mannequin reads the phrases. That’s the mechanism.
At inference time, the mannequin generates extra tokens conditioned on the enter. None of that pipeline is studying microdata.
Schema.org has actual jobs. It feeds wealthy ends in classical search. It helps entity disambiguation within the data graph. It helps voice assistants pull structured fields. These are well-defined capabilities inside particular methods. They aren’t the mechanism by which an LLM understands a sentence.
So when a vendor claims structured knowledge “ensures AI engines can parse and join your content material,” there may be nothing to make sure. The parsing layer they’re imagining shouldn’t be there. The mannequin already parsed your sentence. It did so by studying the sentence.
One Trick, Three Model Colours
Take a look at the most important GEO and AEO explainers out there proper now, and you discover the identical Search engine marketing-era playbook with the acronym swapped.
Semrush is already lined. The fourth pillar of its “Technical GEO” presents schema and structured knowledge as making certain one thing that the structure can not guarantee.
AirOps revealed a graphic titled “15 Methods to Get Cited by ChatGPT, Perplexity, & Google.” It’s the most numbers-heavy specimen of the style I’ve seen this 12 months. Schema markup will increase quotation probability by 13%. Sequential H2 to H4 tags double your probabilities. Brief paragraphs make content material 49% extra prone to seem in AI solutions. Perplexity cites UGC in 91% of solutions, versus Gemini’s 7. Learn the supply notes and the methodology path comes house. The numbers within the graphic hint again to AirOps’s personal “2026 State of AI Search Report.” AirOps is citing AirOps on the query of whether or not AirOps’s prescriptions work.
Peec AI does a extra trustworthy job in locations. Its full information to GEO acknowledges the probabilistic nature of the system and concedes that basis fashions are already skilled, so optimization focuses on the retrieval layer. Then it lands the identical prescriptions: heading hierarchy, bullet lists, FAQ markup, a number of schema varieties layered on every web page, summaries on the prime of sections – all constructed on the chunking declare that lengthy paragraphs lose out as a result of the engine extracts fragments fairly than full articles.
Profound, citing Aleyda Solis’s guidelines, is essentially the most specific in its piece: “Optimize for Chunk-Stage Retrieval.” Every part, a standalone snippet. Every web page, a buffet from which the engine takes what it desires. The engine, on this telling, is a well mannered visitor who solely takes what’s been laid out.
Three distributors. Identical working assumption: a controllable, prescriptive technical self-discipline sits between a writer and a quotation, and it occupies roughly the identical form as classical Search engine marketing. Schema, headings, construction, freshness, machine-readable codecs. Acquainted. Billable. Reportable as much as a chief advertising officer.
What Schema Truly Does
Schema shouldn’t be the goal right here. Schema has actual, well-defined makes use of. Classical Google search makes use of it for wealthy outcomes: costs, rankings, occasion occasions, the structured fields that drive search engine outcomes web page options. The data graph makes use of it for entity disambiguation. Voice assistants pull structured fields out of it.
None of that goes away. If you happen to’re answerable for technical Search engine marketing, maintain implementing schema the place it earns its maintain.
Schema can not attain right into a transformer and enhance its comprehension of your prose. The mannequin isn’t architected to learn schema as schema. It receives no matter textual content the engine fetched and selected to incorporate, and processes that textual content as language tokens. The complete GEO/AEO advertising layer rests on conflating two distinct claims: that schema is beneficial in classical search, and that schema feeds the LLM. The primary is true. The second is a class error.
Chunking Is Not Yours To Optimize

The chunking recommendation retains reappearing as a result of it sounds technical, sits neatly inside a flowchart, and offers a content material staff one thing concrete to do on Monday morning. It’s also incoherent.
Chunking occurs at retrieval time. Perplexity, ChatGPT, and Gemini every run a retriever over candidate paperwork, cut up them in response to their very own configurations (size, overlap, embedding mannequin, typically semantic boundaries), and feed the top-k chunks into the mannequin’s context. These configurations belong to the engine. They get tuned in another way throughout methods and retuned on schedules no writer is aware about. The writer’s view of the chunker is the writer’s view of the mannequin: black field, outcomes solely.
So when a vendor says “optimize for chunk-level retrieval,” what is definitely being really useful is sweet writing. Brief, self-contained paragraphs. Clear definitions close to the highest of sections. Inner logical construction. These are recognizable disciplines: info structure, technical writing, readability. They’ve been recognizable disciplines since lengthy earlier than the transformer was invented. They aren’t a brand new technical layer.
A extra trustworthy model of the pitch can be: Rent somebody competent at writing for the net. That sentence doesn’t match on a pricing web page.
The Paper They Don’t Learn
There may be an precise tutorial paper referred to as “GEO.” Aggarwal and co-authors, KDD 2024. It’s the closest factor to a citable supply the SaaS layer has when it sells generative engine optimization as a self-discipline. It’s also, as papers go, simple to skim. 9 “optimization strategies” are examined on a ten,000-query benchmark, with outcomes.
What did the paper discover labored?
Including citations from credible sources. Including quotations from related sources. Including statistics. Bettering fluency. Making prose simpler to know. The strategies that produced the biggest visibility lifts had been primarily: write content material with extra proof in cleaner prose.
What did the paper take a look at and discover didn’t work?
Key phrase stuffing, the closest analogue within the paper to the Search engine marketing-era playbook the present GEO and AEO distributors have repackaged. Outcome: beneath baseline. The paper’s authors observe in plain phrases that strategies efficient in serps “might not translate to success on this new paradigm.”
Discover what shouldn’t be within the listing of 9 strategies. Schema. Structured knowledge. FAQ markup. Heading hierarchy. Machine-readable codecs. None of those are examined within the paper, as a result of none of them are the optimization floor the paper research. The paper is finding out content-level interventions: what you set within the phrases, not metadata layered across the phrases.
The SaaS layer borrowed the acronym. The findings stayed within the paper. “Technical GEO” is the Search engine marketing playbook with completely different stickers on the identical bins, offered in opposition to analysis that factors the opposite approach.
The Assumption Smuggled In
The SaaS pitch solely is sensible when you smuggle in a single assumption: that the system you’re optimizing for has the identical form because the one which’s been billing Search engine marketing purchasers for a quarter-century. Inputs you management. Outputs that reply. A retrievable causal chain between the 2.
That mannequin was all the time a simplification of how search labored. It was shut sufficient to maintain the trade working, and shut sufficient to maintain the invoices going out.
None of that simplification survives contact with generative methods. The identical immediate produces completely different solutions throughout periods, customers, temperatures, mannequin variations, and days. Noticed habits throughout the main engines, not a clear property of any single one. The retrieval layer in entrance of the mannequin additionally strikes: candidate sources shift, rating shifts, freshness home windows shift. No causal chain runs between “I added FAQ schema” and “the mannequin cited my web page.” What runs between them is a chance distribution, and the belongings you management have an effect on that distribution in methods no person can cleanly attribute. Not even the individuals who created these methods.
That is the established line on AI visibility instruments, repeated right here as a result of it applies to the entire prescriptive layer. Statistically unverifiable knowledge drawn from non-deterministic methods. A 13% quotation raise, measured how, in opposition to what counterfactual, with what reproducibility? The methodological questions aren’t what these numbers are designed to reply. The numbers are the reply. They land in a graphic, get rendered as ROI in a board deck, and the dialog strikes on.
One thing To Say In The Assembly
Right here is the half that the structure argument and the methodology argument don’t, on their very own, clarify. Why does all the SaaS layer maintain efficiently promoting these things to people who find themselves not silly?
The trustworthy model of the reply goes one thing like: We’re working with decreased visibility right into a system that doesn’t expose its mechanics, that returns completely different outputs to completely different folks for a similar question, that’s altering month by month, and that has folded a considerable chunk of the funnel right into a black field. We are able to maintain doing the work that has all the time been the work: writing nicely, being helpful, constructing authority, sustaining the location. We are able to monitor what reveals up the place. The deterministic dashboard we used to have shouldn’t be coming again.
That sentence is unsayable in a advertising assembly. It admits the lever shouldn’t be related. It tells management that the finances line they accredited doesn’t have a corresponding motion. It offers the staff nothing to place in subsequent quarter’s plan.
So the SaaS layer fills the hole. It manufactures levers. Pillars, frameworks, share lifts, schema audits, chunking optimization, machine-readable codecs. Reportable exercise. Defensible expenditure. One thing to say within the assembly. None of this will get you visibility. The engine decides that. What’s on provide is the looks of management, offered to individuals who would fairly pay than concede that management left the room.
As soon as the lever is purchased, it must be operated. Schema audits get scheduled. Chunking checklists get reviewed. Quotation likelihoods get tracked, refreshed, and in contrast. The dashboard the staff paid for turns into the dashboard the staff optimizes in opposition to, and the dashboard quietly replaces the precise downside with the a part of the issue it might see. By the point anybody notices, the SaaS layer is writing the temporary.
None of it is a ethical failure on the customer’s aspect. What you might be watching is what occurs when an trade has been organized for a quarter-century across the premise which you could pull a lever and watch the meter transfer, and the meter quietly disconnects from the lever. The distributors aren’t working a con. They’re filling demand for the one factor the customer can not afford to do with out: a solution that matches in a slide.
Rank And Tank, All Over Once more
I maintain coming again to a phrase that matches this entire second: dancing to the rank-and-tank tunes (I borrowed it from David McSweeney). The cycle goes: Vendor sells the controllable-discipline body, companies undertake it, content material groups scale manufacturing across the prescriptions, AI-generated articles get pumped out at quantity as a result of the prescriptions are simple to template. A few of it ranks for some time. Most of it will definitely tanks as a result of the prescriptions had been by no means the mechanism, and the engine adjusts, or the freshness window closes, or the system merely strikes on.
The Search engine marketing trade has achieved this earlier than. Spinning. Mass programmatic pages. Doorway content material. Every cycle adopted the identical form: a controllable enter dressed as a self-discipline, offered at scale, briefly efficient, ultimately punished by the engine, changed by the following controllable enter dressed as a self-discipline.
GEO and AEO are the present cycle. The pillars and percentages and pyramids are this cycle’s templates. Beneath them, the methods bifurcate.
One path is model presence exploitation. Plant your identify the place the engines look. Reddit threads, top-X listicles, the identical quotation surfaces again and again. The cycle feeds itself: engines cite the surfaces, manufacturers work the surfaces, surfaces feed the engines. I’ve written about this loop earlier than; I referred to as it the Ouroboros sample. The quick model is that the loop is much less steady than the technique assumes.
The opposite path is content material at scale. Produce variations, pump out quantity, deal with the templated output as content material that might earn a quotation. I’ve written about this method earlier than, within the “Scaling Disappointment” piece. The quick model is that uniqueness shouldn’t be worth, and on the tempo these prescriptions allow, qualitative evaluate stops being potential. The amount of AI-generated copy produced beneath this path is that this cycle’s externality.
The following cycle will promote the cleanup.
Neglect for a second whether or not your “Technical GEO” is ready up appropriately. Ask whether or not the factor you might be placing on the web page is price studying. Giant language fashions had been designed to learn no matter is there. If what’s there may be good, will probably be learn. If what’s there may be templated, low-utility content material optimized in opposition to a chunking heuristic that doesn’t exist, it should ultimately be filtered out: by the engine, by the consumer, or by the following tutorial paper displaying that retrieval high quality is degraded by precisely this sort of slop.
The benefit, when it accrues, will accrue to the individuals who don’t get distracted. Who don’t subscribe to the dashboard. Who maintain engaged on product-driven Search engine marketing and the foundations which have all the time related content material to folks. There are early indicators of this on the timelines I learn. Practitioners brazenly questioning whether or not optimizing in opposition to a non-deterministic floor is sensible in any respect, and asking whether or not their consideration belongs again on classical search; which, on the finish of the chain, is what feeds these methods anyway.
The mess was all the time the purpose. The structure handles it. The trade simply must cease pretending the mess is the issue.
Extra Assets:
This publish was initially revealed on The Inference.
Featured Picture: Roman Samborskyi/Shutterstock
