HomeSEOThe Science Of What AI Actually Rewards

The Science Of What AI Actually Rewards

Increase your expertise with Progress Memo’s weekly professional insights. Subscribe totally free!

In “The Science Of How AI Pays Consideration,” I analyzed 1.2 million ChatGPT responses to know precisely how AI reads a web page. In “The Science Of How AI Picks Its Sources,” I analyzed 98,000 quotation rows to know which pages make it into the studying pool in any respect.

That is Half 3.

The place Half 1 instructed you the place on a web page AI appears, and Half 2 instructed you which pages AI routinely considers, this one tells you what AI truly rewards contained in the content material it reads.

The information clarifies:

  • Most AI search engine optimisation writing recommendation doesn’t maintain at scale. There isn’t a common “write like this to get cited” components – the indicators that elevate one trade’s quotation charges can actively damage one other.
  • The entity sorts that predict quotation will not be those being focused. DATE and NUMBER are common positives. PRICE suppresses quotation in 5 of six verticals, and KG-verified entities are a detrimental sign.
  • The one writing sign that holds throughout all seven verticals: Declarative language in your intro, +14% combination elevate.
  • Heading construction is binary. Decide to the best quantity in your vertical or use none. Three to 4 headings are worse than zero in each vertical.
  • Company content material dominates. Reddit doesn’t. AI quotation habits doesn’t mirror what occurred to natural search in 2023-2024.

1. Particular Writing Indicators Affect Quotation, Whereas Others Hurt It

Whereas “The Science Of How AI Pays Consideration” covers components of the web page and kinds of writing that affect ChatGPT visibility, I wished to know which writing-level indicators – phrase depend, construction, language model – predict greater AI quotation charges throughout verticals.

Strategy

  1. I in contrast high-cited pages (greater than three distinctive immediate citations) vs. low-cited throughout seven writing metrics: phrase depend, definitive language, hedging, checklist gadgets, named entity density, and intro-specific indicators.
  2. I analyzed the primary 1,000 phrases for checklist merchandise depend, named entity density, intro definitive language token density, and intro quantity depend.

Outcomes: Throughout all verticals, definitive phrasing and together with related entities matter. However most indicators are flat.

Picture Credit score: Kevin Indig

What The Trade Patterns Confirmed

When splitting the information up by vertical, we instantly see preferences:

  • Whole phrase depend was strongest in CRM/SaaS (1.59x).
  • Finance was an anomaly with phrase depend: Shorter pages win (0.86x phrase depend).
  • Definitive phrases within the first 1,000 characters had been constructive for many verticals.
  • Schooling is a sign void. Writing model explains virtually nothing about quotation chance there.
Picture Credit score: Kevin Indig

High Takeaways

1. There isn’t a common “write like this to get cited” components. For instance, the indicators that elevate CRM/SaaS quotation charges actively damage Finance. As a substitute, match content material format to vertical norms.

2. The one common rule: open with a direct declarative assertion. Not a query, not context-setting, not preamble. The shape is “[X] is [Y]” or “[X] does [Z].” That is the one writing instruction that holds no matter vertical, content material sort, or size.

3. LLMs “penalize” hedging in your intro. “This may occasionally assist groups perceive” performs worse than “Groups that do X see Y.” Take away qualifiers out of your opening paragraph earlier than another optimization.

2. The Entity Varieties That Predict Quotation Are Not The Ones Being Focused

Most AEO recommendation focuses on named entities as a class: Pack in additional recognized model names, device names, numbers. The cross-vertical entity sort evaluation under tells a extra particular (and extra helpful) story.

Strategy

  1. Ran Google’s Pure Language API on the primary 1,000 characters (about 200-250 phrases) of every distinctive URL.
  2. Computed elevate per entity sort: % of high-cited pages with that sort / % of low-cited pages.
  3. Analyzed 5,000 pages throughout seven verticals.

* A fast word on terminology: Google NLP classifies software program merchandise, apps, and SaaS instruments as CONSUMER_GOOD, a legacy label from when the API was constructed for bodily retail. All through this evaluation, CONSUMER_GOOD means software program/product entities.

Outcomes: DATE and NUMBER are probably the most common constructive indicators. Curiously, PRICE is the strongest common detrimental.

Picture Credit score: Kevin Indig
Picture Credit score: Kevin Indig

What The Trade Patterns Confirmed

  • DATE is probably the most common constructive sign, apart from Finance (0.65x).
  • NUMBER is the second most common. Particular counts, metrics, and statistics within the intro persistently predict greater quotation charges. Finance (0.98x) and Product Analytics (1.10x) mark the ground and ceiling of that vary.
  • PRICE is the strongest common detrimental. Pages that open with pricing sign business intent. Finance is the only real exception at 1.16x, doubtless as a result of value right here means payment percentages and fee comparisons, that are the precise reference information monetary queries are on the lookout for.
  • CONSUMER_GOOD (software program/product entities) is blended. In Healthcare, product entities sign established manufacturers and instruments. In Crypto, naming particular protocols and merchandise is core to answering technical queries.
  • PHONE_NUMBER is a constructive sign in Healthcare (1.41x) and Schooling (1.40x). In each instances, it’s virtually definitely a proxy for established manufacturers/establishments/suppliers with actual bodily presence, not a literal sign so as to add cellphone numbers to your pages.

The Information Graph inversion deserves its personal word right here:

  • The information confirmed that high-cited pages common 1.42 KG-verified entities vs. 1.75 for low-cited pages (elevate: 0.81x).
  • Pages constructed round well-known, KG-verified entities (main manufacturers, establishments, well-known individuals) have a tendency towards generic protection, which isn’t most well-liked by ChatGPT.
  • Excessive-cited pages are dense with particular, area of interest entities: a specific methodology, a exact statistic, a named comparability. Lots of these area of interest entities haven’t any KG entries in any respect. That specificity is what AI reaches for.

High Takeaways

1. Add the publish date to your pages and intention to make use of a minimum of one particular quantity in your content material. That mixture is the closest factor to a common AI quotation sign this dataset produced. However Finance will get there by value information and site specificity as a substitute.

2. Keep away from opening with pricing in non-finance verticals. Value-dominant intros correlate with decrease quotation charges.

3. KG presence and model authority don’t translate to an AI quotation benefit. Chasing Wikipedia entries, model panels, or KG verification is the unsuitable lever. Particular, area of interest entities (even ones with out KG entries) outperform well-known ones.

3. Heading Construction: Commit To One Or Don’t Hassle

We all know headings matter for citations from the earlier two analyses. Subsequent, I wished to know whether or not heading depend predicts quotation charges and whether or not the optimum construction varies by vertical.

Strategy

  1. Counted complete headings per web page (H1+H2+H3) throughout all cited URLs.
  2. Grouped pages into 7 heading-count buckets: 0, 1-2, 3-4, 5-9, 10-19, 20-49, 50+.
  3. Computed high-cited fee (% of URLs which might be high-cited) per bucket per vertical.

Outcomes: Together with extra headings in your content material is just not universally higher. The candy spot will depend on vertical and content material sort. One discovering holds in every single place: Unusually, 3-4 headings are worse than zero.

Picture Credit score: Kevin Indig

What The Trade Patterns Confirmed

  • CRM/SaaS is the one vertical the place the 20+ heading elevate is confirmed: 12.7% high-cited fee at 20-49 headings vs. a 5.9% baseline. The 50+ bucket reaches 18.2%. Lengthy structured reference pages and comparability guides with one part per device outperform the whole lot else right here.
  • Healthcare inverts most sharply. The high-cited fee drops from 15.1% at zero headings to 2.5% at 20-49 headings. A web page with 30 H2s on telehealth subjects indicators optimization intent, not scientific authority.
  • Finance peaks at 10-19 headings (29.4% high-cited fee). Structured however not exhaustive: assume fee tables, regulatory breakdowns, and advisor comparability pages with reasonable heading depth.
  • Crypto peaks at 5 to 9 headings (34.7% high-cited fee). Technical documentation on this vertical tends towards dense prose with reasonable navigation construction. Over-structuring breaks up the technical depth.
  • Schooling is flat throughout all heading counts, which is in step with the writing indicators discovering. Heading construction explains virtually nothing about quotation chance in schooling content material.
  • The three to 4 heading useless zone holds throughout each vertical with out exception. Partial construction confuses AI navigation with out offering the total good thing about a dedicated hierarchy.

High Takeaways

1. The 20+ heading discovering from Half 1 is a CRM/SaaS discovering, not a common one. Making use of it to healthcare, schooling, or finance may actively suppress quotation charges in these verticals.

2. The precept that holds in every single place: Decide to construction or don’t use it. The center floor prices you in each vertical. A totally-structured web page with the best heading depth outperforms a half-structured web page in each vertical.

3. Use the optimum heading vary in your vertical. Crypto: 5-9. Finance and Schooling: 10-19. CRM/SaaS: 20+ (with H3s). Healthcare: 0 or 5-9 at most. Lengthy CRM reference pages with 50+ sections are the one case the place most heading depth pays off.

4. UGC Doesn’t Dominate

The “Reddit impact” reshaped natural search between 2024 and 2025. I wished to know whether or not ChatGPT cites user-generated content material (Reddit, boards, opinions) at significant charges or whether or not company/editorial content material dominates.

The frequent trade assumption – that AI additionally preferentially cites group voices – is just not what we discovered within the information.

Strategy

  1. Categorised these cited URLs as (1) UGC: Reddit, Quora, Stack Overflow, discussion board subdomains, Medium, Substack, Product Hunt, Tumblr, or (2) group/discussion board prefixes or company/editorial by area.
  2. Computed quotation share per class per vertical.
  3. Dataset: 98,217 citations throughout 7 verticals.

Outcomes: Company content material accounts for 94.7% of all citations. UGC is almost invisible.

Picture Credit score: Kevin Indig

What The Trade Patterns Confirmed

  • Finance is probably the most corporate-locked vertical at 0.5% UGC. YMYL (Your Cash, Your Life) content material seems to systematically suppress citations to group opinion.
  • Healthcare sits at 1.8% UGC for a similar structural motive. Medical, telehealth, and HIPAA content material attracts virtually completely from institutional sources.
  • Crypto has the best UGC penetration within the dataset at 9.2%. Neighborhood-generated content material (Reddit technical threads, Medium tutorials, developer discussion board posts) solutions a significant proportion of analyzed queries. In a fast-moving technical area of interest the place official documentation persistently lags, group posts fill the hole.
  • Product Analytics and HR Tech sit at 6.9% and 5.8% UGC. Each are verticals the place Reddit comparability threads and product evaluate communities present real sign alongside company content material.

High Takeaways

1. The “Reddit impact” in search engine optimisation has not translated proportionally to AI citations. In most verticals, reddit.com captures 2-5% of complete citations. This discovering is in keeping with different trade analysis, together with this report from Profound.

2. For finance and healthcare: UGC has near-zero AI quotation worth. Put money into structured, authoritative company content material with clear sourcing. Neighborhood engagement might matter for different causes, however it doesn’t contribute meaningfully to AI quotation share in these verticals.

3. For crypto, product analytics, and HR tech: Neighborhood presence has measurable quotation worth. Detailed Reddit comparability threads, technical Medium posts, and structured developer discussion board solutions can complement company content material attain.

What This Means For How You Strategize For LLM Visibility

Throughout all three components of this examine, the constant discovering is that AI quotation is just not primarily a writing high quality drawback.

Half 2 confirmed it’s a content material structure drawback: Skinny single-intent pages are structurally locked out no matter how nicely they’re written. This piece exhibits the identical logic applies contained in the content material itself.

The mixture writing indicators desk is crucial chart on this evaluation. Not as a result of it exhibits you what to do, however as a result of it exhibits how a lot of what the AI search engine optimisation/GEO/AEO trade is telling you doesn’t survive cross-vertical scrutiny. Phrase depend, checklist density, named entity counts … all flat or detrimental on the combination. The indicators that work are vertical-specific and smaller than our trade’s consensus implies.

The meta-lesson from this evaluation is that findings are vertical (and possibly matter) particular, which isn’t any totally different in search engine optimisation.

This half concludes the Science of AI – for now. As a result of the AI ecosystem is continually altering.

Methodology

We analyzed ~98,000 ChatGPT quotation rows pulled from roughly 1.2 million ChatGPT responses from Gauge.

As a result of AI behaves in a different way relying on the subject, we remoted the information throughout seven distinct, verified verticals to make sure the findings weren’t skewed by one particular trade.

Analyzed verticals:

  • B2B SaaS
  • Finance
  • Healthcare
  • Schooling
  • Crypto
  • HR Tech
  • Product Analytics

Featured Picture: CoreDESIGN/Shutterstock; Paulo Bobita/Search Engine Journal

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular