Enhance your expertise with Progress Memo’s weekly knowledgeable insights. Subscribe at no cost!
This week, I share my findings from analyzing 1.2 million ChatGPT responses to reply the query of the way to enhance your possibilities of getting cited.
For 20 years, SEOs have written”final guides” designed to maintain people on the web page. We write lengthy intros. We drag insights all alongside via the draft and into the conclusion. We construct suspense to the ultimate name to motion.
The information reveals that this model of writing just isn’t splendid for AI visibility.
After analyzing 1.2 million verified ChatGPT citations, I discovered a sample so constant it has a P-Worth of 0.0: the “ski ramp.” ChatGPT pays disproportionate consideration to the highest 30% of your content material. Moreover, I discovered 5 clear traits of content material that will get cited. To win within the AI period, it is advisable begin writing like a journalist.
1. Which Sections Of A Textual content Are Most Doubtless To Be Cited By ChatGPT?

There isn’t a lot recognized about which elements of a textual content LLMs cite. We analyzed 18,012 citations and located a “ski ramp” distribution.
- 44.2% of all citations come from the primary 30% of textual content (the intro). The AI reads like a journalist. It grabs the “Who, What, The place” from the highest. In case your key perception is within the intro, the probabilities it will get cited are excessive.
- 31.1% of citations come from the 30-70% of a textual content (the center). For those who bury your key product options in paragraph 12 of a 20-paragraph put up, the AI is 2.5x much less more likely to cite it.
- 24.7% of citations come from the final third of an article (the conclusion). It proves the AI does get up on the finish (very similar to people). It skips the precise footer (see the 90-100% drop-off), nevertheless it loves the “Abstract” or “Conclusion” part proper earlier than the footer.
Doable explanations for the ski ramp sample are coaching and effectivity:
- LLMs are skilled on journalism and educational papers, which observe the “BLUF” (Backside Line Up Entrance) construction. The mannequin learns that essentially the most “weighted” info is all the time on the high.
- Whereas trendy fashions can learn as much as 1 million tokens for a single interplay (~700,000-800,000 phrases), they intention to ascertain the body as quick as doable, then interpret every little thing else via that body.

18,000 out of 1.2 million citations provides us all of the perception we want. The P-Worth of this evaluation is 0.0, that means it’s statistically indeniable. I cut up the info into batches (randomized validation splits) to display the steadiness of the outcomes.
- Batch 1 was barely flatter, however batches 2, 3, and 4 are nearly equivalent.
- Conclusion: As a result of batches 2, 3, and 4 locked onto the very same sample, the info is secure throughout all 1.2 million citations.
Whereas these batches verify the macro-level stability of the place ChatGPT appears throughout a doc, they elevate a brand new query about its granular habits: Does this top-heavy bias persist even inside a single block of textual content, or does the AI’s focus change when it reads extra deeply? Having established that the info is statistically indeniable at scale, I needed to “zoom in” to the paragraph degree.

A deep evaluation of 1,000 items of content material with a excessive quantity of citations reveals 53% of citations come from the center of a paragraph. Solely 24.5% come from the primary and 22.5% from the final sentence of a paragraph. ChatGPT just isn’t “lazy” and solely reads the primary sentence of each paragraph. It reads deeply.
Takeaway: You don’t must drive the reply into the primary sentence of each paragraph. ChatGPT seeks the sentence with the best “info acquire” (essentially the most full use of related entities and additive, expansive info), no matter whether or not that sentence is first, second, or fifth within the paragraph. Mixed with the ski ramp sample, we are able to conclude that the best possibilities for citations come from the paragraphs within the first 20% of the web page.
2. What Makes ChatGPT Extra Doubtless To Cite Chunks?
We all know the place in content material ChatGPT likes to quote from, however what are the traits that affect quotation probability?
The evaluation reveals 5 successful traits:
- Definitive language.
- Conversational question-answer construction.
- Entity richness.
- Balanced sentiment.
- Easy writing.
1. Definitive Vs. Obscure Language

Quotation winners are nearly 2x extra probably (36.2% vs 20.2%) to comprise definitive language (“is outlined as,” “refers to”). The language quotation doesn’t should be a definition verbatim, however the relationships between ideas should be clear.
Doable explanations for the influence of direct, declarative writing:
- In a vector database, the phrase “is” acts as a robust bridge connecting a topic to its definition. When a consumer asks “What’s X?” the mannequin searches for the strongest vector path, which is sort of all the time a direct “X is Y” sentence construction.
- The mannequin tries to reply the consumer instantly. It prefers a textual content that enables it to resolve the question in a single sentence (Zero-Shot) moderately than synthesizing a solution from 5 paragraphs.
Takeaway: Begin your articles with a direct assertion.
- Dangerous: “On this fast-paced world, automation is changing into key…”
- Good: “Demo automation is the method of utilizing software program to…”
2. Conversational Writing

Textual content that will get cited is 2x extra probably (18% vs. 8.9%) to comprise a query mark. After we speak about conversational writing, we imply the interaction between questions and solutions.
Begin with the consumer’s question as a query, then reply it instantly. For instance:
- Winner Model: “What’s Programmatic search engine optimization? It’s…”
- Loser Model: “On this article, we’ll focus on the varied nuances of…”
78.4% of citations with questions come from headings. The AI is treating your H2 tag because the consumer immediate and the paragraph instantly following it because the generated response.
Instance loser construction:
Instance winner construction (The 78%):
-
When did search engine optimization begin?
(Literal Question)
-
search engine optimization began in…
(Direct Reply)
The explanation that particular instance wins is due to what I name “entity echoing”: The header asks about search engine optimization, and the very first phrase of the reply is search engine optimization.
3. Entity Richness

Regular English textual content has an “entity density” (that’s, comprises correct nouns like manufacturers, instruments, individuals) of ~5-8%. Closely cited textual content has an entity density of 20.6%!
- The 5-8% determine is a linguistic benchmark derived from normal corpora just like the Brown Corpus (1 million phrases of consultant English textual content) and the Penn Treebank (Wall Road Journal textual content).
Instance:
- Loser sentence: “There are a lot of good instruments for this activity.” (0% Density)
- Winner sentence: “High instruments embrace Salesforce, HubSpot, and Pipedrive.” (30% Density)
LLMs are probabilistic. Generic recommendation (”select an excellent instrument”) is dangerous and imprecise, however a particular entity (”select Salesforce”) is grounded and verifiable. The mannequin prioritizes sentences that comprise “anchors” (entities) as a result of they decrease the perplexity (confusion) of the reply.
A sentence with three entities carries extra “bits” of knowledge than a sentence with 0 entities. So, don’t be afraid of namedropping (sure, even your opponents).
4. Balanced Sentiment

In my evaluation, the cited textual content has a balanced subjectivity rating of 0.47. The subjectivity rating is a typical metric in pure language processing (NLP) that measures the quantity of non-public opinion, emotion, or judgment in a chunk of textual content.
The rating runs on a scale from 0.0 to 1.0:
- 0.0 (Pure Objectivity): The textual content comprises solely verifiable information. No adjectives, no emotions. Instance: “The iPhone 15 was launched in September 2023.”
- 1.0 (Pure Subjectivity): The textual content comprises solely private opinions, feelings, or intense descriptors. Instance: “The iPhone 15 is a fully beautiful masterpiece that I really like.”
AI doesn’t need dry Wikipedia textual content (0.1), nor does it need unhinged opinion (0.9). It desires the “analyst voice.” It prefers sentences that specify how a truth applies, moderately than simply stating the stat alone.
The “successful” tone appears like this (Rating ~0.5): “Whereas the iPhone 15 options a typical A16 chip (truth), its efficiency in low-light images makes it a superior selection for content material creators (evaluation/opinion).“
5. Enterprise-Grade Writing

Enterprise-grade writing (suppose The Economist or Harvard Enterprise Evaluate) will get extra citations. “Winners” have a Flesch-Kincaid rating of 16 (faculty degree) in comparison with the “losers” with 19.1 (Tutorial/PhD degree).
Even for complicated matters, complexity can harm. A grade 19 rating means sentences are lengthy, winding, and stuffed with multisyllable jargon. The AI prefers easy subject-verb-object buildings with quick to reasonably lengthy sentences, as a result of they’re simpler to extract information from.
Conclusion
The “ski ramp” sample quantifies a misalignment between narrative writing and data retrieval. The algorithm interprets the gradual reveal as a insecurity. It prioritizes the quick classification of entities and information.
Excessive-visibility content material features extra like a structured briefing than a narrative.
This imposes a “readability tax” on the author. The winners on this dataset depend on business-grade vocabulary and excessive entity density, disproving the idea that AI rewards “dumbing down” content material (with exceptions).
We’re not solely writing robots … but. However the hole between human preferences and machine constraints is closing. In enterprise writing, people scan for insights. By front-loading the conclusion, we fulfill the algorithm’s structure and the human reader’s shortage of time.
Methodology
To grasp precisely the place and why AI cites content material, we analyzed the code.
All knowledge on this analysis comes from Gauge.
- Gauge supplied roughly 3 million AI solutions from ChatGPT, alongside 30 million citations. Every quotation URL’s net content material was scraped on the time of reply to supply direct correlation between the true net content material and the reply itself. Each uncooked HTML and plaintext have been scraped.
1. The Dataset
We began with a universe of 1.2 million search outcomes and AI-generated solutions. From this, we remoted 18,012 verified citations for positional evaluation and 11,022 citations for “linguistic DNA” evaluation.
- Significance: This pattern measurement is massive sufficient to provide a P-Worth of 0.0 (p ), that means the patterns we discovered are statistically indeniable.
2. The “Harvester” Engine
To seek out precisely which sentence the AI was quoting, we used semantic embeddings (a Neural Community strategy).
- The Mannequin: We used all-MiniLM-L6-v2, a sentence-transformer mannequin that understands that means, not simply key phrases.
- The Course of: We transformed each AI reply and each sentence of the supply textual content into 384-dimensional vectors. We then matched them utilizing cosine similarity.
- The Filter: We utilized a strict similarity threshold (0.55) to discard weak matches or hallucinations, making certain we solely analyzed high-confidence citations.
3. The Metrics
As soon as we discovered the precise match, we measured two issues:
- Positional Depth: We calculated precisely the place the cited textual content appeared within the HTML (e.g., on the 10% mark vs. the 90% mark).
- Linguistic DNA: We in contrast “winners” (cited intros) vs. “losers” (skipped intros) utilizing Pure Language Processing (NLP) to measure:
- Definition Price: Presence of definitive verbs (is, are, refers to).
- Entity Density: Frequency of correct nouns (manufacturers, instruments, individuals).
- Subjectivity: A sentiment rating from 0.0 (Truth) to 1.0 (Opinion).
Featured Picture: Paulo Bobita/Search Engine Journal
