Google Search Advocate John Mueller has pushed again on the thought of constructing separate Markdown or JSON pages only for giant language fashions (LLMs), saying he doesn’t see why LLMs would wish pages that nobody else sees.
The dialogue began when Lily Ray requested on Bluesky about “creating separate markdown / JSON pages for LLMs and serving these URLs to bots,” and whether or not Google may share its perspective.
Ray requested:
Undecided should you can reply, however beginning to hear rather a lot about creating separate markdown / JSON pages for LLMs and serving these URLs to bots. Are you able to share Googleʼs perspective on this?
The query attracts consideration to a growing development the place publishers create “shadow” copies of essential in codecs which are simpler for AI programs to know.
There’s a extra energetic dialogue on this subject occurring on X.
This has been the recent subject these days, I’ve been getting pitched by firms who do that https://t.co/rVnbPKUxZj
— Lily Ray 😏 (@lilyraynyc) November 23, 2025
What Mueller Stated About LLM-Solely Pages
Mueller replied that he isn’t conscious of something on Google’s facet that might name for this sort of setup.
He notes that LLMs have labored with common net pages from the start:
I’m not conscious of something in that regard. In my POV, LLMs have educated on – learn & parsed – regular net pages because the starting, it appears a provided that they haven’t any issues coping with HTML. Why would they need to see a web page that no consumer sees? And, in the event that they verify for equivalence, why not use HTML?
When Ray adopted up about whether or not a separate format may assist “expedite getting key factors throughout to LLMs shortly,” Mueller argued that if file codecs made a significant distinction, you’ll probably hear that immediately from the businesses working these programs.
Mueller added:
If these creating and working these programs knew they may create higher responses from websites with particular file codecs, I anticipate they’d be very vocal about that. AI firms aren’t actually recognized for being shy.
He stated some pages should still work higher for AI programs than others, however he doesn’t assume that comes all the way down to HTML versus Markdown:
That stated I can think about some pages working higher for customers and a few higher for AI programs, however I doubt that’s as a result of file format, and it’s undoubtedly not generalizable to every little thing. (Excluding JS which nonetheless appears exhausting for a lot of of those programs).”
Taken collectively, Mueller’s feedback recommend that, from Google’s perspective, you don’t must create bot-only Markdown or JSON clones of present pages simply to be understood by LLMs.
How Structured Information Suits In
Different people within the thread drew a line between speculative “shadow” codecs and circumstances the place AI platforms have clearly outlined feed necessities.
A reply from Matt Wright pointed to OpenAI’s eCommerce product feeds for instance the place JSON schemas matter.
In that context, an outlined spec governs how ChatGPT ingests and shows product information. Wright explains:
Apparently, the OpenAI eCommerce product feeds are stay: JSON schemas seem to have a key function in AI search already.
That instance helps the concept that structured feeds and schemas are most essential when a platform publishes a spec and asks you to make use of it.
Moreover, Wright factors to a thread on LinkedIn the place Chris Lengthy noticed that “editorial websites utilizing product schemas, are inclined to get included in ChatGPT citations.”
Why This Issues
In the event you’re questioning whether or not to construct “LLM-optimized” Markdown or JSON variations of your content material, this change will help steer you again to the fundamentals.
Mueller’s feedback reinforce that LLMs have lengthy been in a position to learn and parse normal HTML.
For many websites, it’s extra productive to maintain enhancing velocity, readability, and content material construction on the pages you have already got, and to implement schema the place there’s clear platform steering.
On the similar time, the Bluesky thread exhibits that AI-specific codecs are beginning to emerge in slender areas akin to product feeds. These are value monitoring, however they’re tied to specific integrations, not a blanket rule that markdown is best for LLMs.
Trying Forward
The dialog highlights how briskly AI-driven search adjustments are turning into technical requests for search engine optimization and dev groups, usually earlier than there’s documentation to help them.
Till LLM suppliers publish extra concrete pointers, this thread factors you again to work you possibly can justify at the moment: preserve your HTML clear, scale back pointless JavaScript the place it makes content material exhausting to parse, and use structured information the place platforms have clearly documented schemas.
Featured Picture: Roman Samborskyi/Shutterstock
