The Publisher’s Third Option

Every website owner who has thought about AI agents has eventually arrived at the same binary. Block them with robots.txt and lose whatever discovery, traffic, or relevance they might deliver. Or let them through and watch them consume your content at the full cost of raw HTML — which is to say, at a cost you bear but cannot control.

Neither option is good. There is a third.

The cost nobody is calculating

When an AI agent reads a page on your site, it sends your HTML — all of it — to a language model. A typical news article might be 150,000 tokens of raw HTML. Of those, perhaps 3,000 are the actual article. The other 147,000 are navigation elements, ad slots, footer links, tracking scripts, stylesheet references, and the accumulated detritus of a modern web page.

You do not pay for this directly. But you are subsidizing it. The AI companies running these agents are paying to process your layout. Their models are spending context window on your script tags. Their users are waiting longer for responses because the model is reading your cookie consent banner.

Nobody wins. The publisher bears the server cost of serving full pages to agents that don’t render them. The AI company bears the token cost of processing markup that carries no semantic value. The end user gets slower responses. The economics are misaligned at every layer.

What robots.txt SOM Directives change

The proposal is simple. Add five lines to your existing robots.txt file that say: “Yes, you can read my content, and here is a more efficient way to get it.” The SOM endpoint returns only the meaningful content — typed, structured, and semantic. No scripts. No styles. No navigation chrome. Ten to one hundred times fewer tokens. Same information.

This is cooperative content negotiation. The publisher controls the representation. The publisher decides what content is included and how it is structured. The agent gets what it needs in a format optimized for machine consumption. The cost drops for everyone involved.

The key word is cooperative. This is not scraping. This is not blocking. This is the publisher saying: “I want you to read my content. Here is the best way to do it.” It is the same principle behind RSS, behind sitemaps, behind Open Graph tags — infrastructure that publishers deploy voluntarily because it serves their interests.

The economics

For a publisher serving 50,000 AI agent page views per day, the downstream token economics look like this:

Raw HTML~45,000 tokens/page → 2.25B tokens/day → ~$5,625/day

SOM~2,650 tokens/page → 132.5M tokens/day → ~$331/day

The publisher is not paying these token costs directly. But the agents consuming their content are. And agents that pay less to read your content have more reason to read it. More reason to index it. More reason to surface it in responses. The economic alignment is genuine: making your content cheaper to consume makes it more likely to be consumed.

This is not a theoretical argument. It is arithmetic. The economics calculator on this site will compute the numbers for your traffic volume.

The five-minute implementation

The implementation path for most publishers is straightforward. Add a CNAME record pointing your SOM subdomain to a SOM-compatible service. Add five lines to your robots.txt declaring the endpoint, the format, and the version. Use a service like somready.com to handle the actual SOM generation and serving.

That is the complete implementation. No code changes to your site. No new dependencies. No migration. Your existing pages continue to serve normally to browsers and traditional crawlers. The SOM endpoint serves structured content to agents that discover it. Check your configuration at somready.com/check.

The binary was never real

The binary — block or be consumed — was never the only option. It was the only option when the tooling did not exist to support anything else. Publishers have always had the ability to serve different representations to different consumers. We do it for mobile browsers. We do it for screen readers. We do it for search engine crawlers with structured data markup.

Serving a structured representation to AI agents is the same principle, applied to a new class of consumer. The robots.txt SOM Directives proposal gives publishers the mechanism. The tooling gives them the implementation path. The economics give them the reason.

The third option has always been there. Now it has infrastructure.

The cost nobody is calculating

What robots.txt SOM Directives change

The economics

The five-minute implementation

The binary was never real

See also