Cloudflare has announced new default settings that will block 'mixed-use' crawlers from accessing pages hosting ads starting September 15, 2026. The change applies to new customers, new sites set up by existing customers, and all free customers. This policy aims to separate web crawlers used for traditional search from those used for AI agents and training, according to the company. The move could impact how AI model providers access web content for training and powering agentic services. Cloudflare emphasized that most website owners want their content to be discoverable through search and AI services but seek protections against unauthorized use of their intellectual property. The company highlighted that the 'world’s largest search engine' has access to about '2x more information' than other AI companies due to its difficulty in maintaining discoverability without being used for AI. Google has previously contested this generalization, noting its Google Extended bot allows site owners to opt out of content use for training and AI products without affecting search inclusion. However, Googlebot crawls for Search, including AI features like AI Overviews and AI Mode. 'Now that the majority of traffic on the Internet is non-human, we must go further and act faster so that a sustainable ecosystem can emerge,' said Cloudflare co-founder and CEO Matthew Prince, referring to the recent milestone where bots surpassed human traffic for the first time. That shift was not expected to occur until next year. 'Cloudflare’s new tools and partnerships give website owners increased visibility and commercial opportunities and benefit AI companies that have bots with clear and transparent intent,' Prince added. While Cloudflare offers products to help users launch AI systems, the company also released tools to give publishers more control over their content in the AI era. In recent years, Cloudflare launched tools to combat AI bots, including a marketplace that lets websites charge AI bots for scraping, dubbed Pay Per Crawl. The latter is now evolving into 'Pay Per Use,' allowing publishers to charge AI companies when their content creates value, not just when it’s fetched. Cloudflare’s data suggested that over 50% of crawl traffic from AI crawlers is spent re-fetching unchanged pages. To implement this, Cloudflare is initially working with two partners, Ceramic.ai and You.com. When a publisher opts in, they’re paid when their content appears in Ceramic’s AI search results or when You.com accesses a piece of their premium content. Other AI companies can customize this model for how they work, Cloudflare says.
Source: techcrunch