Known Agents 2025 Year In Review

Known Agents lets websites track and control their bot traffic with features like Automatic Robots.txt, Agent Analytics, and LLM Referral Tracking. You can connect your website using any CDN, backend, or the WordPress plugin, for free.

The Known Agents 2025 Year in Review reveals how AI bots reshaped the web in 2025, and what these shifts mean for website owners in 2026 and beyond. We encourage you to explore the findings below. Detailed sources and methodology can be found in the appendix.

1. Bots represented ~40% of all website traffic

Non-human traffic reached a tipping point in 2025. Traditional crawlers like Googlebot remained the most active, but new AI-related bots made up 29% of all that traffic. Altogether, bots accounted for approximately 40% of visits to the average website. This shift fundamentally changes how website owners need to approach infrastructure costs, content optimization, and IP protection strategies. Website owners can use Agent Analytics to track all bot activity across categories.

Traffic Percentage By Agent Type

AI Agent
Uses an actual web browser to autonomously complete complex tasks on behalf of a human user
AI Assistant
Fetches website content in response to a user prompt, to include in an AI-generated answer
AI Data Scraper
Downloads website content to include in datasets used for training AI models such as LLMs
AI Search Crawler
Indexes website content to possibly include as citations in AI-powered search results
Archiver
Captures and stores historical website snapshots for long-term digital preservation
Automated Agent
Automates browser interactions programmatically without direct human supervision
Developer Helper
Assists with testing, debugging, and ensuring website functionality
Fetcher
Retrieves web page metadata to power app features like link previews or feeds
Intelligence Gatherer
Analyzes web content for brand safety, competitive insights, and ad targeting
Scraper
Extracts large amounts of web data, often without explicit website permission
Search Engine Crawler
Systematically scans and indexes web pages to include in search results
Security Scanner
Scans websites for security vulnerabilities, threats, and configuration weaknesses
SEO Crawler
Analyzes website structure and content to identify SEO improvement opportunities
Uncategorized
Not yet assigned a type
Undocumented AI Agent
AI-powered bot with an unclear purpose, often used for undisclosed data collection

Traffic Percentage By Top Agent

AhrefsBot
Amazonbot
Barkrowler
bingbot
ChatGPT-User
facebookexternalhit
Googlebot
GPTBot
meta-externalagent
SemrushBot

2. AI bots fell into 4 major behavioral categories

AI bots weren't monolithic. They served distinct purposes with real business implications:

Each category exhibited different traffic patterns depending on whether they were automated or user-initiated.

3. Publishers inconsistently blocked AI bots of the same category

Robots.txt analysis across the top 1,000 domains exposed a fragmented approach to bot management. Publishers blocked certain AI data scrapers, while allowing others that served the same purpose. The block rates for functionally identical bots varied dramatically, revealing that most sites lacked a coherent strategy or were unable to keep up with new bots. To maintain a consistent blocking strategy, we recommend using Automatic Robots.txt rather than adding individual bots manually.

More troublingly, many publishers blocked AI assistant and search crawler bots that could have driven real traffic from AI platforms like ChatGPT and Gemini. Publishers appeared unable to distinguish beneficial crawlers from extractive training bots. This confusion likely cost them significant referral traffic as AI search grew. Again, we recommend using Automatic Robots.txt to solve this problem.

AI Data Scraper Blocked Percentage

Ai2Bot-Dolma
Amazonbot
ApifyWebsiteContentCrawler
Applebot-Extended
Bytespider
CCBot
ChatGLM-Spider
ClaudeBot
CloudVertexBot
cohere-training-data-crawler
Cotoyogi
Datenbank Crawler
Diffbot
FacebookBot
FirecrawlAgent
Google-Extended
GoogleOther
GPTBot
ICC-Crawler
imageSpider
Kangaroo Bot
laion-huggingface-processor
LCC
meta-externalagent
netEstate Imprint Crawler
omgili
PanguBot
SBIntuitionsBot
Spider
Timpibot
VelenPublicWebCrawler
webzio-extended

4. AI agents became a potential new customer

Autonomous AI agents such as ChatGPT User and Manus User emerged. They browsed, compared, and transacted on behalf of human users. Companies like Browserbase built infrastructure to help businesses develop their own AI agents. With this early foundation, paired with advancements in model accuracy, AI agent activity was expected to pick up significantly in 2026. Website owners who want to capture this growing conversion channel should use Agent Analytics to see how AI agents are navigating their pages and improve conversion rates.

AI Agent Traffic Percentage

5. Initiatives to build trust accelerated, and adoption grew

The industry responded to bot proliferation with better mechanisms for transparency. Standards like HTTP Message Signatures (web bot auth) emerged to cryptographically verify bot identity, while bots increasingly included metadata in their requests to provide detail about their purpose and operator. This helped websites verify legitimate bots and optimize their experience.

Appendix

Methodology

We Want Your Feedback

We're constantly working to make our analyses as helpful as possible for the web community. If you have questions, suggestions, requests, or would like to discuss these findings, please reach out to us. If you want these insights for your own website, simply sign up and connect your website.