Every AI agent and crawler touching your storefront.
A merchant-facing reference for the bots, indexers, and autonomous shopping agents that fetch your product pages, train on your copy, or attempt checkout on behalf of a human. Use it to tune robots.txt, recognize legitimate traffic in your logs, and decide who deserves a verified pass into your checkout flow.
User-agent strings change. Operators rotate IPs. Treat this as a starting map, not a firewall rule — Cartograph adds the cryptographic verification layer on top.
Tracked agents
14+
Operators
9
Honor robots.txt
11 / 14
AI search index
3 agents- robots ✓
OAI-SearchBot
OpenAI
Builds OpenAI's search index for ChatGPT search answers.
OAI-SearchBot/1.xOperator docs → - robots ✓
PerplexityBot
Perplexity
Builds Perplexity's answer index and citation graph.
PerplexityBot/1.0Operator docs → - robots ✓
Amazonbot
Amazon
Indexes for Alexa and Amazon's product knowledge graph.
Amazonbot/0.1
Model training
7 agents- robots ✓
GPTBot
OpenAI
Crawls public pages to train OpenAI's foundation models.
GPTBot/1.xOperator docs → - robots ✓
ClaudeBot
Anthropic
Trains Anthropic's Claude family of models.
ClaudeBot/1.xOperator docs → - robots ✓
Google-Extended
Google
Opt-out token Google honors when training Gemini and Vertex AI models on your content.
Google-Extended (control token, not a UA string) - robots ✓
Applebot-Extended
Apple
Opt-out token for Apple Intelligence training. Applebot itself still indexes for Siri/Spotlight.
Applebot-Extended (control token) - robots ✓
Meta-ExternalAgent
Meta
Crawls public pages to train Meta's Llama and related models.
Meta-ExternalAgent/1.x - robots ✓
CCBot
Common Crawl
Builds the Common Crawl corpus, a primary training set for most open LLMs.
CCBot/2.0 - robots ✗
Bytespider
ByteDance
Trains ByteDance's Doubao/Cici models. Known for aggressive crawl rates.
Bytespider
Shopping / task agent
4 agents- robots ✓
ChatGPT-User
OpenAI
On-demand fetch when a ChatGPT user (or agent) follows a link, including shopping flows.
ChatGPT-User/1.xOperator docs → - robots ✓
Claude-Web / anthropic-ai
Anthropic
On-demand fetches initiated by Claude users, including agentic checkout tasks.
Claude-Web/1.0 · anthropic-ai/1.0 - robots ✗
Perplexity-User
Perplexity
On-demand fetch for a Perplexity user following a citation or product link.
Perplexity-User/1.0 - robots ✗
Operator (browser agent)
OpenAI
Headful browser agent that fills carts and completes checkouts on behalf of a user.
Mozilla/5.0 … (no distinct UA — drives a real browser)
User-agent strings aren't proof
Any client can claim to be GPTBot or ChatGPT-User. Operators publish reverse-DNS ranges and, increasingly, signed request headers — but enforcing them is on you. Cartograph's evidence layer verifies agent identity at the edge and gives you an auditable log of who actually touched the storefront, so you can let verified agents through and quietly rate-limit the rest.