AI Crawlability Tester
Check if AI search crawlers can access your website. Test your robots.txt and llms.txt configuration against GPTBot, PerplexityBot, ClaudeBot, and more in seconds.
Understanding AI Crawlers
AI search engines use specialized web crawlers to discover and index your content. If your website blocks these crawlers, your content cannot appear in AI search results — regardless of how well it is optimized. Here is what each major AI crawler does and how to configure access.
GPTBot
OpenAI's general-purpose web crawler. GPTBot crawls websites for both model training data and search retrieval. It is the primary crawler that determines whether your content appears in ChatGPT Search results.
User-agent: GPTBot
Allow: / OAI-SearchBot
OpenAI's search-specific crawler. Unlike GPTBot, OAI-SearchBot is used exclusively for ChatGPT Search and does not contribute to model training. If you want ChatGPT Search visibility without training contribution, allow this bot and block GPTBot.
User-agent: OAI-SearchBot
Allow: / PerplexityBot
Perplexity AI's web crawler. Perplexity is the second-largest AI-native search platform and is known for providing more source citations per response than any other AI search engine (an average of 6.2 citations per response).
User-agent: PerplexityBot
Allow: / ClaudeBot
Anthropic's web crawler for Claude. ClaudeBot indexes content for Claude's web search capabilities and training data. Allowing ClaudeBot ensures your content is available to Claude users when they search for topics you cover.
User-agent: ClaudeBot
Allow: / Google-Extended
Google's AI-specific crawler. Google-Extended is separate from Googlebot and is used for AI training and AI Overviews. Blocking Google-Extended does not affect your traditional Google Search rankings — it only affects AI Overviews and Gemini.
User-agent: Google-Extended
Allow: / Amazonbot
Amazon's web crawler used for Alexa, Amazon search, and AI assistant features. Allowing Amazonbot ensures your content can be surfaced through Amazon's ecosystem, including Alexa voice responses and Amazon product search.
User-agent: Amazonbot
Allow: / How to Configure Your robots.txt for AI Crawlers
Your robots.txt file is the primary mechanism for controlling AI crawler access. It is a plain text file located at your domain root (e.g., https://yourdomain.com/robots.txt). Here is a recommended configuration that allows all major AI crawlers:
# Traditional search engines
User-agent: Googlebot
Allow: /
User-agent: Bingbot
Allow: /
# AI Search Crawlers
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Amazonbot
Allow: /
# Sitemap
Sitemap: https://yourdomain.com/sitemap.xml Important
If you want to appear in ChatGPT Search but do not want your content used for AI model training, allow OAI-SearchBot and block GPTBot. OAI-SearchBot is used only for search, while GPTBot is used for both training and search.
What Is llms.txt?
The llms.txt file is a proposed standard that provides AI-readable information about your website. While robots.txt tells AI crawlers what they can access, llms.txt tells them what your site is about, which pages are most important, and how to understand your content structure.
Think of llms.txt as a site manifest for AI engines. It sits at your domain root (e.g., https://yourdomain.com/llms.txt) and uses a simple Markdown-like format. Early data suggests that sites with llms.txt files receive marginally more AI engagement, though the specification is still emerging.
Use our free llms.txt generator to create yours in minutes.
Why AI Crawlability Matters
Visibility in AI Search
If AI crawlers cannot access your site, your content will never appear in ChatGPT Search, Perplexity, or other AI search results. Crawlability is the foundation of AI search visibility.
Growing Traffic Source
AI search handles 8.1% of all web queries in 2026 and is growing rapidly. Sites that allow AI crawlers now are building visibility in a channel that will only become more important.
Control Over Access
Robots.txt gives you granular control. You can allow search crawlers while blocking training crawlers, or allow access to certain directories while restricting others.
Want a Complete AI Search Audit?
Crawlability is just one part of AI search optimization. Run a full AEO audit to check your content structure, schema markup, and overall citation readiness.