Free Tool — No Signup Required

AI Crawlability Tester

Check if AI search crawlers can access your website. Test your robots.txt and llms.txt configuration against GPTBot, PerplexityBot, ClaudeBot, and more in seconds.

Understanding AI Crawlers

AI search engines use specialized web crawlers to discover and index your content. If your website blocks these crawlers, your content cannot appear in AI search results — regardless of how well it is optimized. Here is what each major AI crawler does and how to configure access.

GPTBot

OpenAI's general-purpose web crawler. GPTBot crawls websites for both model training data and search retrieval. It is the primary crawler that determines whether your content appears in ChatGPT Search results.

User-agent: GPTBot
Allow: /

OAI-SearchBot

OpenAI's search-specific crawler. Unlike GPTBot, OAI-SearchBot is used exclusively for ChatGPT Search and does not contribute to model training. If you want ChatGPT Search visibility without training contribution, allow this bot and block GPTBot.

User-agent: OAI-SearchBot
Allow: /

PerplexityBot

Perplexity AI's web crawler. Perplexity is the second-largest AI-native search platform and is known for providing more source citations per response than any other AI search engine (an average of 6.2 citations per response).

User-agent: PerplexityBot
Allow: /

ClaudeBot

Anthropic's web crawler for Claude. ClaudeBot indexes content for Claude's web search capabilities and training data. Allowing ClaudeBot ensures your content is available to Claude users when they search for topics you cover.

User-agent: ClaudeBot
Allow: /

Google-Extended

Google's AI-specific crawler. Google-Extended is separate from Googlebot and is used for AI training and AI Overviews. Blocking Google-Extended does not affect your traditional Google Search rankings — it only affects AI Overviews and Gemini.

User-agent: Google-Extended
Allow: /

Amazonbot

Amazon's web crawler used for Alexa, Amazon search, and AI assistant features. Allowing Amazonbot ensures your content can be surfaced through Amazon's ecosystem, including Alexa voice responses and Amazon product search.

User-agent: Amazonbot
Allow: /

How to Configure Your robots.txt for AI Crawlers

Your robots.txt file is the primary mechanism for controlling AI crawler access. It is a plain text file located at your domain root (e.g., https://yourdomain.com/robots.txt). Here is a recommended configuration that allows all major AI crawlers:

# Traditional search engines
User-agent: Googlebot
Allow: /

User-agent: Bingbot
Allow: /

# AI Search Crawlers
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Amazonbot
Allow: /

# Sitemap
Sitemap: https://yourdomain.com/sitemap.xml

Important

If you want to appear in ChatGPT Search but do not want your content used for AI model training, allow OAI-SearchBot and block GPTBot. OAI-SearchBot is used only for search, while GPTBot is used for both training and search.

What Is llms.txt?

The llms.txt file is a proposed standard that provides AI-readable information about your website. While robots.txt tells AI crawlers what they can access, llms.txt tells them what your site is about, which pages are most important, and how to understand your content structure.

Think of llms.txt as a site manifest for AI engines. It sits at your domain root (e.g., https://yourdomain.com/llms.txt) and uses a simple Markdown-like format. Early data suggests that sites with llms.txt files receive marginally more AI engagement, though the specification is still emerging.

Use our free llms.txt generator to create yours in minutes.

Why AI Crawlability Matters

Visibility in AI Search

If AI crawlers cannot access your site, your content will never appear in ChatGPT Search, Perplexity, or other AI search results. Crawlability is the foundation of AI search visibility.

Growing Traffic Source

AI search handles 8.1% of all web queries in 2026 and is growing rapidly. Sites that allow AI crawlers now are building visibility in a channel that will only become more important.

Control Over Access

Robots.txt gives you granular control. You can allow search crawlers while blocking training crawlers, or allow access to certain directories while restricting others.

Want a Complete AI Search Audit?

Crawlability is just one part of AI search optimization. Run a full AEO audit to check your content structure, schema markup, and overall citation readiness.