A16 · Access & Crawlability

AI Crawlers Allowed

Jump to section

TL;DR

Your robots.txt may be blocking AI crawlers like OAI-SearchBot and GPTBot. Add explicit Allow rules to ensure your content appears in AI-powered search results. Follow the steps below to diagnose, fix, and verify.

Why this matters

AI crawlers like OAI-SearchBot power ChatGPT Search. If blocked, your content won’t appear in AI search results regardless of how good it is. This is a sitewide configuration that affects all pages.

Where this shows up in Oversearch

In Oversearch, open AI Page Optimizer and run a scan. Check Benchmark Breakdown for the “AI crawlers allowed in robots.txt” benchmark to see the status of OAI-SearchBot and GPTBot.

What is OAI-SearchBot and why should I allow it?

OAI-SearchBot is OpenAI’s web crawler specifically for ChatGPT Search. When users ask ChatGPT questions that require current web information, OAI-SearchBot’s index is used to find relevant content.

If you want your content to appear when users search via ChatGPT, you must allow OAI-SearchBot access to your site.

  • OAI-SearchBot powers ChatGPT Search results.
  • Blocking it means zero visibility in ChatGPT Search.
  • It respects robots.txt rules strictly.
  • Allowing it does not mean your content is used for training.

If you use Oversearch, open AI Page OptimizerBenchmark Breakdown to see OAI-SearchBot status.

What is GPTBot and how is it different from OAI-SearchBot?

GPTBot is OpenAI’s crawler for training and improving AI models. It’s separate from OAI-SearchBot which is purely for search.

You can allow one while blocking the other depending on your preferences:

  • GPTBot: Used for AI model training and improvement.
  • OAI-SearchBot: Used for ChatGPT Search results only.
  • Blocking GPTBot does not affect ChatGPT Search visibility.
  • Blocking OAI-SearchBot does not affect model training.

If you use Oversearch, open AI Page OptimizerBenchmark Breakdown to see both crawler statuses separately.

How do I check if AI crawlers are blocked?

Check your robots.txt file at https://yoursite.com/robots.txt for rules affecting OAI-SearchBot and GPTBot.

Look for explicit blocks or wildcard rules that might prevent access:

  • Check for User-agent: OAI-SearchBot with Disallow: /
  • Check for User-agent: GPTBot with Disallow: /
  • Check if User-agent: * has Disallow: / (blocks all including AI)
  • No explicit rule means the crawler inherits from wildcard (*).

If you use Oversearch, open AI Page OptimizerBenchmark Breakdown for automatic detection.

How do I add explicit Allow rules for OpenAI crawlers?

Add User-agent stanzas for each AI crawler at the top of your robots.txt file, before any wildcard rules.

Here’s the recommended configuration for full AI search visibility:

# Allow OpenAI Search Bot (ChatGPT Search)
User-agent: OAI-SearchBot
Allow: /

# Allow GPTBot (OpenAI model improvement)
User-agent: GPTBot
Allow: /

# General crawlers
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/

If you want search visibility but not training, use this instead:

# Allow OpenAI Search Bot (ChatGPT Search)
User-agent: OAI-SearchBot
Allow: /

# Block GPTBot (no model training)
User-agent: GPTBot
Disallow: /

# General crawlers
User-agent: *
Allow: /

If you use Oversearch, run AI Page Optimizer after changes to verify the fix.

What does allowed vs blocked vs partial mean?

Oversearch reports three possible statuses for each AI crawler:

  • Allowed: Crawler has full access to your site (explicit or via wildcard).
  • Blocked: Crawler is completely blocked via Disallow: /.
  • Partial: Some paths are blocked, others allowed.

Partial access may be intentional (blocking /admin/) or accidental. Review your Disallow rules if you see partial status.

If you use Oversearch, open AI Page OptimizerBenchmark Breakdown to see detailed evidence.

Common root causes

  • Wildcard User-agent: * with Disallow: / blocking all crawlers.
  • Legacy robots.txt from pre-AI era without explicit AI crawler rules.
  • Security-focused robots.txt that blocks unknown user agents.
  • CMS or hosting platform defaults that are overly restrictive.

How to detect

  • In Oversearch AI Page Optimizer, check the “AI crawlers allowed” benchmark.
  • Fetch robots.txt directly: curl https://yoursite.com/robots.txt
  • Search for OAI-SearchBot and GPTBot stanzas.
  • Check if wildcard rules might be blocking AI crawlers.

How to fix

  1. Locate your robots.txt file (usually /public/robots.txt or site root).
  2. Add explicit User-agent stanzas for OAI-SearchBot and GPTBot.
  3. Place these stanzas before any wildcard (*) rules.
  4. Use Allow: / for full access or specific Allow/Disallow rules as needed.
  5. Deploy the updated robots.txt.
  6. Verify with Oversearch AI Page Optimizer scan.

Verify the fix

  • Run an Oversearch AI Page Optimizer scan and confirm AI crawler benchmarks pass.
  • Fetch robots.txt and verify the new rules are present.
  • Check that AI crawler stanzas appear before wildcard rules.
  • Allow 24-48 hours for crawlers to see the updated robots.txt.

Prevention

  • Include AI crawler rules in your robots.txt template from the start.
  • Review robots.txt when updating security or access policies.
  • Add robots.txt validation to your deployment checklist.
  • Monitor Oversearch AI Page Optimizer for crawler access issues.

FAQ

What is OAI-SearchBot and why should I allow it?

OAI-SearchBot is OpenAI’s web crawler for ChatGPT Search. If you want your content to appear in ChatGPT Search results, you must allow this crawler in robots.txt. When in doubt, add an explicit Allow rule for OAI-SearchBot.

What is GPTBot and how is it different from OAI-SearchBot?

GPTBot is OpenAI’s crawler for training and improving AI models. OAI-SearchBot is for ChatGPT Search results. You can allow one while blocking the other. When in doubt, allow OAI-SearchBot for search visibility and decide separately on GPTBot based on your training data preferences.

Can I allow ChatGPT Search but block AI training?

Yes. Allow OAI-SearchBot and block GPTBot. This lets your content appear in ChatGPT Search results without being used to train AI models. When in doubt, this is a reasonable middle-ground approach.

Do AI crawlers inherit from wildcard (*) rules?

Yes. If no explicit User-agent stanza exists for OAI-SearchBot or GPTBot, they inherit from the wildcard (*) rules. If your wildcard blocks all crawlers, AI crawlers are blocked too. When in doubt, add explicit rules for AI crawlers above your wildcard rules.

Where should AI crawler rules go in robots.txt?

Place specific AI crawler rules before wildcard (*) rules. More specific User-agent rules take precedence. When in doubt, put AI crawler stanzas at the top of your robots.txt file.

How do I verify OpenAI crawler identity?

OpenAI publishes IP ranges for their crawlers at platform.openai.com/docs/bots. You can verify requests come from these IPs. When in doubt, trust robots.txt compliance first and add IP verification only if needed for security.