Fairvisor vs. robots.txt
The Situation
robots.txt has been the standard for crawler control since 1994. It was designed for a world where crawlers were polite and few. That world is gone.
AI training crawlers — GPTBot, ClaudeBot, Bytespider, Meta-ExternalAgent — often ignore robots.txt directives. Even when they comply, robots.txt is binary: allow or disallow. No rate limiting. No analytics. No enforcement.
Comparison
| Capability | robots.txt | Fairvisor |
|---|---|---|
| Enforcement | Honor system — crawler decides whether to comply | Inline enforcement at the edge — no choice |
| Compliance model | Best-effort honor system | Inline enforcement for traffic routed through Fairvisor |
| Granularity | Allow/disallow by path | Per-bot/per-category/per-network limits (ua:bot_category, ip:type, ip:asn) with staged actions |
| Rate limiting | No — it’s all-or-nothing | Token bucket with configurable RPS and burst |
| Staged response | No | Warn → throttle → reject (gradual backpressure) |
| Analytics | Server logs (if you parse them) | Real-time dashboard: volume, bandwidth, trends by bot |
| Safe rollout | No way to test | Shadow mode — measure before enforcing |
| Latency profile | Static file fetch | Additional decision step in edge request path |
| Maintenance | Edit a text file | JSON policy, version-controlled |
| Cost | Free | Free (OSS edge) or SaaS plan for analytics |
| SEO impact | Can accidentally block search engines | Targets AI crawlers specifically, search engines untouched |
When robots.txt Is Enough
- Your site has low traffic and crawler cost is negligible
- You only need to signal intent to compliant crawlers (Google, Bing)
- You don’t need analytics on crawler behavior
When You Need Fairvisor
- AI crawlers are a measurable share of your traffic (>10%)
- Your bandwidth bill is growing due to crawler activity
- You need rate limiting, not just allow/disallow
- You need visibility: which bots, how much, which endpoints
- You want to test before enforcing (shadow mode)
robots.txtcompliance is not enough — you need enforcement
Use Them Together
Fairvisor doesn’t replace robots.txt — it complements it. Keep your robots.txt for search engine guidance. Use Fairvisor for AI crawler enforcement.
robots.txt = "Please don't scrape too fast"
Fairvisor = "You will not scrape faster than 10 RPS"