Enforcement at microsecond speed
Fairvisor is built for one thing: fast decisions under real production load, with sub-millisecond targets in documented deployment patterns.
What Fairvisor Does for SREs
Fast by Design
Fairvisor evaluates allow/throttle/reject in-process and in-memory. Optimized for hot-path performance first. → Performance tuningMicrosecond-Class Decision Time
Docs and benchmarks show a microsecond-class decision path with sub-millisecond targets in typical deployments. Validate latency in your own traffic profile and gateway topology.Predictable Under Load
Counters and policy checks stay local to the edge process. Stable latency characteristics even during bursts. No remote calls on the hot path.Policy Propagation Without Hot-Path Penalty
Policies sync asynchronously. Data-plane requests are never blocked waiting for control-plane response.Fail-Open by Default
If policy data is temporarily unavailable or stale, traffic is allowed by default with explicit telemetry. Enforcement never becomes a hard outage trigger.Graceful Degradation
No cliff, no thundering herd. Controlled backpressure at 80% (warning header), 95% (throttle with 200–500ms delay), 100% (reject with Retry-After + jitter).Decision Tracing from 429 to Root Cause
Reject responses include reason/retry metadata. For policy/rule attribution, use debug session headers (X-Fairvisor-Debug-*). → Decision tracing
Prometheus Metrics Out of the Box
fairvisor_decisions_total, fairvisor_decision_duration_seconds, fairvisor_config_version and related metrics are exposed via /metrics. Prometheus scrape/forwarding setup remains part of your infra config. → Metrics reference
Incident Runbook
What the first 10 minutes of a rate limiting incident look like with Fairvisor:
T+0 — Reject spike alert fires. fairvisor_decisions_total{action="reject"} crosses threshold.
T+1 — Check which route and limit key is triggering. fairvisor_decisions_total grouped by route and limit_key shows the source immediately.
T+2 — Pull decision trace for a sample 429. Use X-Fairvisor-Reason/Retry-After, then enable debug session headers for policy/rule attribution. → Debug session docs
T+5 — If abuse confirmed: activate kill-switch for the offending tenant. Propagation is designed to be fast and should be validated against your deployment.
T+10 — Incident contained. Audit log captures operator identity, action, and scope. → Kill-switch runbook
Total investigation time without Fairvisor: 20–40 minutes. With decision tracing: under 5.
Who This Is For
- SREs and on-call engineers who own API reliability
- Platform engineers setting SLOs for shared rate limiting infrastructure
- DevOps teams deploying enforcement as a shared service
- Teams where enforcement latency affects production p99
FAQ
How much latency does Fairvisor add?
Fairvisor runs in-process and in-memory, with sub-millisecond targets in documented deployment patterns. Actual p95/p99 depends on gateway wiring, workload shape, and environment.What happens if the policy control plane goes down?
Fail-open by default. If policy data is unavailable or stale, traffic is allowed through with explicit telemetry logged. Enforcement never becomes a hard outage trigger. You can configure fail-closed per route if your use case requires it.How quickly do policy changes propagate to the edge?
Policy sync is asynchronous and designed for seconds-scale propagation in normal conditions. Validate propagation and alert thresholds in your own environment. → Performance tuningWhat Prometheus metrics are available out of the box?
fairvisor_decisions_total (labeled by action, route, limit_key), fairvisor_decision_duration_seconds, fairvisor_config_version, fairvisor_loops_detected_total, fairvisor_circuit_breaker_trips_total and other counters/histograms via /metrics. Prometheus scrape wiring is configured in your stack. → Metrics reference
How does graceful degradation work?
No cliff, no thundering herd. At 80% of limit: warning header. At 95%: throttle with 200–500ms delay. At 100%: reject withRetry-After plus jitter. The jitter prevents synchronized retry storms when all rejected clients see the same Retry-After value.
How do I trace why a specific request was rejected?
Start with reject headers (X-Fairvisor-Reason, Retry-After, RateLimit*). For policy/rule attribution, enable debug session headers (X-Fairvisor-Debug-*). → Decision tracing
What is the kill-switch and when should I use it?
The kill-switch blocks traffic for a specific scope (tenant, route, or descriptor value) and is intended for rapid incident containment. Use it when abuse is confirmed and verify propagation in your deployment runbook. → Kill-switch runbookCan we scope limits per tenant without creating noisy-neighbor regressions?
Yes. Limits are keyed by tenant/user/route dimensions, so one tenant’s spike does not consume another tenant’s quota. This keeps enforcement isolation aligned with your SLO boundaries.Why teams choose Fairvisor
100μs decisions that don't eat your latency budget
In-process, in-memory evaluation. Policy enforcement adds microseconds, not milliseconds. Never your bottleneck.Controlled backpressure, not a cliff
Staged degradation at 80%, 95%, 100% prevents thundering herd on limit breach. Jitter on Retry-After prevents synchronized retries.Trace from 429 to root cause with deterministic workflow
Reason/retry headers plus debug session attribution (X-Fairvisor-Debug-*) give an operator path from reject to policy/rule without blind log hunting.
Targets
| Metric | Target |
|---|---|
| Decision latency p50 | Microsecond-class target |
| Decision latency p99 | Sub-millisecond target (deployment-dependent) |
| Decision latency p99.9 | Low-millisecond target (deployment-dependent) |
| Bundle propagation | Seconds-scale target (deployment-dependent) |
| Kill-switch effect | Rapid containment target (deployment-dependent) |