Know where every AI dollar goes
Cost attribution by tenant, team, and endpoint. Budget enforcement in real time. CSV export for your FinOps tools. Fairvisor makes LLM spend visible and controllable.
What Fairvisor Does for FinOps
Cost Attribution
Every request is tagged with its cost and grouped by your business dimensions — org_id, team, user, endpoint, model. Query via API or export to CSV. → Cost-Based Budget docsReal-Time Budget Enforcement
Don’t just observe overspend — prevent it. Set budgets per org, team, user, or endpoint. Fairvisor enforces them at the edge with staged actions: warn → throttle → reject.Anomaly Alerting
When a tenant’s spend rate is 10x their usual, Fairvisor fires an alert before the circuit breaker trips. Slack, webhook, email — you pick the channel. → Budget exhaustion runbookExport & Integration
CSV and JSON export for integration with Kubecost, CloudHealth, Vantage, or internal dashboards. → Metrics referenceWho Pays When Nobody Is Watching
Three scenarios that happen before you notice:
The agent in a loop. A misconfigured retry loop runs 200 calls per minute overnight. Monitoring fires at 3am. By 3:35am when the on-call engineer kills it: $8,000 spent. With Fairvisor budget circuit breaker: stopped at $50.
The noisy tenant. One free-tier customer discovers your API doesn’t enforce limits. They run a batch job and consume 80% of your monthly LLM budget in 6 hours. With per-tenant budgets: hard stop at their limit, other tenants unaffected.
The expensive endpoint. A new RAG pipeline gets deployed. Nobody noticed the average prompt is 40,000 tokens. Bill arrives in 30 days. With real-time cost attribution: you see the spike in the dashboard within minutes.
Who This Is For
- FinOps practitioners and finance leads tracking LLM spend
- Engineering managers with budget responsibility across teams or products
- Platform teams that need to chargeback AI costs by org or team
- CFOs who got an unexpected LLM bill and want hard limits going forward
FAQ
How does cost attribution work?
Every request is tagged with its calculated cost and grouped by your business dimensions: org_id, team, user, endpoint path, model name, time window. Queryable in real time via API. No waiting 30 days for provider exports.Can I set hard limits per team?
Yes. Set a daily or hourly budget per org, team, user, or endpoint. Fairvisor enforces it at the edge: warn → throttle → reject. Not just observability — actual enforcement before the overage happens.What happens when a budget is reached?
At 80% of budget: warning header in the response. At 95%: throttle with 200–500ms delay. At 100%: reject with 429. Thresholds are configurable. The budget circuit breaker also trips automatically if spend rate spikes abnormally above baseline.How do I export spend data to my FinOps tools?
CSV and JSON export. Compatible with Kubecost, CloudHealth, Vantage, or internal dashboards. Or query via API for real-time integration. → Metrics referenceHow does anomaly alerting work?
When a tenant’s spend rate is significantly above their recent baseline (configurable multiplier), Fairvisor fires an alert before the circuit breaker trips. Channels: Slack, webhook, email. You find out before the engineer wakes up, not after.How granular is cost attribution?
Per request, grouped by any combination of: org_id, team_id, user_id, endpoint path, model name, and time window. Any combination is queryable via API.Does Fairvisor integrate with our LLM provider's usage data?
Fairvisor tracks cost based on token counts it observes directly: prompt tokens before forwarding, completion tokens during streaming. For providers that don’t expose token counts in the response, you configure per-model cost rates and Fairvisor estimates from those.Can we separate soft alerts from hard budget cutoffs?
Yes. Configure warning thresholds (for alerting/visibility) independently from enforcement thresholds (throttle/reject). This lets finance see early risk signals while engineering keeps deterministic hard stops.Why teams choose Fairvisor
Attribution before the bill arrives
Every request tagged with cost by org, team, endpoint, and model — queryable in real time, not reconstructed 30 days later from provider exports.Enforcement, not just observability
Most tools show you what happened. Fairvisor stops it at 95% of budget before it becomes 110%.One tool for visibility and control
Cost attribution, budget enforcement, anomaly alerting, and circuit breaker — in the same policy layer. No separate observability stack required.Dashboard Preview
(screenshot placeholder: consumption dashboard showing per-tenant spend over time, budget utilization bars, and projected exhaustion dates)
Key metrics at a glance:
- Total spend by tenant (7d / 30d)
- Budget utilization % per org
- Reject rate (are you over-limiting?)
- Projected exhaustion date
- Top spending endpoints
- Loop detection events