Cost tracking
Every request row carries a cost_usd field computed at ingest time from your actual token usage and the provider's current list price. No approximations, no post-hoc math in the dashboard — the number is deterministic and auditable.
Why it matters
Provider dashboards show spend a day late and at the billing-period level. That's useless for the daily question — “is my new feature's LLM usage going to blow up the budget?” Spanlens computes cost the moment the response lands, so you can track per-minute, per-project, per-user cost without waiting for an invoice.
How it works
Price table
apps/server/src/lib/cost.ts ships a curated MODEL_PRICES map — USD per 1M tokens, separately for prompt and completion. Snapshot as of 2026-04:
'gpt-4o': { prompt: 2.5, completion: 10 }
'gpt-4o-mini': { prompt: 0.15, completion: 0.6 }
'gpt-4-turbo': { prompt: 10, completion: 30 }
'claude-opus-4-7': { prompt: 15, completion: 75 }
'claude-sonnet-4-6': { prompt: 3, completion: 15 }
'claude-haiku-4-5-20251001': { prompt: 0.8, completion: 4 }
'gemini-1.5-pro': { prompt: 1.25, completion: 5 }
'gemini-1.5-flash': { prompt: 0.075, completion: 0.3 }
// ...tsThe formula
promptCost = (promptTokens / 1_000_000) * price.prompt
completionCost = (completionTokens / 1_000_000) * price.completion
totalCost = promptCost + completionCosttsThe dated-variant problem (critical gotcha)
OpenAI returns dated variants in the model field of the response body (e.g. you request gpt-4o-mini and get back gpt-4o-mini-2024-07-18). That dated string is what lands in requests.model. Naive lookup against MODEL_PRICES['gpt-4o-mini'] would miss and return null.
calculateCost() handles this by:
- Exact match first — if
gpt-4o-mini-2024-07-18is in the table, use it. - Otherwise, longest boundary-aware prefix match. The model id must start with a registered key followed by
-, sogpt-4does not accidentally matchgpt-4o-mini.
The same matching pattern is reused by Savings and any future feature that keys on model family.
Graceful degradation on unknown models
If a request comes in for a model we don't have pricing for (brand-new release, fine-tuned custom model), calculateCost() returns null and the row's cost_usd is NULL. Dashboard filters this out of cost aggregates — we never estimate or fabricate. The gap is visible, not hidden.
Fix: open a PR to add the model to MODEL_PRICES with the provider's official rate. Backfill isn't retroactive; cost appears on new requests only.
Using it
Programmatic access
Cost is a first-class field on every request. Fetch via:
# All requests with cost, last 7 days
GET /api/v1/requests?sinceHours=168
# Aggregate cost by model
GET /api/v1/stats?sinceHours=720&groupBy=model
# → [
# { "model": "gpt-4o", "requestCount": 1204, "totalCostUsd": 12.84 },
# { "model": "gpt-4o-mini", "requestCount": 42103, "totalCostUsd": 6.32 }
# ]bashPer-project rollup
Pass projectId to scope. Useful for chargeback when multiple teams share one Spanlens org:
GET /api/v1/stats?projectId=<uuid>&sinceHours=720&groupBy=modelbashDesign choices
- Computed at ingest, not at read time. Freezes the price at the moment of the call. If OpenAI drops gpt-4o prices tomorrow, your historical cost data doesn't retroactively change. Audit-friendly.
- Stored as
numeric(14,8). 8 decimal places of precision — enough to represent fractional cents on very cheap models without rounding error. - Table, not API. We don't fetch live prices from provider APIs — they don't expose one. Hand-maintained table is the reality for the industry.
Limitations
- Price table drifts. When a provider changes prices, our table needs a PR. Tracked as a monthly maintenance item. If you're self-hosting, pin a specific commit or expect to cherry-pick updates.
- No cache-token pricing separate line yet. Anthropic's
cache_read_input_tokensandcache_creation_input_tokensare currently folded into prompt tokens. We're adding separate accounting so cost reflects the 10× discount. Roadmap. - No batch API discount. OpenAI and Anthropic both offer ~50% off batch calls. Spanlens treats a batch response as a normal request. Mark batch traffic with a custom tag and filter manually for now.
Note: this page is about per-request USD cost (how much each LLM call cost you at the provider). For Spanlens subscription billing — plan quotas, overage rates, invoicing — see Billing & quotas.
Related: Requests, Savings, /dashboard. Source: apps/server/src/lib/cost.ts.