Requests
Every LLM call that flows through the Spanlens proxy produces one row in the requests table. /requests is the viewer: filter, sort, drill down, and read the actual request and response bodies. This is the raw substrate every other feature (Traces, Anomalies, Savings, etc.) aggregates from.
Why it matters
Aggregate views summarize — they smooth over individual outliers. When something goes wrong — a user reports a wrong answer, a cost spike is unaccounted for, a prompt injection slips through — you need to see the actual bytes that went out and came back. Requests gives you that exact record.
What gets logged
For every proxied call, Spanlens stores:
| Field | Description |
|---|---|
provider | openai / anthropic / gemini |
model | The dated variant returned by the provider (e.g. gpt-4o-mini-2024-07-18), not the alias you requested |
prompt_tokens / completion_tokens / total_tokens | Parsed from the provider's response (or streamed deltas) |
cost_usd | Computed via cost tracking |
latency_ms | Time from our proxy receiving the request to last byte sent |
status_code | HTTP status from the provider (200, 429, 500, etc.) |
request_body / response_body | Full payloads up to 10KB each. Truncated with a marker if larger. Authorization headers stripped before storage. |
project_id | Scoped to the API key used (or X-Spanlens-Project header) |
provider_key_id | Which provider key was used to make the call (name shown in the drawer) |
trace_id / span_id | Set when the call was made inside an SDK observe() wrapper. Links to the parent Trace. |
flags | PII / injection flags (JSONB array) |
created_at | When the request arrived at the proxy |
Dashboard
Stat strip
Above the list, a five-cell strip shows real-time 24-hour metrics: total requests, average latency, spend, error rate, and active anomaly count. Each cell includes a mini spark chart. Cells turn accent-colored when a metric exceeds a threshold (latency > 1 s, error rate > 1%, any anomaly present).
Traffic bars
A 30-day bar chart sits below the stat strip. Bar height corresponds to request volume; bars with at least one error flip to the error color. Hover a bar to see the date label.
List view & filters
The main table is paginated (up to 100 rows/page) with these filters:
- Provider — exact match (openai / anthropic / gemini)
- Model — partial, case-insensitive match (e.g. searching “mini” matches
gpt-4o-mini-2024-07-18) - Provider key — dropdown of your registered keys, to isolate traffic by key
- Status — All / OK (2xx) / 4xx / 5xx
- Date range — from / to
Column headers for Latency, Cost, Tokens, and Age are clickable to sort ascending or descending. The default sort is newest-first by created_at.
Hovering the Age cell shows a tooltip with the full timestamp.
Detail drawer
Clicking any row opens a 480 px right-side drawer — no page navigation. The drawer shows:
- Request ID, timestamp, and error badge (if applicable)
- Metadata grid: Model, Provider, Status code, Provider key name, Prompt tokens, Completion tokens
- Trace / Span IDs with inline links and copy buttons. Trace ID links directly to the Traces waterfall view.
- Metrics row: Latency, Cost, Total tokens (with prompt / completion breakdown)
- Prev / Next navigation buttons — step through the current result set one row at a time. When you reach the end of a page the drawer automatically loads the next page and jumps to the first (or last) row. An Open → link opens the standalone detail page
/requests/[id]if you need a shareable URL.
Drawer tabs
| Tab | Content |
|---|---|
| Request | Formatted message view. OpenAI and Anthropic messages[] are rendered as a conversation. Anthropic system strings/arrays are shown in a separate block above the messages. Gemini contents[].parts[] are normalized into the same layout. A copy button exports the raw JSON. |
| Response | Response body JSON when captured. Streaming responses are not buffered server-side (they pass through directly to your app), so this tab shows a note in that case. |
| Trace | Mini span list from the parent trace (up to 8 spans with type badges and durations) + a link to open the full waterfall. Shows a help note when the request has no associated trace. |
| Raw | Full request_body and response_body as pretty-printed JSON, each with a copy button. |
| Error | Conditionally shown when error_message is set. Displays the raw error string from the provider. |
API
# List requests — paginated, sortable, filterable
GET /api/v1/requests
?projectId=<uuid> # filter by project
&provider=openai # exact match
&model=mini # partial match (case-insensitive)
&providerKeyId=<uuid> # filter by provider key
&status=ok # ok | 4xx | 5xx
&from=2024-01-01T00:00:00Z
&to=2024-01-31T23:59:59Z
&sortBy=latency_ms # created_at | latency_ms | cost_usd | total_tokens
&sortDir=desc # asc | desc
&page=1
&limit=50 # max 100
# One request by id (includes full request_body + response_body)
GET /api/v1/requests/:id
# Replay a request (returns a proxy-ready payload — no UI button yet)
POST /api/v1/requests/:id/replay
Body: { "model": "gpt-4o-mini" } # optional model overridebashThe list endpoint returns { success, data, meta: { total, page, limit } }. Each row includes a flattened provider_key_name field (the human-readable key label) so the dashboard can render it without a second round-trip.
Privacy & retention
- Authorization headers are stripped from
request_bodybefore it's stored — your OpenAI/Anthropic/Gemini key never appears in logs. - 10KB body cap. Large prompts (e.g. 40-page PDF extraction) are truncated at 10KB with a visible marker. Full bodies would blow up storage and cost.
- Retention policy. Free plan: 7 days. Paid plans: 30/90 days. Old rows are pruned nightly by
cron-prune-logs. - RLS-enforced. You can only see requests belonging to your own organization. The
requeststable has Row Level Security enabled.
Limitations
- 10KB body cap is fixed. A “full-body archive to S3” opt-in for Enterprise customers is on the roadmap.
- No full-text body search in the UI. The model filter uses
ilike; there is no free-text search over request/response body content. Heavier search needs a separate OLAP layer (ClickHouse is the likely path). - No UI replay button. The backend exposes
POST /api/v1/requests/:id/replaywhich returns a proxy-ready payload, but there is no one-click “send this request again” button in the dashboard yet. - Streaming response bodies not captured. The proxy streams responses directly to your application without buffering, so
response_bodyisnullfor streaming calls. Token counts and cost are still accurate (parsed from SSE deltas).
Related: Traces (grouped view), Cost tracking, Security flags, /requests dashboard.