Requests

Every LLM call that flows through the Spanlens proxy produces one row in the requests table, backed by ClickHouse for fast analytical reads. /requests is the viewer: filter, sort, drill down, and read the actual request and response bodies. This is the raw substrate every other feature (Traces, Anomalies, Savings, etc.) aggregates from.

Why it matters

Aggregate views summarize, they smooth over individual outliers. When something goes wrong , a user reports a wrong answer, a cost spike is unaccounted for, a prompt injection slips through, you need to see the actual bytes that went out and came back. Requests gives you that exact record.

What gets logged

For every proxied call, Spanlens stores:

FieldDescription
provideropenai / anthropic / gemini / azure
modelThe dated variant returned by the provider (e.g. gpt-4o-mini-2024-07-18), not the alias you requested
prompt_tokensGross input tokens parsed from the provider response (or streamed deltas). Includes any cached portion.
completion_tokensOutput tokens generated by the model.
total_tokensSum of prompt + completion. Convenience for billing queries.
cache_read_tokensSubset of prompt_tokensserved from the provider's prompt cache (Anthropic / OpenAI). Charged at reduced rate by cost tracking. 0 for rows before 2026-05-14.
cache_write_tokensPortion of prompt_tokens written to the cache (Anthropic). Billed at the cache-write premium.
cost_usdComputed via cost tracking
latency_msTime from our proxy receiving the request to last byte sent
status_codeHTTP status from the provider (200, 429, 500, etc.)
request_bodyOutgoing payload sent to the provider, up to 64KB. Authorization headers stripped before storage.
response_bodyIncoming payload from the provider, up to 64KB. Reconstructed into non-streaming shape for SSE responses.
project_idScoped to the API key used (or X-Spanlens-Project header)
provider_key_idWhich provider key was used to make the call (name shown in the drawer)
trace_idSet when the call ran inside an SDK observe() wrapper. Groups related calls into a Trace.
span_idIdentifies this specific call within the trace tree.
prompt_version_idSet when the call carried x-spanlens-prompt-version header. Links to a Prompts version row.
user_idSet from x-spanlens-user header. Customer-supplied end-user ID for attribution (Spanlens does not interpret the value).
session_idSet from x-spanlens-session header. Groups requests from one conversation or workflow.
flagsPII / injection flags (JSONB array)
created_atWhen the request arrived at the proxy

Dashboard

Stat strip

Above the list, a five-cell strip shows real-time 24-hour metrics: total requests, average latency, spend, error rate, and active anomaly count. Each cell includes a mini spark chart. Cells turn accent-colored when a metric exceeds a threshold (latency > 1 s, error rate > 1%, any anomaly present).

Traffic bars

A 30-day bar chart sits below the stat strip. Bar height corresponds to request volume; bars with at least one error flip to the error color. Hover a bar to see the date label.

List view & filters

The list auto-refreshes every 10 seconds so new requests appear without a page reload. A manual button in the toolbar forces an immediate refetch.

The main table is paginated (up to 100 rows/page) with these filters. Filter state is synced to the URL, copy and share the URL to hand off a pre-filtered view, or use the browser's back button to restore a previous filter state.

  • Provider, exact match (openai / anthropic / gemini / azure)
  • Model, partial, case-insensitive match (e.g. searching “mini” matches gpt-4o-mini-2024-07-18)
  • Provider key, dropdown of your registered keys, to isolate traffic by key
  • Status, All / OK (2xx) / 4xx / 5xx
  • Date range, from / to

URL-only filters

These filters are applied when navigating here via a drilldown from another page. An active filter banner appears at the top of the page and can be cleared with Clear ×.

URL paramMeaningPrimary entry point
?promptVersionId=<uuid>Only calls that used a specific prompt versionPrompts → Calls tab row click
?userId=<str>Only calls from a specific end-userRequest detail User field click
?sessionId=<str>Only calls from a specific sessionRequest detail Session field click

The user_id / session_id columns are only populated when the caller sends the x-spanlens-user / x-spanlens-session headers. See SDK helpers withUser() / withSession().

Column headers for Latency, Cost, Tokens, and Age are clickable to sort ascending or descending. The default sort is newest-first by created_at.

Hovering the Age cell shows a tooltip with the full timestamp.

Replay

Every request detail page has a Replay button. It opens a modal where you can re-run the original call against a different model and compare the result inline , without touching your application code.

  • Model selector. A dropdown pre-populated with models for the same provider. The original model is always available as the first option. Changing the model resets any previous result.
  • Run. Executes the replay server-side via POST /api/v1/requests/:id/replay/run. Spanlens decrypts your provider key, strips any stream: true flag, forwards the original request body with the new model, and returns a result card showing latency, token counts, and cost. The replayed call is also logged as a new row in /requests.
  • Copy curl. Fetches a ready-to-run curl snippet from POST /api/v1/requests/:id/replay and copies it to the clipboard. The snippet is also displayed in the modal so you can inspect or edit it before running.

Detail drawer

Clicking any row opens a 480 px right-side drawer, no page navigation. The drawer shows:

  • Request ID, timestamp, and error badge (if applicable)
  • Metadata grid: Model, Provider, Status code, Provider key name, Prompt tokens, Completion tokens
  • Trace / Span IDs with inline links and copy buttons. Trace ID links directly to the Traces waterfall view.
  • Metrics row: Latency, Cost, Total tokens (with prompt / completion breakdown)
  • Prev / Next navigation buttons, step through the current result set one row at a time. When you reach the end of a page the drawer automatically loads the next page and jumps to the first (or last) row. An Open → link opens the standalone detail page /requests/[id] if you need a shareable URL.

Drawer tabs

TabContent
RequestFormatted message view. OpenAI and Anthropic messages[] are rendered as a conversation. Anthropic system strings/arrays are shown in a separate block above the messages. Gemini contents[].parts[] are normalized into the same layout. A copy button exports the raw JSON.
ResponseResponse body JSON when captured. Streaming responses are not buffered server-side (they pass through directly to your app), so this tab shows a note in that case.
TraceMini span list from the parent trace (up to 8 spans with type badges and durations) + a link to open the full waterfall. Shows a help note when the request has no associated trace.
RawFull request_body and response_body as pretty-printed JSON, each with a copy button.
ErrorConditionally shown when error_message is set. Displays the raw error string from the provider.

API

# List requests, paginated, sortable, filterable
GET /api/v1/requests
  ?projectId=<uuid>      # filter by project
  &provider=openai       # exact match
  &model=mini            # partial match (case-insensitive)
  &providerKeyId=<uuid>  # filter by provider key
  &promptVersionId=<uuid> # filter by prompt version
  &userId=<str>          # filter by x-spanlens-user header value
  &sessionId=<str>       # filter by x-spanlens-session header value
  &status=ok             # ok | 4xx | 5xx
  &from=2024-01-01T00:00:00Z
  &to=2024-01-31T23:59:59Z
  &sortBy=latency_ms     # created_at | latency_ms | cost_usd | total_tokens
  &sortDir=desc          # asc | desc
  &page=1
  &limit=50              # max 100

# One request by id (includes full request_body + response_body)
GET /api/v1/requests/:id

# Replay, curl snippet (proxy-ready payload)
POST /api/v1/requests/:id/replay
  Body: { "model": "gpt-4o-mini" }  # optional model override

# Replay, execute server-side and return result (latency / tokens / cost)
POST /api/v1/requests/:id/replay/run
  Body: { "model": "gpt-4o-mini" }  # optional model override
bash

The list endpoint returns { success, data, meta: { total, page, limit } }. Each row includes a flattened provider_key_name field (the human-readable key label) so the dashboard can render it without a second round-trip.

Privacy & retention

  • Authorization headers are stripped from request_bodybefore it's stored, your OpenAI/Anthropic/Gemini key never appears in logs.
  • API key patterns are auto-masked in stored bodies. Anything matching sk-*, sk-proj-*, sk-ant-*, AIza*, or sl_live_* (≥12-char body) is replaced with <prefix>*** before insert. Defense-in-depth for keys that slip into prompts/tool output/error text. See Security for details.
  • 64KB body cap. Large prompts (e.g. 40-page PDF extraction) are truncated at 64KB with a visible marker. Full bodies would blow up storage and cost.
  • Body retention opt-out. Pass logBody: 'meta' in the SDK (or X-Spanlens-Log-Body: meta header) to skip body storage entirely while keeping tokens / cost / latency / identifiers. Set 'none' to additionally drop user_id and session_id. See SDK.
  • Retention policy.Free plan: 14 days. Pro: 90 days. Team and Enterprise: 365 days (Enterprise is extendable by contract). Enforced by the table's TTL plus a per-plan query-time clip, older rows are dropped by ClickHouse's background merge.
  • Tenant isolation. ClickHouse has no row-level security; every read path goes through the requestsScope() helper which injects an organization_id = ? filter on every query. Direct ClickHouse access is server-only. The dashboard cannot bypass the filter.

Limitations

  • 64KB body cap is fixed.A “full-body archive to S3” opt-in for Enterprise customers is on the roadmap.
  • No full-text body search in the UI yet.The model filter uses case-insensitive substring match; there is no free-text search over request/response body content. ClickHouse can do it efficiently, the dashboard hasn't exposed it yet.
  • Streaming response bodies are reconstructed, not original.SSE chunks are tee'd while pass-through to the client. After the stream closes, the assistant text is reassembled and written to response_bodyin the upstream's standard non-streaming shape (so the dashboard renders it identically to a non-stream response). Tool calls / images / non-text content blocks are not preserved, only the assistant-visible text portion. Aborted streams keep whatever was received up to the break.

Related: Traces (grouped view), Cost tracking, Security flags, /requests dashboard.