LLM observability
LLM observability is the practice of capturing every call your application makes to a large language model, then surfacing the cost, latency, token usage, and behavioral signals that decide whether the app stays profitable, fast, and safe. It is the LLM-specific equivalent of APM for traditional services. This page is the conceptual entry point; the marketing-side hub lives at /llm-observability.
Why standard APM is not enough
Datadog, New Relic, and Sentry track HTTP requests and database spans. LLM calls have shapes those tools were not built to capture: token counts that drive non-linear cost, model variants priced differently within the same provider, prompt versions that change behavior without code changes, tool calls that branch into agent flows, and non-deterministic output where the same input gives different results on each run.
The five categories Spanlens captures
| Category | Spanlens entity | Surface |
|---|---|---|
| Cost (USD per request, per model, per customer) | requests.cost_usd + tags | /requests, /savings |
| Latency (p50/p95/p99, TTFT for streaming) | requests.latency_ms, requests.time_to_first_token_ms | /requests |
| Quality (eval scores, judge↔human correlation) | evals, annotations | /evals, /annotation |
| Reliability (error rates, retry counts) | requests.status, requests.retry_count | /anomalies, /alerts |
| Security (PII matches, prompt injection, key leakage) | requests.pii_flags, security_events | /security |
Three integration patterns
Spanlens supports the three patterns that dominate the space. Pick by stack, not by ideology.
1. Drop-in SDK
Swap the provider SDK import for a Spanlens-instrumented version. Same methods, same types. Fastest for single-language apps.
// Before
import OpenAI from 'openai'
const openai = new OpenAI()
// After
import { createOpenAI } from '@spanlens/sdk/openai'
const openai = createOpenAI()ts2. Proxy
Point the provider baseURL at the Spanlens proxy. Works in any language including Ruby, Go, and raw HTTP.
POST https://api.spanlens.io/proxy/openai/v1/chat/completions
Authorization: Bearer sl_live_...
Content-Type: application/json
{"model":"gpt-4o-mini","messages":[...]}http3. OpenTelemetry
Emit OTLP/HTTP spans from your existing OTel pipeline. Best when you already have OTel exporters set up and want LLM spans to flow through the same infrastructure. See /docs/otel for setup.
Where to go next
- Data model, the eight entities that back every dashboard view.
- Agent tracing, span tree structure and critical path.
- Evals, scoring quality and tracking drift over prompt versions.
- Prompt management, versioning, A/B, and statistical rollout decisions.