LLM observability

LLM observability is the practice of capturing every call your application makes to a large language model, then surfacing the cost, latency, token usage, and behavioral signals that decide whether the app stays profitable, fast, and safe. It is the LLM-specific equivalent of APM for traditional services. This page is the conceptual entry point; the marketing-side hub lives at /llm-observability.

Why standard APM is not enough

Datadog, New Relic, and Sentry track HTTP requests and database spans. LLM calls have shapes those tools were not built to capture: token counts that drive non-linear cost, model variants priced differently within the same provider, prompt versions that change behavior without code changes, tool calls that branch into agent flows, and non-deterministic output where the same input gives different results on each run.

The five categories Spanlens captures

Category	Spanlens entity	Surface
Cost (USD per request, per model, per customer)	`requests.cost_usd` + tags	/requests, /savings
Latency (p50/p95/p99, TTFT for streaming)	`requests.latency_ms`, `requests.time_to_first_token_ms`	/requests
Quality (eval scores, judge↔human correlation)	`evals`, `annotations`	/evals, /annotation
Reliability (error rates, retry counts)	`requests.status`, `requests.retry_count`	/anomalies, /alerts
Security (PII matches, prompt injection, key leakage)	`requests.pii_flags`, `security_events`	/security

Three integration patterns

Spanlens supports the three patterns that dominate the space. Pick by stack, not by ideology.

1. Drop-in SDK

Swap the provider SDK import for a Spanlens-instrumented version. Same methods, same types. Fastest for single-language apps.

// Before
import OpenAI from 'openai'
const openai = new OpenAI()

// After
import { createOpenAI } from '@spanlens/sdk/openai'
const openai = createOpenAI()

2. Proxy

Point the provider baseURL at the Spanlens proxy. Works in any language including Ruby, Go, and raw HTTP.

POST https://api.spanlens.io/proxy/openai/v1/chat/completions
Authorization: Bearer sl_live_...
Content-Type: application/json

{"model":"gpt-4o-mini","messages":[...]}

http

3. OpenTelemetry

Emit OTLP/HTTP spans from your existing OTel pipeline. Best when you already have OTel exporters set up and want LLM spans to flow through the same infrastructure. See /docs/otel for setup.

Where to go next

Data model, the eight entities that back every dashboard view.
Agent tracing, span tree structure and critical path.
Evals, scoring quality and tracking drift over prompt versions.
Prompt management, versioning, A/B, and statistical rollout decisions.