LLM observability

LLM observability is the practice of capturing every call your application makes to a large language model, then surfacing the cost, latency, token usage, and behavioral signals that decide whether the app stays profitable, fast, and safe. It is the LLM-specific equivalent of APM for traditional services. This page is the conceptual entry point; the marketing-side hub lives at /llm-observability.

Why standard APM is not enough

Datadog, New Relic, and Sentry track HTTP requests and database spans. LLM calls have shapes those tools were not built to capture: token counts that drive non-linear cost, model variants priced differently within the same provider, prompt versions that change behavior without code changes, tool calls that branch into agent flows, and non-deterministic output where the same input gives different results on each run.

The five categories Spanlens captures

CategorySpanlens entitySurface
Cost (USD per request, per model, per customer)requests.cost_usd + tags/requests, /savings
Latency (p50/p95/p99, TTFT for streaming)requests.latency_ms, requests.time_to_first_token_ms/requests
Quality (eval scores, judge↔human correlation)evals, annotations/evals, /annotation
Reliability (error rates, retry counts)requests.status, requests.retry_count/anomalies, /alerts
Security (PII matches, prompt injection, key leakage)requests.pii_flags, security_events/security

Three integration patterns

Spanlens supports the three patterns that dominate the space. Pick by stack, not by ideology.

1. Drop-in SDK

Swap the provider SDK import for a Spanlens-instrumented version. Same methods, same types. Fastest for single-language apps.

// Before
import OpenAI from 'openai'
const openai = new OpenAI()

// After
import { createOpenAI } from '@spanlens/sdk/openai'
const openai = createOpenAI()
ts

2. Proxy

Point the provider baseURL at the Spanlens proxy. Works in any language including Ruby, Go, and raw HTTP.

POST https://api.spanlens.io/proxy/openai/v1/chat/completions
Authorization: Bearer sl_live_...
Content-Type: application/json

{"model":"gpt-4o-mini","messages":[...]}
http

3. OpenTelemetry

Emit OTLP/HTTP spans from your existing OTel pipeline. Best when you already have OTel exporters set up and want LLM spans to flow through the same infrastructure. See /docs/otel for setup.

Where to go next

  • Data model, the eight entities that back every dashboard view.
  • Agent tracing, span tree structure and critical path.
  • Evals, scoring quality and tracking drift over prompt versions.
  • Prompt management, versioning, A/B, and statistical rollout decisions.