Record every OpenAI, Anthropic, and Gemini call: cost, latency, tokens, full request and response. Then score quality, run experiments, catch anomalies and PII, and ship cheaper models with proof.
baseURLand you're done.Per-team, per-model, per-route cost. Daily rollups. Budget alerts by Slack or webhook. One place to answer “why did our OpenAI bill jump?”
Multi-step agents as waterfall trees. Critical path, cost attribution, and latency outliers, highlighted automatically.
Cost and latency tell you what happened. Spanlens tells you whether it got better. Capture real traffic into datasets, score it with an LLM judge or your own team, then run the next prompt version against it before you ship.
Run Spanlens in your cluster with Docker Compose or a single binary. Prompts and completions never leave your network.
Projects isolate workloads, roles and invitations manage the whole team, and an audit log records every change. Wire Spanlens into your stack with webhooks and alerts.
Free while you're small. Flat monthly fee, not per seat. Self-host is free forever.
30-second setup. Your first 50,000 requests are on us. Cancel anytime. There's nothing to cancel.
- import OpenAI from 'openai'
- const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })
+ import { createOpenAI } from '@spanlens/sdk/openai'
+ const openai = createOpenAI()
const res = await openai.chat.completions.create({ ... })- from openai import OpenAI - client = OpenAI(api_key=os.environ["OPENAI_API_KEY"]) + from spanlens.integrations.openai import create_openai + client = create_openai() res = client.chat.completions.create(...)