Traces

When your code makes one LLM call, a flat request log is enough. When it makes ten calls orchestrated across retrieval, generation, tool-use, and re-ranking, you need a tree. Spanlens Traces give you that tree — automatically when you wrap code in observe(), or manually with the low-level span API.

Why it matters

Agent workflows are hard to debug because the interesting failure is never in one call. It's in the interaction: retrieval returned garbage, so generation hallucinated, so the re-ranker picked a bad answer. Flat logs show three unrelated lines. A trace shows one tree with timings — immediately obvious where the bug lives.

LangSmith and LangFuse popularized this view. Spanlens delivers the same thing without requiring you to migrate to LangChain or adopt heavyweight decorators.

How it works

The data model

A trace groups related spans under one id. A span is any piece of async work — an LLM call, a vector DB search, a tool invocation, a custom function. Spans nest via parent_span_id, forming a tree.

trace: "user-session-abc123"
└── answer-question              (1.8s)
    ├── retrieve                 (120ms)
    ├── generate                 (1.4s)   ← where the time went
    │   └── openai.chat.create   (1.4s, $0.0043, gpt-4o-mini)
    └── rerank                   (280ms)

text

Every span records: start/end time, input, output (optional), status, and metadata. LLM spans automatically capture tokens, cost, model, and provider.

Parallel spans are first-class

The database schema intentionally does NOT enforce a foreign key on parent_span_id. This lets you fire off parallel children, record them as they finish, and close the parent later — no ordering constraints. Essential for real agent code that runs Promise.all([agentA(), agentB()]).

Using it

Option 1 — `observe()` (recommended)

Wrap any async function. Nested observe() calls automatically become child spans:

import { observe } from '@spanlens/sdk'

const answer = await observe('answer-question', async () => {
  const docs = await observe('retrieve', async () => {
    return vectorDb.search(query)
  })

  const response = await observe('generate', async () => {
    return openai.chat.completions.create({
      model: 'gpt-4o-mini',
      messages: buildMessages(docs),
    })
  })

  return response.choices[0].message.content
}, { trace: 'user-session-abc123' })

Option 2 — `observeOpenAI()` for single-call convenience

import { observeOpenAI } from '@spanlens/sdk/openai'

const res = await observeOpenAI(openai, {
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: question }],
}, { name: 'greeting', trace: 'session-1' })

Usage and cost are parsed automatically from the OpenAI response and attached to the span. Same thing exists for Anthropic (observeAnthropic) and Gemini (observeGemini).

Option 3 — Low-level handles (for parallel spans)

import { SpanlensClient } from '@spanlens/sdk'

const client = new SpanlensClient()
const trace = client.startTrace('multi-agent-workflow')

const spanA = trace.startSpan('agent-a')
const spanB = trace.startSpan('agent-b')

const [resA, resB] = await Promise.all([
  runAgentA().then((r) => { spanA.end({ output: r }); return r }),
  runAgentB().then((r) => { spanB.end({ output: r }); return r }),
])

await trace.end()

Viewing traces in the dashboard

Open /traces. Each row is a trace, with total duration and span count. Click one to see the full tree: waterfall timeline, per-span latency, inputs/outputs, and direct links to the underlying /requests row for any LLM span.

Design choices worth knowing

Fire-and-forget ingest. startTrace() and span() return synchronously; network POSTs to Spanlens run in the background. Your request hot path is never blocked by span ingest — typical overhead is under 1ms per span call.
Client-generated UUIDs. Idempotent — if your retry loop calls span.end() twice with the same UUID, the second call is a server-side no-op. No duplicated spans.
Edge-compatible. Uses only fetch and crypto.randomUUID(). Works in Vercel Edge, Cloudflare Workers, Deno, Bun, and Node 18+.
Errors don't break your request. Default silent: true swallows span-ingest failures. Provide an onError hook on SpanlensClient if you want visibility.

Limitations

No zoom or pan yet. The waterfall fits the trace duration to the viewport width. For traces with thousands of spans you'll want to drill into a sub-tree — that lives on the roadmap.
Inline label hides on narrow bars. Spans that take less than ~8% of total duration show only as a colored sliver; hover for the precise timing tooltip, or click to open the side panel.
No OpenTelemetry export yet. If your team standardizes on OTel, you can't pipe Spanlens spans into Datadog/Honeycomb today. Planned for Phase 5.
Trace IDs are opaque strings. We don't yet enforce W3C traceparent format — so linking Spanlens traces to your app's APM traces requires you to pass the same id to both.

Related: Requests (flat log), @spanlens/sdk (API reference), /traces dashboard.