Vercel AI SDK integration

Vercel AI SDK exposes onStepFinish and onFinish callbacks on every call shape (generateText, streamText, generateObject, streamObject). createSpanlensTracker returns those two callbacks ready to spread directly into the AI SDK options, so a 2-line change records the span, token usage, model name, and multi-step tool topology to /traces without touching the rest of the call. Works with AI SDK 4.x and 5.x via a duck-typed payload check, no peer dependency on the ai package.

Install

pnpm add @spanlens/sdk
# the integration is exposed as a sub-path import — no extra peer dep

bash

Minimal setup

import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'
import { SpanlensClient } from '@spanlens/sdk'
import { createSpanlensTracker } from '@spanlens/sdk/vercel-ai'

const client = new SpanlensClient({ apiKey: process.env.SPANLENS_API_KEY! })
const tracker = createSpanlensTracker({ client, modelName: 'gpt-4o' })

const result = await generateText({
  model: openai('gpt-4o'),
  messages: [{ role: 'user', content: 'Summarise the latest release notes.' }],
  onStepFinish: tracker.onStepFinish,  // optional — captures intermediate tool steps
  onFinish:     tracker.onFinish,       // required — closes the span with token totals
})

A fresh trace is opened the moment createSpanlensTrackeris called, so latency is measured from the user's perspective (not just from when the model started emitting). The span closes when onFinish fires.

What gets captured

AI SDK call	Captured	Notes
`generateText`	single span, prompt + completion tokens, model, latency, finish reason	token usage read from `response.usage`
`streamText`	same shape — `onFinish` fires after the stream closes with the totals	partial tokens during the stream are not split into sub-spans (kept flat)
`generateObject` / `streamObject`	span output carries the structured object as text via `JSON.stringify(...)`	token usage is identical to the text-mode siblings
Multi-step tool calls (`maxSteps` > 1)	`steps` count in span metadata; one final span per call	individual tool calls are not split out (keeps the trace tree small)

Trace tree shape

A two-step tool-using call (model picks a tool, runs it, then writes the final answer) produces one span by default — the multi-step structure is recorded as metadata.steps on the same span so the trace tree stays readable in the common case:

Trace: ai.generate            (3.2s)
└── llm.gpt-4o                (3.2s, steps=2, gpt-4o, 480/220 tokens, $0.0034)

text

To break tool calls out into their own spans, pass a parent trace and wrap your tool execution in trace.span(...) explicitly — see Attaching to a long-lived trace below.

Attaching to a long-lived trace

By default the tracker opens a fresh trace on each AI call and closes it when onFinish fires. To group multiple generateText turns under a single trace (chat sessions, agent loops, RAG pipelines), pass an existing trace at construction — the tracker leaves its lifecycle entirely to the caller:

const trace = client.startTrace({
  name: 'chat-session',
  metadata: { user_id: user.id, session_id: sessionId },
})

for (const userMessage of conversation) {
  const tracker = createSpanlensTracker({ client, trace, modelName: 'gpt-4o' })

  await generateText({
    model: openai('gpt-4o'),
    messages: history.concat({ role: 'user', content: userMessage }),
    onFinish: tracker.onFinish,
  })
}

await trace.end({ status: 'completed' })

Each turn lands as a child llm.gpt-4o span under the parent trace, so the chat appears as a single waterfall on /traces.

Pairing with the proxy for accurate cost

The tracker reads token totals from the AI SDK callback payload, which is reliable on non-streaming calls but occasionally drifts on the early AI SDK 5.x betas. For authoritative billing-grade cost numbers, route the underlying provider through the Spanlens proxy and the matching Request row will always carry the canonical figure:

import { createOpenAI } from '@ai-sdk/openai'

const openai = createOpenAI({
  apiKey: process.env.SPANLENS_API_KEY!,
  baseURL: 'https://api.spanlens.io/proxy/openai/v1',
})

const result = await generateText({
  model: openai('gpt-4o'),
  // ... onFinish: tracker.onFinish, etc.
})

With the proxy in place, every model call lands as a Request in /requests with the authoritative cost, and the matching llm.gpt-4o span links to it via request_id automatically.

Linking spans to prompt versions

Use the proxy approach above and attach a default header so the call is tagged with a Spanlens Prompts version. The matching Request row carries prompt_version_id, so the A/B view can compare versions on real traffic:

const openai = createOpenAI({
  apiKey: process.env.SPANLENS_API_KEY!,
  baseURL: 'https://api.spanlens.io/proxy/openai/v1',
  headers: { 'x-spanlens-prompt-version': 'chatbot-system@3' },
})

Verifying the integration

Make one generateText call with the tracker wired up.
Open /traces. A new trace appears with name ai.generate (or your custom traceName).
Click into the trace. One llm.<modelName> span sits underneath with token counts and computed cost on the right panel.
If you wired the proxy as well, the span links to the matching Request in /requests via request_id. Open it to see the raw request and response bodies.

Troubleshooting

Span shows zero tokens

AI SDK 4.x reports promptTokens / completionTokens; AI SDK 5.x renamed the fields to inputTokens / outputTokens. The tracker accepts both shapes via a fallback chain, so a zero count usually means the underlying provider didn't emit usage at all (some streaming responses on the AI SDK 5.x betas, certain Bedrock backends). Routing through the Spanlens proxy (see above) recovers the authoritative number from the raw stream.

Trace closes before tool calls finish

The auto-managed trace closes when onFinish fires, which is at the end of the AI SDK call — not the end of your application logic. If you kick off background work (DB writes, downstream API calls) after the LLM returns, pass an external trace via trace= and call trace.end() yourself when all work is done. See Attaching to a long-lived trace.

Multi-step trace looks flat

By design, the tracker keeps multi-step tool calls collapsed into one span with a steps count in metadata — most users want a readable trace, not a fan-out of tool noise. To break each tool call into its own span, drop the tracker for that portion and use trace.span({name: 'tool.search', spanType: 'tool'}) inside your tool implementation.

TypeScript complains about callback signatures

The tracker is duck-typed against the AI SDK 4.x and 5.x payloads, so the parameter types are intentionally loose (any-equivalent on the framework side). If your TS config rejects implicit unknowns, cast the callbacks explicitly: onFinish: tracker.onFinish as Parameters<typeof generateText>[0]['onFinish'].

Next: LangGraph integration for multi-agent workflows, or data model for what ends up in ClickHouse.