OpenAI Assistants API integration
The Assistants API encapsulates threads, runs, and steps. Spanlens captures the full hierarchy: each thread becomes a trace, each run becomes an agent_step span, and each step (message creation, tool call, code interpreter invocation) becomes a child span. Spans inherit the OpenAI thread_id, run_id, and step_id as tags so you can pivot or filter on any of them.
Setup
The Spanlens OpenAI drop-in instruments the underlying HTTP layer, so all Assistants API endpoints are captured automatically. No extra wiring per thread or per run.
import { createOpenAI } from '@spanlens/sdk/openai'
const openai = createOpenAI()
const assistant = await openai.beta.assistants.create({
name: 'Support agent',
model: 'gpt-4o-mini',
tools: [{ type: 'code_interpreter' }],
})
const thread = await openai.beta.threads.create()
await openai.beta.threads.messages.create(thread.id, {
role: 'user',
content: 'Plot the last week of revenue from this CSV.',
})
const run = await openai.beta.threads.runs.createAndPoll(thread.id, {
assistant_id: assistant.id,
})tsWhat gets captured
| Assistants API resource | Spanlens entity | Notes |
|---|---|---|
| Thread | trace | thread_id is the trace_id; one thread, one trace. |
| Run | span (kind="agent_step") | One run per span. Includes model, usage, status. |
| Message creation step | span (kind="llm") | Full input/output captured. |
| Tool call step (function) | span (kind="tool") | Tool name, arguments, output captured. |
| Code interpreter step | span (kind="tool") with subtype="code" | Generated code and stdout captured. |
| File search / retrieval step | span (kind="tool") with subtype="retrieval" | Vector store ID and result count captured. |
Multi-run thread cost
For a long-lived thread that runs the assistant multiple times, Spanlens aggregates cost across runs at the thread level. The trace view shows the cumulative cost-per-user-message split, which is the unit most product teams want to bill on.
Streaming runs
Streaming runs work without extra setup. The drop-in handles the server-sent events stream and emits one span per message delta plus a final aggregating span for the full run.