CrewAI integration

CrewAI orchestrates multiple agents that each have their own role, goal, backstory, and tool set. Spanlens captures the full crew execution tree: one root span per crew kickoff, one child span per agent task, and one leaf span per LLM call or tool invocation. Per-agent cost and latency surface in the trace view so you can see which agent burned the budget.

Install

pip install spanlens crewai crewai-tools
bash

Minimal setup

CrewAI internally uses LangChain for LLM calls. Attach the Spanlens LangChain callback handler to each agent's LLM, and the trace flows through the crew automatically.

from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI
from spanlens import SpanlensClient
from spanlens.langchain import SpanlensCallbackHandler

client = SpanlensClient()
handler = SpanlensCallbackHandler(client=client)

llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[handler])

researcher = Agent(
    role="Researcher",
    goal="Find the latest on a topic.",
    backstory="You read primary sources first.",
    llm=llm,
)

writer = Agent(
    role="Writer",
    goal="Turn research into a short brief.",
    backstory="You write for technical readers.",
    llm=llm,
)

research_task = Task(description="Research LLM observability tools.", agent=researcher)
write_task = Task(description="Write a brief from the research.", agent=writer)

crew = Crew(agents=[researcher, writer], tasks=[research_task, write_task])
result = crew.kickoff()
python

What gets captured

CrewAI eventSpanlens span
Crew kickoffkind="trace" (root)
Task executionkind="agent_step" with agent role as attribute
LLM call inside a taskkind="llm"
Tool call from an agentkind="tool"
Delegation between agentskind="agent_step" with delegation_target attribute

Hierarchical vs sequential crews

Spanlens renders both crew modes correctly. Sequential crews produce a linear span chain; hierarchical crews produce a tree where the manager agent has child spans per delegated task. Critical path computation works the same way for both.

Per-agent cost attribution

Spans are tagged with the agent role, so the /users and savings views aggregate cost per agent. For a crew where one researcher agent calls gpt-4o while writers use gpt-4o-mini, you see the cost breakdown automatically.

Where to go next