Direct proxy (any language)

If you're not using the TypeScript SDK, you can still use Spanlens by pointing any OpenAI / Anthropic / Gemini client at our proxy URL. Works with Python, Ruby, Go, Rust, Java, PHP, or raw HTTP.

⚡ Use streaming for long requests

The proxy enforces a 25-second first-byte timeout. Any request expected to take longer (large max_tokens, slow models, JSON mode with big outputs) must use stream: true. Streaming sidesteps the timeout entirely — first byte arrives in ~200ms regardless of total duration. If you need a single JSON object back, accumulate chunks server-side and return the merged string to your client (the “internal streaming” pattern).

How it works

Spanlens exposes a 1:1 compatible proxy at:

https://spanlens-server.vercel.app/proxy/openai/v1
https://spanlens-server.vercel.app/proxy/anthropic
https://spanlens-server.vercel.app/proxy/gemini/v1beta

Send requests exactly as you would to the real provider, with two changes:

Base URL — point your SDK at the Spanlens proxy
API key — use your Spanlens API key (starts with sl_live_) instead of the provider's. The real provider key is pulled from your registered keys server-side.

Python — OpenAI

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["SPANLENS_API_KEY"],
    base_url="https://spanlens-server.vercel.app/proxy/openai/v1",
)

res = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hi"}],
)

python

Python — Anthropic

from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ["SPANLENS_API_KEY"],
    base_url="https://spanlens-server.vercel.app/proxy/anthropic",
)

msg = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hi"}],
)

python

curl — raw HTTP

curl https://spanlens-server.vercel.app/proxy/openai/v1/chat/completions \
  -H "Authorization: Bearer $SPANLENS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hi"}]
  }'

bash

Ruby

require "openai"

client = OpenAI::Client.new(
  access_token: ENV["SPANLENS_API_KEY"],
  uri_base: "https://spanlens-server.vercel.app/proxy/openai",
)

res = client.chat(parameters: {
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hi" }],
})

ruby

Go

import "github.com/sashabaranov/go-openai"

config := openai.DefaultConfig(os.Getenv("SPANLENS_API_KEY"))
config.BaseURL = "https://spanlens-server.vercel.app/proxy/openai/v1"

client := openai.NewClientWithConfig(config)

res, _ := client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
    Model: "gpt-4o-mini",
    Messages: []openai.ChatCompletionMessage{
        {Role: "user", Content: "Hi"},
    },
})

Streaming

Server-Sent Events streaming works transparently. Spanlens tees the stream — one copy flows to you in real time, the other is parsed asynchronously to extract token usage. Latency overhead is negligible (10–50ms).

Passing project / metadata

Add an X-Spanlens-Project header to tag requests with a project scope:

-H "X-Spanlens-Project: my-backend-service"

Add an X-Spanlens-Prompt-Version header to link the request to a specific prompt version so it appears in the A/B comparison table. Accepts name@version, name@latest, or a raw UUID:

-H "X-Spanlens-Prompt-Version: chatbot-system@3"
# or
-H "X-Spanlens-Prompt-Version: chatbot-system@latest"
# or
-H "X-Spanlens-Prompt-Version: ae1c3c1e-99eb-2b98-5f05-012345678901"

Invalid or unknown values silently resolve to null — the proxy never fails because a prompt tag is stale. The request just isn't linked to a version.

Self-hosting

If you're running Spanlens on your own infra, replace the base URL:

https://your-spanlens-domain.com/proxy/openai/v1

See self-hosting for Docker deployment.

Next: self-hosting with Docker.