Claude 3.5 Sonnet pricing

Anthropic's flagship working model. $3.00 per 1M input, $15.00 per 1M output, with cache reads at 10% of input price.

Input

$3.00

per 1M tokens

Cached input: $0.300 per 1M

Cache write: $3.75 per 1M

Output

$15.00

per 1M tokens

Provider: Anthropic
Context window: 200K tokens
Max output: 8K tokens
Released: 2024-06, refreshed 2024-10

What Claude 3.5 Sonnet is best for

✓Complex multi-turn agents and tool use
✓Long-form writing and editing
✓Code generation and review
✓Long-context document QA (up to 200K tokens)
✓Workflows that benefit from aggressive prompt caching

Monthly cost scenarios

Real-world estimates at common usage levels. Numbers assume no caching, no batching, and the standard tier price.

Use case	In / req	Out / req	Reqs / mo	Monthly
Coding assistant	4,000	1,500	20,000	$690.00
Document QA	50,000	800	5,000	$810.00
Production agent	2,500	1,000	100,000	$2250.00
Long-form writing	1,500	3,000	10,000	$495.00
Long-context summarizer	80,000	1,200	3,000	$774.00

Alternatives to Claude 3.5 Sonnet

Claude 3.5 Haiku

Smaller Anthropic model at $0.80 input / $4.00 output. About 4x cheaper. Use for high-volume narrow tasks where Sonnet quality is overkill.

GPT-4o

OpenAI frontier at $2.50 input / $10 output. Slightly cheaper than Sonnet, comparable quality. Pick based on which produces better outputs on your eval.

Claude Opus 4

Anthropic flagship at $15 input / $75 output (where available). 5x the cost of Sonnet. Use only for the hardest reasoning steps; route the rest to Sonnet.

Track Claude 3.5 Sonnet usage with Spanlens

Spanlens captures every Claude 3.5 Sonnet call with input + output tokens, exact cost, latency, and full request body. One line of code or a baseURL swap. Open source MIT licensed, self-hostable.

Start free →Anthropic integration guide Official pricing ↗

FAQ

What is the Claude 3.5 Sonnet cost per 1M tokens in 2026?

Claude 3.5 Sonnet is priced at $3.00 per 1M input tokens and $15.00 per 1M output tokens. Cache reads are $0.30 per 1M (10% of input) and cache writes are $3.75 per 1M (25% premium over input).

How does Anthropic prompt caching pricing work?

Anthropic charges a one-time cache write at $3.75 per 1M tokens (input + 25%), then 10% of base input price on every cache read until the cache expires (default 5 minutes). For a shared system prompt across many requests, the break-even point is roughly 2-3 cache hits.

What is the Claude 3.5 Sonnet context window?

200K tokens of input, 8K maximum output. The 200K context window is one of the largest among production models — particularly useful for document-grounded tasks.

How does Claude 3.5 Sonnet compare to GPT-4o on cost?

GPT-4o is slightly cheaper ($2.50 input / $10 output vs Anthropic's $3.00 / $15). However, Anthropic's prompt caching at 10% of input makes Sonnet cheaper than GPT-4o for any workload with shared context (e.g. RAG with stable system prompts).

Can I run Claude 3.5 Sonnet on AWS Bedrock?

Yes. Bedrock offers Claude 3.5 Sonnet at the same per-token pricing under most regions. Bedrock authentication uses SigV4 with your AWS credentials. Spanlens works with both direct Anthropic and Bedrock endpoints.

How do I monitor Claude 3.5 Sonnet usage in production?

Capture every Messages API call with input + output + cache token breakdown, latency, model variant, and cost. Aggregate by prompt version to catch caching regressions. Spanlens does this in one line — see /integrations/anthropic.

Last updated 2026-06-16. Prices in USD at the standard tier. Spot something out of date? Tell us.