AI API Pricing Comparison

Every major model's API cost in one table. Prices per million tokens, updated monthly.

Last updated: March 7, 2026

Gemini 2.0 Flash is the cheapest quality option at $0.10/M input. For flagship models, GPT-4o offers the best price-to-performance ratio. Claude Opus 4 is the most expensive but leads in long-document analysis and coding tasks.

All Models — March 2026

Model	Provider	Input $/M	Output $/M	Context	Best For
GPT-4o	OpenAI	$2.50	$10.00	128K	Multimodal, general
Claude Opus 4	Anthropic	$15.00	$75.00	200K	Long docs, analysis
Claude Sonnet 4	Anthropic	$3.00	$15.00	200K	Balanced performance
Gemini 2.5 Pro	Google	$1.25	$10.00	1M	Massive context
GPT-o3	OpenAI	$10.00	$40.00	200K	Complex reasoning
GPT-o3-mini	OpenAI	$1.10	$4.40	200K	Budget reasoning
Grok 3	xAI	$3.00	$15.00	131K	Real-time data
Llama 4 Maverick	Meta	$0.20	$0.60	1M	Cost efficiency
Mistral Large 2	Mistral	$2.00	$6.00	128K	EU compliance
Qwen 3 235B	Alibaba	$0.80	$3.20	128K	Multilingual
Command R+	Cohere	$2.50	$10.00	128K	RAG / Enterprise
Claude Haiku 4.5	Anthropic	$0.80	$4.00	200K	Fast & cheap
GPT-4o-mini	OpenAI	$0.15	$0.60	128K	Budget tasks
Gemini 2.0 Flash	Google	$0.10	$0.40	1M	Cheapest quality

Key Takeaways

Cheapest flagship: GPT-4o at $2.50/$10.00 per M tokens
Best budget: Gemini 2.0 Flash at $0.10/$0.40 — 25x cheaper than GPT-4o
Largest context: Gemini and Llama 4 at 1M tokens (2,500 pages)
Open source winner: Llama 4 Maverick at $0.20/$0.60 via API providers
Most expensive: Claude Opus 4 at $15/$75 — justified for complex analysis

Frequently Asked Questions

What are tokens? ▼

Tokens are chunks of text — roughly 3/4 of a word. "Hello world" is 2 tokens. Pricing is per million tokens (M), so $1/M input means processing 750,000 words costs about $1.

Why is output more expensive than input? ▼

Generating text (output) requires more computation than reading it (input). The model has to make decisions for each token it produces, which is more GPU-intensive than encoding input.

Which model offers the best value? ▼

For most use cases, Gemini 2.0 Flash offers the best quality-per-dollar ratio. For tasks requiring top-tier reasoning, Claude Sonnet 4 offers the best balance of capability and cost.