|
Anthropic
— Claude Sonnet 4.6
|
cachedInput |
Derived at 10% of input rate — Anthropic publishes 90% cache-hit discount on this tier. |
|
Anthropic
— Claude Sonnet 4.5
|
cachedInput |
Derived at 10% of input rate; same 90% cache-hit convention as Sonnet 4.6. |
|
Anthropic
— Claude Sonnet 4.5
|
batchInput |
Derived at 50% of standard input — Anthropic documents uniform 50% Batch discount. |
|
Anthropic
— Claude Sonnet 4.5
|
batchOutput |
Derived at 50% of standard output — Anthropic documents uniform 50% Batch discount. |
|
Anthropic
— Claude Haiku 4.5
|
cachedInput |
Derived at 10% of input rate — Anthropic 90% cache-hit discount convention. |
|
OpenAI
— GPT-5.4 Mini
|
cachedInput |
Derived at 10% of input — OpenAI documents automatic 90% discount on cache hits across GPT-5.x tier. |
|
OpenAI
— GPT-5.4 Nano
|
cachedInput |
Derived at 10% of input — OpenAI 90% cache-hit convention. |
|
OpenAI
— GPT-5.4 Nano
|
batchInput |
Derived at 50% of input — OpenAI Batch API uniform 50% discount. |
|
OpenAI
— GPT-5.4 Nano
|
batchOutput |
Derived at 50% of output — OpenAI Batch API uniform 50% discount. |
|
OpenAI
— GPT-5.4 Pro
|
cachedInput |
Derived at 10% of input — OpenAI 90% cache-hit convention. |
|
OpenAI
— GPT-5.4 Pro
|
batchInput |
Derived at 50% of input — OpenAI Batch API uniform 50% discount. |
|
OpenAI
— GPT-5.4 Pro
|
batchOutput |
Derived at 50% of output — OpenAI Batch API uniform 50% discount. |
|
OpenAI
— GPT-5.2
|
cachedInput |
Derived at 10% of input; no residency uplift. |
|
OpenAI
— GPT-5.2
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5.2
|
batchOutput |
Derived at 50% of output. |
|
OpenAI
— GPT-5
|
cachedInput |
Derived at 10% of input. |
|
OpenAI
— GPT-5
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5
|
batchOutput |
Derived at 50% of output. |
|
OpenAI
— GPT-5.5 Pro
|
cachedInput |
Derived at 10% of input — OpenAI does not publish a cached rate for *-pro models; using the family convention. |
|
OpenAI
— GPT-5.5 Pro
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5.5 Pro
|
batchOutput |
Derived at 50% of output. |
|
OpenAI
— GPT-5.2 Pro
|
cachedInput |
Derived at 10% of input — pro-tier convention. |
|
OpenAI
— GPT-5.2 Pro
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5.2 Pro
|
batchOutput |
Derived at 50% of output. |
|
OpenAI
— GPT-5.1
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5.1
|
batchOutput |
Derived at 50% of output. |
|
OpenAI
— GPT-5 Pro
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5 Pro
|
batchOutput |
Derived at 50% of output. |
|
OpenAI
— GPT-5 Nano
|
cachedInput |
Derived at 10% of input. |
|
OpenAI
— GPT-5 Nano
|
batchInput |
Derived at 50% of input. |
|
OpenAI
— GPT-5 Nano
|
batchOutput |
Derived at 50% of output. |
|
Google
— Gemini 3 Flash
|
cachedInput |
Derived at 10% of input — Google caching discount convention ~90%. |
|
Google
— Gemini 3.1 Flash-Lite
|
cachedInput |
Derived at 10% of input — Google caching convention. |
|
Google
— Gemini 3.1 Flash-Lite
|
batchInput |
Derived at 50% of input — Google Batch API uniform 50% discount. |
|
Google
— Gemini 3.1 Flash-Lite
|
batchOutput |
Derived at 50% of output — Google Batch API uniform 50% discount. |
|
Google
— Gemini 2.5 Pro
|
cachedInput |
Derived at 10% of input. |
|
Google
— Gemini 2.5 Flash
|
cachedInput |
Derived at 10% of input. |
|
Google
— Gemini 2.5 Flash-Lite
|
cachedInput |
Derived at 10% of input — Google caching convention. |
|
Google
— Gemini 2.5 Flash-Lite
|
batchInput |
Derived at 50% of input — Google Batch API uniform 50% discount. |
|
Google
— Gemini 2.5 Flash-Lite
|
batchOutput |
Derived at 50% of output — Google Batch API uniform 50% discount. |
|
Google
— Gemini 2.0 Flash
|
cachedInput |
Derived at 25% of input per Google 2.0 family caching rates. |
|
Google
— Gemini 2.0 Flash
|
batchInput |
Derived at 50% of input — Google Batch API uniform 50% discount. |
|
Google
— Gemini 2.0 Flash
|
batchOutput |
Derived at 50% of output — Google Batch API uniform 50% discount. |
|
Google
— Gemini 2.0 Flash-Lite
|
cachedInput |
Derived at 10% of input — Google caching convention. |
|
Google
— Gemini 2.0 Flash-Lite
|
batchInput |
Derived at 50% of input — Google Batch API uniform 50% discount. |
|
Google
— Gemini 2.0 Flash-Lite
|
batchOutput |
Derived at 50% of output — Google Batch API uniform 50% discount. |
|
xAI
— Grok 4 (legacy)
|
cachedInput |
Extrapolated at 25% of base. |