NVIDIA Inception Proud NVIDIA Inception AI Startup
AI COST INTELLIGENCE PUTS YOU IN CONTROL

AICost.ai helps you reduce costs, gain visibility, plan and forecast.

Get answers to your AI cost issues:

AICOST POPULAR LINKS

👉 Start here — Results in 10 minutes

Browse 46 calculators by intent

Pick a path — every link goes straight to the calculator.

🚀 New · Public Beta For AI agents · MCP server

Ask AI cost questions inside Claude, ChatGPT, Cursor & Perplexity — natively.

aicost shipped a Model Context Protocol (MCP) server. Plug it into your favorite AI assistant and it can call our 48+ calculators mid-conversation. No more switching tabs to look up pricing — your AI just answers with verified numbers and cites the source.

✓ Working today
  • · Claude.ai Pro/Max/Enterprise
  • · ChatGPT Plus/Pro/Team (OAuth)
  • · Perplexity Pro/Max (OAuth)
  • · Cursor · Continue · Zed · Cody · Goose
🔮 Coming next
  • · Hybrid pricing (subscription vs API)
  • · TCO + ROI playbooks for enterprise
  • · Domain calcs (healthcare, finance, dev)
  • · Self-serve API keys at aicost.ai/account
📨
Want a beta invite?
Email [email protected] with the AI client you'd use (Claude / ChatGPT / Cursor / Perplexity / etc.) and we'll send setup instructions. Free during beta.

Open standard · Model Context Protocol · live at https://mcp.aicost.ai

📊 Comprehensive AI Vendor Pricing Guides

Interactive pricing breakdowns with model positioning cards, full rate variants (batch, caching, long context), subscription plans, partner programs, and community-verified gotchas. Verified daily, cross-checked against vendor docs.

2026 Focus: Agentic AI cost

Agent loops, multi-step workflows, voice agents, full stacks.

See agentic tools →

2026 Focus: RAG architecture cost

Pipeline, embeddings, chunking, hybrid search.

See RAG tools →
Free calculators · no signup · verified pricing

45+ calculators. Every AI cost question answered.

Pricing across 25 LLMs, 9 embedding models, 7 vision models, 12 audio services. Take a number to your CFO that holds up to scrutiny.

🧭

AI TCO + ROI Framework

NEW · vendor-agnostic · 90 sec

Total cost of ownership across 6 pillars. Workload × vertical × cloud aware. Combines our 38 calcs where precision matters with industry-typical ranges (cited) where it doesn't. Tool handoffs to ToolsInfo so YOU pick the vendor.

AI Subscription Strategy: June 15 Pivot

NEW · Anthropic SDK credit · multi-vendor

On June 15, 2026, Anthropic's Agent SDK credit goes live. Every Pro / Max / Team Premium / Enterprise Premium plan starts including a separate monthly SDK credit equal to the plan's base price. Pro at $20/mo effectively becomes $40 of value. GitHub Copilot transitions to usage-based billing two weeks earlier on June 1. Your team's subscription math just changed. Two new calculators answer the question every builder and FP&A director will be asked this quarter.

🚀

Integrated Stack Calculators & Playbooks

3 tools · NEW · start here

Get the full integrated cost on one screen. Each stack composes the right atomic calcs (planner + executor + verifier; ingest + storage + queries; STT + LLM + TTS) and surfaces what-if savings. Playbooks walk you through step by step.

🧮

Gen AI Text Pricing Calculators

6 tools · start here
💸

AI Cost Optimization Calculators

5 tools · highest leverage
💰

AI Workload Finance & Planning Calculators

12 tools · for CFOs & founders
🛠

Specialty AI Workload Calculators

1 tools · for builders
👤

Consumer & Personal AI Calculators

6 tools · for individuals & creators
🛡️

Compliance & Enterprise Calculators

1 tools · for enterprise
Coming soon
🛡️
Compliance Cost Delta
HIPAA, SOC2, PCI, EU AI Act overhead on AI spend. Private endpoints, logging, residency.
Coming soon
SLA Tier Cost
Enterprise uptime + support tier deltas. Provisioned throughput vs on-demand.
Browse all 46 tools →
Pricing verified 2026-04-17 · Methodology + sources shown on every tool · No signup required
INSTANT ANSWERS

Or tell the Genie what’s going on. Get the right framework

Instant routing to the right product line, the exact playbook, and the tools that match your problem.

🧞
Ask AvatarVA Frameworks · Tools · Playbooks

👋 Tell me what’s going on. I’ll surface the right frameworks, tools, and playbooks, plus which product line fits.

Pick the problem closest to yours:

The six pillars of the AI cost invisible bill

Your AI bill isn't just tokens. It shows up across six dimensions — most of which never appear on a dashboard.

When enterprises track “AI cost,” they count tokens, GPU hours, and API calls. But the true cost extends far beyond the invoice.

  • Hallucinations create legal liability.
  • Missed compliance brings regulatory fines.
  • Silent model drift erodes quality for months.
  • PII leakage triggers breach events.
  • Overbuilt MLOps kills R&D budgets.
  • Rogue agent loops generate $20K overnight bills.

These are the dimensions of the invisible bill.

The categories below are pulled from our sister site ToolsInfo.com — 115K+ workflow tools that pair with AI cost optimization.

💰
Financial AI Cost: The Dollars Your bill is no longer predictable. Token volatility + GPU inflation = budget surprises that kill roadmaps.
1,004 tools · 33 categories

from sister website ToolsInfo.com

Every AI feature has a per-request compute cost. Scale from 1K to 1M users and your bill can 10x while revenue 2x. This is where traditional FinOps playbooks break — they were written before inference became the biggest line item.

⚙️ AI Cost & FinOps 167

Manage cloud spending with automated AI cost optimization

🪪 Cloud Cost Management 116

Scale your cloud footprint without exceeding your budget.

🛠️ Reserved Instance Management 96

Boost cloud savings by automating the lifecycle of your instances

📁 Spot Instance Management 92

Scale cloud workloads while cutting compute costs by up to 90%

🔐 FinOps Platforms 76

Manage cloud costs to maximize ROI and eliminate wasted budget

🏷️ Spot Bidding Tools 31

Scale workloads affordably with automated spot market bidding

⚙️ Cloud Spend Management 28

Manage cloud budgets to eliminate waste and increase margins

🛠️ Showback Tools 26

Track cloud spending by department to drive team accountability

🧭 FinOps Reporting 25

Track cloud investments to maximize your engineering ROI.

🪧 Savings Plans Optimization 24

Close the gap on cloud overspending with automated savings

🔍 LLM Cost Analytics 23

Track AI spending to optimize your operational margins

🛠️ Cloud Cost Visibility 22

Track every cloud dollar to eliminate wasteful spending.

🧩 Cloud Cost Optimization 20

Boost your cloud ROI by eliminating waste and hidden costs

📈 Commitment Management 20

Boost cloud savings by automating long-term discount strategies

📊 Model Cost Optimization 20

Manage AI expenses and maximize ROI by eliminating wasted compute

🛠️ AI Budget Management 19

Manage AI costs with granular visibility and governance tools.

📡 Cloud Unit Economics 17

Track profitability by linking cloud costs to business value.

📡 Preemptible VM Tools 16

Boost compute capacity while cutting costs using preemptible VMs.

🏷️ Spot Automation 16

Automate spot bidding to maximize savings and ensure app stability.

🪪 Reserved Capacity Planning 15

Scale savings by predicting and securing future capacity needs.

📁 AI Token Tracking 15

Track LLM expenses to slash your monthly API bills immediately

📈 Cloud Financial Management 14

Track cloud spending to eliminate waste and maximize your budget.

🛠️ RI Utilization Tools 14

Scale your cloud ROI by eliminating wasted reserved capacity

🎯 Spot Fleet Management 14

Manage spot fleets to cut compute costs without sacrificing uptime.

🪶 Cloud Chargeback 14

Track cloud spending by department to drive financial accountability.

📊 RI Management Tools 12

Automate RI lifecycle management to slash cloud infrastructure costs.

🪪 RI Marketplace Tools 11

Streamline RI liquidity to recover capital from unused commitments

⚙️ Spot Interruption Handling 10

Automate workload migration to slash cloud costs without downtime

📡 Cloud Cost Intelligence 9

Track cloud spend to eliminate waste and maximize your ROI

🪪 FinOps Automation 7

Scale cloud resources efficiently while automatically reducing monthly spend

📡 Cost Anomaly Detection 6

Track unexpected cloud spend to prevent budget overruns.

🔍 Spot Instance Tools 5

Scale infrastructure on spot instances to slash cloud costs.

📡 FinOps Tools 4

Boost cloud ROI by aligning engineering spend with business goals.

See all 33 categories →
🎯
Reliability AI Cost: Hallucinations & Drift When your AI gets it wrong, the bill shows up in legal, reputation, and trust — not your AWS invoice.
196 tools · 6 categories

from sister website ToolsInfo.com

Hallucinations cost enterprises $14,200 per employee annually in human-in-the-loop verification time. Bias drift in regulated industries triggers fines in the billions. Every AI output that reaches a customer is a liability event waiting to happen.

🪶 LLM Testing & Evaluation 125

Streamline model launches with automated accuracy scoring

📡 LLM Benchmarking Tools 27

Manage AI performance by choosing the best model for your needs

📡 Prompt Regression Testing 18

Automate LLM testing to ensure consistent output quality at scale

🔐 LLM Eval Frameworks 12

Automate AI quality checks to ship production-ready models faster

📁 AI Output Validation 10

Boost AI accuracy and prevent hallucinations in your production

🔐 AI Quality Assurance 4

Track model performance to deliver reliable AI and build trust

See all 6 categories →
⚖️
Governance AI Cost: Compliance & Regulation EU AI Act, HIPAA, SOC 2, state-level laws. Non-compliance isn't a fine — it's a business closure event.
275 tools · 6 categories

from sister website ToolsInfo.com

Every regulated industry (healthcare, finance, legal, government) is rebuilding AI programs to stay compliant. Miss an audit trail, skip a model card, fail to document bias — and you're not just losing money, you're losing the right to operate.

📡 AI Governance & Compliance 165

Streamline regulatory audits with automated AI governance

🔍 EU AI Act Compliance 28

Manage regulatory risk and ensure total AI Act compliance

🪶 AI Model Documentation 28

Streamline governance with automated model documentation

🪪 AI Audit Trail Tools 28

Track data lineage to maintain full audit accountability

🛡️ AI Policy Management 20

Automate regulatory compliance to reduce legal risk across teams

📁 AI Risk Management 6

Manage enterprise risk and deploy compliant AI with confidence

See all 6 categories →
🛡️
Privacy & Security AI Cost: The Breach Bill PII leaked in a prompt. Prompt injection compromises your agent. Shadow AI exfiltrates data nobody authorized.
445 tools · 11 categories

from sister website ToolsInfo.com

The fastest way to lose 10x what you saved on AI infrastructure is a data leak through a prompt. Clean rooms, PII detection, red-team testing, and prompt injection defense aren't optional — they're the difference between AI as competitive advantage and AI as breach vector.

🔍 AI Data Privacy 149

Manage compliance risks and protect sensitive data automatically

📐 AI Security & Red Teaming 146

Track model vulnerabilities to prevent breaches and ensure safety

🪧 Differential Privacy Tools 25

Scale data sharing safely without compromising user privacy

🔧 AI Data Masking 23

Automate data privacy to secure sensitive information

🧩 AI Jailbreak Prevention 22

Manage LLM security risks to ensure safe and compliant AI

🛠️ AI Data Anonymization 20

Scale data usage safely while protecting sensitive user privacy

🛡️ AI Penetration Testing 20

Manage AI security risks by identifying vulnerabilities early

📊 AI Model Security 18

Manage AI security risks and prevent model data leaks effortlessly

🔐 LLM Vulnerability Scanning 15

Close security gaps by identifying model flaws before launch

🪪 Prompt Injection Defense 5

Automate LLM security to protect your data and brand reputation

📐 AI PII Detection 2

Detect PII in prompts and outputs before sending or storing.

See all 11 categories →
🔧
MLOps & Operational AI Cost: The Engineering Bill Fine-tuning compute. Pipeline orchestration. Model versioning. The hidden "prod-ready" cost few price in.
257 tools · 5 categories

from sister website ToolsInfo.com

A single domain fine-tune run costs $8K-$60K in compute alone, plus a $200K/year MLE to supervise. Moving a model from notebook to production requires MLOps infrastructure most teams underestimate by 3-5x. This is where R&D budgets die.

🔧 MLOps & Model Lifecycle 194

Manage the entire ML lifecycle to ship reliable models faster

🔐 AI Model Registry 24

Manage the full ML lifecycle to ensure reliable model performance

🧩 AI Pipeline Orchestration 20

Automate end-to-end ML lifecycles to accelerate model delivery

📦 Feature Store Platforms 13

Streamline model deployment with consistent, real-time ML data

🧮 Model Versioning Tools 6

Track experiment history and reproduce successful models instantly

See all 5 categories →
🔍
Observability AI Cost: What You Can't See Rogue agent loops. Silent model degradation. Prompt inefficiency. Costs that accumulate while you sleep.
2,069 tools · 61 categories

from sister website ToolsInfo.com

A single infinite-loop agent can rack up $20K overnight. Silent model drift reduces quality for months before anyone notices. Without observability purpose-built for AI workloads, you're flying blind — and the bill is how you learn.

🪪 AI Agents & Automation 227

Automate complex workflows to save hours of manual effort

📦 AI Observability & Monitoring 206

Track model performance to prevent costly production errors

📁 Prompt Engineering Tools 202

Build better AI responses with systematic prompt optimization

📦 AI & LLM Platforms 166

Automate complex workflows with custom AI that scales with you

📊 AIOps Platforms 119

Automate incident response to prevent outages before they happen

🛠️ LLM Routers & Gateways 114

Route queries across providers ? failover, cost optimization, rate-limit handling.

📁 Distributed Tracing 81

Track request paths to resolve microservice bottlenecks faster

🧭 APM Tools 76

Boost application uptime to ensure a seamless user experience

🔐 Infrastructure Monitoring 60

Track resource health to prevent downtime and reduce cloud costs

🧮 Log Management 58

Track system logs to resolve critical errors before users notice

🪧 Anomaly Detection Ops 34

Streamline incident response by identifying hidden performance gaps

📡 AI Personal Assistants 33

Automate daily scheduling to reclaim your productive hours

📊 LLM Fine-Tuning Platforms 28

Build proprietary models that provide superior accuracy

📦 IT Operations AI 26

Boost system uptime by predicting and resolving IT issues faster.

🛡️ AI Chatbot Builders 25

Build intelligent agents to resolve customer issues instantly

🧮 AI Workflow Automation 25

Automate complex processes to reclaim hours of manual work

📦 LLM Evaluation & Testing 24

Manage AI model accuracy to deliver reliable user experiences.

🧭 Voice AI & Speech 24

Build human-like voice experiences to increase user engagement.

🧮 RAG Frameworks & Tools 23

Boost AI accuracy with real-time data to eliminate hallucinations.

🔍 AI Copilots & Assistants 23

Streamline software delivery and ship complex features faster.

🛡️ SIEM Log Management 20

Manage security risks by turning raw logs into actionable data

🧮 Monitoring & Observability 20

Streamline troubleshooting to keep your systems running smoothly.

📦 Open Source LLMs 20

Build private AI models to maintain complete data sovereignty

🧮 AIOps Tools 20

Automate incident response to reduce manual toil and resolve issues

🔧 Multimodal AI Platforms 20

Boost decision-making by analyzing video, text, and audio data

📈 LLM API Providers 19

Build intelligent features faster without training custom models

🧩 Auto-Remediation 18

Automate incident response to maintain 24/7 system availability

📊 Backend Monitoring 17

Boost application uptime by identifying server-side bottlenecks

🏷️ OpenTelemetry Platforms 17

Streamline system observability with vendor-neutral data.

🔧 Service Dependency Mapping 16

Track service connections to prevent outages during system changes.

🔐 AI Guardrails & Safety 16

Manage LLM risks and ensure brand safety in every interaction

🪪 Infrastructure Dashboards 15

Streamline troubleshooting with real-time system visibility

📡 Real User Monitoring 15

Boost conversion rates by optimizing real-world site performance

🧮 Distributed APM 14

Track distributed requests to find and fix performance bottlenecks

📡 APM Platforms 13

Boost app performance by resolving system bottlenecks fast

🧩 Transaction Tracing 13

Boost application performance by identifying slow user transactions

🛡️ Log Management Platforms 13

Track system logs in real-time to resolve production issues faster.

🪶 Span Analysis 13

Streamline debugging by isolating latency in specific spans

🔍 Root Cause Analysis AI 12

Streamline troubleshooting to reduce mean time to recovery

🛡️ AI Agent Platforms 12

Build autonomous workflows that execute complex business tasks

⚙️ LLM Hosting & Inference 12

Scale AI performance with high-speed hosting and low latency

📁 Error Tracking 12

Track software bugs and fix issues before users complain

🛠️ Trace Analytics 11

Track request flows to resolve complex performance issues

🔐 Distributed Tracing Tools 11

Track request paths to pinpoint bottlenecks in complex microservices

📐 Full-Stack APM 10

Boost application performance and reduce MTTR with total visibility.

📈 Centralized Logging 10

Close visibility gaps by aggregating all logs into a single view

🛡️ Network Monitoring Cloud 10

Streamline cloud traffic to eliminate latency and ensure connectivity

🪧 Metrics Collection 10

Automate metrics collection to make proactive scaling decisions

⚙️ Conversational AI Platforms 10

Automate customer support to resolve inquiries instantly at scale.

🧮 Infrastructure Monitoring Tools 9

Scale infrastructure confidently while ensuring maximum system uptime

🏷️ Incident Prediction 9

Boost system uptime by preventing critical IT failures before they happen

🎯 Cloud-Native Logging 9

Manage logs effortlessly across dynamic, ephemeral cloud systems

📊 Cloud Resource Monitoring 8

Track resource utilization to optimize performance and cut costs.

⚙️ Server Monitoring 8

Track server health to prevent downtime before it impacts customers.

🔐 Application Performance Monitoring 7

Boost application uptime and deliver a seamless user experience.

📦 Uptime Monitoring 7

Track system health in real-time to ensure maximum service reliability.

🛠️ AI Agent Frameworks 6

Scale autonomous workflows to handle complex business tasks.

📊 AI Browser Automation 6

Automate repetitive web tasks to save hours of manual work.

🪪 Log Aggregation 5

Streamline system debugging with unified log visibility.

🧩 Log Analytics 1

Search and analyze logs ? Elasticsearch, OpenSearch, Loki.

🪶 Status Pages 1

Build customer trust through transparent real-time updates.

See all 61 categories →

Live cost intelligence & cloud connectivity

Weekly-refreshed AI cost intelligence by persona/vendor, 2.5 years of pricing history (charted), and one-click cloud cost connectors for AWS, Azure, and GCP.

🔌

Live Cloud Cost Connectors

LIVE · real billing data

Stop guessing — connect your real cloud bills. Read-only IAM role (you create it via CloudFormation Quick-Create — no credentials shared). We pull Cost Explorer + Bedrock data daily, surface savings recommendations, feed real numbers into every calculator on this site.

Live
☁️
Connect Your AWS Account
Cross-account read-only IAM role via CloudFormation Quick-Create. We pull Cost Explorer + Bedrock daily, surface savings.
Use: Real AWS billing data · Bedrock cost breakdown · daily savings recs
Saves: 25-60% via 7 automated optimization rules (model swaps, region, etc.)
Dashboard
📊
My Cloud Connections
Manage all connected accounts in one place. Per-connection Bedrock cost breakdown + savings recommendations.
Use: Multi-account orgs · disconnect / reconnect · audit log access
Refresh: Every 24h · 90-day audit trail · KMS-encrypted at rest
Coming soon
🔷
Connect Your Azure Account
Azure AD service principal + Cost Management Reader role. Same pattern as AWS — read-only, daily refresh.
Use: Azure OpenAI cost · PTU utilization · subscription breakdown
Saves: Same 7-rule savings engine as AWS
Coming soon
🔵
Connect Your GCP Account
Service account with billing.viewer role + BigQuery Detailed Billing export. Vertex AI cost breakdown.
Use: Vertex AI / Gemini API spend · multi-project rollups
Saves: Region rebalancing · model tier downgrades

Ready for expert help?

Three productized brands. Three sizes each. Personal · SMB · Enterprise.

Browse 46 free calculators. Read 46 expert guides.

If you still need help, we're here for $19.99 to enterprise.

📖 Data sources & methodology 161 text models · 9 embeddings · 24 vision · 41 audio · 8 vector DBs across 10 vendor pages · last verified 2026-06-05

Methodology

  • All prices are USD per 1 million tokens, current as of 2026-06-05.
  • Vendor-published values have no mark. Inferred/extrapolated values are marked with * and listed below.
  • Batch API discounts are 50% off standard rates across providers that offer Batch mode.
  • Prompt caching discounts vary by provider (typically 80-90% off cached input tokens).
  • Regional data-residency surcharges (Anthropic 1.1x, OpenAI 1.1x, Google regional tiers) are NOT included in base rates.
  • Long-context pricing tiers apply when input exceeds model threshold.
  • Embedding prices are input-only (no output tokens generated).

Primary sources

Last-verified date is the most recent successful daily snapshot (aicost_pricing_snapshots) or, when no snapshot exists yet, the latest successful crawler run (aicost_crawler_runs). 10 of 10 vendors are currently verified. Aggregator services (TokenCost, AI Pricing Guru, etc.) are not listed.

Anthropic
2026-06-05
https://www.anthropic.com/pricing
Daily snapshot since Sep 2023 · 578 days captured
Anthropic Docs
2026-06-05
https://platform.claude.com/docs/en/about-claude/pricing
Daily snapshot since Sep 2023 · 578 days captured
OpenAI
2026-06-05
https://openai.com/api/pricing/
Daily snapshot since Sep 2023 · 579 days captured
Google AI
2026-06-05
https://ai.google.dev/gemini-api/docs/pricing
Daily snapshot since Dec 2023 · 554 days captured
Google Vertex
2026-06-05
https://cloud.google.com/vertex-ai/generative-ai/pricing
Daily snapshot since Dec 2023 · 554 days captured
DeepSeek
2026-06-05
https://api-docs.deepseek.com/quick_start/pricing
Daily snapshot since May 2024 · 493 days captured
xAI
2026-06-05
https://x.ai/api
Daily snapshot since Nov 2024 · 411 days captured
Mistral
2026-06-05
https://mistral.ai/pricing
Daily snapshot since Dec 2023 · 552 days captured
Cohere
2026-06-05
https://cohere.com/pricing
Daily snapshot since Sep 2023 · 578 days captured

Inferred values (marked with * in calculator tables)

Derived from industry conventions, not directly published by the vendor. Typical conventions: cached input = 10% of base (90% off), Batch API = 50% of base (50% off).

Vendor / Model Field Why it’s inferred
Anthropic — Claude Sonnet 4.6 cachedInput Derived at 10% of input rate — Anthropic publishes 90% cache-hit discount on this tier.
Anthropic — Claude Sonnet 4.5 cachedInput Derived at 10% of input rate; same 90% cache-hit convention as Sonnet 4.6.
Anthropic — Claude Sonnet 4.5 batchInput Derived at 50% of standard input — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Sonnet 4.5 batchOutput Derived at 50% of standard output — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Haiku 4.5 cachedInput Derived at 10% of input rate — Anthropic 90% cache-hit discount convention.
OpenAI — GPT-5.4 Mini cachedInput Derived at 10% of input — OpenAI documents automatic 90% discount on cache hits across GPT-5.x tier.
OpenAI — GPT-5.4 Nano cachedInput Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Nano batchInput Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Nano batchOutput Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro cachedInput Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Pro batchInput Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro batchOutput Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.2 cachedInput Derived at 10% of input; no residency uplift.
OpenAI — GPT-5.2 batchInput Derived at 50% of input.
OpenAI — GPT-5.2 batchOutput Derived at 50% of output.
OpenAI — GPT-5 cachedInput Derived at 10% of input.
OpenAI — GPT-5 batchInput Derived at 50% of input.
OpenAI — GPT-5 batchOutput Derived at 50% of output.
OpenAI — GPT-5.5 Pro cachedInput Derived at 10% of input — OpenAI does not publish a cached rate for *-pro models; using the family convention.
OpenAI — GPT-5.5 Pro batchInput Derived at 50% of input.
OpenAI — GPT-5.5 Pro batchOutput Derived at 50% of output.
OpenAI — GPT-5.2 Pro cachedInput Derived at 10% of input — pro-tier convention.
OpenAI — GPT-5.2 Pro batchInput Derived at 50% of input.
OpenAI — GPT-5.2 Pro batchOutput Derived at 50% of output.
OpenAI — GPT-5.1 batchInput Derived at 50% of input.
OpenAI — GPT-5.1 batchOutput Derived at 50% of output.
OpenAI — GPT-5 Pro batchInput Derived at 50% of input.
OpenAI — GPT-5 Pro batchOutput Derived at 50% of output.
OpenAI — GPT-5 Nano cachedInput Derived at 10% of input.
OpenAI — GPT-5 Nano batchInput Derived at 50% of input.
OpenAI — GPT-5 Nano batchOutput Derived at 50% of output.
Google — Gemini 3 Flash cachedInput Derived at 10% of input — Google caching discount convention ~90%.
Google — Gemini 3.1 Flash-Lite cachedInput Derived at 10% of input — Google caching convention.
Google — Gemini 3.1 Flash-Lite batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 3.1 Flash-Lite batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Pro cachedInput Derived at 10% of input.
Google — Gemini 2.5 Flash cachedInput Derived at 10% of input.
Google — Gemini 2.5 Flash-Lite cachedInput Derived at 10% of input — Google caching convention.
Google — Gemini 2.5 Flash-Lite batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Flash-Lite batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash cachedInput Derived at 25% of input per Google 2.0 family caching rates.
Google — Gemini 2.0 Flash batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite cachedInput Derived at 10% of input — Google caching convention.
Google — Gemini 2.0 Flash-Lite batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
xAI — Grok 4 (legacy) cachedInput Extrapolated at 25% of base.

Pricing is cross-verified against the LiteLLM community registry when available. Daily snapshots are kept in aicost_pricing_snapshots; every change is logged to aicost_price_changelog with old & new values for full audit trail. Read the full methodology →