Anthropic pricing, complete breakdown

Verified 2026-05-27, cross-checked against Anthropic pricing page, litellm, openrouter

Anthropic's frontier model lineup is led by Claude Opus 4.7 and 4.6, both priced at $5.00 per million input tokens and $25.00 per million output tokens. For high-throughput applications, Claude Sonnet 4.6 offers a balanced rate of $3.00 per million input and $15.00 per million output tokens. The most efficient model, Claude Haiku 4.5, reduces costs further to $1.00 per million input and $5.00 per million output tokens. All current models support prompt caching to lower costs on repeated context. This guide helps you navigate these API rates alongside Anthropic's various subscription and enterprise tiers.

Claude Haiku 4.5 provides the lowest entry point at just $1.00 per million input tokens.

How Anthropic's pricing universe works

Anthropic employs a multi-channel pricing strategy to balance high-margin developer usage with predictable recurring revenue from consumers and enterprises. The API track serves builders who need metered, programmatic access to models for custom applications. Subscription tiers like Claude Pro and the high-usage Max plans target individual power users, while Team and Enterprise plans provide the administrative controls and security required for organizational deployment. By offering models through cloud marketplaces like AWS and Google Vertex, Anthropic also captures enterprise spend from companies that prefer to consolidate billing within their existing cloud infrastructure.

API (per-token, metered)

For: Developers, technical teams, startups building products on top of Claude
  • Pay only for tokens consumed
  • Full model lineup including batch, caching, long context
  • Programmatic via SDKs
When to use: When integrating Anthropic into your own product or running variable batch workloads
Best for: Builders with metered or unpredictable usage

Consumer subscriptions (Pro, Max tiers)

For: Individuals using Anthropic directly for writing, coding, research, analysis
  • Fixed monthly fee
  • Generous usage caps
  • Web/desktop/mobile apps
  • Often includes newer models first
When to use: When using Anthropic as a daily-driver AI assistant rather than building on it
Best for: Solo professionals, knowledge workers, vibe coders

Business/Team plans

For: Teams of 5-200 needing shared workspaces, admin controls, SSO
  • Per-seat billing
  • Centralized billing
  • Admin & audit controls
  • Sometimes shared usage pools
When to use: When deploying Anthropic across a team that does NOT need API integration
Best for: Mid-size organizations adopting AI for internal productivity

Enterprise (custom contract)

For: Large organizations with procurement requirements, compliance needs, or volume-discount leverage
  • Custom pricing and limits
  • SLAs
  • DPAs and BAAs
  • Dedicated support
  • Sometimes private cloud / VPC
When to use: When per-seat or per-token pricing exceeds ~$50K/year, or when compliance/contractual needs require it
Best for: Enterprises with procurement-led adoption

Cloud marketplaces (AWS Bedrock, Google Vertex, Azure)

For: Organizations with existing cloud commits or strict data-residency requirements
  • Same models, slightly different pricing (often parity or small premium)
  • Counts toward existing cloud spend commits
  • Stays within cloud's data-protection boundary
When to use: When you already burn down EDP/MACC/CCC commits and prefer single-bill
Best for: Cloud-committed enterprises
Which one should you pick? If you are building a product, use the API for granular control. For personal use, Claude Pro or the Claude Max 5x/20x tiers provide the best value for high-volume chat and coding. Organizations should deploy Claude Team Standard or Premium for seat-based management, while those with strict compliance or massive scale should move to a Claude Enterprise contract or a cloud marketplace provider.

🎁 Current promos and time-sensitive deals

What's active right now. Auto-hides expired items.
Claude for Startups
Provides $25,000 to $100,000+ in API credits valid for 12 months, plus premium rate limit boosts and access to the Anthology Fund.
expires 2026-12-31 · source
Claude Partner Network Funding
A $100 million program providing funding for training, market development, and technical certification for consulting partners.
expires expires_note · source
AWS Bedrock Batch Discount
50% discount on batch inference compared to on-demand rates for high-volume, non-latency-sensitive workloads.
expires no_expiration · source
Google Vertex AI Committed Use Discounts (CUDs)
Approximately 30% discount for 1-year and 50% for 3-year spend commitments on Claude models via Vertex AI.
expires no_expiration · source
Anthropic primarily distributes promotional credits through cloud partners (AWS/GCP) and venture capital accelerators rather than public coupon codes.

📅 What changed in the last 30 days

Populated from aicost_price_changelog. Hides automatically when no recent events.
No notable price or model changes detected in Anthropic's published pricing in the last 30 days.

Every Anthropic product, profiled

For each product, what it's for, who picks it, what to watch out for, pros and cons, and what we tell our consulting clients.

consumer free

Claude Free

Free
Free forever with daily usage limits
Target users
casual users, students, ai curious
Typical uses
  • Basic text summarization and drafting
  • Casual coding assistance and debugging
  • Testing Claude's reasoning capabilities
  • Occasional image analysis and OCR
Why pick it
Ideal for users who need high-quality reasoning for occasional tasks without a financial commitment.
Key features
  • Access to Claude 3.5 Sonnet
  • Web, iOS, and Android access
  • Basic Claude Code CLI access
  • Standard response speeds
  • Limited daily message caps
⚠ Marketing gimmicks to watch
Dynamic Message Caps
Usage limits are not fixed and can decrease significantly during periods of high demand.
Impact: You may find yourself locked out of the service during peak business hours when you need it most.
Model Gating
The Free tier typically only provides access to the mid-range 'Sonnet' model, not the flagship 'Opus'.
Impact: Complex reasoning tasks may fail or provide lower-quality results compared to the paid tiers.
Pros
  • No-cost access to industry-leading reasoning
  • Clean, distraction-free interface
  • Strong privacy reputation compared to competitors
Cons
  • Strict and unpredictable usage limits
  • No access to Claude Projects or Artifacts sharing
  • Data may be used for training (unless opted out)
Insider view
The Free tier is essentially a 'Sonnet' demo. While the reasoning is top-tier, the message limits are often too restrictive for any sustained professional workflow. It is best used as a secondary tool for cross-referencing other AI outputs.
Max bang for buck
Use the Free tier during off-peak hours (late evening or weekends) to maximize the number of messages allowed before hitting the cap.
🔒 Training-on-your-data policy
Anthropic may use data from the Free tier to improve its models. Users can opt-out in settings or by emailing [email protected]. Source: https://www.anthropic.com/legal/privacy
🔄 Migration path
Upgrade when:
You hit usage limits daily or need to upload large documents (Claude Projects).
Downgrade when:
Your usage drops to fewer than 5 prompts per day.
Switch vendor when:
You require a built-in web search or image generation features.
ScenarioMonthlyAnnualNotes
Standard casual use varies varies No cost, but limited by daily caps.
consumer flagship

Claude Pro

$20/mo · $200/yr
$20/mo or $200/year sticker price
Target users
solo professionals, developers, power users
Typical uses
  • Complex coding with Claude Code Pro
  • Managing long-form documents in Projects
  • High-volume research and analysis
  • Early access to new model releases
Why pick it
Designed for professionals who need 5x more usage than the free tier and access to the full model family.
Key features
  • Access to Claude 3.5 Opus and Sonnet
  • Claude Projects for organizing docs/code
  • Priority access during high-traffic periods
  • Claude Code Pro CLI integration
  • Artifacts for live code/UI previews
⚠ Marketing gimmicks to watch
The 5x Usage Multiplier
Anthropic markets Pro as having '5x more usage' than Free, but since Free limits are dynamic, Pro limits are also effectively moving targets.
Impact: You cannot calculate an exact number of messages you are guaranteed per day.
Annual Sticker vs. Monthly
The $200 annual price is a $40 discount over monthly, but it is billed as a single upfront payment.
Impact: Commitment is required; there are no partial refunds if you decide to switch vendors mid-year.
Pros
  • Best-in-class coding and reasoning performance
  • Projects feature allows for 200k context management
  • No data training on your prompts by default
Cons
  • Still subject to usage caps (not unlimited)
  • No native web browsing or image generation
  • Mobile app can be laggy with very long Project contexts
Insider view
Claude Pro is the gold standard for developers and writers. The 'Projects' feature is the real killer app here, allowing you to ground the AI in your specific codebase or style guides. However, the lack of a 'Search' tool means you still need a secondary tool for current events.
Max bang for buck
Opt for the $200 annual plan if you intend to use Claude as your primary coding assistant; it effectively brings the monthly cost down to $16.67.
🔒 Training-on-your-data policy
Anthropic does not use Claude Pro data to train its models. Source: https://www.anthropic.com/legal/commercial-terms
🔄 Migration path
Upgrade when:
You need to manage multiple knowledge bases (Projects) or require Opus-level reasoning.
Downgrade when:
You find yourself using less than 10% of your daily message limit.
Switch vendor when:
You need integrated web search or real-time data fetching.
ScenarioMonthlyAnnualNotes
Monthly subscription $20 $240 Standard monthly billing.
Annual sticker price $16.67 $200 Paid as $200 upfront.
consumer power low

Claude Max 5x

$100/mo · $1200/yr
High-volume individual usage
Target users
full stack developers, heavy ai users, independent researchers
Typical uses
  • Full-day coding sessions with Claude Code
  • Processing massive document sets (1M+ tokens)
  • Building complex applications via CLI
  • High-frequency iterative research
Why pick it
Designed for power users whose productivity is limited by the standard Pro tier's message caps.
Key features
  • 5x the message capacity of Claude Pro
  • Highest priority for new model features
  • Claude Code Max 5x access
  • Advanced Project management tools
  • Direct support channel access
⚠ Marketing gimmicks to watch
The 'Unlimited' Illusion
While 5x Pro is a massive amount of usage, it is still technically a capped tier.
Impact: Automated scripts or extremely heavy CLI usage can still trigger rate limits.
Feature Parity
Most features are identical to the Pro tier; you are primarily paying for 'headroom'.
Impact: If you don't hit Pro caps, this is a $80/month 'donation' to Anthropic.
Pros
  • Virtually eliminates 'usage exhausted' interruptions
  • Ideal for professional developers using Claude Code
  • Highest possible individual rate limits
Cons
  • Significant price jump from Pro ($20 to $100)
  • No annual discount currently available for Max tiers
  • Overkill for 95% of standard professional users
Insider view
This tier is specifically for the 'Claude-first' developer. If your IDE is essentially a window into Claude Code, the $100/mo is easily justified by the lack of context-switching when limits hit. For anyone else, it's hard to justify the 5x price increase.
Max bang for buck
Only subscribe if you have verified that you hit Pro limits at least 4 days per week.
🔒 Training-on-your-data policy
Data is not used for training. Source: https://www.anthropic.com/legal/commercial-terms
🔄 Migration path
Upgrade when:
Pro limits are consistently reached before midday.
Downgrade when:
You move to a project that requires less iterative AI assistance.
Switch vendor when:
You require API-level control and custom system prompts for the same price.
ScenarioMonthlyAnnualNotes
Power user monthly $100 $1,200 No annual discount available.
team standard

Claude Team Standard

$25/mo · $240/yr
Minimum 5 seats required
Target users
small agencies, startup teams, corporate departments
Typical uses
  • Collaborative project management
  • Sharing custom AI instructions across a team
  • Centralized billing for AI tools
  • Internal knowledge base grounding
Why pick it
The entry point for organizations needing administrative control and shared workspace features.
Key features
  • Higher usage limits than Claude Pro
  • Centralized admin console and billing
  • Shared Projects for team collaboration
  • Early access to collaboration features
  • Claude Code Team access
⚠ Marketing gimmicks to watch
The 5-Seat Floor
The price is marketed as $25/seat, but you cannot buy fewer than 5 seats.
Impact: The true minimum cost is $125/mo (monthly) or $1,200/yr (annual), even for a 2-person team.
The Annual Lock-in
The $20/seat price requires an annual commitment for all 5+ seats.
Impact: Adding or removing seats mid-year can lead to complex billing adjustments.
Pros
  • Shared Projects allow teams to work from the same context
  • Administrative control over user access
  • Higher usage caps than individual Pro accounts
Cons
  • Expensive for very small teams (under 5 people)
  • Lacks the advanced security of the Enterprise tier
  • No SSO (Single Sign-On) in the Standard tier
Insider view
Team Standard is perfect for small creative or dev agencies. The ability to share 'Projects'—essentially pre-loaded context windows—is a massive productivity booster for teams working on the same client or codebase. Avoid if you are a solo user; the 'higher limits' don't justify the 5-seat minimum cost.
Max bang for buck
Use the annual billing to save $300/year on the minimum 5-seat configuration.
🔒 Training-on-your-data policy
Data is not used for training. Source: https://www.anthropic.com/legal/commercial-terms
🔄 Migration path
Upgrade when:
You need SSO, more than 50 seats, or specialized enterprise support.
Downgrade when:
The team shrinks below 5 people and collaboration features aren't being used.
Switch vendor when:
You need deep integration with Microsoft 365 or Google Workspace.
ScenarioMonthlyAnnualNotes
Minimum 5-seat team (Monthly) $125 $1,500 5 seats @ $25/mo.
Minimum 5-seat team (Annual) $100 $1,200 5 seats @ $20/mo.
enterprise

Claude Enterprise

$20/mo · $240/yr
Custom enterprise pricing — contact sales
Target users
fortune 500, highly regulated industries, large scale engineering orgs
Typical uses
  • Company-wide AI deployment with SSO
  • Analyzing massive internal datasets (500k context)
  • Audit-ready AI interactions
  • Custom integration with internal GitHub/GitLab
Why pick it
The only choice for large organizations requiring SOC 2 compliance, SSO, and maximum security.
Key features
  • 500,000 token context window
  • SSO (SAML) and SCIM provisioning
  • Audit logs and activity monitoring
  • GitHub/GitLab integrations
  • Dedicated account management
⚠ Marketing gimmicks to watch
The 'Contact Sales' Wall
Pricing is entirely opaque and depends on seat count and volume commitments.
Impact: Procurement can take weeks; you cannot 'self-serve' into this tier.
Integration Gating
Key features like native GitHub integration are often locked behind this tier.
Impact: Teams may be forced to upgrade just for one specific workflow integration.
Pros
  • Maximum security and compliance features
  • Massive context window for large-scale analysis
  • Consolidated billing and usage insights
Cons
  • High entry cost and long procurement cycles
  • May include features that smaller enterprises don't need
  • Pricing varies wildly based on negotiation
Insider view
Anthropic Enterprise is built for the 'AI-safe' corporation. The 500k context window is the standout feature, allowing users to drop entire technical manuals or legal libraries into a single prompt. If you have more than 100 users, the administrative overhead of the Team tier makes this an eventual necessity.
Max bang for buck
Negotiate for 'Claude Marketplace' credits if you already have a large cloud spend with AWS or Google.
🔒 Training-on-your-data policy
Enterprise data is never used for training and is governed by a formal Data Processing Addendum (DPA). Source: https://www.anthropic.com/legal/commercial-terms
🔄 Migration path
Upgrade when:
Compliance (SOC2/SSO) is required or context needs exceed 200k tokens.
Downgrade when:
Organization-wide usage is low and security requirements can be met by Team tiers.
Switch vendor when:
You require a vendor with a broader suite of enterprise productivity tools (e.g., Microsoft).
ScenarioMonthlyAnnualNotes
100-seat Enterprise deployment varies varies Pricing is custom; typically ranges from $30-$60 per seat depending on volume.
developer api

Anthropic API

Per-token (see API rates above)
Pay-as-you-go usage based on tokens
Target users
software engineers, ai product builders, data scientists
Typical uses
  • Integrating Claude into custom applications
  • Automated batch processing of documents
  • Building autonomous agents
  • Fine-tuning and prompt engineering experiments
Why pick it
The most cost-effective way to use Claude for automated tasks or at massive scale.
Key features
  • Access to all models (Opus, Sonnet, Haiku)
  • Prompt Caching (90% discount on repeats)
  • Batch API (50% discount for 24h turnaround)
  • Computer Use (beta) capabilities
  • Client SDKs for Python and TypeScript
⚠ Marketing gimmicks to watch
Silent Prompt Cache TTL
The 5-minute default TTL for caching can effectively drop to 3 minutes in practice.
Impact: Pausing too long between messages can result in full-price re-processing of large contexts.
Input vs. Output Pricing
Output tokens are 5x more expensive than input tokens across all models.
Impact: Verbose model responses can quickly inflate costs compared to concise ones.
Pros
  • Lowest possible cost for high-volume tasks
  • Prompt caching significantly reduces costs for long contexts
  • No monthly subscription fee; pay only for what you use
Cons
  • Requires technical expertise to implement
  • Rate limits start low for new accounts (Tier 1)
  • No built-in UI; must use a third-party client or build your own
Insider view
The API is where Anthropic's 'Prompt Caching' shines. For applications with large, static contexts (like a legal library or a codebase), the 90% discount makes Claude significantly cheaper than GPT-4o for repeat queries. However, watch the TTL expiration closely.
Max bang for buck
Use the Batch API for any task that doesn't require an immediate response to save 50% on token costs.
🔒 Training-on-your-data policy
API data is not used for training. Source: https://www.anthropic.com/legal/commercial-terms
🔄 Migration path
Upgrade when:
You need to automate workflows or build a customer-facing AI product.
Downgrade when:
You only need a simple chat interface and don't want to manage API keys.
Switch vendor when:
You need specific features like OpenAI's 'Realtime API' or Google's 2M context window.
ScenarioMonthlyAnnualNotes
Light developer use (1M Sonnet tokens/mo) $18 $216 Assumes $3 input / $15 output mix.
Heavy Batch processing (100M Haiku tokens/mo) $300 $3,600 Assumes 50% Batch discount on Haiku rates.
team

Claude Team Premium

$125/mo · $1200/yr
$125/mo monthly billing, $100/mo billed annually (per user)
Target users
small teams, collaborative teams
Key features
  • All Team Standard features
  • 5x more usage than standard seats
  • Includes Claude Code and Claude Cowork
  • Connect Microsoft 365, Slack, and more
  • Enterprise search across your organization
  • Central billing and administration
  • Single sign-on (SSO)
  • Admin controls for remote and local connectors
ScenarioMonthlyAnnualNotes
Standard annual cost $125 $1,200 Lower rate with annual commitment ($100/mo)
consumer

Claude Max 20x

$200/mo · $2400/yr
$200/mo
Target users
individual users, professionals
Key features
  • Core Extended Thinking
  • Notes: Top consumer tier; 20x Pro usage; same model lineup as Max 5x, higher quotas
  • Tools Projects: unlimited
  • Tools Artifacts
  • Tools Claude Code
  • Tools File Creation
  • Tools Research Mode
  • Tools Code Execution
ScenarioMonthlyAnnualNotes
Standard annual cost $200 $2,400 Monthly subscription
team

Claude Team Standard Seat

$25/mo · $240/yr
$25/mo monthly billing, $20/mo billed annually (per user)
Target users
small teams, collaborative teams
Key features
  • All Claude features
  • More usage than Pro
  • Includes Claude Code and Claude Cowork
  • Connect Microsoft 365, Slack, and more
  • Enterprise search across your organization
  • Central billing and administration
  • Single sign-on (SSO)
  • Admin controls for remote and local connectors
ScenarioMonthlyAnnualNotes
Standard annual cost $25 $240 Lower rate with annual commitment ($20/mo)

All Anthropic products at a glance

Scroll up to the product profile for full detail

ProductPriceBest forHeadline featureYearly estimate
Claude Free $0 Occasional chat Access to latest Sonnet model $0
Claude Pro $20/mo Power users Claude Code & 5x usage limits $240
Claude Team Standard $25/mo/seat Small teams (5+) Shared Projects & Admin tools $1,500 (5 seats)
Claude Max 20x $200/mo Heavy automation 20x standard Pro limits $2,400
Anthropic API Usage-based App development Prompt Caching & Batch API Variable

Anthropic vs the field

Same-tier comparison across top 5 vendors

Comparison tierAnthropicOpenAIGooglexAIVerdict
Flagship Consumer Chat
Claude Pro
$20/mo
GPT-5.4 Pro
$30/mo
Gemini 3.1 Pro
$20/mo
Grok 4.20
$16/mo
Anthropic and Google maintain the $20 price point while OpenAI's flagship has moved to $30.
Developer (Reasoning API)
Claude 3.5 Sonnet
$3.00 / $15.00
GPT-5.4
$2.50 / $15.00
Gemini 3.1 Pro
$2.00 / $12.00
Grok 4.20 (reasoning)
$2.00 / $6.00
xAI and Google offer lower output pricing for reasoning-capable models compared to Anthropic.
Developer (Fast/Mini API)
Claude 3.5 Haiku
$0.25 / $1.25
GPT-5.4 Mini
$0.75 / $4.50
Gemini 3 Flash
$0.50 / $3.00
Grok 4.1 Fast
$0.20 / $0.50
Anthropic's Haiku pricing is highly competitive in the 'Mini' tier, undercutting OpenAI and Google on input costs.

🌳 Which Anthropic product fits you?

3 questions, 1 recommendation
How do you intend to use Claude?
Recommended
anthropic api
The API provides the flexibility and pay-as-you-go pricing required for programmatic integration.
See full profile ↑
Recommended
claude team standard + claude enterprise
Team plans offer shared context through Projects and administrative oversight for groups.
See full profile ↑
Recommended
claude pro + claude max 5x
Pro removes the restrictive caps of the free tier and adds priority access to new features like Claude Code.
See full profile ↑
Recommended
claude free
Best for users who want to experience Claude's intelligence without a monthly financial commitment.
See full profile ↑

Tracking the transition from the Claude 3 series to the 4.7 generation.

API or subscription: which is cheaper for you?

Cross-over math at current rates

claude-pro ($20/mo) vs claude-sonnet-4-6 API ($3/$15 per MTok)
Break-even: ~4,762 messages/month (avg ~600 tokens each)

At a 2:1 input-to-output ratio (400 tokens in, 200 tokens out), each message costs approximately $0.0042 via API.

👉 If you send more than 160 messages per day, the $20 subscription is cheaper than the API.
claude-pro ($20/mo) vs claude-opus-4-7 API ($5/$25 per MTok)
Break-even: ~2,857 messages/month (avg ~600 tokens each)

Using the flagship Opus model via API costs roughly $0.007 per average message.

👉 For heavy Opus users, the subscription pays for itself at just 95 messages per day.
claude-max-5x ($100/mo) vs claude-opus-4-7 API ($5/$25 per MTok)
Break-even: ~14,286 messages/month (avg ~600 tokens each)

The Max 5x tier provides significantly higher limits for power users who require consistent Opus access.

👉 Cheaper than API if you are a high-volume professional sending 475+ high-complexity messages daily.
Rule of thumb
Subscriptions are best for high-volume chat and creative writing where token counts are unpredictable. API is superior for automated workflows, low-volume utility tasks, or building custom applications.

🧮 Estimate your annual Anthropic cost

Pick a profile, see the all-in annual estimate

All estimates use 2026-05-27 rates. API rates verified against LiteLLM.

Current pricing (all production models)

ModelInput $/MOutput $/MCached $/MContext
Claude Opus 4.7
claude-opus-4-7
$5 $25 $0.50 1,000,000
Claude Opus 4.6
claude-opus-4-6
$5 $25 $0.50 1,000,000
Claude Sonnet 4.6
claude-sonnet-4-6
$3 $15 $0.30 1,000,000
Claude Haiku 4.5
claude-haiku-4-5
$1 $5 $0.10 200,000

Prices are in USD per 1M tokens. Caching discounts apply to input tokens only. Verified 2026-05-27.

Full rate breakdown (all variants)

Variants beyond standard API: batch (async, 50% off), cached read (0.1x), cache writes (1.25x or 2x base), long-context tier (~2x above threshold).

Claude Opus 4.7 claude-opus-4-7

Highest reasoning density for complex logic and agentic workflows
Primary useOptimized for deep research, complex coding, and high-stakes decision-making requiring maximum logical consistency.
Who picks itEngineering teams building autonomous agents and data scientists handling large, unstructured datasets.
Vs other Anthropic modelsAt $5/$25 per million tokens, it is the most resource-intensive model, offering higher reasoning depth than Sonnet 4.6. It supports a 1M token context window for massive datasets.
When to useUse when the task requires extreme logical precision or complex multi-step planning. Look to Sonnet 4.6 if you need faster responses at a lower price point.
Equivalents at other vendors
openai
GPT-5.4 Matches the high-reasoning tier with a lower $2.5/$15 entry price.
google
Gemini 3.1 Pro Provides comparable 1M+ context handling at a $2/$12 rate.
xai
Grok 4 (legacy) Targets the same complex logic use cases at a $3/$15 price point.

Claude Opus 4.7 claude-opus-4-7

VariantInput $/MOutput $/MNotes
Standard $5 $25 Default per-token API rate
Batch API $2.5 $12.5 Async batch processing, results within 24 hours, typically 50% off
Cached read $0.50 $25 Cached prompt input (~0.1x base); output rate unchanged
Cache write (5-min TTL) $6.25 Premium for caching prompt; ~1.25x base input
Cache write (1-hour TTL) $10 Premium for longer cache lifetime; ~2x base input

Claude Opus 4.6 claude-opus-4-6

Stable high-end reasoning for established production logic pipelines
Primary useDesigned for sophisticated tool use and document analysis where output stability is paramount.
Who picks itEnterprises maintaining legacy high-performance workflows that require consistent output patterns over time.
Vs other Anthropic modelsMatches the $5/$25 pricing of Opus 4.7 but provides a stable baseline for existing logic-heavy workflows.
When to useUse for maintaining consistency in existing workflows that were tuned for this specific model version. For new projects requiring maximum intelligence, use Opus 4.7.
Equivalents at other vendors
openai
GPT-5.4 Provides a similar logic-heavy alternative at a $2.5/$15 price point.
google
Gemini 3.1 Pro Aligns on reasoning depth and context capacity for $2/$12.

Claude Opus 4.6 claude-opus-4-6

VariantInput $/MOutput $/MNotes
Standard $5 $25 Default per-token API rate
Batch API $2.5 $12.5 Async batch processing, results within 24 hours, typically 50% off
Cached read $0.50 $25 Cached prompt input (~0.1x base); output rate unchanged

Claude Sonnet 4.6 claude-sonnet-4-6

Balanced performance and speed for high-scale production applications
Primary useOptimized for fast coding assistance, real-time customer interactions, and sophisticated data extraction.
Who picks itProduct teams scaling AI features that require high reasoning at a moderate price point.
Vs other Anthropic modelsPriced at $3/$15, it offers a significant discount over the Opus series while maintaining high reasoning capabilities. It is the primary choice for balancing cost and intelligence.
When to useUse as the default for most production applications. Move to Haiku 4.5 for simple classification or Opus 4.7 for extreme logic.
Equivalents at other vendors
openai
GPT-5.4 Offers similar reasoning capabilities for slightly less at $2.5/$15.
google
Gemini 3.1 Pro Matches the 1M context window for a lower $2/$12 rate.
mistral
Mistral Large 3 A competitive alternative for enterprise reasoning at a $2/$6 price point.

Claude Sonnet 4.6 claude-sonnet-4-6

VariantInput $/MOutput $/MNotes
Standard $3 $15 Default per-token API rate
Batch API $1.5 $7.5 Async batch processing, results within 24 hours, typically 50% off
Cached read $0.30 $15 Cached prompt input (~0.1x base); output rate unchanged

Claude Haiku 4.5 claude-haiku-4-5

Cost-efficient intelligence for high-volume, low-latency data processing
Primary useBuilt for rapid classification, lightweight chat, and high-throughput data extraction tasks.
Who picks itDevelopers building high-throughput systems where sub-second latency and cost-efficiency are requirements.
Vs other Anthropic modelsAt $1/$5, it is the most economical model in the lineup, designed for speed. It has a smaller 200k context window compared to the 1M window in Sonnet and Opus.
When to useUse for high-throughput classification and simple extraction tasks. Switch to Sonnet 4.6 if the task requires more nuanced reasoning or a larger context window.
Equivalents at other vendors
openai
GPT-5.4 Mini Competes in the high-speed utility tier with a slightly lower $0.75/$4.5 rate.
google
Gemini 3 Flash Optimized for similar low-latency tasks at a $0.5/$3 price point.
cohere
Command R Focuses on RAG and tool-use efficiency at a lower $0.15/$0.6 cost.

Claude Haiku 4.5 claude-haiku-4-5

VariantInput $/MOutput $/MNotes
Standard $1 $5 Default per-token API rate
Batch API $0.50 $2.5 Async batch processing, results within 24 hours, typically 50% off
Cached read $0.10 $5 Cached prompt input (~0.1x base); output rate unchanged

Subscription plans (consumer + business)

PlanAudienceMonthlyAnnualPer seatWhat's included
Claude Team Premium
Claude Team
team $125 $100/mo billed annually
($1,200/yr total)
$1 All Team Standard features · 5x more usage than standard seats · Includes Claude Code and Claude Cowork · Connect Microsoft 365, Slack, and more · Enterprise search across your organization · Central billing and administration · Single sign-on (SSO) · Admin controls for remote and local connectors · Enterprise deployment for the Claude desktop app · No model training on your content by default · Mix and match seat types
Limits: usage: 5x more than Team Standard
claude.com ↗
Claude Enterprise Custom enterprise $20 $20 $1 All Team plan features · Admins set user and org spend limits · Google Docs cataloging · Role-based access with fine grained permissioning · System for Cross-domain Identity Management (SCIM) · Audit logs · Compliance API for observability and monitoring · Custom data retention controls · Network-level access control · IP allowlisting · HIPAA-ready offering available · Claude Security (beta)
Limits: usage: scales with model and task · seat price: 20
claude.com ↗
Claude Team Standard
Claude Team
team $25 $20/mo billed annually
($240/yr total)
$1 All Claude features · More usage than Pro · (plus all shared Team features) · Includes Claude Code and Claude Cowork · Connect Microsoft 365, Slack, and more · Enterprise search across your organization · Central billing and administration · Single sign-on (SSO) · Admin controls for remote and local connectors · Enterprise deployment for the Claude desktop app · No model training on your content by default · Mix and match seat types
Limits: max seats: 150 · min seats: 5 · limits source: transcript: 'All Claude features, plus more usage than Pro'
claude.com ↗
Claude Free
Claude Free
consumer $0 Notes: Daily usage limits; no credit card required
Limits: opus access: none · default model: sonnet-4-6 · usage multiplier: baseline
anthropic.com ↗
Claude Pro
Claude
consumer $20 $200/yr
(≈ $17/mo)
Notes: Annual = $200/yr = $17/mo (~17% discount); Claude Code inclusion contested in some changelogs ? verify on anthropic.com/pricing
Limits: opus access: limited · default model: sonnet-4-6 · usage multiplier: baseline_paid · context window tokens: 1000000 · annual effective monthly: 17
claude.com ↗
Claude Max 5x
Claude
consumer $100 Notes: Annual not publicly disclosed; designed for users running multiple Claude Code instances concurrently · Early Access · Priority Access · Priority Routing
Limits: opus access: full · default model: opus-4-7 · usage multiplier: 5x_pro · context window tokens: 1000000
anthropic.com ↗
Claude Max 20x
Claude
consumer $200 Notes: Top consumer tier; 20x Pro usage; same model lineup as Max 5x, higher quotas · Early Access · Priority Access · Priority Routing
Limits: opus access: full · default model: opus-4-7 · usage multiplier: 20x_pro · context window tokens: 1000000
anthropic.com ↗
Claude Free
Claude Free
consumer $0 $0 Chat on web, iOS, Android, and on your desktop · Generate code and visualize data · Write, edit, and create content · Ability to search the web · Memory across conversations · Create files and execute code · Unlock more from Claude with desktop extensions · Connect Slack and Google Workspace services · Integrate any context or tool through connectors with remote MCP · Extended thinking for complex work
Limits: projects: unlimited · context tokens: unlimited · messages per 5h: unlimited
anthropic.com ↗
Claude Pro
Claude
consumer $20 $200/yr
(≈ $17/mo)
Everything in Free · More usage · Includes Claude Code · Includes Claude Cowork · Access to unlimited projects to organize chats and documents · Access to Research · Ability to use more Claude models · Claude for Microsoft 365 · Claude for Microsoft Outlook
Limits: usage: more than Free tier
anthropic.com ↗
Claude Max 5x
Claude
consumer $100 Everything in Pro · Choose 5x more usage than Pro · Higher output limits for all tasks · Early access to advanced Claude features · Priority access at high traffic times
Limits: usage: 5x more than Pro tier · output limits: higher
anthropic.com ↗
Claude Max 20x
Claude
consumer $200 Everything in Pro · Choose 20x more usage than Pro · Higher output limits for all tasks · Early access to advanced Claude features · Priority access at high traffic times
Limits: usage: 20x more than Pro tier · output limits: higher
anthropic.com ↗
Claude Team Standard Seat
Claude
team $25 $20/mo billed annually
($240/yr total)
$1 All Claude features · More usage than Pro · Includes Claude Code and Claude Cowork · Connect Microsoft 365, Slack, and more · Enterprise search across your organization · Central billing and administration · Single sign-on (SSO) · Admin controls for remote and local connectors · Enterprise deployment for the Claude desktop app · No model training on your content by default
Limits: usage: more than Pro tier
anthropic.com ↗
Claude Team Premium Seat
Claude
team $125 $100/mo billed annually
($1,200/yr total)
$1 All Team Standard Seat features · 5x more usage than standard seats · Includes Claude Code and Claude Cowork · Connect Microsoft 365, Slack, and more · Enterprise search across your organization · Central billing and administration · Single sign-on (SSO) · Admin controls for remote and local connectors · Enterprise deployment for the Claude desktop app · No model training on your content by default
Limits: usage: 5x more than Team Standard Seat
anthropic.com ↗
Claude Enterprise
Claude
enterprise $1 All Team plan features · Admins set user and org spend limits · Role-based access with fine grained permissioning · System for Cross-domain Identity Management (SCIM) · Audit logs · Compliance API for observability and monitoring · Custom data retention controls · Network-level access control · IP allowlisting · HIPAA-ready offering available · Claude Security (beta)
Limits: seats: 100+ · usage: scales with model and task
anthropic.com ↗
Claude Code Free
Claude Code Free
developer $0 $0 Included with Claude Free tier
Limits: usage: included with Claude Free tier
anthropic.com ↗
Claude Code Pro
Claude Code
developer $20 $200/yr
(≈ $17/mo)
Included with Claude Pro tier
Limits: usage: included with Claude Pro tier
anthropic.com ↗
Claude Code Max 5x
Claude Code
developer $100 Included with Claude Max 5x tier
Limits: usage: 5x more than Pro tier
anthropic.com ↗
Claude Code Max 20x
Claude Code
developer $200 Included with Claude Max 20x tier
Limits: usage: 20x more than Pro tier
anthropic.com ↗
Claude Code Team Standard Seat
Claude Code
team $25 $20/mo billed annually
($240/yr total)
$1 Included with Team Standard Seat
Limits: usage: included with Team Standard Seat
anthropic.com ↗
Claude Code Team Premium Seat
Claude Code
team $125 $100/mo billed annually
($1,200/yr total)
$1 Included with Team Premium Seat
Limits: usage: included with Team Premium Seat
anthropic.com ↗
Claude Code Enterprise
Claude Code
enterprise $1 All Team plan features · Admins set user and org spend limits · Role-based access with fine grained permissioning · System for Cross-domain Identity Management (SCIM) · Audit logs · Compliance API for observability and monitoring · Custom data retention controls · Network-level access control · IP allowlisting · HIPAA-ready offering available
Limits: seats: 100+ · usage: scales with model and task
anthropic.com ↗

Subscription pricing is separate from per-token API rates above.

How buyers think about Anthropic pricing

Each scenario below is interactive — tweak the inputs to see how the math changes for your workload.

Cheapest Claude for high-volume classification

vibe-codersolopreneurdeveloper

The problem: You need to process hundreds of thousands of short classification tasks daily without the premium cost of reasoning models. Managing these workloads at scale requires a balance between speed and per-token efficiency.

What to do: Claude Haiku 4.5 is the recommended model for high-frequency, low-latency tasks.

A request with 2000 input tokens and 50 output tokens costs $0.00225 ($0.002 input + $0.00025 output). With prompt caching, the input cost drops to $0.0002, reducing the total to $0.00045 per request. At 100,000 daily requests, this totals $45 per day (as of 2026-05-27).

→ Using Haiku 4.5 with prompt caching can reduce daily operating costs for classification by up to 80 percent compared to base rates.

Quick calc — adjust for your workload
Per request:  ·  Monthly:  ·  Annual:
Open full calculator with caching, batch, charts →

Strongest Claude for hard reasoning problems

developerit-buyerenterprise

The problem: Complex architectural decisions or deep code audits require the highest level of cognitive performance, where accuracy is more critical than the cost of a single inference.

What to do: Claude Opus 4.7 provides the most advanced reasoning capabilities in the Anthropic suite.

A heavy reasoning task with 10,000 input tokens and 2,000 output tokens costs $0.10 ($0.05 input + $0.05 output). For a team running 1,000 such deep-reasoning queries daily, the cost is $100 per day (as of 2026-05-27).

→ Opus 4.7 maintains a premium price point of $5 per million input tokens to deliver maximum accuracy on complex logic.

Quick calc — adjust for your workload
Per request:  ·  Monthly:  ·  Annual:
Open full calculator with caching, batch, charts →

Is Claude Pro at the listed monthly rate worth it for solo use

vibe-codersolopreneur

The problem: You are deciding between a flat monthly subscription and paying for API usage. You need to know the break-even point where the subscription becomes more economical than per-token billing.

What to do: Claude Sonnet 4.6 is the balanced model used for comparison against the standard Pro subscription.

At 20 requests per day with 5,000 input and 1,000 output tokens, API costs total $0.60 daily ($0.015 input + $0.015 output). Over 30 days, this equals $18 (as of 2026-05-27). If your daily volume exceeds 22 requests of this size, the monthly Pro subscription typically becomes more cost-effective.

→ The break-even point for a solo user is approximately 132,000 total tokens processed per day at balanced tier rates.

Quick calc — adjust for your workload
Per request:  ·  Monthly:  ·  Annual:
Open full calculator with caching, batch, charts →

When does prompt caching pay off

vibe-codersolopreneurdeveloperit-buyerenterprise

The problem: You are sending the same large context or system prompt repeatedly. You want to know how many repeat requests are needed to justify the engineering effort of implementing caching.

What to do: Anthropic's prompt caching offers a 90 percent discount on input tokens for cached content.

For a 5,000 token prompt on Sonnet 4.6, the base cost is $0.015. The cached read cost is $0.0015. Including the 25 percent cache-write premium, the break-even point occurs after 3 cached reads within the 5-minute TTL window (as of 2026-05-27).

→ Prompt caching reduces input costs by 90 percent for any workload that reuses context at least three times in a five-minute window.

Quick calc — adjust for your workload
Per request:  ·  Monthly:  ·  Annual:
Open full calculator with caching, batch, charts →

Anthropic API direct vs AWS Bedrock vs Google Vertex

enterpriseit-buyerdeveloper

The problem: You need to choose a hosting provider for Claude models. You are looking for the lowest price while maintaining enterprise security and compliance.

What to do: While base per-token rates are identical across providers, regional surcharges and commitment programs create significant total cost differences.

Sonnet 4.6 costs $3/$15 per million tokens on all platforms. However, Google Vertex AI applies a 10 percent surcharge for regional endpoints, and AWS Bedrock adds approximately 10 percent for cross-region inference (as of 2026-05-27).

→ Base prices are identical, but regional configurations on Vertex and Bedrock can increase effective costs by 10 percent.

Quick calc — adjust for your workload
Per request:  ·  Monthly:  ·  Annual:
Open full calculator with caching, batch, charts →

How does Claude compare on cost to GPT and Gemini

vibe-codersolopreneurdeveloperit-buyerenterprise

The problem: You are comparing the balanced tier of Claude against other major providers to ensure you are not overpaying for similar performance.

What to do: Claude Sonnet 4.6 serves as the benchmark for balanced performance and cost efficiency.

Sonnet 4.6 is priced at $3 per million input tokens and $15 per million output tokens. For a standard workload of 5,000 input and 1,000 output tokens, the cost is $0.03 per request (as of 2026-05-27). This pricing is positioned to compete directly with other mid-tier frontier models.

→ Claude Sonnet 4.6 maintains a competitive $3 per million input token rate for balanced enterprise workloads.

Quick calc — adjust for your workload
Per request:  ·  Monthly:  ·  Annual:
Open full calculator with caching, batch, charts →

Volume discounts & partner programs

Heads up — these are community-sourced and analyst-reported terms. Specific credit amounts, discount percentages, and program thresholds change frequently. Always verify current terms directly with Anthropic before relying on a specific number. Treat reported figures as ballpark, not contract language.

Claude Partner Network

Threshold: reportedly 10 team members must complete Anthropic Academy certifications for consulting partners

Typical discount (reported): varies; includes $100 million in total program funding for training and market development

Benefits:

How to engage: Apply through the Anthropic Partner Portal; requires 4-8 weeks for review and a technical interview

Source: anthropic.comvendor_official · cited 2026-03-12

Claude Marketplace

Threshold: requires an existing Anthropic enterprise spend commitment

Typical discount (reported): allows spend consolidation; no commission taken by Anthropic from partner transactions

Benefits:

How to engage: Enterprises with existing commitments should contact their Anthropic account team

Source: claude.comvendor_official · cited 2026-03-06

Claude for Startups

Threshold: varies by stage (Pre-seed to Series B); often requires VC or accelerator partnership

Typical discount (reported): reportedly $25,000 to $100,000+ in API credits

Benefits:

How to engage: Apply via partner VC firms or directly through the Anthropic Startup Program portal

Source: anthropic.comvendor_official · cited 2025-05-22

AWS Bedrock Provisioned Throughput

Threshold: minimum 1-month or 6-month commitment

Typical discount (reported): fixed hourly rates; monthly costs range from approximately $15,000 to $37,000+ per Model Unit

Benefits:

How to engage: Purchase through the AWS Bedrock console or via AWS Marketplace

Source: aws.amazon.comvendor_official · cited 2026-05-14

Google Vertex AI Committed Use Discounts (CUDs)

Threshold: 1-year or 3-year spend commitment

Typical discount (reported): approximately 30% for 1-year and 50% for 3-year commitments

Benefits:

How to engage: Enable through the Google Cloud Console under Vertex AI Model Garden

Source: cloud.google.comvendor_official · cited 2026-05-04

Microsoft Foundry (Azure) Integration

Threshold: requires an active Azure subscription (Enterprise or MCA-E)

Typical discount (reported): uses standard API pricing; eligible for Microsoft Azure Consumption Commitment (MACC)

Benefits:

How to engage: Deploy Claude models as serverless APIs through the Azure AI Foundry Model Catalog

Source: anthropic.comvendor_official · cited 2025-11-18

Multi-cloud availability

Cloud-marketplace terms change frequently. Model availability dates, pricing parity, and regional features can drift week to week. Verify with each cloud's pricing page (AWS Bedrock, Google Vertex, Azure AI Foundry) before architecting around specifics.
CloudModel availabilityPrice vs vendor-directReasons to pick
AWS Bedrock Claude Opus (4.7, 4.6, 4.5, 4.1), Claude Sonnet (4.6, 4.5, 4), Claude Haiku 4.5, and Claude Mythos Preview (gated research preview). Identical per-token rates in standard regions ($5/$25 for Opus 4.6, $3/$15 for Sonnet 4.6, $1/$5 for Haiku 4.5); cross-region inference adds approximately 10%.
  • AWS-native IAM and VPC PrivateLink integration for security
  • AWS Activate credits can be applied to Bedrock usage
  • Unified billing on a single AWS invoice
  • Provisioned Throughput options for guaranteed capacity
  • HIPAA-eligible with AWS BAA

pecollective.com ↗
Google Vertex AI Claude Opus (4.7, 4.6, 4.5), Claude Sonnet (4.6, 4.5, 4), Claude Haiku 4.5, and Claude Mythos Preview (private preview). Base per-token pricing matches direct API, but regional endpoints reportedly carry a 10% surcharge over global endpoint pricing.
  • VPC-SC boundaries and Private Service Connect (PSC) for strict perimeter control
  • Direct integration with BigQuery-backed request logging
  • FedRAMP High authorization context for partner models
  • Unified ML lifecycle management within the Google Cloud ecosystem

cloud.google.com ↗
Microsoft Foundry (Azure) Claude Opus (4.7, 4.6, 4.5, 4.1), Claude Sonnet (4.6, 4.5), and Claude Haiku 4.5. Reportedly identical to direct API rates ($5/$25 per 1M tokens for Opus 4.6/4.7).
  • Integration with Microsoft 365 Work IQ and Fabric IQ data
  • Support for secure, governed agents within the Azure ecosystem
  • Interoperable platform allowing switching between Claude and OpenAI models
  • Reliability guarantees for high-volume enterprise token processing

microsoft.com ↗
Claude Platform on AWS (Marketplace) Full feature parity with direct Anthropic API, including beta features like Managed Agents and MCP connectors. Identical per-token rates but billed in 'Claude Consumption Units' (CCUs) at a fixed rate of $0.01 per CCU.
  • Access to native Anthropic features (Managed Agents, fast mode) on day one while keeping AWS billing
  • Applies against AWS Enterprise Discount Program (EDP) commitments
  • Operated by Anthropic but appears as a single line item on the AWS invoice

cloudzero.com ↗

Free credits & startup programs

Program details and credit amounts shift often. Apply directly through each program's official page for current values, eligibility windows, and application requirements.

Anthropic Startup Program

Reported value: $25,000 to $100,000+ in API credits

Eligibility: Early-stage startups (Pre-seed to Series A) building AI-powered products; primarily available through Anthropic's VC partner network or direct application for incorporated companies.

How to apply: Apply via the official startup program page or through a referral from an approved Anthropic VC partner.

Apply / learn more at anthropic.com ↗

Anthology Fund (Menlo Ventures partnership)

Reported value: $25,000 in free credits

Eligibility: Startups selected for investment by the $100 million Anthology Fund; focuses on AI infrastructure, frontier applications, and social impact.

How to apply: Submit an investment application through the Menlo Ventures Anthology Fund portal.

Apply / learn more at menlovc.com ↗

Claude for Open Source

Reported value: approximately $1,200 value (6 months of Claude Max 20x)

Eligibility: Maintainers of public GitHub repositories with 5,000+ stars or 1M+ monthly npm downloads; must have been active in the last 3 months.

How to apply: Apply through the dedicated Claude for OSS contact form (applications close June 30, 2026).

Apply / learn more at claude.com ↗

Anthropic AI for Science Program

Reported value: up to $20,000 in API credits

Eligibility: Academics, researchers, and nonprofits working on high-impact scientific projects, particularly in biology and life sciences.

How to apply: Complete the AI for Science Program application form; submissions are evaluated on the first Monday of each month.

Apply / learn more at support.anthropic.com ↗

External Researcher Access Program

Reported value: $1,000 in API credits

Eligibility: Researchers working on high-priority AI safety and alignment topics; credits apply to standard model suite API use.

How to apply: Submit the External Researcher Access Program application form with details on the research team and topic.

Apply / learn more at support.anthropic.com ↗

Google for Startups Cloud Program (AI Track)

Reported value: $10,000 specific to partner models (including Anthropic)

Eligibility: Qualifying startups in the Scale or Scale AI Tier; must be less than 5 years old and pre-Series A.

How to apply: Join the Google for Startups Cloud Program and contact a Google Cloud Account Executive to request partner model credits.

Apply / learn more at cloud.google.com ↗

AWS Activate (Portfolio Package)

Reported value: up to $100,000 in AWS credits

Eligibility: Startups associated with an AWS Activate Provider (VC, accelerator, or incubator); credits can be used for Claude models via Amazon Bedrock.

How to apply: Apply through the AWS Activate console using an Organization ID from a partner provider.

Apply / learn more at aws.amazon.com ↗

Pricing gotchas to watch

Most gotchas below were surfaced by community reports. Some may have been fixed, changed, or never been the user-facing issue they appeared. Verify against current vendor docs before architecting around a workaround.

Silent Prompt Cache TTL Expiration and Traffic Gaps

While Anthropic's documentation specifies a 5-minute default TTL for ephemeral prompt caching, users have reportedly observed silent reductions in effective cache life to approximately 3 minutes in practice. Additionally, for users on the 1-hour TTL tier, community reports suggest a quiet shift in default behavior back to 5 minutes for certain session types, leading to unexpected full-price re-processing of large contexts (up to 1M tokens) if a user pauses for more than 5 minutes.

Workaround: Implement a 'heartbeat' request or 'Claude Pulse' style monitoring to track live cache countdowns and ensure follow-up replies occur within the 3-5 minute window to maintain the 90% read discount.

Source: github.comgithub_issue · cited 2026-05-27

Minimum Token Thresholds Vary by Model Generation

Prompt caching only activates if the prefix meets a minimum token threshold, which varies significantly by model. As of May 2026, Claude Sonnet 4.6 and 4.5 require 1,024 tokens, while higher-tier models like Claude Opus 4.7, 4.6, 4.5, and the Claude Mythos Preview require a much higher 4,096 tokens. Claude Haiku 4.5 also reportedly requires 4,096 tokens due to its longer attention window architecture.

Workaround: Verify system prompt length using the count_tokens API; if below the threshold, 'pad' the prompt with additional few-shot examples or domain context to trigger the 90% read discount.

Source: claude.comvendor_docs · cited 2026-05-27

Dynamic Content and Whitespace Cache Invalidation

Anthropic's prompt caching requires an exact byte-for-byte prefix match. Common 'gotchas' that invalidate the cache include injecting dynamic timestamps or session IDs into system prompts, changing the order of tool definitions, or even inconsistent trailing whitespace (e.g., concatenating with '\n' in one process but not another).

Workaround: Move dynamic elements like timestamps to the user message block rather than the system prompt, and ensure tool definitions are sorted or stored in a stable-order Map.

Source: dev.toblog_post · cited 2026-05-27

Google Vertex AI Regional Endpoint Surcharge

While base per-token pricing for Claude models is generally identical across the Anthropic API, AWS Bedrock, and Google Vertex AI, Vertex AI reportedly applies a 10% surcharge specifically for regional endpoints compared to its global endpoint pricing.

Workaround: Use global endpoints where data residency requirements allow to avoid the 10% regional premium.

Source: hikari-dev.comblog_post · cited 2026-05-27

Managed vs. Unmanaged Agent Billing Disparity

Anthropic's beta 'Managed Agents' (Tier 3) provides a hosted runtime with optimized prompt caching and pay-as-you-go billing. In contrast, users of the 'Agent SDK' (Tier 2) reportedly face a 'hard-capped' credit system and must manually implement caching logic, which can lead to significantly higher costs if third-party tools reprocess context inefficiently.

Workaround: Migrate agentic workloads to Managed Agents or Routines to leverage Anthropic's native runtime optimizations and avoid the 'unmanaged' credit squeeze.

Source: reddit.comreddit · cited 2026-05-27

Hidden costs (25-40% beyond per-token rates)

Typical overhead: 25-40% beyond raw per-token rates.

What it costs to leave Anthropic

Leaving Anthropic requires rewriting prompts to remove XML-style tagging and adjusting tool-calling schemas. Lock-in is highest for users of Managed Agents or the Claude Partner Network due to proprietary runtime optimizations.

Who is this for?

For vibe coders & solo devs

Focus on maximizing the value of Claude Pro or the Claude for Open Source program if you maintain popular repositories. Use prompt caching for long coding sessions to keep costs low while maintaining context. Leverage the $25,000 to $100,000 in startup credits if you are building a new product through a VC partner.

* Apply for the Claude for Open Source program for approximately $1,200 in value.
* Use the count_tokens API to ensure prompts meet the 1,024 token caching threshold.
* Move dynamic timestamps to user messages to avoid cache invalidation.
* Implement a heartbeat request to maintain the 5-minute cache TTL during long pauses.

For SMBs and growing teams

Consolidate your spend through the Claude Marketplace if you already have an enterprise commitment. Consider the Claude Partner Network to access training and $100 million in total program funding for market development. Use the Team Standard or Premium plans to manage user access without the complexity of direct API integration.

* Apply for the Claude Partner Network to get dedicated Applied AI engineer support.
* Use the Claude Marketplace to consolidate third-party app spend into one invoice.
* Monitor the 10 percent regional surcharge on Google Vertex AI if using regional endpoints.
* Evaluate the $20,000 AI for Science program if your business conducts academic research.

For enterprise buyers

Prioritize AWS Bedrock or Google Vertex AI to use existing cloud credits and meet strict security requirements like VPC PrivateLink. Use Provisioned Throughput on AWS for guaranteed capacity at fixed hourly rates, which can range from $15,000 to $37,000 per Model Unit. Leverage Committed Use Discounts (CUDs) on Google Cloud for up to 50 percent savings on three-year terms.

* Negotiate AWS Bedrock Provisioned Throughput for high-volume, consistent workloads.
* Use Google Vertex AI global endpoints to avoid the 10 percent regional premium.
* Apply AWS Activate or Google Cloud startup credits to offset initial Claude usage.
* Utilize the Claude Marketplace to reduce the risk of cost overruns on large commitments.
Need help deciding which Anthropic tier or model fits your workload? Book a $19.99 quick call →

Sources verified for this page

Primary: Anthropic pricing page

View all 22 cited insider sources across 13 domains

Generator: gen-v5.0.8-2026-05-25 · Last refreshed: Tue May 26 2026 22:46:16 GMT-0400 (Eastern Daylight Time) · Pricing snapshot: Tue May 26 2026 00:00:00 GMT-0400 (Eastern Daylight Time)

📖 Data sources & methodology 161 text models · 9 embeddings · 24 vision · 41 audio · 8 vector DBs across 10 vendor pages · last verified 2026-06-05

Methodology

  • All prices are USD per 1 million tokens, current as of 2026-06-05.
  • Vendor-published values have no mark. Inferred/extrapolated values are marked with * and listed below.
  • Batch API discounts are 50% off standard rates across providers that offer Batch mode.
  • Prompt caching discounts vary by provider (typically 80-90% off cached input tokens).
  • Regional data-residency surcharges (Anthropic 1.1x, OpenAI 1.1x, Google regional tiers) are NOT included in base rates.
  • Long-context pricing tiers apply when input exceeds model threshold.
  • Embedding prices are input-only (no output tokens generated).

Primary sources

Last-verified date is the most recent successful daily snapshot (aicost_pricing_snapshots) or, when no snapshot exists yet, the latest successful crawler run (aicost_crawler_runs). 10 of 10 vendors are currently verified. Aggregator services (TokenCost, AI Pricing Guru, etc.) are not listed.

Anthropic
2026-06-05
https://www.anthropic.com/pricing
Daily snapshot since Sep 2023 · 578 days captured
Anthropic Docs
2026-06-05
https://platform.claude.com/docs/en/about-claude/pricing
Daily snapshot since Sep 2023 · 578 days captured
OpenAI
2026-06-05
https://openai.com/api/pricing/
Daily snapshot since Sep 2023 · 579 days captured
Google AI
2026-06-05
https://ai.google.dev/gemini-api/docs/pricing
Daily snapshot since Dec 2023 · 554 days captured
Google Vertex
2026-06-05
https://cloud.google.com/vertex-ai/generative-ai/pricing
Daily snapshot since Dec 2023 · 554 days captured
DeepSeek
2026-06-05
https://api-docs.deepseek.com/quick_start/pricing
Daily snapshot since May 2024 · 493 days captured
xAI
2026-06-05
https://x.ai/api
Daily snapshot since Nov 2024 · 411 days captured
Mistral
2026-06-05
https://mistral.ai/pricing
Daily snapshot since Dec 2023 · 552 days captured
Cohere
2026-06-05
https://cohere.com/pricing
Daily snapshot since Sep 2023 · 578 days captured

Inferred values (marked with * in calculator tables)

Derived from industry conventions, not directly published by the vendor. Typical conventions: cached input = 10% of base (90% off), Batch API = 50% of base (50% off).

Vendor / Model Field Why it’s inferred
Anthropic — Claude Sonnet 4.6 cachedInput Derived at 10% of input rate — Anthropic publishes 90% cache-hit discount on this tier.
Anthropic — Claude Sonnet 4.5 cachedInput Derived at 10% of input rate; same 90% cache-hit convention as Sonnet 4.6.
Anthropic — Claude Sonnet 4.5 batchInput Derived at 50% of standard input — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Sonnet 4.5 batchOutput Derived at 50% of standard output — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Haiku 4.5 cachedInput Derived at 10% of input rate — Anthropic 90% cache-hit discount convention.
OpenAI — GPT-5.4 Mini cachedInput Derived at 10% of input — OpenAI documents automatic 90% discount on cache hits across GPT-5.x tier.
OpenAI — GPT-5.4 Nano cachedInput Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Nano batchInput Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Nano batchOutput Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro cachedInput Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Pro batchInput Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro batchOutput Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.2 cachedInput Derived at 10% of input; no residency uplift.
OpenAI — GPT-5.2 batchInput Derived at 50% of input.
OpenAI — GPT-5.2 batchOutput Derived at 50% of output.
OpenAI — GPT-5 cachedInput Derived at 10% of input.
OpenAI — GPT-5 batchInput Derived at 50% of input.
OpenAI — GPT-5 batchOutput Derived at 50% of output.
OpenAI — GPT-5.5 Pro cachedInput Derived at 10% of input — OpenAI does not publish a cached rate for *-pro models; using the family convention.
OpenAI — GPT-5.5 Pro batchInput Derived at 50% of input.
OpenAI — GPT-5.5 Pro batchOutput Derived at 50% of output.
OpenAI — GPT-5.2 Pro cachedInput Derived at 10% of input — pro-tier convention.
OpenAI — GPT-5.2 Pro batchInput Derived at 50% of input.
OpenAI — GPT-5.2 Pro batchOutput Derived at 50% of output.
OpenAI — GPT-5.1 batchInput Derived at 50% of input.
OpenAI — GPT-5.1 batchOutput Derived at 50% of output.
OpenAI — GPT-5 Pro batchInput Derived at 50% of input.
OpenAI — GPT-5 Pro batchOutput Derived at 50% of output.
OpenAI — GPT-5 Nano cachedInput Derived at 10% of input.
OpenAI — GPT-5 Nano batchInput Derived at 50% of input.
OpenAI — GPT-5 Nano batchOutput Derived at 50% of output.
Google — Gemini 3 Flash cachedInput Derived at 10% of input — Google caching discount convention ~90%.
Google — Gemini 3.1 Flash-Lite cachedInput Derived at 10% of input — Google caching convention.
Google — Gemini 3.1 Flash-Lite batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 3.1 Flash-Lite batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Pro cachedInput Derived at 10% of input.
Google — Gemini 2.5 Flash cachedInput Derived at 10% of input.
Google — Gemini 2.5 Flash-Lite cachedInput Derived at 10% of input — Google caching convention.
Google — Gemini 2.5 Flash-Lite batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Flash-Lite batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash cachedInput Derived at 25% of input per Google 2.0 family caching rates.
Google — Gemini 2.0 Flash batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite cachedInput Derived at 10% of input — Google caching convention.
Google — Gemini 2.0 Flash-Lite batchInput Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite batchOutput Derived at 50% of output — Google Batch API uniform 50% discount.
xAI — Grok 4 (legacy) cachedInput Extrapolated at 25% of base.

Pricing is cross-verified against the LiteLLM community registry when available. Daily snapshots are kept in aicost_pricing_snapshots; every change is logged to aicost_price_changelog with old & new values for full audit trail. Read the full methodology →