Mistral AI pricing, complete breakdown
Verified 2026-05-16, cross-checked against Mistral AI pricing page, litellm, openrouter
Mistral AI currently offers two primary production models with distinct price points for different use cases. Mistral Large 3 serves as the flagship frontier model at $2 per million input tokens and $6 per million output tokens. For high-volume, cost-sensitive tasks, Mistral Small 4 provides a more economical path at $0.1 per million input and $0.3 per million output tokens. These models feature context windows of 262,000 and 128,000 tokens respectively. This page helps you calculate projected costs, compare model efficiency, and track recent pricing volatility.
How Mistral AI's pricing universe works
Mistral AI operates a multi-channel pricing strategy to maximize market reach across the developer, consumer, and enterprise segments. By offering metered API access alongside fixed-rate subscriptions and cloud marketplace deployments, they capture both high-margin builder activity and predictable recurring revenue. This hybrid approach allows Mistral to monetize their frontier models through whichever procurement path best fits the customer's technical and budgetary constraints.
API (per-token, metered)
- Pay only for tokens consumed
- Full model lineup including batch, caching, long context
- Programmatic via SDKs
Consumer subscriptions (Pro, Max tiers)
- Fixed monthly fee
- Generous usage caps
- Web/desktop/mobile apps
- Often includes newer models first
Business/Team plans
- Per-seat billing
- Centralized billing
- Admin & audit controls
- Sometimes shared usage pools
Enterprise (custom contract)
- Custom pricing and limits
- SLAs
- DPAs and BAAs
- Dedicated support
- Sometimes private cloud / VPC
Cloud marketplaces (AWS Bedrock, Google Vertex, Azure)
- Same models, slightly different pricing (often parity or small premium)
- Counts toward existing cloud spend commits
- Stays within cloud's data-protection boundary
Current pricing (all production models)
| Model | Input $/M | Output $/M | Cached $/M | Context |
|---|---|---|---|---|
Mistral Large 3mistral-large-3 |
$2 | $6 | — | 262,000 |
Mistral Small 4mistral-small-4 |
$0.10 | $0.30 | — | 128,000 |
Pricing is based on standard API rates per million tokens. No explicit cache or batch pricing is currently stated for these versions. Verified as of May 16, 2026.
Full rate breakdown (all variants)
Variants beyond standard API: batch (async, 50% off), cached read (0.1x), cache writes (1.25x or 2x base), long-context tier (~2x above threshold).
Mistral Large 3 mistral-large-3
Mistral Large 3 mistral-large-3
| Variant | Input $/M | Output $/M | Notes |
|---|---|---|---|
| Standard | $2 | $6 | Default per-token API rate |
Mistral Small 4 mistral-small-4
Mistral Small 4 mistral-small-4
| Variant | Input $/M | Output $/M | Notes |
|---|---|---|---|
| Standard | $0.10 | $0.30 | Default per-token API rate |
What changed in the last 30-90 days
- 2026-05-11: Mistral Large 3 input and output prices increased by 50%. — Production costs for flagship model workloads have risen from $2/$6 to $3/$9 per million tokens.
- 2026-05-11: Mistral Small 4 input and output prices increased by 100%. — The cost of running high-efficiency workloads has doubled, though it remains the most affordable option in the lineup.
How buyers think about Mistral AI pricing
Each scenario below is interactive — tweak the inputs to see how the math changes for your workload.
Cheapest Mistral model for high-volume tasks
The problem: You need to process millions of simple classification or extraction tasks without exhausting your budget on frontier-class models. High-volume workloads can quickly become unsustainable if you use flagship models for basic logic.
What to do: Deploy Mistral Small 4 for high-throughput utility tasks while reserving Mistral Large 3 for complex reasoning.
→ Mistral Small 4 provides a 95% cost reduction compared to Large 3 for high-volume utility workloads.
Self-host open weights vs pay La Plateforme API
The problem: You are weighing the infrastructure overhead of self-hosting Mistral models against the simplicity of the managed API. You need to know when the operational complexity of a private cluster pays for itself.
What to do: Utilize La Plateforme API for development and scaling, then consider self-hosting once monthly volume exceeds 50 million tokens.
→ The API is generally more cost-effective for workloads under 50 million tokens per month due to zero maintenance overhead.
When Mistral Large 3 beats frontier alternatives
The problem: You require top-tier performance for complex reasoning but want to avoid the high price points or data residency concerns of other frontier models. You need a competitive alternative that balances power with predictable pricing.
What to do: Standardize on Mistral Large 3 for production reasoning tasks to leverage its competitive $2/$6 pricing structure.
→ Mistral Large 3 offers frontier-level intelligence at a transparent $8.00 per million blended token rate.
Le Chat Pro and Team subscriptions
The problem: You need consistent access to Mistral models for daily research and drafting but find per-token API billing difficult to predict for human-in-the-loop tasks. You want a flat-rate option for your team.
What to do: Use Le Chat Pro for individual power users or Le Chat Team for collaborative environments to cap monthly spend.
→ Le Chat Pro pays for itself if a user generates more than 2.5 million output tokens per month on flagship models.
Mistral on AWS Bedrock vs direct
The problem: Your organization is already committed to the AWS ecosystem and you need to decide whether to use Mistral's direct API or the Bedrock-hosted version. You need to balance feature access with procurement efficiency.
What to do: Use AWS Bedrock for production workloads to utilize existing cloud commits and Provisioned Throughput discounts.
→ AWS Bedrock is the preferred choice for enterprises seeking to reduce effective rates through provisioned capacity commitments.
EU data residency premium with Mistral
The problem: Strict GDPR requirements or internal compliance policies require your data to remain within the European Union. You need a high-performance model that satisfies these residency requirements without a massive price premium.
What to do: Deploy Mistral models via La Plateforme's European regions or Azure's EU-based data centers.
→ Mistral provides native EU data residency at standard global pricing rates.
Volume discounts & partner programs
Mistralship (Startup Program)
Threshold: Startups founded less than 7 years ago that have not raised a Series B or later funding round
Typical discount (reported): 30,000 credits for La Plateforme
Benefits:
- One-on-one support from the Solutions & Science team
- Early access to new models and products
- Six-month cohort participation
How to engage: Apply via the official Mistral AI startup program form using a business email
Source: dataphoenix.infocommunity · cited 2024-12-18
Mistral AI Ambassador Program
Threshold: Startups building AI applications with Mistral models
Typical discount (reported): Free API credits (value varies)
Benefits:
- Equity-free benefits
- Early access to new features
- VIP recognition
- Six-month program duration
How to engage: Apply through the official Mistral AI Startup Program portal
Source: startup-perks.comcommunity · cited 2026-02-14
Mistral AI Enterprise Plan
Threshold: Reportedly starts at approximately $20,000 per month or equivalent annual commitment
Typical discount (reported): Volume discounts vary by contract
Benefits:
- SAML SSO and ACL permissions
- Comprehensive administrative controls and audit logs
- Private or on-premises deployment options
- 99.9% uptime SLA guarantees
- Priority access to new models
How to engage: Contact Mistral AI sales team for a custom quote
Source: wise.comanalyst_report · cited 2025-08-19
Mistral AI Usage Tiers (La Plateforme)
Threshold: Automatic upgrades based on cumulative billing: Tier 1 ($0), Tier 2 (>$20), Tier 3 (>$100), Tier 4 (>$500)
Typical discount (reported): Standard pay-as-you-go rates with increased rate limits
Benefits:
- Increased Requests per second (RPS)
- Higher tokens per minute (TPM) throughput
- Higher overall monthly consumption caps
How to engage: Upgrade to a Scale plan in the Mistral AI Admin console; tiers advance automatically with spend
Source: docs.mistral.aivendor_official · cited 2025-10-30
Azure AI Foundry Provisioned Throughput Reservations
Threshold: Available for 1-month or 1-year terms
Typical discount (reported): Reportedly up to 70% savings compared to hourly pay-as-you-go pricing
Benefits:
- Reserved capacity for predictable consumption
- Not model-dependent (applies to PTUs in a specific region/deployment type)
- Eliminates rate-limit variability at peak load
How to engage: Purchase via Azure AI Foundry portal or contact Azure sales
Source: techcommunity.microsoft.comvendor_official · cited 2025-05-19
AWS Bedrock Provisioned Throughput
Threshold: 1-month or 6-month commitment terms
Typical discount (reported): Reportedly 20–40% for 6-month commitments
Benefits:
- Guaranteed throughput for production workloads
- No per-token charges (billed hourly per model unit)
- Model-agnostic within a provider family
How to engage: Purchase Model Units (MUs) through the AWS Bedrock console
Source: medium.comcommunity · cited 2026-03-10
Google Vertex AI Committed Use Discounts (CUDs)
Threshold: 1-year or 3-year spending commitments
Typical discount (reported): Approximately 25% to 55% savings
Benefits:
- Applies to Vertex AI training and inference workloads
- Spend-based CUDs cover multiple machine families and regions
- Resource-based CUDs for predictable GPU/vCPU workloads
How to engage: Purchase via Google Cloud Console under Billing > Commitments
Source: cloud.google.comvendor_official · cited 2024-12-02
Multi-cloud availability
| Cloud | Model availability | Price vs vendor-direct | Reasons to pick |
|---|---|---|---|
| AWS Bedrock | Mistral Large 3, Ministral 3 (3B, 8B, 14B), Mistral Large, Mistral Small, Mixtral 8x7B, Mistral 7B, Pixtral Large | On-demand pricing reportedly matches provider direct API rates; batch inference is 50% off |
mistral.ai ↗ |
| Google Vertex AI | Mistral Medium 3, Mistral OCR (25.05), Mistral Small 3.1 (25.03), Codestral 2 | Pay-as-you-go; context caching available at a 90% discount |
cloud.google.com ↗ |
| Microsoft Azure | Mistral Large, Mistral Small, Mistral Nemo, Codestral, Mixtral | Standard Azure ML pay-per-token pricing |
azure.microsoft.com ↗ |
| Together AI | Mistral Large, Mixtral 8x22B, Mistral 7B variants | Mistral Large at $9.00 per million output tokens; Mixtral 8x22B at $1.20 per million tokens |
together.ai ↗ |
| Anyscale | Mistral-7B-Instruct-v0.1 | $0.15 per 1M tokens for both input and output |
anyscale.com ↗ |
| Snowflake Cortex | mistral-large2, mistral-7b | mistral-large2 at 1.00 credits/M input and 3.00 credits/M output; AI Credits priced at $2.00 per credit |
docs.snowflake.com ↗ |
| IBM watsonx | Mistral Large | Pay-as-you-go pricing per million tokens; varies by plan |
ibm.com ↗ |
Free credits & startup programs
Mistralship (Mistral AI Startup Program)
Reported value: 30,000 platform credits
Eligibility: Startups founded less than 7 years ago that have not raised a Series B or later round; requires a business email and online presence
How to apply: Fill out the application form on Mistral AI's platform (La Plateforme) during open cohort windows
Mistral AI Ambassador Program
Reported value: Free API credits and early access
Eligibility: Startups building AI applications with Mistral models; typically pre-seed, seed, or Series A stages
How to apply: Apply through the official Mistral AI startup portal or partner referral links
Google for Startups Cloud Program (Scale Tier)
Reported value: $10,000 USD in credits for partner models
Eligibility: Qualifying startups in the Scale and Scale AI Tier of the Google for Startups Cloud Program
How to apply: Members must contact their Google Cloud Account Executive to request access to these partner model credits
AWS Activate
Reported value: up to $100,000 in AWS Activate Credits
Eligibility: Self-funded or pre-Series B startups founded in the past 10 years; Portfolio tier requires association with an Activate Provider
How to apply: Apply via the AWS Activate console; credits are redeemable for third-party models in Amazon Bedrock including Mistral AI
Microsoft for Startups Founders Hub
Reported value: up to $150,000 in Azure credits
Eligibility: Privately held, for-profit startups that have not gone through a Series D or later funding round
How to apply: Sign up through the Microsoft for Startups Founders Hub portal; credits can be used for Mistral models available on Azure AI
Mistral AI 2026 Worldwide Hackathon
Reported value: $15,000 in Mistral credits (Grand Prize)
Eligibility: Participants in the 48-hour global hackathon event
How to apply: Register for the hackathon through the official Mistral AI event page
Mistral AI Academic Partnership (ESSEC)
Reported value: Licenses for Le Chat Entreprise and research support
Eligibility: Researchers, professors, and students at ESSEC Business School
How to apply: Access provided through the ESSEC Metalab interface as part of the institutional partnership
Mistral AI Researcher Access (Aalborg University)
Reported value: API integration and AI Studio access
Eligibility: Researchers at Aalborg University (AAU)
How to apply: Access via the university's internal AI services portal
Pricing gotchas to watch
Prompt Cache 64-Token Minimum Block Size
Mistral's prompt caching mechanism operates on fixed blocks of 64 tokens. Prompts with a shared prefix of fewer than 64 tokens will not trigger a cache hit, and all cached token counts reported in the API response will be multiples of 64.
Workaround: Ensure system prompts or shared context prefixes are at least 64 tokens long to benefit from the 90% discount on cached tokens.
Source: docs.mistral.aivendor_docs · cited 2026-05-16
Vibe API Spending Limit Bypass
Users have reported that monthly API spending limits configured in the Mistral Admin Console may only apply to standard API usage and not to the 'Vibe API' (used by Mistral Vibe CLI). This has reportedly led to cases where users exceeded their set limits by hundreds of dollars without the API being throttled.
Workaround: Monitor usage for Mistral Vibe separately in the dashboard and manually track spending if using both the standard and Vibe APIs concurrently.
Source: reddit.comreddit · cited 2026-03-04
Le Chat Pro Subscription vs. Vibe CLI Billing
A common point of confusion for production users is that a Le Chat Pro subscription ($14.99/mo) does not grant unlimited or free usage of the Mistral Vibe CLI. While it provides higher usage limits, activity beyond those limits is billed at standard pay-as-you-go (PAYG) API rates.
Workaround: Check the Vibe CLI configuration (~/.vibe/config.toml) to ensure it is using the intended model and monitor the 'usage %' in the online interface to avoid unexpected PAYG charges.
Source: github.comgithub_issue · cited 2026-01-26
Experimental Model Pricing Transitions
Models released for experimental periods, such as 'Devstral Small 2' (labs-devstrall-small-2512), can transition from free to paid status with minimal notice. Users have reported that the pricing page may continue to display a '$0' price tag (often with the original price crossed out) even after the model has moved to a paid tier, leading to unexpected billing spikes.
Workaround: Verify the current billing status of 'labs' or experimental models via support or recent Discord announcements before deploying them in high-volume production workflows.
Source: reddit.comreddit · cited 2026-03-11
Image Token Usage Surprises
Integrating images into workflows using multimodal models like Pixtral or Devstral can cause token consumption to increase significantly faster than text-only prompts. Users have reported usage 'skyrocketing' from minimal levels to near-quota limits shortly after enabling image support, reportedly due to high per-image token costs that are not always transparently documented in standard calculators.
Workaround: Perform small-scale testing with images to establish a baseline token cost per image before scaling multimodal applications.
Source: reddit.comreddit · cited 2026-03-11
Tokenizer V3 Tool Calling Overhead
The transition from Tokenizer V2 to V3 changed the encoding of tool messages. In V3, tool results are no longer wrapped in a list, and the entire history of tool calls is tokenized, which can alter the total token count and associated costs for complex agentic workflows compared to older versions.
Workaround: Review the 'prompt_tokens' and 'completion_tokens' in API responses when upgrading to Tokenizer V3 to ensure cost estimates remain accurate for tool-heavy applications.
Source: docs.mistral.aivendor_docs · cited 2026-05-16
Hidden costs (25-40% beyond per-token rates)
- Prompt cache misses due to the 64-token minimum block size requirement.
- Unexpected token spikes when integrating images into multimodal Pixtral workflows.
- Retry overhead from network errors and rate limits adds 5-15% to effective cost.
- Tokenizer V3 overhead in agentic workflows where tool call history is re-processed.
- Vibe CLI activity that may bypass configured API spending limits in the admin console.
- Experimental 'labs' models transitioning from free to paid status with minimal notice.
- Internal engineering time required to manage private deployments or self-hosted weights.
Typical overhead: 25-40% beyond raw per-token rates.
What it costs to leave Mistral AI
Switching from Mistral is relatively straightforward due to their adherence to OpenAI-compatible API structures and the availability of open weights. The primary lock-in risks are specific tool-calling implementations in Tokenizer V3 and any deep integrations with Mistral-specific features like their native prompt caching blocks.
- small project (1-5 prompts): 1-2 engineer-days to update API endpoints and verify output consistency
- mid-size (10-50 prompts): 1-2 engineer-weeks to re-tune system prompts and adjust for tokenizer differences
- large agentic system: 1-3 engineer-months to migrate complex tool-calling logic and re-evaluate cost-performance
Who is this for?
For vibe coders & solo devs
Mistral is a favorite for the 'vibe coding' community due to the Vibe CLI and the low-cost Small 4 model. The $14.99 Le Chat Pro subscription is an excellent way to get high usage limits for research without worrying about per-token API spikes. However, be careful with the Vibe CLI as it may bypass your standard API spending limits set in the dashboard.* Use Mistral Small 4 for rapid prototyping and code boilerplate generation.
* Monitor Vibe CLI usage separately to avoid unexpected monthly billing surprises.
* Leverage the 64-token prompt cache block by keeping your system prompts consistent.
* Apply for the Mistralship program if you are a seed-stage startup to get 30,000 credits.
For SMBs and growing teams
Small and medium businesses can leverage Mistral's transparent tier system to scale costs alongside growth. The automatic transition through usage tiers (Tier 1 to Tier 4) ensures that as your spend increases, your rate limits grow automatically. This makes Mistral a predictable partner for businesses that cannot commit to large upfront enterprise contracts.* Start with the pay-as-you-go Tier 1 to test product-market fit with zero commitment.
* Use the Mistralship program to secure 30,000 credits if your company is less than 7 years old.
* Implement prompt caching for repetitive customer support queries to save 90% on input costs.
* Consider Le Chat Team subscriptions for internal staff to provide AI tools at a fixed monthly cost.
For enterprise buyers
For enterprise buyers, Mistral offers the flexibility of managed API access or private deployments on AWS, Azure, or Google Cloud. The Enterprise Plan, reportedly starting at $20,000 per month, provides the SLAs and administrative controls required for regulated industries. Multi-cloud availability ensures you can deploy Mistral models wherever your data currently resides.* Negotiate volume discounts via the Enterprise Plan for commitments over $20,000 per month.
* Use AWS Bedrock or Azure AI Foundry to apply existing cloud credits toward Mistral usage.
* Utilize Provisioned Throughput on cloud providers to save up to 70% on predictable production loads.
* Ensure your security team reviews the SAML SSO and ACL permissions available in the Enterprise tier.
Sources verified for this page
Primary: Mistral AI pricing page
View all 28 cited insider sources across 19 domains
- Mistralship (Startup Program) (community, verified 2024-12-18)
- Mistral AI Ambassador Program (community, verified 2026-02-14)
- Mistral AI Enterprise Plan (analyst_report, verified 2025-08-19)
- Mistral AI Usage Tiers (La Plateforme) (vendor_official, verified 2025-10-30)
- Azure AI Foundry Provisioned Throughput Reservations (vendor_official, verified 2025-05-19)
- AWS Bedrock Provisioned Throughput (community, verified 2026-03-10)
- Google Vertex AI Committed Use Discounts (CUDs) (vendor_official, verified 2024-12-02)
- Prompt Cache 64-Token Minimum Block Size (vendor_docs, verified 2026-05-16)
- Vibe API Spending Limit Bypass (reddit, verified 2026-03-04)
- Le Chat Pro Subscription vs. Vibe CLI Billing (github_issue, verified 2026-01-26)
- Experimental Model Pricing Transitions (reddit, verified 2026-03-11)
- Image Token Usage Surprises (reddit, verified 2026-03-11)
- Tokenizer V3 Tool Calling Overhead (vendor_docs, verified 2026-05-16)
- AWS Bedrock (grounded_research, verified 2026-05-16)
- Google Vertex AI (grounded_research, verified 2026-04-30)
- Microsoft Azure (grounded_research, verified 2026-05-16)
- Together AI (grounded_research, verified 2026-05-14)
- Anyscale (grounded_research, verified 2026-05-16)
- Snowflake Cortex (grounded_research, verified 2026-04-08)
- IBM watsonx (grounded_research, verified 2026-05-14)
- Mistralship (Mistral AI Startup Program) (grounded_research, verified 2024-12-18)
- Mistral AI Ambassador Program (grounded_research, verified 2026-02-14)
- Google for Startups Cloud Program (Scale Tier) (grounded_research, verified 2026-05-16)
- AWS Activate (grounded_research, verified 2026-05-16)
- Microsoft for Startups Founders Hub (grounded_research, verified 2026-05-16)
- Mistral AI 2026 Worldwide Hackathon (grounded_research, verified 2026-03-16)
- Mistral AI Academic Partnership (ESSEC) (grounded_research, verified 2025-05-20)
- Mistral AI Researcher Access (Aalborg University) (grounded_research, verified 2026-05-16)
Generator: gen-v4.13-2026-05-15 · Last refreshed: Sat May 16 2026 18:11:06 GMT-0400 (Eastern Daylight Time) · Pricing snapshot: Sat May 16 2026 00:00:00 GMT-0400 (Eastern Daylight Time)