Guides → Playground & Guide → Vendor Concentration Risk - How Exposed Is Your AI Portfolio?

Vendor Concentration Risk - How Exposed Is Your AI Portfolio?

Meet Diana Sokolov. CTO at a 250-person Series D. "We're 85% on Anthropic. Board asked: what if Anthropic raises prices 50% or has a 6-week outage?"

🔥 No good answer. Need a number + strategy by next quarter board meeting.

The story

Single-vendor AI is one of the highest-leverage risks most companies don't price. If 80%+ of AI spend goes to one vendor, you're exposed to: surprise pricing changes (15-50%), capacity throttling during their outage, model deprecation (vendor sunsets the model you depend on), and contract renegotiation power asymmetry.

Diana's exposure: 85% Anthropic, 10% OpenAI, 5% Google. Single-vendor concentration score: 8/10 (red zone). Mitigation: build LiteLLM-style abstraction now (3-4 weeks), maintain prompt portability (test on Sonnet AND GPT-5.5 weekly), keep at least 20% of workload routed elsewhere as a 'living hedge.'

Three risk dimensions. (1) Pricing risk - how much can vendor raise prices before you must absorb? (2) Operational risk - how long can you survive a vendor outage? (3) Strategic risk - vendor changes terms (no-train tier sunset, model deprecation, geographic restrictions). Each has different mitigation.

About this calculator: Vendor Concentration Risk - How Exposed Is Your AI Portfolio?

Single-vendor AI is a board-level risk. Quantify your concentration, model migration cost, and design the multi-vendor strategy that won't bankrupt you.

Inputs you control

Input	Impact on result	Range	Typical
Total monthly AI spend ($)	All vendors combined. The blast radius.	500 – 5M	100000
Top vendor share (%)	Diana: 85% Anthropic. Healthy multi-vendor: <60%. Concentration: >75%.	20 – 100	85
Estimated switching time (months)	How long to migrate top vendor to alt. Re-prompt + eval + edge cases. Production migrations: 3-6 months typical.	0.5 – 24	4

Outputs computed for you · model: `risk`

Output	How inputs affect it
Monthly cost ($)	computed from inputs
Annual cost ($)	monthlyUsd × 12

Below: live sliders. Move them to see numbers change in real time. * Output uses the generic compute model — for precise numbers use the full calculator below.

What you're looking at

Each input shapes your cost. Move the slider — see the impact.

Total monthly AI spend ($) 100,000

All vendors combined. The blast radius.

Estimated: —

Top vendor share (%) 85

Diana: 85% Anthropic. Healthy multi-vendor: <60%. Concentration: >75%.

Estimated: —

Estimated switching time (months) 4

How long to migrate top vendor to alt. Re-prompt + eval + edge cases. Production migrations: 3-6 months typical.

Estimated: —

Ready to run the numbers?

Open the full calculator — pick a model, enter your tokens, see per-call, daily, monthly, and annual cost.

🚀 Open the full calculator →

Reading your result

Concentration score: top_vendor_share / 10. 85% = 8.5/10 (red). 60% = 6/10 (yellow). 40% = 4/10 (green). Below 30% concentration is rarely worth chasing - diversification cost > marginal risk reduction.

Pricing shock cost = total_spend × top_share × shock_pct. Diana: $100K × 0.85 × 0.25 = $21.25K/mo extra if Anthropic raises prices 25%. Annual $255K. Real money.

Outage exposure = total_spend × top_share × (outage_days / 30). 1-week Anthropic outage: $100K × 0.85 × 7/30 = ~$20K of business at risk (assuming AI is revenue-generating, not just cost).

Mitigation costs are small relative to risk. LiteLLM-style abstraction layer + dual-vendor testing = ~$1-3K/mo overhead. Insurance premium against $250K+ shock. Worth it at most enterprise scales.

What "good" looks like:

Healthy multi-vendor: No vendor >60%. Quarterly migration drills.
Acceptable concentration: 60-75% on one. Abstraction layer present. Tested fallback path.
Red zone: >75% on one. No abstraction. Long migration time. Mitigate within 6 months.
Catastrophic: 95%+ single-vendor. No fallback. >12 month switch time. Board-level risk.

Top vendors for diversification (alternates to current)

Verified 20 hours ago

1

GPT-5 Mini

$0.250 in · $2.00 out ·
2

Command

$1.00 in · $2.00 out ·
3

devstral-2

$0.400 in · $2.00 out ·

Three real scenarios

Same calculator, three different team sizes. Click a tab to see how the numbers shift.

$80,000 / month ≈ $960,000 / year

Mid-size SaaS, 50% Anthropic + 30% OpenAI + 20% Google. Multi-vendor abstraction in place. Pricing shock exposure: $10K/mo. Switching time 2 months. Healthy.

Healthy range: Concentration score 5 - green zone

See inputs used

totalMonthlySpendUsd: 80,000
topVendorSharePct: 50
switchingTimeMonths: 2
expectedPriceShockPct: 25
abstractionLayerCostMonthlyUsd: 1,500

Trade-offs

Cost isn't the only dimension. Click any constraint — see how recommendations change.

What matters most to you? Click any dimension — recommendations update.

Best fit for "cost":

Multi-vendor adds 5-15% operational overhead Abstraction layer, dual maintenance
Multi-vendor saves 10-30% via competitive pricing Negotiating leverage

Multi-vendor isn't free - it adds complexity. But it gives you negotiating leverage and operational hedging. Net cost is usually neutral or slightly positive.

Use cases

Pre-loaded scenarios for the most common applications. Click a tab to see realistic numbers — then the "Try this scenario" button to load it into the calculator above.

$5,000 / month ≈ $60,000 / year

10-person startup. $5K bill. 90% one vendor for simplicity. Switch time low (small codebase). Pricing shock tolerable at this scale. Don't over-engineer; revisit at $20K+ bill.

Healthy range: Concentration high but acceptable at this stage

See inputs used

totalMonthlySpendUsd: 5,000
topVendorSharePct: 90
switchingTimeMonths: 1
expectedPriceShockPct: 30
abstractionLayerCostMonthlyUsd: 0

What this calculator can't tell you

Honest limitations — every model is wrong; some are useful. Where this one falls short:

Pricing shock estimates are heuristic - vendors don't pre-announce price changes.
Switching time depends heavily on architecture cleanliness.
Doesn't model business-impact of outages (revenue loss varies by use case).
Doesn't include reputational risk from vendor incidents (Anthropic outage → your customer impact).

For these, use: Multi-Model Router for routing layer. Self-Host Break-even for ultimate hedge.

Where to go next

Implement routing as hedge →

Multi-vendor in production.

Self-host as ultimate hedge →

Eliminate vendor dependency.

Track pricing changes →

Early warning system.

Methodology

Source: /ai-cost-economics
Extraction: Risk model calibrated against 12 historical vendor incidents (price changes, outages, deprecations).
Editorial gate: 8-layer defense — see aicost.ai/ai-cost-economics
Last verified: 6/4/2026, 8:00:00 PM

Author: Subu Vdaygiri, Founder & CEO of CloudIntelligence.ai. 17 years Fortune 100 (Ingram Micro, Siemens). Wharton CTO program · Kellogg CPO program · 10× AWS+Azure certified.

3 years of pricing history

Why this matters: pricing for major vendors has dropped 40-90% in the last 24 months. A budget set 12 months ago is probably wrong by 30%+.

View 3-year history for →

📖 Data sources & methodology 161 text models · 9 embeddings · 24 vision · 41 audio · 8 vector DBs across 10 vendor pages · last verified 2026-06-05

Methodology

All prices are USD per 1 million tokens, current as of 2026-06-05.
Vendor-published values have no mark. Inferred/extrapolated values are marked with * and listed below.
Batch API discounts are 50% off standard rates across providers that offer Batch mode.
Prompt caching discounts vary by provider (typically 80-90% off cached input tokens).
Regional data-residency surcharges (Anthropic 1.1x, OpenAI 1.1x, Google regional tiers) are NOT included in base rates.
Long-context pricing tiers apply when input exceeds model threshold.
Embedding prices are input-only (no output tokens generated).

Primary sources

Last-verified date is the most recent successful daily snapshot (aicost_pricing_snapshots) or, when no snapshot exists yet, the latest successful crawler run (aicost_crawler_runs). 10 of 10 vendors are currently verified. Aggregator services (TokenCost, AI Pricing Guru, etc.) are not listed.

Anthropic

2026-06-05

https://www.anthropic.com/pricing

Daily snapshot since Sep 2023 · 578 days captured

Anthropic Docs

2026-06-05

https://platform.claude.com/docs/en/about-claude/pricing

Daily snapshot since Sep 2023 · 578 days captured

OpenAI

2026-06-05

https://openai.com/api/pricing/

Daily snapshot since Sep 2023 · 579 days captured

Google AI

2026-06-05

https://ai.google.dev/gemini-api/docs/pricing

Daily snapshot since Dec 2023 · 554 days captured

Google Vertex

2026-06-05

https://cloud.google.com/vertex-ai/generative-ai/pricing

Daily snapshot since Dec 2023 · 554 days captured

DeepSeek

2026-06-05

https://api-docs.deepseek.com/quick_start/pricing

Daily snapshot since May 2024 · 493 days captured

xAI

2026-06-05

https://x.ai/api

Daily snapshot since Nov 2024 · 411 days captured

Mistral

2026-06-05

https://mistral.ai/pricing

Daily snapshot since Dec 2023 · 552 days captured

Cohere

2026-06-05

https://cohere.com/pricing

Daily snapshot since Sep 2023 · 578 days captured

Voyage AI

2026-06-05

https://docs.voyageai.com/docs/pricing

Inferred values (marked with * in calculator tables)

Derived from industry conventions, not directly published by the vendor. Typical conventions: cached input = 10% of base (90% off), Batch API = 50% of base (50% off).

Vendor / Model	Field	Why it’s inferred
Anthropic — Claude Sonnet 4.6	`cachedInput`	Derived at 10% of input rate — Anthropic publishes 90% cache-hit discount on this tier.
Anthropic — Claude Sonnet 4.5	`cachedInput`	Derived at 10% of input rate; same 90% cache-hit convention as Sonnet 4.6.
Anthropic — Claude Sonnet 4.5	`batchInput`	Derived at 50% of standard input — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Sonnet 4.5	`batchOutput`	Derived at 50% of standard output — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Haiku 4.5	`cachedInput`	Derived at 10% of input rate — Anthropic 90% cache-hit discount convention.
OpenAI — GPT-5.4 Mini	`cachedInput`	Derived at 10% of input — OpenAI documents automatic 90% discount on cache hits across GPT-5.x tier.
OpenAI — GPT-5.4 Nano	`cachedInput`	Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Nano	`batchInput`	Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Nano	`batchOutput`	Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro	`cachedInput`	Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Pro	`batchInput`	Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro	`batchOutput`	Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.2	`cachedInput`	Derived at 10% of input; no residency uplift.
OpenAI — GPT-5.2	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.2	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5	`cachedInput`	Derived at 10% of input.
OpenAI — GPT-5	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.5 Pro	`cachedInput`	Derived at 10% of input — OpenAI does not publish a cached rate for *-pro models; using the family convention.
OpenAI — GPT-5.5 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.5 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.2 Pro	`cachedInput`	Derived at 10% of input — pro-tier convention.
OpenAI — GPT-5.2 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.2 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.1	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.1	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5 Nano	`cachedInput`	Derived at 10% of input.
OpenAI — GPT-5 Nano	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5 Nano	`batchOutput`	Derived at 50% of output.
Google — Gemini 3 Flash	`cachedInput`	Derived at 10% of input — Google caching discount convention ~90%.
Google — Gemini 3.1 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 3.1 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 3.1 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Pro	`cachedInput`	Derived at 10% of input.
Google — Gemini 2.5 Flash	`cachedInput`	Derived at 10% of input.
Google — Gemini 2.5 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 2.5 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash	`cachedInput`	Derived at 25% of input per Google 2.0 family caching rates.
Google — Gemini 2.0 Flash	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 2.0 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
xAI — Grok 4 (legacy)	`cachedInput`	Extrapolated at 25% of base.

Pricing is cross-verified against the LiteLLM community registry when available. Daily snapshots are kept in aicost_pricing_snapshots; every change is logged to aicost_price_changelog with old & new values for full audit trail. Read the full methodology →

Vendor Concentration Risk - How Exposed Is Your AI Portfolio?

The story

About this calculator: Vendor Concentration Risk - How Exposed Is Your AI Portfolio?

Inputs you control

Outputs computed for you · model: `risk`

What you're looking at

Ready to run the numbers?

Reading your result

Top vendors for diversification (alternates to current)

Three real scenarios

Trade-offs

Best fit for "cost":

Best fit for "hallucination":

Best fit for "compliance":

Best fit for "privacy":

Best fit for "latency":

Best fit for "vendor lock-in":

Best fit for "mlops overhead":

Use cases

What this calculator can't tell you

Where to go next

Methodology

3 years of pricing history

Methodology

Primary sources

Inferred values (marked with * in calculator tables)

The story

About this calculator: Vendor Concentration Risk - How Exposed Is Your AI Portfolio?

Inputs you control

Outputs computed for you · model: risk

What you're looking at

Ready to run the numbers?

Reading your result

Top vendors for diversification (alternates to current)

Three real scenarios

Trade-offs

Best fit for "cost":

Best fit for "hallucination":

Best fit for "compliance":

Best fit for "privacy":

Best fit for "latency":

Best fit for "vendor lock-in":

Best fit for "mlops overhead":

Use cases

What this calculator can't tell you

Where to go next

Methodology

3 years of pricing history

Methodology

Primary sources

Inferred values (marked with * in calculator tables)

Outputs computed for you · model: `risk`