Voyage AI: Seat-Based Pricing for Retrieval Workflows

Voyage AI pricing decoded for vibe coders, solopreneurs, developers, IT buyers, and enterprise — with cross-vendor comparison and procurement intel. Refreshed weekly.

Last refreshed: 2026-05-02 🔴 Pricing data may be stale — refresh in progress

Live-tracked weekly via aicost crawlers against docs.voyageai.com. Discrepancies surfaced in changelog — see how this page is sourced.

Where are you coming from?

How Voyage AI stacks up

Cross-vendor synthesis — what you can’t get from Voyage AI’s own marketing.

Among the major LLM vendors in 2026, Voyage AI (voyage) stands out for its departure from the standard token-based economy. While OpenAI (openai) and Cohere (cohere) price their flagship models like gpt-5-4 and command-r-plus at $2.5/M input tokens, Voyage AI utilizes a seat-based subscription model. This eliminates the variable cost concerns common in high-density retrieval tasks where context windows are frequently saturated.

OpenAI (openai) is technically cheaper for low-volume users via gpt-5-nano ($0.05/M input), but for production RAG systems, Voyage AI's fixed-cost structure simplifies TCO calculations. In a rag-knowledge-base archetype, where the LLM typically represents only 25% of total TCO, Voyage AI's predictable pricing allows teams to focus budget on engineering and observability rather than fluctuating API bills.

How the pricing actually works

Tier structure, batch discounts, caching, mechanism details.

Voyage AI (voyage) operates on a seat-based subscription model. This differs from the industry-standard per-token pricing seen with peers like OpenAI (openai) and Cohere (cohere). In this model, costs are tied to the number of users or 'seats' rather than the volume of data processed or tokens generated.

This structure is relevant for high-throughput applications. While a developer using gpt-5-4 would pay $2.5/M input tokens for every request, a Voyage AI user pays a flat rate. This eliminates the need for cost-saving measures like extensive prompt caching or token-limiting, which are often required when managing the variable costs of models like command-r-plus ($2.5/M input).

What's changed recently

Last 30 days of price + plan movement.

No notable price movements in last 30 days. Pricing has been stable.

Top 5 questions to ask Voyage AI

Verbatim — distilled from procurement intel.

  • _NARRATIVE_PENDING_
  • _NARRATIVE_PENDING_
  • _NARRATIVE_PENDING_
  • _NARRATIVE_PENDING_
  • _NARRATIVE_PENDING_

Watch out for

Gotchas, traps, and recent shifts that surprise buyers.

  • _NARRATIVE_PENDING_
  • _NARRATIVE_PENDING_
  • _NARRATIVE_PENDING_

Vendor comparison

Flagship + cheapest tier across 3 vendors. Voyage AI highlighted.

Vendor Flagship model Input / output Cheapest model Subscription tiers Recent changes (30d)
Voyage AI 0 stable
Cohere command-r-plus $2.5/M in · $10/M out command-r
$0.15 / $0.6
0 stable
OpenAI gpt-5-4 $2.5/M in · $15/M out gpt-5-nano
$0.05 / $0.4
6 2 changes

Who wins for what

6 common scenarios — best vendor for each.

  • Predictable budgeting for high-frequency RAG
    Winner: voyage  · voyage
    Seat-based pricing removes the variable cost of $2.5/M input tokens seen in gpt-5-4 or command-r-plus.
  • Lowest entry price for a single developer
    Winner: openai  · gpt-5-nano
    gpt-5-nano offers input tokens at $0.05/M, allowing for sub-dollar experimentation.
  • Predictable monthly spend for small teams
    Winner: voyage  · voyage
    Subscription-based model avoids the variable costs associated with per-token API usage.
  • High-volume output generation efficiency
    Winner: cohere  · command-r
    command-r offers output at $0.6/M, significantly lower than the $15/M for gpt-5-4.
  • Individual user chat with fixed monthly cost
    Winner: openai  · chatgpt-plus
    ChatGPT Plus provides a flat $20.00/mo entry point for individual users.
  • Enterprise search with massive context requirements
    Winner: voyage  · voyage
    Subscription model prevents cost scaling with context size, unlike the $2.5/M input cost of gpt-5-4.

Integration & TCO context

The seat fee is one line item. These archetypes show full TCO with engineering + observability + compliance.

  • Inference-only Chatbot (no retrieval) LLM is ~95% of total TCO
    Workflow: general-q-and-a  · Fit for: vibe coder, smb
    Solo developer with ChatGPT Plus + Claude Pro = $40/mo. Total monthly cost is ~$40 because there are no integration costs.
    Implementation: ~1 eng-weeks initial + ~2 hrs/month ongoing
  • RAG Knowledge Base / Internal Q&A LLM is ~25% of total TCO
    Workflow: enterprise-search  · Fit for: smb, enterprise
    SMB support RAG: $400/mo LLM tokens, $1500/mo total TCO including eng + observability + eval.
    Implementation: ~4 eng-weeks initial + ~12 hrs/month ongoing
  • Code Agent Deployment (Cursor / Copilot at team scale) LLM is ~70% of total TCO
    Workflow: developer-productivity  · Fit for: developer, smb, enterprise
    50-dev team on Copilot Business = $950/mo seats + $200/mo overage + $1500/mo eng oversight = $2650 actual.
    Implementation: ~2 eng-weeks initial + ~6 hrs/month ongoing
  • Customer Support Agent (stateful, multi-channel) LLM is ~30% of total TCO
    Workflow: customer-service  · Fit for: smb, enterprise
    SMB with 10K tickets/mo: $800 agent runtime + $2500 eng + $400 platform = ~$3700/mo.
    Implementation: ~8 eng-weeks initial + ~24 hrs/month ongoing
  • Voice Agent (Call Center / Receptionist) LLM is ~35% of total TCO
    Workflow: voice-customer-service  · Fit for: smb, enterprise
    Restaurant chain with 5K calls/mo on Gemini Live: $25 voice + $300 LLM + $4000 eng/observability = ~$4300.
    Implementation: ~6 eng-weeks initial + ~16 hrs/month ongoing
  • Multi-tool Autonomous Agent (research / sales / ops) LLM is ~20% of total TCO
    Workflow: agentic-automation  · Fit for: enterprise
    Fortune 1000 with research agent: $2500 LLM + $1500 platform + $12K eng = ~$16K/mo for ONE agent in production.
    Implementation: ~12 eng-weeks initial + ~40 hrs/month ongoing
  • Self-hosted OSS LLM (vLLM / Ollama / TensorRT) LLM is ~50% of total TCO
    Workflow: data-sovereignty  · Fit for: enterprise, developer
    Healthcare OSS deployment: $4500/mo H100 rental + $12K eng = $16.5K/mo. Break-even vs Claude Sonnet around 100M tokens/month.
    Implementation: ~6 eng-weeks initial + ~60 hrs/month ongoing
  • Office Productivity Rollout (Copilot org-wide) LLM is ~80% of total TCO
    Workflow: workforce-enablement  · Fit for: smb, enterprise
    500-seat enterprise on M365 Copilot: $15K/mo seats + $700/mo overage + $700 governance = $16.4K/mo.
📊 Raw data appendix (pricing tables, all models, all sources)

Current API Pricing

Per 1M tokens, USD. Refreshed nightly from Voyage AI's pricing pages.

Last refreshed 2026-05-02 from vendor pages

Embedding Models

Model Input
$/1M tok
Context Dimensions Tags
voyage-3 $0.06 1024 rag-specialist long-input
voyage-3-large $0.18 2048 high-quality rag-specialist long-input

🧮 Estimate your monthly bill → Compare against all 12 vendors →

Recent Price Movements

Changes detected by our crawler in the last 30 days

No price changes detected in the last 30 days. Pricing has been stable.

Pricing Mechanism Facts

Cache rates, batch discounts, SLAs — every claim cited verbatim from vendor docs.

* 1 additional fact not verified character-for-character (click to expand)

Rows marked * could not be verified character-for-character against the source. Display kept for context with explicit flag.

* Voyage AI voyage-3 is priced at $0.06 per 1M tokens; voyage-3-lite at $0.02/M — 0.06 $/M tokens

Source: Voyage AI Pricing

How this page is sourced  v2
  • Hybrid pricing version: 2026.04.30-1
  • Bundle data version: 2026.04.30-1
  • Agent data version: 2026.04.30-1
  • Integration archetypes: 2026.04.30-1
  • Procurement intel: 2026.04.30-1
  • Pricing-data.js last updated: 2026-04-17
  • Generator: vendor-pricing-v2-batch-1.0
  • Last refreshed: 2026-05-02

Published list prices crawled weekly. Sales-led plans publish public ranges with sources cited. Inferred values marked with asterisks. Persona narratives synthesized from cross-vendor data — refreshed weekly via Gemini 3 Flash.