Methodology: Agent Loop Cost

Cost of agentic workloads with multiple LLM iterations per task.

Citations last refreshed: 2026-05-02 Pricing snapshot: 2026-06-05 ← Back to calculator

How we keep this honest

Every number on aicost.ai is verified by 11 independent audit layers that run every day at 03:30 EDT — covering structural integrity, math correctness, source-side freshness, and cross-source agreement. We publish today's snapshot date and per-vendor verification timestamps below so you can verify any number yourself.

calculators

vendors verified

cited claims

581

days of history

audit layers

See the 8 audit layers

Layer 1: Architecture (12 structural invariants)
Layer 2: Smoke test (every calc page renders)
Layer 3: Golden values (math correctness vs reference)
Layer 4: Source resilience (independent reference data sources reachable)
Layer 5: Math gotchas (static code analysis)
Layer 6: Hybrid reconciliation (cross-source agreement)
Layer 7: Drift detection (day-over-day price changes)
Layer 8: Vendor cache (per-vendor freshness wiring)
Layer 9: Cross-vendor reachability (live vendor pricing page probes)
Layer 10: Rendered HTML drift (calc page DOM contracts, 45 pages daily)
Layer 11: Pricing freshness (cron heartbeat + per-vendor age tracking)

All 8 layers must pass before any pricing data is considered fresh. The infrastructure runs daily and publishes results to an internal dashboard. If any layer flags an issue, it is treated as stop-the-line work.

How this calculator sources its numbers

Every value falls into one of five categories. Numbers without an asterisk are vendor-published, directly observable, or computed by arithmetic on published data. Numbers marked with * are typical best-target values — we state the working range and invite you to override with your own number.

Vendor-published Directly from the vendor's pricing or docs page. No asterisk.

Published benchmark Independent benchmark (e.g. Chatbot Arena, ANN-Benchmarks, vLLM). Cited with date.

Research paper Peer-reviewed or widely-accepted research (e.g. LLMLingua, RAGAS).

Typical target * No single canonical source exists. We state the working range and explain why.

Computed Arithmetic on vendor-published values (e.g. batch discount × standard rate).

Any individual claim may also be tagged with ^* if its source has not yet been re-verified against the current vendor page — treat such claims as approximate until the next verification cycle resolves them.

Vendor verification freshness

Each vendor's pricing page is independently re-checked on a cadence ranging from daily to weekly. Below: when each relevant vendor was last verified by our automated pipeline.

anthropic

2026-06-05

verified today

by auto-pipeline

chatgpt-plus

2026-06-05

verified today

by auto-pipeline

chroma

2026-06-05

verified today

by auto-crawler

claude-pro

2026-06-05

verified today

by auto-pipeline

cohere

2026-06-05

verified today

by auto-pipeline

* Typical best-target values (defaults you may want to override)

These values have no single canonical source. We've stated the working range and its basis. Your actual numbers will likely differ — every starred field in the calculator has an override input.

* Production agentic deployments add a 15ÔÇô30% safety buffer for unexpected loop iterations and retries.

Default used

1.200000 multiplier

Typical range

1.100000–1.500000 multiplier

Source

aicost.ai editorial ÔÇö operational guidance · Sat May 02 2026 00:00:00 GMT-0400 (Eastern Daylight Time)

Default 1.2 (20% buffer) reflects industry practice for autonomous agent budgets. Bare-minimum 1.10 is optimistic; high-variance workloads (research, exploratory analysis) trend toward 1.50.

Published benchmarks and research

Independent, reproducible, and cited.

Default 22 working days per month for monthly cost projections.

Value

22.000000 days

Source

US Bureau of Labor Statistics · Mon Jan 01 2024 00:00:00 GMT-0500 (Eastern Standard Time)

Type

Benchmark

US standard work-month is 22 days (52 weeks ├ù 5 days ├À 12 months Ôëê 21.7, rounded). Calculator default; user can override.

Vendor pricing pages referenced

All vendor-published prices used by this calculator are sourced from the pages below. See the verification panel above for when each was last re-checked.

Anthropic https://www.anthropic.com/pricing · verified 2026-04-17
Anthropic Docs https://platform.claude.com/docs/en/about-claude/pricing · verified 2026-04-17
OpenAI https://openai.com/api/pricing/ · verified 2026-04-17
Google AI https://ai.google.dev/gemini-api/docs/pricing · verified 2026-04-17
Google Vertex https://cloud.google.com/vertex-ai/generative-ai/pricing · verified 2026-04-17
DeepSeek https://api-docs.deepseek.com/quick_start/pricing · verified 2026-04-17
xAI https://x.ai/api · verified 2026-04-17
Mistral https://mistral.ai/pricing · verified 2026-04-17
Cohere https://cohere.com/pricing · verified 2026-04-17
Voyage AI https://docs.voyageai.com/docs/pricing · verified 2026-04-17

See an error or stale value?

We treat methodology as a living document. If a price is wrong, a benchmark is outdated, or you have a better citation, let us know and we will verify within 48 hours.

Email [email protected] →

📖 Data sources & methodology 161 text models · 9 embeddings · 24 vision · 41 audio · 8 vector DBs across 10 vendor pages · last verified 2026-06-05

Methodology

All prices are USD per 1 million tokens, current as of 2026-06-05.
Vendor-published values have no mark. Inferred/extrapolated values are marked with * and listed below.
Batch API discounts are 50% off standard rates across providers that offer Batch mode.
Prompt caching discounts vary by provider (typically 80-90% off cached input tokens).
Regional data-residency surcharges (Anthropic 1.1x, OpenAI 1.1x, Google regional tiers) are NOT included in base rates.
Long-context pricing tiers apply when input exceeds model threshold.
Embedding prices are input-only (no output tokens generated).

Primary sources

Last-verified date is the most recent successful daily snapshot (aicost_pricing_snapshots) or, when no snapshot exists yet, the latest successful crawler run (aicost_crawler_runs). 10 of 10 vendors are currently verified. Aggregator services (TokenCost, AI Pricing Guru, etc.) are not listed.

Anthropic

2026-06-05

https://www.anthropic.com/pricing

Daily snapshot since Sep 2023 · 578 days captured

Anthropic Docs

2026-06-05

https://platform.claude.com/docs/en/about-claude/pricing

Daily snapshot since Sep 2023 · 578 days captured

OpenAI

2026-06-05

https://openai.com/api/pricing/

Daily snapshot since Sep 2023 · 579 days captured

Google AI

2026-06-05

https://ai.google.dev/gemini-api/docs/pricing

Daily snapshot since Dec 2023 · 554 days captured

Google Vertex

2026-06-05

https://cloud.google.com/vertex-ai/generative-ai/pricing

Daily snapshot since Dec 2023 · 554 days captured

DeepSeek

2026-06-05

https://api-docs.deepseek.com/quick_start/pricing

Daily snapshot since May 2024 · 493 days captured

xAI

2026-06-05

https://x.ai/api

Daily snapshot since Nov 2024 · 411 days captured

Mistral

2026-06-05

https://mistral.ai/pricing

Daily snapshot since Dec 2023 · 552 days captured

Cohere

2026-06-05

https://cohere.com/pricing

Daily snapshot since Sep 2023 · 578 days captured

Voyage AI

2026-06-05

https://docs.voyageai.com/docs/pricing

Inferred values (marked with * in calculator tables)

Derived from industry conventions, not directly published by the vendor. Typical conventions: cached input = 10% of base (90% off), Batch API = 50% of base (50% off).

Vendor / Model	Field	Why it’s inferred
Anthropic — Claude Sonnet 4.6	`cachedInput`	Derived at 10% of input rate — Anthropic publishes 90% cache-hit discount on this tier.
Anthropic — Claude Sonnet 4.5	`cachedInput`	Derived at 10% of input rate; same 90% cache-hit convention as Sonnet 4.6.
Anthropic — Claude Sonnet 4.5	`batchInput`	Derived at 50% of standard input — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Sonnet 4.5	`batchOutput`	Derived at 50% of standard output — Anthropic documents uniform 50% Batch discount.
Anthropic — Claude Haiku 4.5	`cachedInput`	Derived at 10% of input rate — Anthropic 90% cache-hit discount convention.
OpenAI — GPT-5.4 Mini	`cachedInput`	Derived at 10% of input — OpenAI documents automatic 90% discount on cache hits across GPT-5.x tier.
OpenAI — GPT-5.4 Nano	`cachedInput`	Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Nano	`batchInput`	Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Nano	`batchOutput`	Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro	`cachedInput`	Derived at 10% of input — OpenAI 90% cache-hit convention.
OpenAI — GPT-5.4 Pro	`batchInput`	Derived at 50% of input — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.4 Pro	`batchOutput`	Derived at 50% of output — OpenAI Batch API uniform 50% discount.
OpenAI — GPT-5.2	`cachedInput`	Derived at 10% of input; no residency uplift.
OpenAI — GPT-5.2	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.2	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5	`cachedInput`	Derived at 10% of input.
OpenAI — GPT-5	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.5 Pro	`cachedInput`	Derived at 10% of input — OpenAI does not publish a cached rate for *-pro models; using the family convention.
OpenAI — GPT-5.5 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.5 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.2 Pro	`cachedInput`	Derived at 10% of input — pro-tier convention.
OpenAI — GPT-5.2 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.2 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5.1	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5.1	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5 Pro	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5 Pro	`batchOutput`	Derived at 50% of output.
OpenAI — GPT-5 Nano	`cachedInput`	Derived at 10% of input.
OpenAI — GPT-5 Nano	`batchInput`	Derived at 50% of input.
OpenAI — GPT-5 Nano	`batchOutput`	Derived at 50% of output.
Google — Gemini 3 Flash	`cachedInput`	Derived at 10% of input — Google caching discount convention ~90%.
Google — Gemini 3.1 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 3.1 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 3.1 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Pro	`cachedInput`	Derived at 10% of input.
Google — Gemini 2.5 Flash	`cachedInput`	Derived at 10% of input.
Google — Gemini 2.5 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 2.5 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.5 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash	`cachedInput`	Derived at 25% of input per Google 2.0 family caching rates.
Google — Gemini 2.0 Flash	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite	`cachedInput`	Derived at 10% of input — Google caching convention.
Google — Gemini 2.0 Flash-Lite	`batchInput`	Derived at 50% of input — Google Batch API uniform 50% discount.
Google — Gemini 2.0 Flash-Lite	`batchOutput`	Derived at 50% of output — Google Batch API uniform 50% discount.
xAI — Grok 4 (legacy)	`cachedInput`	Extrapolated at 25% of base.

Pricing is cross-verified against the LiteLLM community registry when available. Daily snapshots are kept in aicost_pricing_snapshots; every change is logged to aicost_price_changelog with old & new values for full audit trail. Read the full methodology →