API Pricer

Compare AI API Costs Across Every Major Provider

Enter your estimated token usage and instantly see side-by-side pricing for Claude, GPT-4o, Gemini, Llama, Mistral, DeepSeek, and more. No signup required — just real numbers from official provider pricing pages.

Models

5 selected

Pricing varies by prompt length. Rates shown are for prompts up to 200K tokens.

Pricing varies by prompt length and thinking mode. Rates shown are for standard (non-thinking) prompts up to 200K tokens.

Llama 4 is open-source. Pricing reflects Together.ai inference. Rates vary by provider.

DeepSeek offers reduced pricing for cache hits. Rates shown are standard (cache miss) pricing.

Cost Comparison

ModelProvider
Claude Sonnet 4.6anthropic$0.00$0.00$0.00$0.00
GPT-4oopenai$0.00$0.00$0.00$0.00
GPT-4o miniopenai$0.00$0.00$0.00$0.00
Gemini 2.5 Flashgoogle$0.00$0.00$0.00$0.00
Mistral Largemistral$0.00$0.00$0.00$0.00
Cost of Switching

Compare the monthly and annual cost difference between two models at your current usage level.

Select both models above to see the cost difference.

Frequently Asked Questions

How much does it cost to use AI APIs?

AI APIs are priced on a per-token basis, billed separately for input tokens (your prompts) and output tokens (the model's response). Costs vary widely by provider and model tier. Budget models like GPT-4o mini or Gemini 2.5 Flash start as low as $0.15 per million input tokens, while premium frontier models like Claude Opus can reach $15 per million input tokens. Your actual monthly cost depends entirely on how many tokens you send and receive — use this calculator to estimate from your own usage numbers.

Which AI API is the cheapest?

It depends on your use case. For high-volume, cost-sensitive workloads, GPT-4o mini ($0.15/M input), Gemini 2.5 Flash ($0.15/M input), and Llama 4 via Together.ai ($0.27/M input) are among the most affordable options. For tasks requiring frontier intelligence, you'll pay more — Claude Sonnet 4.6 ($3/M input) and GPT-4o ($2.50/M input) offer a balance of capability and cost. Open-source models like Llama 4 can also be self-hosted to eliminate per-token fees entirely if you have the infrastructure.

What is the difference between input and output tokens?

Input tokens are the tokens in the message you send to the model — your system prompt, conversation history, and user message. Output tokens are the tokens in the model's response. All providers charge different rates for each: output tokens are consistently more expensive than input tokens, often by a factor of three to five times. A typical API call might use 500 input tokens and generate 300 output tokens. For long-context or retrieval-augmented workloads where you embed large documents in every prompt, input costs can dominate your bill — many providers offer prompt caching at a steep discount for repeated context.

Do AI providers offer batch pricing discounts?

Yes. Anthropic and OpenAI both offer a Batch API that processes requests asynchronously (with a turnaround time of up to 24 hours) at roughly 50% off standard pricing. This is ideal for workloads that are not latency-sensitive — data labeling, offline document processing, bulk classification, and similar tasks. Google's Gemini models do not currently offer a formal batch discount tier through the API Pricer data set. When batch pricing is available for a model, you can toggle between standard and batch rates directly in the calculator above.

How often do AI API prices change?

AI API pricing has trended downward significantly since 2023 as providers compete for market share and infrastructure costs decline. Price drops can happen at any time — OpenAI, Anthropic, and Google have each cut prices multiple times in the past year. The pricing data on this site is updated manually on a monthly schedule by reviewing each provider's official pricing page.

What is a token in AI API pricing?

A token is the basic unit of text that a language model reads and produces. Tokens are not the same as words — a single word may be one token or several, and punctuation marks and spaces are often their own tokens. As a rough rule of thumb, one token equals about four characters of English text, and 100 tokens equals roughly 75 words. A short paragraph like this one is approximately 80–100 tokens. Most providers publish tokenizer tools you can use to get an exact count for your specific content before estimating costs.

About This Data

Pricing data is sourced directly from each provider's official pricing page: Anthropic, OpenAI, Google, Meta (via Together.ai), Mistral, and DeepSeek. No third-party aggregators are used.

Data is updated manually on a monthly schedule. Every update is verified directly against each provider's official documentation before publishing.