Which AI model
fits your task — and what does it really cost?
Tell us what you want to do — we'll show you the right model, the honest token bill and why ChatGPT Plus, Claude Pro & co. are not directly comparable.
Find the right AI model in just a few clicks
Pick your use case and three requirements — we suggest the best models and compare the costs for you.
Pick the use case that fits best.
Tells us whether to prefer cheap or top-tier models.
Multi-select. For sensitive data usually EU (GDPR).
The “context window” — the AI's short-term memory.
Auto-matched to your selected use case – you can override anytime.
★Your personal recommendation
Out of 40+ models, freshly calculated — this fits you best:
Costs estimated for 500k input + 200k output tokens / month (≈ a few thousand requests). Need an exact calculation? Let's run the numbers together →
ByteDance Doubao 🇨🇳
Doubao Pro 1.5
Extremely cheap for mass content, social tagging and content moderation.
Gemini 2.5 Flash-Lite
Tagging product data, bulk translations, simple classification — when every cent per call matters.
Meta (Llama)
Llama 4 Maverick
Open-weights flagship for on-prem and fine-tuning when you want to host or adapt the weights yourself.
OpenAI
GPT-5 mini
The workhorse for production apps: chat assistants, content drafts, RAG answers, mid-tier tool calling.
Don't want to decide yourself?
We build you multi-model routing that automatically picks the cheapest good-enough model per task — and integrate it into your ERP, CRM or PIM. Why “chat only” isn't enough →
Immer up-to-date bei neuen KI-Modellen oder Preisen
Kurze Mail, sobald ein neues Modell erscheint oder sich Preise ändern. Kein Spam, jederzeit abbestellbar.
Picking the model is just the start
Turning a model into real business value takes clean data, clear processes and the right distribution. That's where we come in.
Need clean data as input?
No model in the world will rescue bad input data. We bundle, clean and enrich your data so every token counts.
Go to DataNaicerWant to turn data into content directly?
Product descriptions, SEO copy and variants at scale — generated from data instead of typed laboriously into ChatGPT.
Go to ContentNaicerNeed continuous reach?
News Stream takes over LinkedIn, blog and newsletter fully automatically — we pick the right AI model per format for you.
Go to News StreamPublisher or media house?
Built for publishers: News Stream produces editorial content, SEO articles and social posts at scale — with transparent tiered pricing.
See publisher pricingAll AI models – pricing in detail
List prices per 1 M tokens (USD) plus a benefit recommendation per model.
OpenAI
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| GPT-5 | $5.00 | $15.00 | 400k | When you need a real second opinion: strategy papers, legal analysis, hard code refactors, agents that orchestrate multiple tools cleanly. Subscription equivalent: ChatGPT Plus 20 $ / Pro 200 $ pro Monat |
| GPT-5 mini | $0.40 | $1.60 | 400k | The workhorse for production apps: chat assistants, content drafts, RAG answers, mid-tier tool calling. |
| GPT-4o | $2.50 | $10.00 | 128k | Voice assistants, image description and OCR — anywhere you need to mix text, image and audio. |
| o4-mini | $1.10 | $4.40 | 200k | Maths, logic, unit-test generation and coding agents when you need reasoning but don't want to pay GPT-5 prices. |
Anthropic
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Claude Fable 5 | $10.00 | $50.00 | 1M | Anthropic's new top tier (Mythos class). For the truly hard problems: multi-step research agents, autonomous coding sessions running for hours, deep legal analysis. Subscription equivalent: Nur über Claude Max 200 $ oder API |
| Claude Opus 4.8 | $5.00 | $25.00 | 1M | Currently the first choice for long-running coding agents (Claude Code, Cursor) and whole codebases or document stacks in context. Writes at a top level with style and consistency. Subscription equivalent: Claude Pro 20 $ / Max 100–200 $ pro Monat |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 1M | The workhorse for coding agents and long-context RAG — cheaper than Opus, almost as good for most tasks. |
| Claude Haiku 4.5 | $1.00 | $5.00 | 200k | Customer-support bots, ticket routing, sentiment and intent classification at high volume. |
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Gemini 2.5 Pro | $1.25 | $10.00 | 2M | When you need to understand entire codebases, videos or hundreds of PDFs at once — the king of long context. Subscription equivalent: Gemini Advanced 21,99 $ / AI Ultra 250 $ pro Monat |
| Gemini 2.5 Flash | $0.30 | $2.50 | 1M | RAG, translations, image recognition at scale — excellent price/performance with long context. |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 1M | Tagging product data, bulk translations, simple classification — when every cent per call matters. |
Meta (Llama)
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Llama 4 Maverick | $0.50 | $1.50 | 1M | Open-weights flagship for on-prem and fine-tuning when you want to host or adapt the weights yourself. |
| Llama 4 Scout | $0.15 | $0.60 | 10M | Load huge knowledge bases fully into context without having to build a RAG pipeline. |
Mistral
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Mistral Large 2 | $2.00 | $6.00 | 128k | First choice for GDPR requirements and EU data residency. Multilingual, solid function calling. Subscription equivalent: Le Chat Pro 14,99 € pro Monat |
| Mistral Small 3 | $0.20 | $0.60 | 32k | EU-compliant low-latency apps, edge deployments, fast internal tools. |
xAI
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Grok 4 | $3.00 | $15.00 | 256k | When real-time web and X data have to be part of the answer — e.g. trend research or social monitoring. Subscription equivalent: X Premium+ 40 $ / SuperGrok 30–300 $ pro Monat |
| Grok 4 mini | $0.30 | $1.50 | 128k | Fast, cheap answers with up-to-date web knowledge. |
DeepSeek 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| DeepSeek-V3.2 | $0.27 | $1.10 | 128k | Very cheap general-purpose model with surprisingly strong coding performance — the price breaker for volume. |
| DeepSeek-R1 | $0.55 | $2.19 | 128k | Frontier-level reasoning at a fraction of the cost of GPT-5 or Opus 4.5. |
Alibaba Qwen 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Qwen3-Max | $1.20 | $6.00 | 1M | Top all-rounder from China, very strong in coding and multilingual tasks (CN, EN, DE). |
| Qwen3-Coder | $0.30 | $1.20 | 1M | Search huge repos, refactor, generate tests — cheap coding agents. |
Moonshot Kimi 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Kimi K2 | $0.60 | $2.50 | 2M | Long documents, research agents, "read this book and answer my questions" scenarios. |
Zhipu GLM 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| GLM-4.6 | $0.60 | $2.20 | 200k | Solid all-rounder with good tool use and CN/EN strength — happy to use as a backup router target. |
Baidu Ernie 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Ernie 4.5 Turbo | $0.55 | $2.20 | 128k | Chinese-language customer communication, marketing content for the CN market. |
ByteDance Doubao 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| Doubao Pro 1.5 | $0.11 | $0.28 | 256k | Extremely cheap for mass content, social tagging and content moderation. |
MiniMax 🇨🇳
| Model | Input / 1M | Output / 1M | Context | What you'd use it for |
|---|---|---|---|---|
| MiniMax M2 | $0.30 | $1.20 | 1M | Multimodal apps (text + image) at a low price, popular in Asia. |
Note: Prices are list prices for the providers' APIs (as of June 2026) and change regularly. Discounts via batch, cache or volume contracts and regional hosting surcharges (Azure, Vertex AI, Bedrock) are not included.
Common questions about AI model costs
5 things nobody says out loud — and that aren't on any price list.
Are output tokens really more expensive than input tokens?
Yes, usually by a factor of 3–5×. If you generate long answers you pay disproportionately. Trick: keep system prompts short, cap answer length.
Is a large context window automatically more expensive?
Long prompts cost linearly with token count. A 1M context only pays off if you actually need it — RAG is often cheaper.
What are reasoning or thinking tokens?
The GPT-o series, DeepSeek-R1 and Claude Opus “think” internally. These thinking tokens get billed too, which is why reasoning models often come out higher than expected.
What does multi-model routing actually give you?
Route simple tasks to Flash-Lite or DeepSeek and hard ones to Sonnet or GPT-5 — in our projects this typically saves 60–80 % of the cost.
Which AI models can I use under GDPR?
For GDPR-critical workloads, Mistral (EU hosting) or self-hosted Llama are usually the only clean choice. Chinese models have no place in sales or personal data.
Not sure which model fits your use case?
We pick for you — data-driven, vendor-neutral and with an eye on total cost of ownership.