Pricing | Synthetic

Synthetic can use either subscription or usage-based pricing. Choose the plan that works best for you.

Subscription Packs

(1 pack)

Price

$1/day

$30/mo

Rate Limit

500

messages/5hr

vs Claude

3×

higher limits

Run any agent for $1/day.

✓Works with any agent framework

✓3x higher rate limits than Claude's $20/month plan

✓1 concurrent request per model (buy more packs to increase)

✓UI and API access

Usage-based

Agents for enterprise.

✓Works with any agent framework

✓UI and API access

✓All models are pay-per-token

Run any agent for $1/day.

✓Works with any agent framework

✓3x higher rate limits than Claude's $20/month plan

✓1 concurrent request per model (buy more packs to increase)

✓UI and API access

Agents for enterprise.

✓Works with any agent framework

✓UI and API access

✓All models are pay-per-token

Always-on model pricing

We keep popular open-source models always-on: there's no boot time, they're just ready to go. For usage-based plans, always-on models are charged per-token.

What's a token?

Just like we read word-by-word, LLMs break down text into tokens, which can be words or part of a word. On average, two words are worth three tokens.

Always-on models are very affordable, usually only costing fractions of a cent per conversation.

Not sure how many tokens your prompt takes? Try our interactive token calculator →

Here's the list of our always-on models:

Model	Context length	Input price (per million tokens)	Output price (per million tokens)
deepseek-ai/DeepSeek-R1-0528	128k tokens	$3.00/mtok	$8.00/mtok
deepseek-ai/DeepSeek-V3	128k tokens	$1.25/mtok	$1.25/mtok
deepseek-ai/DeepSeek-V3.2	159k tokens	$0.56/mtok	$1.68/mtok
meta-llama/Llama-3.3-70B-Instruct	128k tokens	$0.88/mtok	$0.88/mtok
MiniMaxAI/MiniMax-M2.5	187k tokens	$0.40/mtok	$2.00/mtok
moonshotai/Kimi-K2.5	256k tokens	$0.45/mtok	$3.40/mtok
nvidia/Kimi-K2.5-NVFP4	256k tokens	$0.45/mtok	$3.40/mtok
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4	256k tokens	$0.30/mtok	$1.00/mtok
openai/gpt-oss-120b	128k tokens	$0.10/mtok	$0.10/mtok
Qwen/Qwen3-235B-A22B-Thinking-2507	256k tokens	$0.65/mtok	$3.00/mtok
Qwen/Qwen3-Coder-480B-A35B-Instruct	256k tokens	$2.00/mtok	$2.00/mtok
Qwen/Qwen3.5-397B-A17B	256k tokens	$0.60/mtok	$3.60/mtok
zai-org/GLM-4.7	198k tokens	$0.45/mtok	$2.19/mtok
zai-org/GLM-4.7-Flash	192k tokens	$0.10/mtok	$0.50/mtok
zai-org/GLM-5	192k tokens	$1.00/mtok	$3.00/mtok
zai-org/GLM-5.1 (Beta!)	192k tokens	$1.00/mtok	$3.00/mtok

LoRA pricing

What's a LoRA?

Low-rank adapters — called "LoRAs" — are small, efficient fine-tunes that run on top of existing models. They can modify a model to be much more effective at specific tasks.

LoRAs of the following models are always-on, and are charged per-token for usage-based plans.

Model	Context length	Input price (per million tokens)	Output price (per million tokens)
meta-llama/Llama-3.2-1B-Instruct	128k tokens	$0.06/mtok	$0.06/mtok
meta-llama/Llama-3.2-3B-Instruct	128k tokens	$0.06/mtok	$0.06/mtok
meta-llama/Meta-Llama-3.1-8B-Instruct	128k tokens	$0.20/mtok	$0.20/mtok
meta-llama/Meta-Llama-3.1-70B-Instruct	128k tokens	$0.90/mtok	$0.90/mtok

LoRA sizes are measured in "ranks," starting at rank-8; we support up to rank-64 LoRAs kept always-on, and we run them in FP8 precision. The rank is set during the finetuning process: if you create your own LoRA, you'll be able to set exactly what rank you want using standard configuration for your training framework.

Embedding pricing

What are embeddings?

Embedding models convert text to special numerical coordinates, placing more-similar text closer to each other and less-similar text more distant: these coordinates are referred to as "embeddings". Embedding models are often used by AI-enabled tools for tasks like codebase indexing or search.

Embedding models are charged per-token for usage-based plans.

Model	Context length	Input price (per million tokens)
nomic-ai/nomic-embed-text-v1.5	8k tokens	$0.01/mtok

Since embedding models aren't full LLMs and can't be used for chat — only for creating embedding coordinates — these models are only accessible via the API.