Pricing | Synthetic

Synthetic can use either subscription or usage-based pricing. Choose the plan that works best for you.

Subscription Packs

(1 pack)

Price

$1/day

$30/mo

Rate Limit

135

messages/5hr

vs Claude

3×

higher limits

Run any agent for $1/day.

✓Works with any agent framework

✓500 free tool calls per day

✓3x higher rate limits than Claude's $20/month plan

✓1 concurrent request per model (buy more packs to increase)

✓UI and API access

Usage-based

Agents for enterprise.

✓Works with any agent framework

✓UI and API access

✓All models are pay-per-token

Run any agent for $1/day.

✓Works with any agent framework

✓500 free tool calls per day

✓3x higher rate limits than Claude's $20/month plan

✓1 concurrent request per model (buy more packs to increase)

✓UI and API access

Agents for enterprise.

✓Works with any agent framework

✓Pay for what you use

✓UI and API access

✓All models are pay-per-token

All-inclusive pricing

With your subscription, all always-on models are included for one flat monthly price. No per-token billing: just simple, predictable pricing.

Switch to "Usage-based" to see token-based pricing for when you don't need a subscription.

Small models

These models count as only 0.5 requests toward your rate limit:

Model	Context length	Status
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4	256k tokens	✓ Included
zai-org/GLM-4.7-Flash	192k tokens	✓ Included

Standard models

These models count as 1 request toward your rate limit:

Model	Context length	Status
deepseek-ai/DeepSeek-R1-0528	128k tokens	✓ Included
deepseek-ai/DeepSeek-V3	128k tokens	✓ Included
deepseek-ai/DeepSeek-V3.2	159k tokens	✓ Included
meta-llama/Llama-3.3-70B-Instruct	128k tokens	✓ Included
MiniMaxAI/MiniMax-M2.1	192k tokens	✓ Included
MiniMaxAI/MiniMax-M2.5	187k tokens	✓ Included
moonshotai/Kimi-K2-Instruct-0905	256k tokens	✓ Included
moonshotai/Kimi-K2-Thinking	256k tokens	✓ Included
moonshotai/Kimi-K2.5	256k tokens	✓ Included
nvidia/Kimi-K2.5-NVFP4	256k tokens	✓ Included
openai/gpt-oss-120b	128k tokens	✓ Included
Qwen/Qwen3-235B-A22B-Thinking-2507	256k tokens	✓ Included
Qwen/Qwen3-Coder-480B-A35B-Instruct	256k tokens	✓ Included
Qwen/Qwen3.5-397B-A17B	256k tokens	✓ Included
zai-org/GLM-4.7	198k tokens	✓ Included
zai-org/GLM-5 (Beta!)	192k tokens	✓ Included

LoRA models

What's a LoRA?

Low-rank adapters — called "LoRAs" — are small, efficient fine-tunes that run on top of existing models. They can modify a model to be much more effective at specific tasks.

All LoRAs for the following base models are included in your subscription:

Model	Context length	Status
meta-llama/Llama-3.2-1B-Instruct	128k tokens	✓ Included
meta-llama/Llama-3.2-3B-Instruct	128k tokens	✓ Included
meta-llama/Meta-Llama-3.1-8B-Instruct	128k tokens	✓ Included
meta-llama/Meta-Llama-3.1-70B-Instruct	128k tokens	✓ Included

LoRA sizes are measured in "ranks," starting at rank-8; we support up to rank-64 LoRAs kept always-on, and we run them in FP8 precision. The rank is set during the finetuning process: if you create your own LoRA, you'll be able to set exactly what rank you want using standard configuration for your training framework.

Embedding models

What are embeddings?

Embedding models convert text to special numerical coordinates, placing more-similar text closer to each other and less-similar text more distant: these coordinates are referred to as "embeddings". Embedding models are often used by AI-enabled tools for tasks like codebase indexing or search.

The following embedding models are included in your subscription. There's no additional charge for using embeddings, and embeddings requests don't count against your subscription rate limit.

Model	Context length	Status
nomic-ai/nomic-embed-text-v1.5	8k tokens	✓ Included

Since embedding models aren't full LLMs and can't be used for chat — only for creating embedding coordinates — these models are only accessible via the API.