Synthetic can use either subscription or usage-based pricing. Choose the plan that works best for you.
All always-on models are included in your subscription. There's no additional charge for using any of these models.
With your subscription, all always-on models are included for one flat monthly price. No per-token billing—just simple, predictable pricing.
Switch to "Pay per Use" to see token-based pricing for when you don't need a subscription.
Here's the list of all always-on models included in your subscription:
Model | Context length | Status |
---|---|---|
deepseek-ai/DeepSeek-R1 | 128k tokens | ✓ Included |
deepseek-ai/DeepSeek-R1-0528 | 128k tokens | ✓ Included |
deepseek-ai/DeepSeek-V3 | 128k tokens | ✓ Included |
deepseek-ai/DeepSeek-V3-0324 | 128k tokens | ✓ Included |
deepseek-ai/DeepSeek-V3.1 | 128k tokens | ✓ Included |
meta-llama/Llama-3.1-405B-Instruct | 128k tokens | ✓ Included |
meta-llama/Llama-3.1-70B-Instruct | 128k tokens | ✓ Included |
meta-llama/Llama-3.1-8B-Instruct | 128k tokens | ✓ Included |
meta-llama/Llama-3.3-70B-Instruct | 128k tokens | ✓ Included |
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | 524k tokens | ✓ Included |
meta-llama/Llama-4-Scout-17B-16E-Instruct | 328k tokens | ✓ Included |
moonshotai/Kimi-K2-Instruct | 128k tokens | ✓ Included |
moonshotai/Kimi-K2-Instruct-0905 | 256k tokens | ✓ Included |
openai/gpt-oss-120b | 128k tokens | ✓ Included |
Qwen/Qwen2.5-Coder-32B-Instruct | 32k tokens | ✓ Included |
Qwen/Qwen3-235B-A22B-Instruct-2507 | 256k tokens | ✓ Included |
Qwen/Qwen3-235B-A22B-Thinking-2507 | 256k tokens | ✓ Included |
Qwen/Qwen3-Coder-480B-A35B-Instruct | 256k tokens | ✓ Included |
zai-org/GLM-4.5 | 128k tokens | ✓ Included |
Low-rank adapters — called "LoRAs" — are small, efficient fine-tunes that run on top of existing models. They can modify a model to be much more effective at specific tasks.
All LoRAs for the following base models are included in your subscription:
Base model | Status |
---|---|
meta-llama/Llama-3.2-1B-Instruct | ✓ Included |
meta-llama/Llama-3.2-3B-Instruct | ✓ Included |
meta-llama/Meta-Llama-3.1-8B-Instruct | ✓ Included |
meta-llama/Meta-Llama-3.1-70B-Instruct | ✓ Included |
We support launching all other LLMs on-demand on cloud GPUs. There's no configuration necessary: just enter the Hugging Face link for any model, and we'll automatically run it for you in our friendly chat UI or API.
Even with a subscription, on-demand models are charged separately per minute.
These models are not included in your subscription and will be billed at standard rates.
We'll automatically detect the number of GPUs you need to run the model. Here's our current GPU pricing:
GPU Type | Price |
---|---|
80GB | 3 cents/min, per GPU |
48GB | 1.5 cents/min, per GPU |
24GB | 1.2 cents/min, per GPU |