Back to Home

Not affiliated with Meta. Pricing may change without notice. Always verify on the official Llama documentation before making purchasing decisions.

releasedv4 / 4 Maverick· Updated 2026-02-08

Llama API Pricing Tracker

Open-source foundation model for self-hosting

API Pricing

Inputper 1M tokens
$0.00

Prompt & context tokens sent to the model

Outputper 1M tokens
$0.00

Completion tokens generated by the model

Free Tier Available

Cost Estimator

Cost calculator coming soon

Estimate monthly API costs based on your token usage. Input your expected volume and get an instant breakdown.

Model Parameters

Context
10M tokens
Max Output
64K tokens
Multimodal
Yes
Languages
200+

Key Takeaways

1

Fully open-source with Meta's permissive community license.

2

Llama 4 Scout/Maverick introduces MoE architecture with massive 10M context.

3

Zero API cost for self-hosted deployments; cloud hosting available via partners.

Frequently Asked Questions

What is Llama 4?
Llama 4 is Meta's latest open-source model with mixture-of-experts architecture and up to 10M token context.
Is Llama free?
Yes. Llama models are free to download and use under Meta's community license.
Can I self-host Llama?
Yes. Llama weights are freely available for self-hosting on your own infrastructure.
What hardware do I need to run Llama 4?
Llama 4 Scout (17B active params) can run on a single GPU with 32GB+ VRAM. Maverick (400B+ total) requires multi-GPU setups or cloud instances with 4-8 A100/H100 GPUs.
What is the Llama community license?
Meta's community license allows free use for research and commercial applications with over 700M monthly active users requiring a separate license agreement.

Explore More

Stay Updated on Llama Pricing

Track API price changes across all major AI providers. Never overpay for tokens again.

About Llama

Llama is Meta's open-source model family. Llama 4 introduces mixture-of-experts architecture and a 10M token context window.