Efficient

Llama 4 Scout

Meta's efficient MoE model optimized for speed and cost. 16 experts deliver competitive quality at half the cost of Maverick, with the same 128K context length for versatile applications.

Deploy Model View API Docs

Model Specifications

Parameters17B x 16 Experts

ArchitectureMixture of Experts (MoE)

Context Length128K tokens

Active Parameters~17B per token

DeveloperMeta AI

LicenseLlama License

Why Choose Llama 4 Scout

Efficient Architecture

16 experts provide excellent quality with lower resource needs.

128K Context

Same extended context as Maverick for long document processing.

Fast Inference

Pricing

Serverless API

Pay per token with auto-scaling

₹15 /1M tokens input · ₹30 /1M tokens output

Auto-scaling
No minimum
99.9% uptime
Rate limits apply

Get Started

Recommended

Use Cases

High-Volume Tasks

Cost-effective solution for high-throughput applications.

Real-time Chat

Fast response times for interactive applications.

Ready to Deploy Llama 4 Scout?

Get excellent performance at half the cost of larger models.

Deploy Now View All Models