Mistral AI released Small 4 in March 2026, a 119-billion-parameter language model under the Apache 2.0 license. The model achieves GPT-4-class performance on standard benchmarks while being fully open-source and deployable on a single 8-GPU server. This makes Mistral Small 4 the largest high-performance language model available without licensing restrictions.
Mistral Small 4 Capabilities at a Glance
- 119 billion parameters with mixture-of-experts architecture
- Scores 86.3% on MMLU, competitive with GPT-4 Turbo’s 86.5%
- 128K token context window with reliable performance across the full range
- Apache 2.0 license with no commercial use restrictions
- Runs on 8x A100 80GB or equivalent hardware
- Native support for 24 languages including English, French, German, Spanish, and Chinese
Mixture-of-Experts Architecture Explained
Mistral Small 4 uses a mixture-of-experts (MoE) architecture where only a subset of the 119B parameters activate for any given input. The model contains multiple specialized sub-networks, and a routing mechanism selects the most relevant experts for each token. This means the model has 119B total parameters but uses roughly 25B on each forward pass.
Mistral Small 4’s mixture-of-experts architecture activates only 25B of its 119B parameters per token, delivering GPT-4-class quality at inference costs comparable to much smaller dense models.
The practical result is that Small 4 runs faster and cheaper than a dense 119B model while achieving the quality benefits of the full parameter count during training. This is the same architectural approach that Google used for Gemini and that OpenAI reportedly uses for GPT-5.4.
Deployment Options for Enterprise Teams
Self-hosting Small 4 requires 8 GPUs with 80GB of VRAM each, such as 8x A100 or 8x H100. Mistral provides optimized Docker containers and Kubernetes Helm charts for deployment. Quantized variants reduce hardware requirements: the AWQ INT4 version runs on 4x A100 GPUs with minor quality degradation.
For teams that prefer managed hosting, Small 4 is available on the Mistral API, Amazon Bedrock, Azure, and Google Cloud. The managed options provide enterprise SLAs and support while using the same underlying model.
Why 119B Open-Source Matters for the Industry
Previous open-source models at this quality level either came with restrictive licenses (like Llama 2’s commercial limitations, since lifted) or required impractical amounts of hardware. Small 4 removes both barriers. Any company with a modest GPU budget can now run a GPT-4-class model in-house, fully under their control, with no licensing fees or API rate limits.
For regulated industries that cannot send data to third-party APIs, this is transformative. Banks, hospitals, and government agencies can deploy capable AI without the compliance risks of external API dependencies. The Apache 2.0 license means legal teams do not need to negotiate enterprise agreements to start building.