NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50% higher token generation compared to leading open models. The model features a 1M token context window for long-term agent coherence, cross-document reasoning, and multi-step task planning. Latent MoE enables calling 4 experts for the inference cost of only one, improving intelligence and generalization. Multi-environment RL training across 10+ environments delivers leading accuracy on benchmarks including AIME 2025, TerminalBench, and SWE-Bench Verified. Fully open with weights, datasets, and recipes under the NVIDIA Open License, Nemotron 3 Super allows easy customization and secure deployment anywhere — from workstation to cloud.
| Signal | Strength | Weight | Impact |
|---|---|---|---|
| Capabilitiesjust now | 83 | 30% | +25.0 |
| Recencyjust now | 100 | 15% | +15.0 |
| Output Capacityjust now | 90 | 15% | +13.5 |
| Context Windowjust now | 86 | 15% | +12.9 |
| Pricingjust now | 30 | 25% | +7.5 |
Community and practitioner feedback adds real-world signal on top of benchmarks and pricing.
Share your experience with Nemotron 3 Super (free) and help the community make better decisions.
Free