NVIDIA Releases Nemotron-3-Nano Omni-Modal MoE
The new 30-billion-parameter Mixture-of-Experts model handles any combination of modalities with just 3 billion active parameters.
NVIDIA has introduced Nemotron-3-Nano-Omni, a powerful new model designed for sophisticated multimodal tasks. The model employs a Mixture-of-Experts (MoE) architecture, a technique that activates only a fraction of its total parameters for any given task, leading to significant computational savings.
With a total of 30 billion parameters, Nemotron-3-Nano-Omni uses just 3 billion active parameters during inference. This efficient design allows it to deliver the performance of a much larger model without the corresponding computational overhead, making advanced reasoning more accessible. The model is available with BF16 weights, a common format for balancing performance and precision.
Any-to-Any Reasoning
The model's key feature is its "omni-modal" capability, allowing it to process and generate information across different formats seamlessly. It can handle what the company calls "any-to-any" tasks, meaning it can ingest a mix of inputs—such as text, images, and video—and produce a mix of outputs in response. This flexibility is critical for complex applications that require understanding context across multiple data types.
Nemotron-3-Nano-Omni represents a notable step forward for efficient and versatile AI. While it is governed by a custom NVIDIA license rather than a permissive open-source one, its availability on the Hugging Face Hub enables researchers and developers to experiment with its unique multimodal reasoning capabilities.
Sources
- Visit
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16
Hugging Face
0 comments
No comments yet. Be the first to weigh in.
More in Any-to-Any

MiniMax Releases M3, a Multimodal MoE Model
The new open-weight model from MiniMax AI combines vision, coding, and reasoning using a Mixture-of-Experts architecture.
Google Releases Gemma 4 12B Multimodal Model
The new 12-billion-parameter open model from DeepMind introduces a unified 'any-to-any' architecture for advanced multimodal tasks.
Google Releases Gemma 4, a 12B 'Any-to-Any' Model
The new 12-billion-parameter model from Google DeepMind is designed to handle a flexible mix of data types, moving beyond traditional text and image inputs.