Nemotron 3 Nano Omni: A New Open Multimodal Model with Native Audio Support

May 2, 2026 · 3:55 PM

The latest model in the Nemotron series, Nemotron 3 Nano Omni, has been released as the first to natively support audio inputs alongside text, images, and video. Building on its predecessor, Nemotron Nano V2 VL, it delivers consistent accuracy improvements across all modalities. The model is open and efficient, targeting researchers and developers who need a compact yet powerful multimodal AI. Details are available in the paper at https://huggingface.co/papers/2604.24954.

← More AI View original

Nemotron 3 Nano Omni: A New Open Multimodal Model with Native Audio Support

We Care About Your Privacy

How and why we process data