The latest model in the Nemotron series, Nemotron 3 Nano Omni, has been released as the first to natively support audio inputs alongside text, images, and video. Building on its predecessor, Nemotron Nano V2 VL, it delivers consistent accuracy improvements across all modalities. The model is open and efficient, targeting researchers and developers who need a compact yet powerful multimodal AI. Details are available in the paper at https://huggingface.co/papers/2604.24954.
Nemotron 3 Nano Omni: A New Open Multimodal Model with Native Audio Support
AI
May 2, 2026 · 3:55 PM