DailyGlimpse

NVIDIA Nemotron OCR v2: Blazing-Fast Multilingual Document Recognition Powered by Synthetic Data

AI
April 27, 2026 · 4:16 PM

NVIDIA has unveiled Nemotron OCR v2, a cutting-edge optical character recognition model that achieves high-speed multilingual document processing using synthetic training data. The model, designed for document AI applications, can process 34.7 pages per second on a single A100 GPU.

According to NVIDIA, the key to Nemotron OCR v2's performance lies in its data strategy: the model was trained on 12 million synthetic images spanning six languages. This large-scale, high-quality dataset is as critical as the architectural innovations in boosting OCR accuracy.

The architecture reduces computational redundancy by sharing a detection backbone between the recognizer and relational model, enabling faster inference without sacrificing quality. The model shows strong generalization to real-world documents, making it a practical tool for enterprise document processing.

Nemotron OCR v2 represents a significant step forward in multilingual vision models, demonstrating how synthetic data can overcome the scarcity of annotated real-world data. Researchers and developers working on document AI, OCR, and multilingual models will find this development particularly relevant.