DailyGlimpse

Hugging Face Unveils IDEFICS: An Open-Source Multimodal AI Model Rivaling Proprietary Systems

AI
April 26, 2026 · 4:44 PM
Hugging Face Unveils IDEFICS: An Open-Source Multimodal AI Model Rivaling Proprietary Systems

Hugging Face has released IDEFICS, an open-access visual language model that matches the capabilities of proprietary systems like DeepMind's Flamingo. The model, which processes sequences of images and text to generate coherent textual responses, comes in 9 billion and 80 billion parameter versions, each with a base and instruction-tuned variant.

Built entirely on publicly available data and models (LLaMA v1 and OpenCLIP), IDEFICS aims to increase transparency in AI development. The team used only open datasets, including a newly created 115B token corpus called OBELICS, and documented their training process to share lessons learned.

Ethical considerations were integral to the project. The team developed an ethical charter and conducted adversarial testing (red teaming) to evaluate potential biases before release. The model weights are released under an MIT license, though the underlying LLaMA component requires a separate license from Meta.

IDEFICS is available via Hugging Face's Transformers library, with a demo and code samples provided. The project represents a significant step toward democratizing multimodal AI research.