Sentence Transformers has introduced new multimodal embedding and reranker models that can process both text and images. The models are designed to improve search and retrieval tasks by generating unified representations for mixed-modal data. This release aims to enhance applications like visual search, cross-modal retrieval, and content-based recommendation systems. The models are available through the Sentence Transformers library, which has been widely adopted for text embeddings. With this update, developers can now build more powerful search engines that understand both textual and visual content.
Sentence Transformers Launches Multimodal Embedding and Reranker Models
AI
April 26, 2026 · 4:00 PM