DailyGlimpse

Hands-On Contrastive Learning: Fine-Tuning BERT and MiniLM for Scientific Claim Retrieval

AI
April 26, 2026 · 11:17 PM

In the latest installment of the Generative AI lecture series, Constantine Caramanis demonstrates practical applications of InfoNCE loss and contrastive learning. The session focuses on fine-tuning BERT and MiniLM models using the SciFact dataset—a collection of scientific claims and related passages—to improve their performance as semantic embeddings for search tasks.

Before and after training, performance is measured using NDCG and Recall metrics, showing quantitative gains from domain-specific fine-tuning. The lecture includes a live Colab notebook where viewers can follow along with the implementation.

Key concepts covered: InfoNCE, Contrastive Loss, embeddings, fine-tuning, and similarity matrices.