DailyGlimpse

SetFit: A Prompt-Free Approach to Efficient Few-Shot Learning

AI
April 26, 2026 · 5:20 PM
SetFit: A Prompt-Free Approach to Efficient Few-Shot Learning

SetFit is a new framework for few-shot fine-tuning of Sentence Transformers, developed by Hugging Face in collaboration with Intel Labs and the UKP Lab. Unlike traditional methods that rely on handcrafted prompts or verbalisers, SetFit generates rich embeddings directly from labeled text examples, eliminating the need for prompts entirely.

How It Works

SetFit operates in two stages. First, it fine-tunes a Sentence Transformer model on a small number of labeled examples (typically 8 or 16 per class) using contrastive training. Positive and negative pairs are created by selecting examples from the same or different classes. This produces dense embeddings for each example. Second, a classifier head is trained on these embeddings to predict class labels. At inference, new examples are passed through the fine-tuned Sentence Transformer to generate embeddings, which are then classified.

SetFit also supports multilingual classification by simply switching to a multilingual Sentence Transformer checkpoint. Experiments show promising results in German, Japanese, Mandarin, French, and Spanish, both in-language and cross-linguistically.

Benchmark Performance

Despite using much smaller models, SetFit matches or exceeds state-of-the-art few-shot methods. On the RAFT benchmark, SetFit Roberta (355M parameters) outperforms PET and GPT-3, and is competitive with T-Few (11B parameters) – a model 30 times larger. SetFit even surpasses the human baseline on 7 of 11 RAFT tasks.

Rank Method Accuracy Model Size
2 T-Few 75.8 11B
4 Human Baseline 73.5 N/A
6 SetFit (Roberta Large) 71.3 355M
9 PET 69.6 235M
11 SetFit (MP-Net) 66.9 110M
12 GPT-3 62.7 175B

On other datasets, SetFit with only 8 examples per class typically outperforms PERFECT, ADAPET, and vanilla fine-tuned transformers, while achieving comparable results to T-Few 3B despite being 27 times smaller.

Speed and Cost Efficiency

SetFit is significantly faster and cheaper to train than competing methods. Training SetFit on an NVIDIA V100 with 8 labeled examples takes only 30 seconds, costing $0.025. In contrast, training T-Few 3B on an A100 takes 11 minutes and costs about $0.70 – a 28x increase. SetFit can even run on a single GPU like those on Google Colab or on a CPU in minutes. Inference speed-ups of up to 123x are possible through distillation.

Getting Started

To train your own SetFit model, install the setfit library and use the following code:

pip install setfit
from datasets import load_dataset
from sentence_transformers.losses import CosineSimilarityLoss
from setfit import SetFitModel, SetFitTrainer

dataset = load_dataset("SetFit/SentEval-CR")
train_ds = dataset["train"].shuffle(seed=42).select(range(16))
test_ds = dataset["test"]

model = SetFitModel.from_pretrained("sentence-transformers/paraphrase-mpnet-base-v2")

trainer = SetFitTrainer(
    model=model,
    train_dataset=train_ds,
    eval_dataset=test_ds,
    loss_class=CosineSimilarityLoss,
    batch_size=16,
    num_iterations=20,
)
trainer.train()

For more details, see the paper, code, and models on the Hub.