Google's Gemma models, a family of lightweight open-source language models, are now fully supported on the Hugging Face platform, allowing developers to fine-tune them for specialized tasks. This guide walks through the process of adapting Gemma to custom datasets using the Transformers library.
Setting Up the Environment
First, install the necessary dependencies:
pip install transformers datasets accelerate
Ensure you have a Hugging Face token with write access, as Gemma models require acceptance of terms. Log in using:
huggingface-cli login
Loading the Base Model
Choose a Gemma variant (e.g., google/gemma-2b). Load the model and tokenizer:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b-it")
The -it suffix indicates an instruction-tuned version, ideal for chat-style fine-tuning.
Preparing the Dataset
For fine-tuning, use a conversational dataset. Here's an example using the datasets library:
from datasets import load_dataset
dataset = load_dataset("your_dataset_name")
Tokenize the data, ensuring proper padding and truncation:
def tokenize_function(examples):
return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=512)
tokenized_datasets = dataset.map(tokenize_function, batched=True)
Fine-Tuning with Trainer
Use Hugging Face's Trainer API for efficient fine-tuning:
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=4,
num_train_epochs=3,
weight_decay=0.01,
push_to_hub=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["test"],
)
trainer.train()
Saving and Sharing
After training, push your model to Hugging Face Hub:
trainer.push_to_hub()
Your fine-tuned Gemma model is now available for inference. Experiment with hyperparameters and dataset sizes to optimize performance for your specific use case.
For more advanced techniques like LoRA (Low-Rank Adaptation) or PEFT, refer to the Hugging Face documentation on parameter-efficient fine-tuning.