If you're a software developer, you've likely used GitHub Copilot or ChatGPT for tasks like translating code or generating functions from natural language descriptions. While powerful, these proprietary tools often lack transparency and can't be customized to your specific needs. Enter StarCoder from BigCode: a 16-billion-parameter model trained on a trillion tokens from 80+ programming languages, GitHub issues, commits, and Jupyter notebooks—all permissively licensed. With its 8,192-token context window and multi-query attention for fast inference, StarCoder is the best open-source choice for code applications.
In this post, we show how to fine-tune StarCoder for chat to create a personalized coding assistant, dubbed StarChat. We cover:
- Prompting LLMs to act as conversational agents.
- OpenAI's Chat Markup Language (ChatML) for structured message formats.
- Fine-tuning with Transformers and DeepSpeed ZeRO-3 on dialogue data.
To see it in action, try the demo below. You'll find code, dataset, and model links here, here, and here.