In this tutorial, I'll show you how I created Doodle Dash, a real-time ML-powered web game that runs entirely in your browser using Transformers.js. The game is inspired by Google's Quick, Draw! — you have one minute to draw as many items as possible while a neural network guesses your sketches in real time. The model achieves over 60 predictions per second, all locally in your browser.
Overview
The game prompts you to draw an object. As you draw, a MobileViT model predicts the label. If correct, the canvas clears and you get a new prompt. You keep going until the timer runs out. This tutorial covers three stages:
- Training the neural network
- Running it in the browser with Transformers.js
- Game design
1. Training the Neural Network
Training Data
We use a subset of Google's Quick, Draw! dataset, containing over 5 million drawings across 345 categories.
Model Architecture
We fine-tune apple/mobilevit-small, a lightweight Vision Transformer with only 5.6M parameters (~20 MB file size), ideal for in-browser inference.
Fine-Tuning
A Colab notebook details the steps: loading the dataset, transforming images, defining a collate function and evaluation metric, loading the pre-trained model, and training with the Hugging Face Trainer. The fine-tuned model is available here.
2. Running in the Browser with Transformers.js
Transformers.js lets you run 🤗 Transformers models directly in the browser using ONNX Runtime. First, convert the PyTorch model to ONNX using 🤗 Optimum (see the tutorial for details). Then set up a JavaScript project, load the model, and run inference on canvas drawings.
3. Game Design
Leverage real-time performance by making predictions on every stroke. Add quality-of-life features like clear feedback when the model guesses correctly, a timer, and a score counter. The source code for Doodle Dash is on GitHub.
Ready to build your own? Join the Open Source AI Game Jam (July 7-9, 2023)!