Hugging Face has unveiled a new open-source tool called AI vs AI, a deep reinforcement learning multi-agent competition system hosted on Spaces. The system enables multi-agent tournaments by combining a matchmaking algorithm, a results dataset, and an ELO-based leaderboard. Users can push trained models to the Hub, where they are automatically evaluated and ranked against others.
How AI vs AI Works
The tool uses the ELO rating system to provide a relative measure of skill. After each match, ratings are updated based on results and pre-match ratings. Low-rated players beating high-rated opponents cause larger rating shifts, while expected outcomes have minimal impact. The system maintains a constant average ELO of 1200.
Matchmaking follows a simple algorithm:
- Gather all available models from the Hub (new models start at 1200 ELO).
- Create a queue of models.
- Pop the first model and another model with similar rating.
- Simulate the match by loading both models into the environment (e.g., a Unity executable) and record results.
- Update ELO ratings using the formula.
- Continue until queue is empty.
- Save ratings and repeat.
This process runs continuously on free Hugging Face Spaces hardware, using a Scheduler for background execution. A leaderboard displays the latest ELO ratings.
First Challenge: SoccerTwos
The inaugural competition uses Unity ML-Agents' SoccerTwos environment, where two teams of two agents play soccer. Participants train their agents and submit them to the Hub for ranking. The challenge runs from February 1 to April 30, 2024, and is open to everyone. As of now, 48 models have been submitted.
While the ELO ratings are relative and not objective, with a sufficiently large and diverse pool of models and enough matches, this evaluation method becomes a reliable indicator of performance. Hugging Face plans to extend the system to other adversarial multi-agent settings.
For more details, visit the SoccerTwos challenge page.