Hugging Face has integrated Stable-Baselines3, a popular PyTorch library for deep reinforcement learning, into its Hub. This move enables researchers and enthusiasts to host, share, and load pretrained reinforcement learning models with ease.
Installation
To use the integration, install two libraries:
pip install huggingface_hub
pip install huggingface_sb3
Finding and Downloading Models
Models for environments like Space Invaders and CartPole are available. Browse the collection and copy a repository ID. Then load a model in Python:
import gym
from huggingface_sb3 import load_from_hub
from stable_baselines3 import PPO
from stable_baselines3.common.evaluation import evaluate_policy
checkpoint = load_from_hub(
repo_id="sb3/demo-hf-CartPole-v1",
filename="ppo-CartPole-v1.zip",
)
model = PPO.load(checkpoint)
eval_env = gym.make("CartPole-v1")
mean_reward, std_reward = evaluate_policy(
model, eval_env, render=True, n_eval_episodes=5, deterministic=True, warn=False
)
print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")
Sharing Your Models
After logging in (huggingface-cli login), train and push a model:
from huggingface_sb3 import push_to_hub
from stable_baselines3 import PPO
model = PPO("MlpPolicy", "CartPole-v1", verbose=1)
model.learn(total_timesteps=10_000)
model.save("ppo-CartPole-v1")
push_to_hub(
repo_id="ThomasSimonini/demo-hf-CartPole-v1",
filename="ppo-CartPole-v1.zip",
commit_message="Added Cartpole-v1 model trained with PPO",
)
What's Next?
Upcoming integrations include RL-baselines3-zoo, RL-trained-agents, and more deep RL libraries. A tutorial on training a lunar lander agent and using it on the Hub is available here.
Conclusion
Hugging Face looks forward to seeing the community's models and feedback. Thanks to the SB3 team for their collaboration.