Hugging Face and Baidu's PaddlePaddle have announced an open source collaboration aimed at democratizing AI. PaddlePaddle, first open sourced by Baidu in 2016, enables developers of all skill levels to implement deep learning at scale. As of Q4 2022, it is used by over 5.35 million developers and 200,000 enterprises, making it the leading deep learning platform in China by market share.
PaddlePaddle features popular repositories such as the Paddle deep learning framework, model libraries like PaddleOCR, PaddleDetection, PaddleNLP, PaddleSpeech, as well as PaddleSlim for model compression and FastDeploy for deployment.
With PaddleNLP leading the way, PaddlePaddle will gradually integrate its libraries with the Hugging Face Hub. Users will soon be able to access a full suite of pre-trained PaddlePaddle models across text, image, audio, video, and multi-modalities on the Hub.
Find PaddlePaddle Models
All PaddlePaddle models can be found on the Model Hub by filtering with the PaddlePaddle library tag. Over 75 models are already available, including the multi-task Information Extraction model series UIE, the state-of-the-art Chinese Language Model ERNIE 3.0 series, and the novel document pre-training model Ernie-Layout. Users can also explore Spaces like text-to-image ERNIE-ViLG, cross-modal Information Extraction engine UIE-X, and the multilingual OCR toolkit PaddleOCR.
Inference API and Widgets
PaddlePaddle models are accessible through the Inference API via HTTP with cURL or Python's requests library. Models supporting a task come with interactive widgets for direct browser experimentation.
Use Existing Models
To load a specific model, click "Use in paddlenlp" to get a working snippet. For example:
from paddlenlp.transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("PaddlePaddle/ernie-3.0-base-zh", from_hf_hub=True)
model = AutoModelForMaskedLM.from_pretrained("PaddlePaddle/ernie-3.0-base-zh", from_hf_hub=True)
Share Models
Users can share PaddleNLP models by pushing to the Hub using the save_to_hf_hub method:
tokenizer.save_to_hf_hub(repo_id="<my_org_name>/<my_repo_name>")
model.save_to_hf_hub(repo_id="<my_org_name>/<my_repo_name>")
Conclusion
PaddlePaddle is an open source deep learning platform originating from industrial practice, open sourcing innovative projects since 2016. The collaboration with Hugging Face Hub will bring state-of-the-art projects to the community. Stay updated by following @PaddlePaddle on Twitter.