Laravel

Hugging Face Integrates DeepInfra as New Inference Provider for Cost-Effective AI Model Hosting

April 30, 2026 · 1:46 AM

Hugging Face has announced the integration of DeepInfra as a new Inference Provider on its Hub, expanding the platform's serverless AI inference capabilities. DeepInfra, known for its cost-effective per-token pricing, offers access to over 100 models, including popular open-weight LLMs like DeepSeek V4, Kimi-K2.6, and GLM-5.1.

Initially, the integration supports conversational and text-generation tasks, with plans to add text-to-image, text-to-video, and embeddings soon. Users can access DeepInfra-hosted models directly through the Hugging Face website UI or via client SDKs for Python and JavaScript, using either custom API keys or routed requests through Hugging Face.

For billed requests, users are charged standard provider rates with no additional markup from Hugging Face. PRO subscribers receive $2 in monthly inference credits, which can be used across providers. The integration also works with popular agent harnesses like Pi, OpenCode, and Hermes Agents without extra configuration.

Feedback on the new provider is welcome via the Hugging Face discussions space.

Hugging Face Integrates DeepInfra as New Inference Provider for Cost-Effective AI Model Hosting

We Care About Your Privacy

How and why we process data