Gradio, the popular Python library for creating machine learning demos, has introduced a new feature called Gradio-Lite. This serverless version allows developers to run Gradio applications entirely within the user's browser, eliminating the need for backend servers.
Traditionally, Gradio apps require a Python server to handle inference and serve the interface. With Gradio-Lite, the entire app—including model inference—runs client-side using WebAssembly and ONNX Runtime. This means no cloud costs, no latency from server round trips, and increased privacy since data never leaves the browser.
Gradio-Lite is ideal for sharing lightweight models, educational demos, and prototyping. Users can embed interactive demos in static sites like GitHub Pages or Notion without setting up a server. The library supports popular models converted to ONNX format and leverages the browser's local computation power.
This development marks a significant step towards making machine learning more accessible and portable, especially for small-scale deployments where server overhead is a barrier.