Developers looking to create public AI APIs without spending on cloud hosting can now turn Google Colab into a production-ready API server using Ngrok — completely free. This step-by-step guide walks through setting up a FastAPI application inside a Colab notebook and exposing it via a secure public URL generated by Ngrok.
Step 1: Set up Google Colab Open a new notebook in Google Colab and enable GPU acceleration if your AI model requires it. Mount Google Drive if you need persistent storage.
Step 2: Install FastAPI and Ngrok In the first code cell, run:
!pip install fastapi uvicorn pyngrok
Sign up for a free Ngrok account and copy your auth token. Authenticate using:
!ngrok authtoken YOUR_AUTH_TOKEN
Step 3: Write a simple FastAPI app Create a Python cell defining your API endpoints. For example:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
@app.post("/predict")
def predict(data: dict):
# Your AI prediction logic here
return {"prediction": "result"}
Step 4: Expose the API with Ngrok Start the FastAPI server on a local port (e.g., 8000) using uvicorn. Then, use the following code to create a public tunnel:
from pyngrok import ngrok
public_url = ngrok.connect(8000)
print(f"Public API URL: {public_url}")
That's it! Your AI API is now live and accessible from anywhere. This method is ideal for prototyping, testing, and sharing demos with collaborators — all without incurring hosting costs. Remember that free Ngrok tunnels have limitations on concurrent connections and session duration, but for development purposes it's a powerful solution.