SHAHRIAR LABSIntelligence in Motion
    Back to Blog
    AI EngineeringJune 1, 2026

    freelm: The Ultimate Free LLM API Gateway for Python

    Why we built freelm, the open-source Python client that pools six free-tier LLM providers behind one OpenAI-compatible call.

    Almost every app now needs an LLM, and tokens cost money. But there's a lot of free capacity if you know how to spread your requests. freelm is our solution.

    Definition: What is freelm?

    freelm (Noun): An open-source Python gateway developed by Shahriar Labs that pools free-tier providers—OpenRouter, Google AI Studio (Gemini), NVIDIA NIM, Groq, Cerebras, and Mistral—behind one fault-tolerant, OpenAI-compatible call.

    The Problem: Provider Lock-in and Rate Limits

    Each provider meters its free tier independently. Combining them multiplies your effective free throughput. But the operational burden is high: six SDKs, six auth schemes, and constantly changing free model IDs.

    "We wanted to write our AI logic once and never worry about a provider going down," explains Shihab Shahriar Antor. "freelm collapses that complexity into one client."

    How It Works

    The gateway layer owns reliability:

    • Automatic Key Rotation: Give it multiple API keys; it balances load across them.
    • Cross-Provider Failover: A 429 (Rate Limit) cools the key and rotates to the next provider automatically.
    • Live Model Discovery: It queries endpoints live to find active :free models and caches them to disk.

    Get Started

    Install the package via pip for Python, or npm for Node.js:

    # Python
    pip install freelm
    
    # Node.js
    npm install freelm

    Then call it in your Python code:

    import freelm
    
    llm = freelm.FreeLLM.from_env()
    print(llm.text("Explain black holes in one sentence."))

    Frequently Asked Questions (FAQ)

    Q: Is freelm really free?
    A: freelm itself is MIT-licensed and free. It runs on providers' free tiers, so actual limits depend on their quotas.

    Q: Which providers are supported?
    A: OpenRouter, Google Gemini, NVIDIA NIM, Groq, Cerebras, and Mistral.

    Q: Does it support streaming?
    A: Yes! Token streaming works out of the box and routes through the same failover logic.

    Summary

    By abstracting away API limits, freelm allows developers to build robust, always-up AI features without a massive monthly bill. Available now on GitHub.