How We Build AI That Actually Works: The Agency Stack
Beyond chatbots. How Shahriar Labs orchestrates multi-agent systems to solve complex engineering problems autonomously.
Avoid LLM lock-in by routing across providers with automatic fallback — DeepSeek, Gemini, Claude — behind one interface. Here's the pattern.
Avoid LLM lock-in by routing across providers — DeepSeek, Gemini, Claude — behind one normalized interface with automatic fallback. Model quality and pricing shift quarterly. A codebase that calls openai.chat.completions.create() directly is one deprecation notice away from an emergency migration. The provider-agnostic pattern costs 30 minutes to set up and saves weeks when the model landscape shifts.
In 2025: GPT-4 pricing increased 3×. DeepSeek R1 launched at 95% lower cost than comparable models. Gemini 2.5 Flash became the fastest model for high-volume tasks. Every time a new frontier model drops, shops locked into a single provider lose two weeks of migration engineering instead of a config change.
Provider-specific features compound the problem. If your function-calling schema uses OpenAI's exact JSON format in 50 places, switching to Claude's tool use format is a full refactor. Keep provider-specific adapters at the edge.
The architecture has three layers:
llm.complete(prompt, options) interface. No provider names, no SDK imports.LiteLLM implements this pattern as a drop-in OpenAI-compatible proxy. openrouter-free adds free-model routing on top of it for zero-cost fallback paths.
Some capabilities are worth coupling to a specific provider: Claude's extended thinking for multi-step reasoning, Gemini's long-context document analysis, GPT-4o's vision quality. The key is isolating these to named service functions — reasoningService.analyzeDocument(), not claude.messages.create() scattered everywhere. When you switch providers, you update one service, not the whole codebase.
Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. Building LetX, QuantumSketch, and open-source AI agent skills.
Beyond chatbots. How Shahriar Labs orchestrates multi-agent systems to solve complex engineering problems autonomously.
In 2026, AI agents handle planning, coding, testing, and deployment under human direction — shifting developers from implementers to architects and reviewers.