9router: Route AI Coding Agents Through 40+ Providers with 3-Tier Fallback
If you’re running AI coding agents seriously, you’ve hit the economics wall. Claude Code burns through API credits. Cursor Pro has usage caps. Codex bills per token. The per-session cost adds up fast when you’re iterating on real codebases — especially once context rot forces expensive re-prompts.
9router is a local routing proxy that sits between your coding agent and the LLM providers, routing each request through a 3-tier fallback chain: Subscription → Cheap → Free. It has 13.4K GitHub stars and has been trending for good reason — it makes multi-provider routing dead simple.
How It Works
Install globally and start routing:
npm install -g 9router
9router start
The dashboard opens at localhost:20128. Point your coding tool at 9router’s local endpoint instead of directly at OpenAI/Anthropic/Google, and it handles the rest.
The 3-tier fallback logic is straightforward:
- Subscription tier — Your paid API keys (Anthropic, OpenAI, etc.). Used first when available and within budget limits.
- Cheap tier — Lower-cost providers like DeepSeek, Groq, Together AI. Falls back here when subscription quotas are exhausted.
- Free tier — Providers with free access like Kiro AI, OpenCode Free, and Vertex AI (which gives $300 in credits). The safety net that keeps your agent running even when you’ve burned through paid credits.
Each tier supports multi-account round-robin, so you can load-balance across multiple API keys per provider. If one key hits a rate limit, the next one picks up seamlessly.
Format Translation
This is where 9router gets genuinely useful beyond simple proxying. It translates between OpenAI, Claude, and Gemini API formats on the fly. Your tool speaks OpenAI format? 9router can route that request to Claude’s API or Gemini’s API without any configuration on the tool side.
This means you can use Claude Code’s interface but route certain requests through cheaper Gemini models, or use Cursor with providers it doesn’t natively support. The format translation covers 100+ models across 40+ providers.
RTK Token Saver
9router includes a built-in token compression feature called RTK (not to be confused with rtk the CLI proxy — though they’re complementary). 9router’s RTK specifically targets tool_result content in the conversation, compressing it by 20-40% before sending to the LLM.
The approach is different from standalone rtk, which intercepts shell commands at the OS level and compresses their output before it enters the agent’s context. 9router’s token saver operates at the API request level, compressing tool results that are already in the conversation history. You could actually run both — rtk compresses output at the shell, and 9router compresses what’s left at the API layer. Double compression for the token-conscious. If you’re also looking at MCP vs skills for token efficiency, the savings stack.
The Dashboard
The web dashboard at localhost:20128 gives you real-time visibility into:
- Request routing — which provider handled each request and why
- Token usage — per-provider, per-model breakdowns
- Cost tracking — estimated spend across all tiers
- Fallback events — when and why requests fell through to cheaper tiers
- Account status — rate limits, quotas, and health per API key
This is the kind of observability that’s painful to build yourself. When you’re juggling multiple providers and API keys, knowing where your tokens are going matters.
Free Provider Highlights
The free tier is surprisingly capable:
- Kiro AI — Amazon’s coding agent, accessible through 9router’s routing
- OpenCode Free — Free tier access to coding-optimized models
- Vertex AI — Google Cloud gives $300 in free credits, which goes a long way with Gemini models
For side projects or experimentation, you can run AI coding agents at literally zero cost by configuring only the free tier.
When to Use This
9router makes sense if you:
- Run multiple AI coding tools and want unified provider management
- Want automatic fallback when one provider is down or rate-limited
- Have multiple API keys and want round-robin load balancing
- Need to stretch a budget by mixing paid and free providers
- Want visibility into where your AI spend is actually going
It’s a routing and orchestration layer. It doesn’t change how your coding agent works — it changes how requests reach the LLM. That’s a clean separation of concerns.
Getting Started
npm install -g 9router
9router start
# Dashboard: http://localhost:20128
# Configure providers, add API keys, set fallback tiers
# Point your coding tool at 9router's local endpoint
The GitHub repo has detailed setup guides for Claude Code, Cursor, Codex, and Cline. Configuration is YAML-based and the defaults are sensible.
At 13.4K stars and actively maintained, 9router is one of the more mature tools in the AI coding infrastructure space. If you’re spending real money on AI coding and want more control over where those tokens go, it’s worth the 5-minute setup.
Related Reading
- rtk: A Rust CLI Proxy That Cuts AI Agent Token Usage 60-90% — Complementary shell-level compression
- Skills vs MCP: Token Efficiency for AI Agents — Another angle on reducing token waste
- Context Rot in AI Coding Agents — Why long sessions get expensive
- Personal AI Agents: OpenRouter Rankings 2026 — Comparing provider options
- Karpathy’s Four Rules for AI Coding Agents — Best practices for agent configuration