rtk: A Rust CLI Proxy That Cuts AI Agent Token Usage 60-90%

By Prahlad Menon Published 2026-04-05 4 min read

Every AI coding session bleeds tokens on CLI noise. git status returns 2,000 tokens of output when the agent needs 200. pytest dumps 8,000 tokens of test output when the agent needs to know which tests failed. cat file.rs returns the entire file when the agent needs the function signatures.

None of this is the LLM’s fault. It’s a plumbing problem — raw CLI output is built for humans to scan, not for LLMs to reason over. rtk is a thin Rust proxy that fixes the plumbing.

What It Does

rtk intercepts shell commands before their output reaches your AI agent, applies four compression strategies, and returns a semantically equivalent but dramatically smaller result.

The four strategies:

Smart Filtering — removes comments, whitespace, boilerplate that carry no signal for the agent
Grouping — aggregates similar items (files by directory, errors by type) instead of listing each individually
Truncation — keeps relevant context, cuts redundancy
Deduplication — collapses repeated log lines with counts (“error X appeared 47 times” instead of 47 lines of error X)

The benchmark table from the repo, on a medium-sized TypeScript/Rust project:

Command	Frequency	Standard	rtk	Savings
`ls / tree`	10x	2,000	400	-80%
`cat / read`	20x	40,000	12,000	-70%
`grep / rg`	8x	16,000	3,200	-80%
`git status`	10x	3,000	600	-80%
`git diff`	5x	10,000	2,500	-75%
`git log`	5x	2,500	500	-80%
`git add/commit/push`	8x	1,600	120	-92%
`cargo test / npm test`	5x	25,000	2,500	-90%
`pytest`	4x	8,000	800	-90%
Total		~118,000	~23,900	-80%

Across a typical coding session, that’s 118K tokens → 24K tokens on shell operations alone. At Claude Sonnet pricing, that’s not trivial if you’re running agents at any volume.

How the Integration Works

rtk installs a shell hook that transparently rewrites commands before execution. The agent issues git status, the hook silently rewrites it to rtk git status, and the agent receives compressed output. Claude never sees the rewrite — it just gets a smaller, cleaner result.

# Install for Claude Code (default)
rtk init -g

# Or for other tools
rtk init -g --gemini    # Gemini CLI
rtk init -g --codex     # Codex
rtk init --agent cursor # Cursor
rtk init --agent windsurf
rtk init --agent cline

One important caveat from the docs: the hook only rewrites Bash tool calls. Claude Code’s built-in tools (Read, Grep, Glob) don’t pass through the Bash hook. For those workflows, you call rtk directly: rtk read file.rs, rtk grep "pattern" ., rtk find "*.rs" ..

Aggressive Compression Modes

Beyond the standard filtering, rtk has explicit compression levels:

# Read only function signatures, strip bodies entirely
rtk read file.rs -l aggressive

# 2-line heuristic summary of a file
rtk smart file.rs

The aggressive mode for file reading is the interesting one — for agents doing codebase navigation, getting signatures without bodies is often exactly the right level of detail. The agent understands what’s available without consuming the full implementation.

Single Binary, No Dependencies

rtk is a single Rust binary with zero runtime dependencies. brew install rtk on macOS. curl -fsSL .../install.sh | sh on Linux. Under 10ms overhead per command.

One name collision warning from the repo: another crate named rtk (Rust Type Kit) exists on crates.io. If you install via cargo install rtk and rtk gain fails, you have the wrong package. Use cargo install --git https://github.com/rtk-ai/rtk instead.

The Bigger Problem It Addresses

This fits into a pattern worth naming: most AI coding context problems are solved at the application layer (better prompts, better retrieval, better chunking), but some are better solved at the infrastructure layer. rtk is infrastructure — it operates below the agent, transparently, on every command.

The framing from the repo is right: “Claude never sees the rewrite, it just gets compressed output.” That’s the correct abstraction. The agent doesn’t need to know how to be efficient about CLI output; the proxy handles it.

We’ve covered related approaches from different angles: SocratiCode reduces tokens by improving what the agent retrieves; GitNexus reduces tool calls by pre-computing dependency graphs. rtk reduces tokens on every CLI operation regardless of what the agent is doing. Complementary layers.

MIT license. Single binary. brew install rtk. GitHub →