soul.py v0.2.0: Modulizer β 50% Token Savings with Zero Infrastructure
π Part of the soul.py ecosystem β see the original soul.py post for the full story on persistent AI memory.
The Problem: MEMORY.md Gets Big
soul.py stores memories in plain markdown. Itβs human-readable, git-versionable, and works everywhere. But as your agent accumulates knowledge, MEMORY.md grows β 10KB, 25KB, 50KB.
Every conversation loads the full file into context. At 25KB, thatβs ~6,000 tokens before you even ask a question. Most of it is irrelevant to the current query.
The old approach:
Query: "What tools have I used?"
β Load full MEMORY.md (25KB, ~6000 tokens)
β 94% of tokens wasted on irrelevant context
The Fix: Modulizer
soul.py v0.2.0 introduces Modulizer β a zero-deps memory segmentation system inspired by progressive memory patterns.
pip install --upgrade soul-agent
soul modulize MEMORY.md
This creates:
modules/
βββ INDEX.md (1.7KB β table of contents)
βββ projects.md (6KB)
βββ tools.md (2KB)
βββ procedures.md (5KB)
βββ learnings.md (2KB)
βββ reference.md (9KB)
The new approach:
Query: "What tools have I used?"
β Read INDEX.md (1.7KB)
β LLM picks: tools.md, projects.md
β Read only those (8KB instead of 25KB)
β 47% token savings
How It Works
Phase 1: Modulization (One-Time)
soul modulize MEMORY.md --output ./modules/
Behind the scenes:
- Chunker β splits markdown by headers
- Classifier β LLM categorizes each chunk (projects, tools, people, decisions, etc.)
- Splitter β groups chunks into modules
- Indexer β generates INDEX.md with summaries
Phase 2: Two-Phase Retrieval (Every Query)
from soul import Agent
agent = Agent(use_modules=True) # default
response = agent.ask("What tools have I used?")
stats = agent.get_memory_stats()
# {
# 'mode': 'modules',
# 'modules_read': ['tools.md', 'projects.md'],
# 'total_kb': 8.5,
# 'index_kb': 1.7
# }
The agent:
- Reads INDEX.md (always small)
- Asks the LLM: βWhich modules are relevant to this query?β
- Reads only the selected modules
- Answers with full context, fewer tokens
CLI Integration
# Modulize your memory
soul modulize MEMORY.md
# Auto-detect large files (>50KB)
soul modulize --auto
# Chat with modules (automatic)
soul chat
# Chat without modules
soul chat --no-modules
# View module stats during chat
> /modules
π Modules in ./modules/:
π INDEX.md (1.7KB)
π projects.md (6.0KB)
π tools.md (2.0KB)
...
Last query used: tools.md, projects.md
Real-World Results
Tested on a 25KB MEMORY.md with 44 sections:
| Metric | Before | After | Savings |
|---|---|---|---|
| Memory read per query | 25KB | 13KB avg | 47% |
| INDEX size | N/A | 1.7KB | β |
| Module count | 1 file | 6 files | β |
| Categories | β | 7 unique | β |
For larger memory files (100KB+), expect 60-80% savings.
Zero Infrastructure
Unlike RAG, Modulizer needs:
- β No vector database (Qdrant, Pinecone, etc.)
- β No embedding model or API
- β No background services
Just your existing LLM provider. Works with Anthropic, OpenAI, Gemini, and Ollama.
When to Use Modulizer vs RAG
| Use Case | Modulizer | RAG (v2.0) |
|---|---|---|
| Memory size | 10-100KB | 100KB+ |
| Infrastructure | None | Qdrant + embeddings |
| Query style | Category-based | Semantic search |
| Setup time | 1 command | ~30 min |
| Offline/airgapped | β Yes | Depends |
Modulizer is the βright toolβ for solo agents with moderate memory. Itβs the 90% solution with 0% infrastructure.
RAG is better when you need true semantic search across massive knowledge bases.
Backwards Compatible
If you donβt run soul modulize, everything works exactly as before:
- No modules? β Full MEMORY.md is loaded
--no-modulesflag? β Full MEMORY.md is loaded- Modules exist? β Two-phase retrieval kicks in automatically
Your existing workflows are unaffected.
Try It Now
pip install --upgrade soul-agent
# If you have an existing memory file
soul modulize MEMORY.md
# Start chatting
soul chat
Check /modules during chat to see which modules are being used.
Links:
Modulizer was inspired by the progressive-memory pattern β scan an index first, fetch details on demand. We brought this to soul.pyβs file-based memory system.