Phantom: An AI Agent With Its Own Computer, Email, and Self-Rewriting Brain
Most AI agents are disposable. You open a chat, get an answer, close the tab. Next session: day one again. The context is gone, the work is gone, the agent doesn’t know your name.
Phantom takes a different approach: give the AI its own computer.
A dedicated VM where it installs software, runs 24/7, builds infrastructure, remembers what you told it last week, registers its own MCP tools, and — the part that makes this technically interesting — rewrites its own configuration after every session, with the rewrites validated by LLM judges.
What It Actually Does
Three production stories from the README, none of them mockups:
It built an analytics platform nobody asked for. A Phantom was asked to help with data analysis. It installed ClickHouse on its own VM, downloaded the full Hacker News dataset, loaded 28.7 million rows spanning 2007–2021, built an analytics dashboard with interactive charts, created a REST API, and registered that API as an MCP tool so future sessions (and other agents) could query it. Nobody asked it to do any of this beyond “help with data analysis.”
It added a communication channel it was never designed with. Phantom ships with Slack, Telegram, Email, and Webhook channels — not Discord. When asked “Can I talk to you on Discord?”, it replied honestly that Discord wasn’t wired up, then offered to build it. It walked the user through creating a Discord application, spun up the container, and went live on Discord. It permanently gained a capability it wasn’t born with.
It started monitoring its own infrastructure. A Phantom discovered Vigil, a 3-star open-source system monitor. It integrated Vigil into its ClickHouse instance, built a sync pipeline batching metrics every 30 seconds, and created a real-time dashboard showing service health, Docker container status, network I/O, disk I/O, and data pipeline health. 890,450 rows, 25 metrics, auto-refreshing. The agent is watching itself.
The Architecture
# Docker quick start
curl -fsSL https://raw.githubusercontent.com/ghostwright/phantom/main/docker-compose.user.yaml -o docker-compose.yaml
curl -fsSL https://raw.githubusercontent.com/ghostwright/phantom/main/.env.example -o .env
# Add ANTHROPIC_API_KEY, Slack tokens, OWNER_SLACK_USER_ID
docker compose up -d
On boot: Qdrant starts for vector memory, Ollama pulls the embedding model, the agent initializes and DMs you on Slack when ready.
Key components:
- Own VM — agent workspace isolated from your machine
- Qdrant — persistent vector memory across sessions
- MCP server — agent registers new tools it builds and exposes them to other agents
- Secure credential collection — magic links for token submission (no plaintext in chat)
- Email identity — own address, can send and receive
- Self-rewriting config — after every session, the agent updates its own system config; changes are validated by LLM judges before applying
The self-rewriting config is the technically distinctive piece. Most agent frameworks have static system prompts. Phantom’s is dynamic — the agent reflects on what happened in the session and proposes edits to its own instructions, which are then evaluated for safety and coherence before committing. It gets measurably better at your specific job over time.
The soul.py Connection
This is precisely what soul.py was designed to enable at the memory layer. Phantom is the full-stack version: persistent memory (Qdrant), self-improving identity (config rewriting), and autonomous tool creation — all running on dedicated infrastructure.
The difference in philosophy: soul.py is a lightweight portable memory layer you embed in any agent. Phantom is an opinionated full deployment with its own VM, channels, and infrastructure. Different use cases, same underlying insight: agents need persistent state and identity across sessions to be genuinely useful.
The Safety Question
Phantom builds infrastructure without asking for permission. It installed databases, spun up containers, registered APIs, and modified its own config — all autonomously. The README is honest about this: “This is not a chatbot.”
That’s a meaningful capability and a meaningful risk surface. The credential collection pattern (magic links rather than plaintext tokens in chat) is a thoughtful detail. The LLM-judge validation on config rewrites is another. But running a self-modifying agent on infrastructure with real credentials requires thinking carefully about what permissions you grant it and what it can reach.
For development and personal productivity use cases, the risk profile is manageable. For enterprise deployment, you’d want to think hard about network segmentation and blast radius before pointing a Phantom at production infrastructure.
Repo: github.com/ghostwright/phantom — Apache 2.0, v0.18.2, 822 tests passing.