Strix: The Open-Source AI Hacker That Finds, Proves, and Fixes Your Vulnerabilities
A startup called Strix raised $117 million to build an AI-powered application hacker. The pitch: autonomous agents that attack your app the way a real penetration tester would — dynamically, adaptively, with proof that findings are real.
This week, the open-source version dropped.
Same concept. Same agent architecture. Free to run locally or in your CI/CD pipeline, with your own API key.
Repo: github.com/usestrix/strix
The Problem With Existing Security Tooling
Most security tools operate on the wrong layer. Static analyzers like Snyk and Trivy scan your code and dependency manifests for known CVEs. They’re valuable — we’ve written about running Trivy across our own stack and found real vulnerabilities — but they have a fundamental limitation: they don’t run your application.
A vulnerability that exists in code isn’t necessarily exploitable in your deployment. And conversely, a misconfiguration that’s perfectly safe code-wise can be catastrophically exploitable at runtime. Static analysis misses the second category almost entirely, and generates false positives for the first.
The security industry’s answer has been manual penetration testing — human experts who actually attack your running application. It works. It’s also expensive ($30K–$150K per engagement), slow (weeks to complete), and point-in-time (your codebase keeps changing after the test).
Strix’s answer: autonomous agents that do what the human pentester does, at software speed, on every deploy.
How Strix Works
Strix deploys a team of AI agents equipped with a full offensive security toolkit. The key distinction from every tool before it: it validates findings with working proof-of-concepts before reporting them.
No PoC, no finding. That means zero false positives in theory — if Strix reports a vulnerability, it has already exploited it.
The Agent Toolkit
Each Strix agent has access to:
- Full HTTP proxy — intercepts and manipulates requests/responses, the same technique human pentesters use to find injection points and auth flaws
- Browser automation — multi-tab browser for testing XSS, CSRF, and authentication flows that require real browser interaction
- Terminal environments — interactive shells for command execution and exploit testing
- Python runtime — custom exploit development and validation
- Reconnaissance — automated OSINT and attack surface mapping
- Code analysis — both static and dynamic analysis capabilities
- Knowledge management — structured finding documentation that persists across runs
The agents collaborate: one maps the attack surface, another probes specific endpoints, a third writes and executes exploit code to validate a potential finding. If the exploit succeeds, the vulnerability is confirmed and documented with reproduction steps.
What It Finds
Strix covers the full range of application-layer vulnerabilities:
- Access control — IDOR (insecure direct object references), privilege escalation, auth bypass
- Injection — SQL, NoSQL, command injection
- Server-side — SSRF, XXE, deserialization flaws
- Client-side — XSS, prototype pollution, DOM-based vulnerabilities
- Business logic — race conditions, workflow manipulation
- Authentication — JWT vulnerabilities, session management flaws
- Infrastructure — misconfigurations in cloud resources, containers, APIs
Auto-Fix as a PR
After finding and validating a vulnerability, Strix doesn’t just hand you a PDF report. It generates a fix and submits it as a ready-to-merge pull request. You review, approve, merge — the security loop closes without leaving your normal workflow.
CI/CD Integration
The new feature that makes this genuinely production-relevant is the CI/CD integration. Strix runs inside GitHub Actions (and standard CI/CD pipelines) and can block pull requests when new vulnerabilities are detected.
# .github/workflows/strix-security.yml
- name: Strix Security Scan
uses: usestrix/strix-action@v1
with:
target: ./
fail-on: high,critical
env:
STRIX_LLM: openai/gpt-4o
LLM_API_KEY: ${{ secrets.LLM_API_KEY }}
This shifts security left in a meaningful way. Instead of a quarterly pentest catching vulnerabilities that have been in production for months, every PR gets scanned before merge. The developer who introduced the bug is still in context and can fix it immediately.
Getting Started
# Install (requires Docker running)
curl -sSL https://strix.ai/install | bash
# Configure your LLM provider
export STRIX_LLM="openai/gpt-4o"
export LLM_API_KEY="your-api-key"
# Run against your app
strix --target ./your-app-directory
Results land in strix_runs/<run-name> with validated findings, PoC reproduction steps, and suggested fixes. The first run pulls the sandbox Docker image automatically — subsequent runs are faster.
Supported LLM providers include OpenAI, Anthropic (Claude), Google Gemini, and others. Azure OpenAI endpoints work via the compatible API format.
How It Compares
| Strix | Trivy/Snyk | Manual Pentest | PentAGI | |
|---|---|---|---|---|
| Runs the app | ✅ | ❌ | ✅ | ✅ |
| Validates PoC | ✅ | ❌ | ✅ | ✅ |
| CI/CD native | ✅ | ✅ | ❌ | ❌ |
| Auto-fix PRs | ✅ | Partial | ❌ | ❌ |
| Speed | Hours | Minutes | Weeks | Hours |
| Cost | LLM tokens | Free/paid | $30K–150K | LLM tokens |
| Open source | ✅ | ✅/❌ | N/A | ✅ |
The honest comparison: Strix and PentAGI occupy similar space — autonomous AI pentesting agents. Strix’s differentiators are the CI/CD-first design, the auto-fix PR workflow, and a more polished developer-facing CLI. PentAGI’s differentiator is the persistent knowledge graph (Graphiti + Neo4j) that learns across engagements.
For a team that wants security embedded in their development workflow rather than run periodically by a security team, Strix is the better fit.
The $117M Question
The funding context is worth addressing. When a startup raises $117M on an idea and open-sources the core, one of two things is happening: either the open-source version is a lead-gen tool for the enterprise product (the Datadog model), or the core is commoditizing while value moves elsewhere.
In Strix’s case, it’s clearly the former. The managed platform at app.strix.ai offers continuous monitoring, cloud and infrastructure scanning, team collaboration, and integrations with Jira and Linear — things that require persistent infrastructure rather than a local agent run. The open-source version is the engine; the SaaS is the complete workshop.
For individual developers and small teams: the open-source version is genuinely useful and free (beyond LLM costs). For security teams at larger organizations: the managed platform is the likely path.
Either way, the shift it represents is real: security testing is becoming a continuous, automated, AI-driven process — not a periodic audit. The same way static analysis moved from manual code review to an automated CI/CD check, dynamic penetration testing is moving in the same direction.
The supply chain attack that hit LiteLLM through a compromised Trivy GitHub Action last week is exactly the kind of thing this generation of tools is designed to catch earlier. Not after the credentials are already in attacker hands.
Links: