AgentBreaker is not another tool shipping prompt fuzzing and calling it red teaming.
It probes the system first, identifies capabilities such as multi-turn behavior, tool use, and multimodal handling, then generates targeted payloads shaped to the surface it found. The result is a red-team engine that is built to get you outcomes, not just logs.
Most AI security testing is still too manual, too noisy, or too one-off.
AgentBreaker gives you:
- repeatable campaign runs
- structured evidence instead of one-off screenshots
- judge, planner, and generator assisted workflows
- an operator control plane for launches and review
This repo now starts with a clean slate. It does not ship a bundled public seed corpus. What matters is what AgentBreaker has already been able to surface across real systems.
AgentBreaker is built to help teams uncover issues such as:
- system prompt leakage and hidden instruction disclosure
- jailbreak and policy bypass paths
- unsafe tool behavior and action chaining
- sensitive data exposure and retrieval abuse
- browser and API workflow weaknesses around agent execution
- weak refusal patterns that collapse under pressure
The public results corpus already demonstrates outcomes such as:
resistance-level-1: completion-style prompt extraction that disclosed a protected flagpromptairlines: structured JSON export that disclosed protected runtime valuespromptairlines: authority-override framing that yielded restricted coupon datapromptairlines: multimodal injection flows that exfiltrated protected artifacts from uploaded contentgpt-5.2: successful runs across jailbreak, prompt injection, tool misuse, and data exposure patternsgpt-5.4: successful runs across prompt injection, guardrail bypass, tool misuse, and prompt extraction
flowchart LR
A["Configure system"] --> B["Launch campaign"]
B --> C["Generate probes"]
C --> D["Execute and score"]
D --> E["Store evidence and results"]
E --> F["Review in control plane"]
git clone https://github.com/kagexai/agentbreaker.git
cd agentbreaker
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
cp .env.example .env
agentbreaker validate --check-env
agentbreaker run <system-id> --loopOpen the control plane:
agentbreaker serve --port 1337Then visit http://127.0.0.1:1337.
Run a campaign:
agentbreaker run <system-id> --loopValidate config before a run:
agentbreaker validate --check-envInspect configured systems:
agentbreaker targetsStart the review surface:
agentbreaker serve --port 1337agentbreaker/cli.py- main CLI entrypointagentbreaker/campaign.py- campaign loop and strategy selectionagentbreaker/attack.py- payload constructionagentbreaker/target.py- execution harness and scoringagentbreaker/control_plane.py- operator backendfrontend/- control plane frontendtaxonomy/agentbreaker_taxonomy.yaml- strategy librarytarget_config.yaml- system and model configuration
AgentBreaker is for authorized testing only. Do not run it against systems you do not own or do not have explicit permission to assess.
MIT. See LICENSE.