AgentBreaker

AgentBreaker is not another tool shipping prompt fuzzing and calling it red teaming.

It probes the system first, identifies capabilities such as multi-turn behavior, tool use, and multimodal handling, then generates targeted payloads shaped to the surface it found. The result is a red-team engine that is built to get you outcomes, not just logs.

Why It Exists

Most AI security testing is still too manual, too noisy, or too one-off.

AgentBreaker gives you:

repeatable campaign runs
structured evidence instead of one-off screenshots
judge, planner, and generator assisted workflows
an operator control plane for launches and review

This repo now starts with a clean slate. It does not ship a bundled public seed corpus. What matters is what AgentBreaker has already been able to surface across real systems.

What It Can Surface

AgentBreaker is built to help teams uncover issues such as:

system prompt leakage and hidden instruction disclosure
jailbreak and policy bypass paths
unsafe tool behavior and action chaining
sensitive data exposure and retrieval abuse
browser and API workflow weaknesses around agent execution
weak refusal patterns that collapse under pressure

Public Showcase

The public results corpus already demonstrates outcomes such as:

resistance-level-1: completion-style prompt extraction that disclosed a protected flag
promptairlines: structured JSON export that disclosed protected runtime values
promptairlines: authority-override framing that yielded restricted coupon data
promptairlines: multimodal injection flows that exfiltrated protected artifacts from uploaded content
gpt-5.2: successful runs across jailbreak, prompt injection, tool misuse, and data exposure patterns
gpt-5.4: successful runs across prompt injection, guardrail bypass, tool misuse, and prompt extraction

See docs/results-showcase.md.

How It Flows

flowchart LR
  A["Configure system"] --> B["Launch campaign"]
  B --> C["Generate probes"]
  C --> D["Execute and score"]
  D --> E["Store evidence and results"]
  E --> F["Review in control plane"]

Quick Start

git clone https://github.com/kagexai/agentbreaker.git
cd agentbreaker

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

cp .env.example .env
agentbreaker validate --check-env
agentbreaker run <system-id> --loop

Open the control plane:

agentbreaker serve --port 1337

Then visit http://127.0.0.1:1337.

Operator Paths

Run a campaign:

agentbreaker run <system-id> --loop

Validate config before a run:

agentbreaker validate --check-env

Inspect configured systems:

agentbreaker targets

Start the review surface:

agentbreaker serve --port 1337

Core Files

agentbreaker/cli.py - main CLI entrypoint
agentbreaker/campaign.py - campaign loop and strategy selection
agentbreaker/attack.py - payload construction
agentbreaker/target.py - execution harness and scoring
agentbreaker/control_plane.py - operator backend
frontend/ - control plane frontend
taxonomy/agentbreaker_taxonomy.yaml - strategy library
target_config.yaml - system and model configuration

Safety

AgentBreaker is for authorized testing only. Do not run it against systems you do not own or do not have explicit permission to assess.

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
agentbreaker		agentbreaker
control_plane_static		control_plane_static
control_plane_templates		control_plane_templates
disclosures		disclosures
docs		docs
findings		findings
frontend		frontend
providers		providers
taxonomy		taxonomy
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
agentbreaker.py		agentbreaker.py
control_plane.py		control_plane.py
metadata_index.py		metadata_index.py
platforms.yaml		platforms.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
target_config.yaml		target_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentBreaker

Why It Exists

What It Can Surface

Public Showcase

How It Flows

Quick Start

Operator Paths

Core Files

Safety

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentBreaker

Why It Exists

What It Can Surface

Public Showcase

How It Flows

Quick Start

Operator Paths

Core Files

Safety

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages