This directory contains example outputs from LLM Introspect runs.
Note: These examples use local LLM interrogators (via LM Studio). Remote interrogators are also supported using
--remote-interrogatorwith any API provider. See the main README or USAGE for remote interrogator examples.
Command used:
export LMSTUDIO_BASE_URL="http://192.168.1.3:1234/v1"
llm-introspect code-audit \
--provider anthropic \
--model claude-3-haiku-20240307 \
--adaptive lmstudio:qwen/qwen2.5-coder-14b \
--max-followups 3 \
--categories security_sql,security_injection,efficiency_algorithm \
--format markdown \
--output examples/code-audit-haiku-example.mdWhat this does:
- Tests Claude 3 Haiku's code generation abilities
- Uses a local Qwen 2.5 Coder model (via LM Studio) to analyze the generated code
- Focuses on three challenge categories:
security_sql- SQL injection vulnerability detectionsecurity_injection- Command injection vulnerability detectionefficiency_algorithm- Algorithmic complexity analysis
- Outputs results in markdown format
Results: code-audit-haiku-example.md
Key findings:
- Overall Grade: A+ (94.6%)
- Security Score: 90.0%
- Efficiency Score: 94.3%
- All 7 challenges passed
The adaptive interrogator (Qwen 2.5 Coder) analyzed each code response and determined the generated code was satisfactory, so no follow-up probes were needed.
Command used:
export LMSTUDIO_BASE_URL="http://192.168.1.3:1234/v1"
llm-introspect run \
--provider anthropic \
--model claude-haiku-4-5-20251001 \
--adaptive lmstudio:gemma-3-12b \
--max-followups 3 \
--probes all \
--format markdown \
--output examples/full-suite-haiku-4.5-example.mdWhat this does:
- Runs the complete introspection suite against Claude Haiku 4.5
- Uses a local Gemma 3 12B model (via LM Studio) as the adaptive interrogator
- Executes all probe types:
contrastive- Tests response sensitivity to framing variationsboundaries- Maps behavioral boundaries and bypass resistanceconsistency- Measures position stability across topicsunlock- Tests jailbreak resistance and prompt injection handling
- Outputs a comprehensive markdown report
Results: full-suite-haiku-4.5-example.md
Key findings:
- Total API Calls: 1,103
- Total Tokens: 348,882
- Safety Alignment: Moderate (bypass resistance: 0.40)
- Position Stability: Low (consistency: 0.35)
- Content Filtering: Very Low (81% coverage)
- Most Consistent Topic: Microservices vs Monoliths
- Least Consistent Topic: Free will
The model showed context-dependent responses with positions varying based on framing. Boundary mapping identified 24 behavioral boundaries with average sharpness of 0.45.
-
Set up API keys for the provider you want to test:
export ANTHROPIC_API_KEY="your-key" # or export OPENAI_API_KEY="your-key"
-
(Optional) For adaptive interrogation with LM Studio:
export LMSTUDIO_BASE_URL="http://your-host:1234/v1"
llm-introspect code-audit --provider openai --model gpt-4 --format summaryllm-introspect code-audit \
--provider anthropic \
--model claude-3-sonnet-20240229 \
--categories all \
--format markdown \
--output my-audit-results.md# Test a specific language
llm-introspect code-audit \
--provider openai \
--model gpt-4 \
--language rust \
--format markdown \
--output rust-audit.md
# Test all common languages (Python, Rust, Ruby, C, C++, JavaScript, Shell, R)
llm-introspect code-audit \
--provider anthropic \
--model claude-3-sonnet-20240229 \
--all-languages \
--categories security_sql,security_injection \
--format markdown \
--output multi-lang-audit.md
# Test rare languages (Erlang, COBOL, Forth, Haskell)
llm-introspect code-audit \
--provider openai \
--model gpt-4 \
--rare-languages \
--format summary
# Test Haskell specifically
llm-introspect code-audit \
--provider anthropic \
--model claude-3-haiku-20240307 \
--language haskell \
--format markdown \
--output haskell-audit.md
# Test ALL languages including rare ones
llm-introspect code-audit \
--provider anthropic \
--model claude-3-opus \
--all-languages \
--include-rare \
--format markdown \
--output complete-lang-audit.md# Test data structure efficiency (LRU Cache, Trie, Interval Merging)
llm-introspect code-audit \
--provider openai \
--model gpt-4 \
--categories efficiency_datastructure \
--format markdown \
--output datastructure-audit.md
# Test recursion safety and correctness (Deep Flatten, Tree Paths, Backtracking)
llm-introspect code-audit \
--provider anthropic \
--model claude-3-sonnet-20240229 \
--categories correctness_recursive \
--format markdown \
--output recursion-audit.md
# Test concurrency and thread safety (Thread-Safe Counter, Producer-Consumer, Memoization)
llm-introspect code-audit \
--provider openai \
--model gpt-4 \
--categories concurrency_safety \
--format markdown \
--output concurrency-audit.md
# Test resource management (Connection Pool, File Processing, Rate Limiter)
llm-introspect code-audit \
--provider anthropic \
--model claude-3-opus \
--categories concurrency_resource \
--format markdown \
--output resource-audit.md
# Comprehensive audit with all new categories
llm-introspect code-audit \
--provider openai \
--model gpt-4 \
--categories efficiency_datastructure,correctness_recursive,concurrency_safety,concurrency_resource \
--format markdown \
--output advanced-audit.mdllm-introspect list categories # Code audit challenge categories
llm-introspect list languages # Supported programming languagessecurity_sql- SQL injection vulnerability detectionsecurity_xss- Cross-site scripting detectionsecurity_injection- Command/code injection detectionsecurity_path- Path traversal vulnerability detectionsecurity_crypto- Cryptographic weakness detection
efficiency_algorithm- Algorithmic complexity (find duplicates, two sum, anagrams)efficiency_datastructure- Data structure choice (LRU cache, trie, interval merging)
correctness_edge- Edge case handling (safe divide, list access)correctness_error- Error handling (file read, API calls)correctness_recursive- Recursion safety (deep flatten, tree paths, balanced parentheses)
concurrency_safety- Thread safety (thread-safe counter, producer-consumer, memoization)concurrency_resource- Resource management (connection pool, file processing, rate limiter)
Common: Python, JavaScript, Rust, Ruby, C, C++, Shell (Bash), R
Rare: Erlang, COBOL, Forth, Haskell
The hallucination probe tests how susceptible a model is to generating false information when presented with tricky prompts designed to elicit confabulation.
# Full hallucination assessment
llm-introspect hallucination --provider anthropic --model claude-3-sonnet-20240229
# Test specific categories
llm-introspect hallucination --provider openai --model gpt-4 \
--categories fabricated_citations,false_premises,fictional_entities
# Generate markdown report
llm-introspect hallucination --provider anthropic --model claude-3-opus \
--format markdown \
--output examples/hallucination-assessment.md
# With adaptive interrogation for deeper probing
llm-introspect hallucination --provider openai --model gpt-4 \
--adaptive ollama:llama3 --max-followups 3What this tests:
- Fabricated Citations - Asks about non-existent papers and research
- False Premises - Questions containing incorrect assumptions
- Fictional Entities - References to non-existent people/organizations
- Fake Statistics - Requests involving fabricated numbers
- Nonexistent Events - Questions about events that never happened
- Fake Quotes - Requests for quotes from non-existent sources
- Fictional Technical - Questions about non-existent technologies
Use llm-introspect hallucination --help for additional options.
The systems knowledge probe tests how accurately a model knows OS administration across Linux distributions (Debian, Arch, Ubuntu) and BSD variants (FreeBSD, OpenBSD, NetBSD).
# Full systems knowledge audit (all OSes, all categories)
llm-introspect systems-knowledge --provider anthropic --model claude-3-sonnet-20240229
# Test specific operating systems
llm-introspect systems-knowledge --provider openai --model gpt-4 \
--os debian,freebsd
# Test specific categories
llm-introspect systems-knowledge --provider anthropic --model claude-3-sonnet \
--categories networking,configuration
# Generate markdown report
llm-introspect systems-knowledge --provider anthropic --model claude-3-opus \
--format markdown \
--output examples/systems-knowledge-assessment.md
# With adaptive interrogation for deeper probing
llm-introspect systems-knowledge --provider openai --model gpt-4 \
--adaptive ollama:llama3 --max-followups 3What this tests:
- Configuration - Package management, services, init systems, users, logging
- Process Management - Signals, nice/renice, cgroups/jails, monitoring
- Media Management - Filesystems, partitions, ZFS, swap, disk utilities
- Networking - Interfaces, firewalls, DNS, routing, bonding/aggregation
Operating Systems Tested:
- Linux: Debian, Arch, Ubuntu
- BSD: FreeBSD, OpenBSD, NetBSD
Key Metrics:
- Accuracy Score - Penalizes misconceptions (e.g., using Linux commands on BSD)
- Completeness Score - Coverage of required concepts
- Per-OS Grades - Reveals which OS families the model knows best
Use llm-introspect systems-knowledge --help for additional options.
⚠️ WARNING: SENSITIVE CONTENTThe danger probe is designed for security researchers and safety auditors who need to evaluate LLM resistance to generating harmful content. This probe:
- Runs in ISOLATION (cannot combine with other probes)
- Requires explicit confirmation before execution
- Adaptive interrogation disabled (safety measure - static prompts only)
- Tests genuinely dangerous subject matter
- May produce disturbing output from non-refusing models
The danger probe evaluates how well a model resists requests related to harmful topics such as self-harm, substance safety, and psychological manipulation. It explicitly excludes weapons, violence, and terrorism content.
# Basic run (interactive confirmation required)
llm-introspect danger --provider anthropic --model claude-3-haiku-20240307
# Automated run (bypasses confirmation - use responsibly)
llm-introspect danger --provider openai --model gpt-4 --yes
# Generate markdown report
llm-introspect danger --provider anthropic --model claude-3-sonnet \
--format markdown \
--output danger-assessment.mdUse llm-introspect danger --help for additional options.