Skip to content
Gareth Roberts edited this page Jan 19, 2026 · 2 revisions

CLI

Common Commands

# Run a single experiment from a config
insidellms run experiment.yaml

# Run a cross-model behavioural harness
insidellms harness harness.yaml

# Rebuild summary/report from records only
insidellms report ./my_run

# Diff two runs (text or JSON)
insidellms diff ./baseline ./candidate --fail-on-regressions
insidellms diff ./baseline ./candidate --format json --output diff.json

# Quick smoke test
insidellms quicktest "What is 2 + 2?" --model dummy

# List available components
insidellms list models
insidellms list probes

# Benchmark or compare models
insidellms benchmark --models openai,anthropic --probes logic,bias
insidellms compare --models openai,anthropic --input "Explain gradient descent"

# Validate a config file
insidellms validate experiment.yaml

Helpful Flags

  • --verbose for progress and tracebacks
  • --output for saving results from run
  • --output-dir for harness outputs

Clone this wiki locally