AI-First operating system for running a small company with Markdown source of truth, agent prompts, and an agent control plane: deny-by-default permissions, runtime governance, telemetry, and a feedback loop for safe delegation.
This repository packages the reusable public shell of an AI-first company operating system:
- Live operating state stays local and out of git history
05 AI Control Plane/schemas/holds reusable schemas for the local control planeagents/contains role prompts for specialized executionskills/contains reusable callable skills for slash-style invocation and UI surfacingdocs/contains public-safe architecture notes and project briefsscripts/provides validation, telemetry, runtime helpers, the additive mission runtime nucleus, and publication guardrails.hq/is reserved for private local runtime artifacts and live operating state; it must never enter public git history- large work should use a private
.hq/specs/packet plus.hq/handoffs/continuity instead of reopening the whole repo context in each new chat
- AI-first control plane for governed delegation
- additive
Mission/Run/Step/Approvalstate nucleus for durable local execution - append-only runtime event log with a schema-validated envelope: actor, correlation/causation ids, and privacy class for causal audit
- optional per-run execution budgets (
--max-steps,--max-failed-steps) that stop runaway step loops before they burn the session - deny-by-default permission model with a
can()decision function and capability grants - pre-action approval gates with approval-as-continuation and RESUMED-state handling
- byte-stable, secret-scrubbed run receipts for an audit trail
- release gate with required-evidence checks and diff-based permission-expansion enforcement
- short feedback loop: per-run receipts, baseline/best metrics, and a review-due trigger
- role-based agent prompts
- shared role-prompt skeleton with generator-backed normalization
- reusable skills with UI metadata
- machine-readable workflow and policy layer
- telemetry and runtime helper scripts
- public-safety validation for GitHub publication
- allowlist-based publication guardrails for GitHub
AGENTS.mdshared repository rulesagents/*/AGENTS.mdrole-specific promptsskills/*/SKILL.mdreusable skill definitions05 AI Control Plane/schemas/reusable schema definitionsdocs/public-safe architecture and project documentationscripts/validation, runtime, telemetry, and publication-safety toolstests/automated coverage for core behaviorscripts/hq_role_prompt_scaffold.pyshared role-prompt generator foragents/*/AGENTS.mdscripts/hq_private_prompt_lint.pylocal lint for private.hq/prompts/scripts/hq_permissions.pydeny-by-defaultcan()authorization decisionsscripts/hq_policy_hooks.pypre-action gates, approval checkpoints, and run receiptsscripts/hq_feedback_loop.pyper-run feedback receipts, metrics, and the review-due triggerscripts/hq_gate.pyrelease gate with required-evidence and permission-expansion checksscripts/hq_reference_scan.pyreference-pattern analysis with a dynamic identifier scan
python3 -m venv .venv && source .venv/bin/activate # needed when system Python is externally managed (PEP 668)
python3 -m pip install -r requirements-dev.txt
python3 scripts/hq_runtime.py bootstrap
python3 scripts/hq_mission_runtime.py init
python3 scripts/hq_control_plane.py validate
python3 scripts/hq_control_plane.py sync
python3 scripts/hq_control_plane.py status
python3 scripts/hq_runtime.py spec --task "Example large task" --goal "Define the next narrow slice"
python3 scripts/hq_role_prompt_scaffold.py --check
python3 -m unittest discover tests
python3 scripts/hq_public_safety.py
python3 scripts/hq_private_prompt_lint.py
python3 scripts/hq_gate.pypython3 scripts/hq_runtime.py bootstrap creates the local-only HQ scaffold that is intentionally not published to GitHub: now.md, projects.md, planning/notes/project pages, and the local control-plane JSON files.
python3 scripts/hq_runtime.py bootstrap
python3 scripts/hq_mission_runtime.py init
python3 scripts/hq_control_plane.py status
python3 scripts/hq_runtime.py spec --task "Example large task"
python3 scripts/hq_runtime.py handoff --task "Example large task" --spec-file .hq/specs/example-large-task/LATEST.md
python3 scripts/hq_control_plane.py validate
python3 scripts/hq_control_plane.py sync
python3 scripts/hq_role_prompt_scaffold.py --check
python3 scripts/hq_permissions.py check --agent ai_operations_lead --action update_task_state --scope hq:control-plane/task-board
python3 scripts/hq_feedback_loop.py summary
python3 scripts/hq_feedback_loop.py review-status
python3 -m unittest discover tests
python3 scripts/hq_public_safety.py
python3 scripts/hq_private_prompt_lint.py
python3 scripts/hq_gate.pyThe supported local test runner is python3 -m unittest discover tests. pytest is not a required project dependency or part of the official local/CI gate.
Start a new session with python3 scripts/hq_control_plane.py status. That command writes .hq/state/session-bootstrap.json and prints a compact live-state projection with:
- one
startup_focustask with the current primary move - up to two adjacent
support_tracks, preferring the same project corridor before cross-project fallback - active tasks without
done - blocked tasks with a short reason
- the current
next_stepper live task - stale spec/handoff signals in a separate block
- the recommended next command for the next slice
Use python3 scripts/hq_control_plane.py status --json when another script or tool needs the same projection in machine-readable form.
The same run also refreshes .hq/state/memory-index.json as a smaller startup capsule for future runtime consumers.
Use spec for large or ambiguous work. The spec is a private, task-scoped context packet under .hq/specs/ so the next chat can read the narrow brief first instead of reloading broad bootstrap context. Use handoff to capture execution continuity, blockers, and next steps around that spec.
scripts/hq_runtime.py remains the compatibility surface for bootstrap, spec, and handoff helpers. scripts/hq_mission_runtime.py is the additive runtime nucleus for first-class Mission, Run, Step, Approval, and Artifact records; it should grow before any deep rewrite of the older helper surface.
Every runtime mutation appends an event to .hq/state/mission-runtime/events/ using the runtime-event schema envelope (actor, correlation_id, causation_id, privacy_class), so a run's history can be read causally instead of as flat log lines. start-run accepts optional --max-steps and --max-failed-steps budgets; once a budget is exhausted, checkpoint-step refuses new steps with a run_budget_exhausted event, and the run must be finished, interrupted, or restarted with a higher budget.
Tracked role prompts are generated from the shared skeleton. The generated prompts now include a short Quick Start plus split Always Read / Read When Needed paths, so update scripts/hq_role_prompt_scaffold.py and regenerate instead of hand-editing agents/*/AGENTS.md. After changing the scaffold, run python3 scripts/hq_role_prompt_scaffold.py --write and then python3 scripts/hq_role_prompt_scaffold.py --check.
When .hq/prompts/ exists locally, run python3 scripts/hq_private_prompt_lint.py to catch broken local paths, invalid absolute references, and weak audit-prompt feedback loops before relying on those prompts in a new session.
Agent authority is least-privilege and policy-as-code. The agent registry carries capability_grants, and 05 AI Control Plane/permission-grants.json holds the grant/deny table (live data files are gitignored, so CI bootstrap samples mirror them).
python3 scripts/hq_permissions.py check ...evaluates a single request throughcan(): deny-by-default, fixed deny precedence (explicit_deny>invalid_input>too_broad_scope>unsatisfied_approval>no_matching_grant), scope-hierarchy inheritance, and a frozenDecision.- Pre-action gates in
scripts/hq_policy_hooks.pyenforce approval checkpoints, approval-as-continuation, andRESUMED-state handling before risky steps run. - Every governed step can emit a byte-stable, secret- and timestamp-scrubbed run receipt under
.hq/receipts/for an audit trail. python3 scripts/hq_gate.pyruns the release gate: required-evidence checks plus a redundant diff-based permission-expansion enforcement layer that requires a recorded human approval for any widened grant.
Relevant schemas (draft 2020-12, additionalProperties: false): permission-grants, approval-checkpoint, run-receipt, agent-release, and the capability_grants extension to agent-registry.
The control plane keeps a short feedback loop on disk so execution learns across sessions instead of relying on chat memory:
- governed runs auto-write append-only receipts (
before/after) under.hq/telemetry/feedback-loop/, linked by attempt id - receipts carry a numeric metric plus direction, so
summaryreports baseline, best, and latest delta per task python3 scripts/hq_feedback_loop.py review-statusreports whether a review is due — immediately on any adverse outcome, or by cadence after a batch of clean successes (HQ_FEEDBACK_REVIEW_CADENCE, default 5);mark-reviewedresets it- the aggregated summary and review signal are surfaced in
status,.hq/state/session-bootstrap.json, and the memory index so they survive context compaction
The public repository is allowlist-only. These path classes are allowed:
README.md,AGENTS.md,.gitignore,requirements-dev.txtdocs/.github/workflows/agents/*/AGENTS.mdskills/scripts/tests/05 AI Control Plane/schemas/
Everything else is local-only and must not be tracked.
Keep these private:
.hq/runtime state, telemetry, handoffs, evals, reflections, releases, and local prompts.hq/specs/private task packets for large work- live operating docs such as
now.md,projects.md,routines.md,stack.md, and Markdown work under02 Planning/,03 Notes/, and04 Projects/ - raw customer or prospect data
- personal notes, journals, archives, and imported research dumps
- credentials, API keys, private keys, payment exports, and banking material
- temporary datasets and local environment files
If a file contains personal data, customer data, raw imports, credentials, payment artifacts, or private working memory, keep it under .hq/ or outside this repository.
Before pushing, run python3 scripts/hq_public_safety.py or the full python3 scripts/hq_gate.py.