Digital Mind System is a local-first AI memory and operator dashboard prototype for building a personal digital brain. It ingests approved files, URLs, and selected GitHub repository knowledge into SQLite-backed memory, trains a local TF-IDF retrieval model, and exposes a Flask API plus an operator dashboard for inspection, scoring, audit trails, and controlled actions.
It is useful as a practical reference for:
- local AI memory and retrieval-augmented knowledge workflows
- personal knowledge base ingestion with source tracking
- GitHub repository intelligence, classification, and approval-gated ingest
- operator dashboards for safe AI/autonomy supervision
- ASTRA Control Bridge task/result architecture planning
- Flask, SQLite, pytest, and browser-validation based AI tooling prototypes
This project is not production-safe yet. Run it on localhost only. Do not expose it to a network or the public internet without stronger authentication, authorization, CSRF protection, deployment hardening, and a secrets review.
Use the repo root for every command:
cd /path/to/digital_mind_system
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e '.[dev]'The canonical environment is .venv. If new_venv exists, treat it as an alternate or legacy environment and prefer .venv unless you intentionally switch.
source .venv/bin/activate
python -m pytest -qDevelopment server:
source .venv/bin/activate
digital-mind-api \
--db ./data/digital_mind.db \
--model ./data/digital_mind_model.pkl \
--host 127.0.0.1 \
--port 8000 \
--upload-dir ./data/uploadsWaitress server for a local production-style run:
source .venv/bin/activate
digital-mind-api-prod \
--db ./data/digital_mind.db \
--model ./data/digital_mind_model.pkl \
--host 127.0.0.1 \
--port 8000 \
--upload-dir ./data/uploads \
--threads 4Bundled script, defaults to Waitress/local prod mode:
./scripts/run_api.shUse API_MODE=dev ./scripts/run_api.sh for the Flask development server.
Serve the repo root and open the dashboard through localhost:
python3 -m http.server 5500Then open:
http://127.0.0.1:5500/digital_mind_operator_dashboard.html?apiBase=http://127.0.0.1:8000
The older digital_mind_web_ui.html file may still exist, but the operator dashboard is digital_mind_operator_dashboard.html.
Read-only endpoints such as GET /health, GET /ready, GET /live, and GET /stats remain open.
For mutating endpoints, set a local API key before starting the server:
export DIGITAL_MIND_LOCAL_API_KEY='replace-with-a-local-secret'
./scripts/run_api.shMutating API calls must then include:
curl -X POST http://127.0.0.1:8000/pause-runtime \
-H 'X-Digital-Mind-Key: replace-with-a-local-secret'Dashboard URL with key stored in browser local storage:
http://127.0.0.1:5500/digital_mind_operator_dashboard.html?apiBase=http://127.0.0.1:8000&apiKey=replace-with-a-local-secret
If DIGITAL_MIND_LOCAL_API_KEY is not set, mutation auth is intentionally disabled for local development and /brain/status reports this warning:
Mutation authentication is disabled because DIGITAL_MIND_LOCAL_API_KEY is not set.
curl -s http://127.0.0.1:8000/health | python -m json.tool
curl -s http://127.0.0.1:8000/ready | python -m json.tool
curl -s http://127.0.0.1:8000/github/health | python -m json.tool
curl -s http://127.0.0.1:8000/autonomy/pipeline/config | python -m json.tool
curl -s http://127.0.0.1:8000/github/ingest-limits | python -m json.tool
curl -s http://127.0.0.1:8000/brain/status | python -m json.toolDirect POST /github/ingest is approval-gated. Unknown repositories are queued and blocked for write ingest until approved. Avoid-classified repositories allow metadata/reference-only handling by default, but full or selective ingest requires an explicit audited admin override.
Final classifications:
Excellent, Good, Needs Review, Avoid, Unknown, Blocked
Final use modes:
full_ingest, selective_ingest, docs_only, metadata_only, reference_only, skip, blocked
Default GitHub ingest limits are non-null and conservative:
GITHUB_FULL_INGEST_MAX_KB=250000
GITHUB_SELECTIVE_INGEST_MAX_KB=500000
GITHUB_MAX_ARCHIVE_KB=250000
GITHUB_MAX_EXTRACTED_KB=250000
GITHUB_MAX_FILE_KB=512
GITHUB_MAX_FILES_SCANNED=2000
GITHUB_ENABLE_ADMIN_OVERRIDE=false
Environment variables can override these limits. Avoid unlimited except in deliberate local testing.
The autonomy pipeline defaults are conservative:
auto_discovery_enabled=false
auto_ingestion_enabled=false
approval_required=true
safe_mode=true
github_discovery_enabled=false
huggingface_discovery_enabled=false
code_execution_enabled=false
dependency_install_enabled=false
model_binary_loading_enabled=false
Hugging Face items remain metadata-oriented at the source boundary. The pipeline must not execute downloaded code, install dependencies, or load model binaries.
Normal run scripts do not pass --skip-hash-verification. Keep hash verification enabled for normal local and production-style runs.
If you intentionally run with --skip-hash-verification, /brain/status reports:
Model hash verification is disabled because --skip-hash-verification is active.
API log rotation can be configured before starting the server:
export DIGITAL_MIND_API_LOG_PATH=./logs/digital_mind_api.log
export DIGITAL_MIND_API_LOG_MAX_BYTES=5242880
export DIGITAL_MIND_API_LOG_BACKUP_COUNT=5Tail logs:
tail -f ./logs/digital_mind_api.logsource .venv/bin/activate
digital-mind --db ./data/digital_mind.db --model ./data/digital_mind_model.pklCommon interactive commands:
ingest-file <path>
ingest-url <url>
train
query <text>
upgrade
help
exit
Continuous retraining mode:
digital-mind \
--db ./data/digital_mind.db \
--model ./data/digital_mind_model.pkl \
--continuous-retrain \
--retrain-interval 1.0If the API is running and browser tooling is available:
source .venv/bin/activate
python playwright_validation.pyNo browser tooling should be installed just to run this unless you are intentionally doing UI validation work.
LaunchAgent plist files under config/ are templates and require machine-specific absolute paths before use. Update paths to this repo root, then load through launchctl.