Skip to content

Security: Open-Paws/open-paws-intelligence

docs/security.md

Security Model — open-paws-intelligence

This system handles potential legal evidence. Every design decision must account for the three-adversary threat model: state surveillance, industry infiltration, and AI model bias. Data compromise here can endanger witnesses, investigators, and ongoing operations.

Encrypted Storage

All investigation data (evidence documents, field notes, chain of custody logs) must be stored encrypted at rest using AES-256-GCM.

For the evidence database and custody log:

  • Use an encrypted filesystem volume (LUKS on Linux, FileVault on macOS) for the entire evidence/ directory.
  • The SQLite evidence index (evidence/index.db) should additionally be encrypted using SQLCipher if the pysqlcipher3 package is available.
  • Set FIELD_DB_KEY in .env for the field server's local SQLite encryption.

For uploaded documents in transit:

  • The AutomatedFOIA pattern (AES-256-GCM, 12-byte nonce prepended) is implemented in src/api/server.py for document uploads.
  • Documents are processed in RAM using io.BytesIO — no unencrypted temp files.
  • On completion, decrypted bytes are explicitly zeroed from memory where possible.

Key management:

  • Store AES_SECRET_KEY and FIELD_DB_KEY in .env — never commit to git.
  • Rotate keys after any suspected compromise. The .gitignore excludes .env.*.
  • For production: use a secrets manager (Vault, AWS Secrets Manager) not .env files.

What NOT to Store

These data types must NEVER appear in plaintext logs, database fields, or anywhere that could be subpoenaed or seized:

  • Witness identities — real names, contact information, locations
  • Investigator identities — names, pseudonyms that link to real identities, device IDs
  • Undercover operation details — facility entry methods, timing, cover stories
  • Source recruitment or communication — any record of how a witness was contacted

Use pseudonymous IDs instead (e.g. OP-001, WIT-042). Maintain the mapping from pseudonym to real identity ONLY in a separate, separately secured system that does not sync with this pipeline.

Regulatory data (USDA APHIS inspection records) is public and carries no restriction. FOIA request tracking (agency, subject, dates) is also low-sensitivity.

Offline Mode Security

The field server (src/offline/field_server.py) is designed for air-gapped operation:

  • Binds to 127.0.0.1 by default. Never expose to the internet.
  • Network access requires the explicit --network flag and is still localhost-local.
  • pysqlcipher3 encrypts the local SQLite database. If unavailable, a warning is logged. Do not use the unencrypted fallback for active investigations.
  • Sync is always a deliberate operator action. There is no automatic background sync.
  • Investigators should disable Wi-Fi and mobile data before running the field server in high-risk environments.

Device seizure preparation:

  • Enable full-disk encryption (LUKS/FileVault) on any device running this software.
  • Configure remote wipe capability.
  • Set an auto-lock timeout. Consider panic-wipe trigger (e.g. multiple wrong PINs).
  • Do not keep unencrypted backups.
  • Destroy devices that cannot be wiped before seizure when legally permissible.

Ag-Gag Legal Exposure

Ag-gag statutes criminalize undercover investigation of agricultural operations. Exposure risk varies by jurisdiction:

Jurisdiction Ag-Gag Law Risk Level Notes
Iowa Iowa Code § 717A.3A High Criminal penalties for trespass + documentation
North Carolina N.C. Gen. Stat. § 99A-2 High Civil liability for gaining employment under false pretenses
Kansas K.S.A. § 47-1827 High Criminal for entering ag facilities to photograph
Alabama Ala. Code § 2-15-110 Medium Recording without consent at commercial facilities
Montana Mont. Code Ann. § 81-30-103 Medium Civil cause of action for facility operators
Idaho Struck down (9th Cir. 2018) Low Animal Legal Defense Fund v. Wasden
Utah Utah Code § 76-6-112 High Criminal ag-gag, survived facial challenge
Arkansas Ark. Code Ann. § 2-5-101 Medium Agricultural operation recording

Operational guidance:

  • Before any investigation, consult legal counsel on current ag-gag exposure.
  • Evidence from ag-gag jurisdictions may be inadmissible or may expose investigators to criminal prosecution. Handle separately and flag clearly.
  • FOIA requests are protected speech and carry no ag-gag exposure risk.
  • Public records from USDA APHIS are public information — no exposure.

See docs/jurisdiction-guide.md for full jurisdiction analysis.

AI Provider Routing

Zero-retention requirement: Investigation documents (evidence, witness testimony, operational data) must NEVER be routed through AI providers that retain inputs.

Permitted providers for investigation data:

  • Locally hosted models (Ollama, llama.cpp, LM Studio)
  • Providers with verified zero-retention agreements (verify contractually, not just from marketing copy — see the provider's DPA)

Permitted for public regulatory data only:

  • Google Gemini (verify zero-retention tier — standard API retains inputs)
  • OpenAI (verify zero-retention tier — standard API retains inputs)

The ai_provider parameter in src/documents/ingester.py defaults to "auto". Set GEMINI_API_KEY or OPENAI_API_KEY in .env only if you have verified zero-retention agreements. For investigations, use ai_provider="local".

Coalition API Security

API keys are tiered: PUBLIC (no key), COALITION, INVESTIGATOR.

  • PUBLIC endpoints serve only USDA APHIS public data.
  • COALITION keys allow FOIA request generation and cross-org violation tracking.
  • INVESTIGATOR keys allow document ingestion and evidence access.
  • All keys are masked in logs (last 4 chars only) — never log full keys.
  • Rotate COALITION and INVESTIGATOR keys if any partner organization is compromised.
  • Consider IP allowlisting for INVESTIGATOR-tier endpoints.

Set keys in .env:

COALITION_API_KEYS=key1,key2,key3
INVESTIGATOR_API_KEYS=key4,key5

Supply Chain Verification

Before adding any new dependency, verify it exists and has legitimate maintainers. ~20% of AI-recommended packages are hallucinated. Run:

pip index versions <package-name>  # verify package exists on PyPI

Current dependencies are verified. New dependencies must be added deliberately — never accept AI suggestions for package names without checking PyPI directly.

There aren’t any published security advisories