SecureAlphaAI

IP-safe, LLM-powered financial strategy analysis — built on Anthropic Claude.

SecureAlphaAI lets quant firms and asset managers harness frontier AI for portfolio commentary and report generation without leaking proprietary alpha — the signals, strategy names, thresholds, and position sizes that constitute their intellectual property.

The Problem

Modern quantitative finance is built on proprietary alpha — carefully engineered signals, strategy parameters, and portfolio construction logic that represent years of research and millions in development cost. When firms began integrating LLMs into their workflows, an uncomfortable tension emerged:

LLMs need context to be useful. Context contains your IP. Sending your IP to an external API is a risk.

The four failure modes this creates:

Failure Mode	What Happens	Consequence
Training data leakage	Your prompt is retained and used to fine-tune future models	Competitors' models learn your alpha
Prompt injection	A malicious payload in analyst notes causes the LLM to echo back sensitive values	Secrets exfiltrated through the model response
Inference attacks	High-precision numbers (e.g. a 6-decimal factor loading) uniquely identify your strategy	IP reconstructable even from "anonymised" outputs
Compliance exposure	Non-public portfolio positions and strategy parameters may constitute MNPI	Regulatory liability under SEC Rule 10b-5, MiFID II, etc.

The naive solution — "just don't send sensitive data" — breaks the value proposition entirely. The correct solution is a sanitisation pipeline that strips the identifying information while preserving the analytical signal.

That is what SecureAlphaAI is.

Theoretical Framework

Information Theory: Why Precision Is the Attack Surface

Consider a conviction score reported as 0.8347561. The Shannon entropy of a 7-decimal floating-point value drawn from a continuous distribution approaches log₂(10⁷) ≈ 23 bits. That is enough entropy to uniquely identify the strategy from a universe of ~8 million candidates.

By contrast, a value reported as ~0.83 (2 decimal places) carries only log₂(100) ≈ 6.6 bits — insufficient to distinguish your strategy from hundreds of others running similar momentum overlays.

The Sanitiser's threshold rules are therefore not arbitrary. They are grounded in the observation that:

Public market data (prices, returns, volumes) uses at most 4 decimal places in standard data feeds.
Internal risk parameters, factor loadings, and signal scores are routinely computed to 6–8 decimal places.
Anything ≥ 6 decimal places is therefore almost certainly an internal parameter, not a public market datum.

Identifying entropy by decimal precision
─────────────────────────────────────────────────────────────────────────
Decimal places │ Example       │ Entropy  │ Risk
───────────────┼───────────────┼──────────┼─────────────────────────────
2              │ 0.83          │  6.6 bit │ Safe — ambiguous
3              │ 0.835         │  9.9 bit │ Caution — borderline
4              │ 0.8347        │ 13.3 bit │ Sensitive — internal range
5              │ 0.83475       │ 16.6 bit │ High risk
6+             │ 0.834756...   │ 20+ bit  │ BLOCKED — uniquely identifying
─────────────────────────────────────────────────────────────────────────

Fail-Closed Design Philosophy

The PromptGuard operates on a fail-closed principle borrowed from network security: when in doubt, deny.

This asymmetry is deliberate. In access control systems, false positives (legitimate traffic blocked) cause operational friction. False negatives (malicious traffic allowed) cause breaches. In IP protection, the cost asymmetry is even more extreme: a false positive means a developer gets a 422 and adds a new config entry; a false negative means proprietary signals leave the building.

Defence in Depth

No single control is sufficient. SecureAlphaAI layers three independent mechanisms:

  ┌─────────────────────────────────────────────────────────────────────┐
  │  LAYER 1 — Sanitiser (transform)                                    │
  │  Replace known-sensitive values with stable placeholders.           │
  │  Maintains reverse_map for internal audit. Never sent to LLM.       │
  └─────────────────────────────────────┬───────────────────────────────┘
                                        │ sanitised text
  ┌─────────────────────────────────────▼───────────────────────────────┐
  │  LAYER 2 — PromptGuard (detect + reject)                            │
  │  Regex-based inspection for residual sensitive patterns.            │
  │  Catches config gaps. Fail-closed: reject on any match.             │
  └─────────────────────────────────────┬───────────────────────────────┘
                                        │ approved context only
  ┌─────────────────────────────────────▼───────────────────────────────┐
  │  LAYER 3 — API Boundary (validate + limit)                          │
  │  Pydantic schema validation. String length limits on free-text.     │
  │  Prevents prompt injection via oversized payloads.                  │
  └─────────────────────────────────────────────────────────────────────┘

Even if an attacker crafts a payload that defeats one layer, the remaining two provide independent barriers.

Architecture

System Overview

  ┌──────────────────────────────────────────────────────────────────────┐
  │  Client (Browser / SDK / curl)                                       │
  └──────────────────────────────┬───────────────────────────────────────┘
                                 │ HTTPS
  ┌──────────────────────────────▼───────────────────────────────────────┐
  │  Next.js Frontend  (port 3000)                                       │
  │  Strategy list · Detail tabs · Compare · Report · Admin              │
  └──────────────────────────────┬───────────────────────────────────────┘
                                 │ REST + SSE
  ┌──────────────────────────────▼───────────────────────────────────────┐
  │  FastAPI Backend  (port 8000)                                        │
  │                                                                      │
  │   ┌──────────────────────────────────────────────────────────────┐   │
  │   │  Request Boundary                                            │   │
  │   │  Pydantic v2 validation · API-key auth · Rate limiting       │   │
  │   └──────────────────────┬───────────────────────────────────────┘   │
  │                          │ typed domain objects                      │
  │   ┌──────────────────────▼───────────────────────────────────────┐   │
  │   │  ContextBuilder                                              │   │
  │   │  ┌────────────────────────────────────────────────────────┐  │   │
  │   │  │  Sanitiser — replace protected values → placeholders   │  │   │
  │   │  └───────────────────┬────────────────────────────────────┘  │   │
  │   │  ┌───────────────────▼────────────────────────────────────┐  │   │
  │   │  │  PromptGuard — inspect, reject if residual IP found    │  │   │
  │   │  └───────────────────┬────────────────────────────────────┘  │   │
  │   └──────────────────────┼───────────────────────────────────────┘   │
  │                          │ approved context                          │
  │   ┌──────────────────────▼───────────────────────────────────────┐   │
  │   │  StrategyAnalyst / ReportGenerator                           │   │
  │   │  Prompt templates wrapping the safe context                  │   │
  │   └──────────────────────┬───────────────────────────────────────┘   │
  │                          │ formatted prompt                          │
  │   ┌──────────────────────▼───────────────────────────────────────┐   │
  │   │  LLMClient (AsyncAnthropic)                                  │   │
  │   │  claude-opus-4-6 · Prompt caching · Streaming SSE            │   │
  │   └──────────────────────┬───────────────────────────────────────┘   │
  └──────────────────────────┼───────────────────────────────────────────┘
                             │
  ┌──────────────────────────▼───────────────────────────────────────┐
  │  Anthropic API                                                   │
  │  Receives: sanitised context only. Never sees your raw IP.       │
  └──────────────────────────────────────────────────────────────────┘

  ┌─────────────────────────────────────────────────────────────────┐
  │  Supporting Services                                            │
  │  Redis ── arq job queue ── background report generation         │
  │  SQLite/Postgres ── strategy store, API keys, audit log         │
  │  Prometheus ── metrics endpoint (/metrics)                      │
  └─────────────────────────────────────────────────────────────────┘

Data Flow: Concrete Example

POST /analyse

1. Raw request:
   {
     "asset_returns": [{"ticker": "AAPL", "daily_return_pct": 2.3}],
     "strategy_signal": {
       "strategy_name": "AlphaV1",
       "conviction_score": 0.8347561,        ← internal precision
       "notes": "TRD-ABC12345 closed above target"
     }
   }

2. Pydantic validates types and field constraints. ✓

3. ContextBuilder formats raw text:
   "AAPL: +2.30% | Strategy: AlphaV1 | Conviction: 0.8347561 | TRD-ABC12345"

4. Sanitiser transforms:
   "ASSET_A: +2.30% | Strategy: STRATEGY_1 | Conviction: [REDACTED_THRESHOLD]
    | [REDACTED_ID]"
   reverse_map = {"ASSET_A": "AAPL", "STRATEGY_1": "AlphaV1"}
   (stored internally, never transmitted)

5. PromptGuard inspects sanitised text:
   ✓ No numbers with ≥ 6 decimal places
   ✓ No forbidden internal phrases
   ✓ No residual trade/order IDs
   ✓ No suspicious all-caps tickers
   → APPROVED

6. StrategyAnalyst wraps context in analysis prompt template.

7. LLMClient sends to Anthropic (claude-opus-4-6):
   - Stable system prompt (cached — ~90% token cost reduction)
   - Sanitised user prompt only

8. Response returned to caller.
   ✓ Your IP never left the system.

Defence Layers in Detail

Sanitiser (`core/sanitiser.py`)

A deterministic, lossless transformation layer. Every substitution is recorded in a reverse_map for internal audit — but the map stays inside the system and is never sent to the LLM.

Rule	Input Example	Output	Why
Ticker replacement	`AAPL`	`ASSET_A`	Identifies specific positions (MNPI)
Strategy replacement	`AlphaV1`	`STRATEGY_1`	Trade secret — identifies proprietary model
Dollar amounts	`$1,234,567`	`[REDACTED_AMOUNT]`	Position size reveals NAV/exposure
Threshold redaction	`0.0347%`	`[REDACTED_THRESHOLD]`	High-precision % = internal risk param
UUID redaction	`550e8400-…`	`[REDACTED_ID]`	Internal system identifiers
Trade/order IDs	`TRD-ABC12345`	`[REDACTED_ID]`	Links to specific internal orders
BIC/SWIFT codes	`GSBEFRPP`	`[REDACTED_BIC]`	Identifies counterparties
ISIN/CUSIP/SEDOL	`US0378331005`	`[REDACTED_ISIN]`	Identifies specific securities globally
Dates (contextual)	`2024-03-15`	`[REDACTED_DATE]`	Can reveal entry/exit timing

Stability guarantee: tickers are replaced in the order they appear in PROTECTED_TICKERS, so AAPL is always ASSET_A across all requests — consistent across time.

PromptGuard (`core/prompt_guard.py`)

A fail-closed inspection layer that runs after the Sanitiser. Its job is to catch anything the Sanitiser missed — because a config gap exists, or a new data field was added without sanitisation.

Check	Rationale
Numbers with ≥ 6 decimal places	Only internal parameters reach this precision; public data uses ≤ 4 dp
Ratio expressions (`3.14159:1`)	Internal risk-ratio formatting is distinctive
Forbidden phrases (`live PnL`, `strategy id:`, `internal signal`)	Indicator of unsanitised internal system output
Trade/order/position IDs (`TRD-`, `ORD-`, `POS-`)	Systematic internal ID prefixes
Residual all-caps tokens	Flags tickers not listed in `PROTECTED_TICKERS`

Custom rules can be added by passing extra_rules: list[GuardRule] to the constructor — each rule is a compiled regex + human-readable description.

ContextBuilder (`core/context_builder.py`)

The single, mandatory path from your internal domain model to the LLM. There is no other route — you cannot accidentally bypass sanitisation by formatting data yourself. It chains Sanitiser → PromptGuard and returns the approved string or raises ValueError.

LLMClient (`core/llm_client.py`)

Thin wrapper around AsyncAnthropic with:

complete() — single-shot async completion
stream_complete() — async generator streaming text chunks via SSE

Prompt caching: The base system prompt is marked cache_control: {type: "ephemeral"}. Since it is stable across all calls, Anthropic caches it server-side — reducing token costs by up to 90% on repeated requests.

Frontend Dashboard

A full Next.js 15 dashboard ships alongside the API, providing a visual interface for all backend capabilities.

Pages

Page	Path	What it does
Strategy List	`/strategies`	Browse all strategies, create new, delete
Strategy Detail	`/strategies/[id]`	Four tabs: Overview · Performance · Snapshots · Reports
Compare	`/compare`	Side-by-side multi-strategy comparison with chip picker
Admin — API Keys	`/admin/keys`	Create/revoke API keys with role assignment
Admin — Audit Log	`/admin/audit`	Full audit trail of all API actions
Admin — Guard	`/admin/guard`	Live PromptGuard rule inspection and testing
Settings	`/settings`	Configure API base URL and key for the session

All pages are covered by Playwright end-to-end tests (frontend/e2e/).

API Reference

Full interactive docs available at http://localhost:8000/docs (Swagger UI) and http://localhost:8000/redoc (ReDoc) when the server is running.

Core Endpoints

Method	Path	Description
`POST`	`/analyse`	Analyse a market snapshot + strategy signal
`POST`	`/report`	Generate a full narrative report (queued via arq)
`GET`	`/report/{id}`	Poll report job status
`GET`	`/report/{id}/stream`	Stream report output via SSE

Strategy Management

Method	Path	Description
`POST`	`/strategies`	Create a new strategy
`GET`	`/strategies`	List all strategies
`GET`	`/strategies/{id}`	Get a single strategy
`DELETE`	`/strategies/{id}`	Delete a strategy
`POST`	`/strategies/{id}/snapshots`	Record a performance snapshot
`GET`	`/strategies/{id}/snapshots`	Retrieve snapshot history
`GET`	`/strategies/{id}/trend`	Analyse performance trend
`GET`	`/strategies/compare`	Compare multiple strategies

Admin & Governance

Method	Path	Description
`POST`	`/admin/keys`	Create an API key
`GET`	`/admin/keys`	List all API keys
`DELETE`	`/admin/keys/{id}`	Revoke an API key
`GET`	`/admin/audit`	Query the audit log
`GET`	`/admin/dsar/export/{user_id}`	GDPR data export
`DELETE`	`/admin/dsar/delete/{user_id}`	GDPR right-to-erasure

Observability

Method	Path	Description
`GET`	`/health`	Liveness check
`GET`	`/metrics`	Prometheus metrics

Quick Start (Docker — recommended)

Prerequisites: Docker and Docker Compose installed.

# 1. Clone the repo
git clone https://github.com/MrRobotop/SecureAlphaAI.git
cd SecureAlphaAI/secure-alpha-ai

# 2. Create your environment file
cp .env.example .env

Open .env and set your Anthropic API key:

ANTHROPIC_API_KEY=sk-ant-your-key-here

Get a key at console.anthropic.com.

# 3. Build and start all services
docker compose up --build

That starts four containers:

Container	URL	Description
`secure-alpha-frontend`	http://localhost:3000	Next.js dashboard
`secure-alpha-ai`	http://localhost:8000	FastAPI backend
`secure-alpha-arq-worker`	—	Background job worker
`secure-alpha-redis`	localhost:6379	Job queue / cache

# 4. Try it — analyse a strategy signal
curl -s -X POST http://localhost:8000/analyse \
  -H "Content-Type: application/json" \
  -d '{
    "timestamp": "2024-01-15T09:30:00Z",
    "volatility_regime": "high",
    "sector_rotation_signal": 0.32,
    "asset_returns": [
      {"ticker": "AAPL", "daily_return_pct": 2.3},
      {"ticker": "MSFT", "daily_return_pct": -0.8}
    ],
    "analyst_note": "Tech outperforming on earnings beat.",
    "strategy_signal": {
      "strategy_name": "AlphaV1",
      "signal_direction": "long",
      "conviction_score": 0.75,
      "target_tickers": ["AAPL"],
      "notes": "Momentum building post-earnings."
    },
    "analysis_question": "Is the long signal well-supported by market conditions?"
  }' | python -m json.tool

Quick Start (Local Dev)

# 1. Clone and enter
git clone https://github.com/MrRobotop/SecureAlphaAI.git
cd SecureAlphaAI/secure-alpha-ai

# 2. Python virtual environment
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate

# 3. Install backend dependencies
pip install -e ".[dev]"

# 4. Configure environment
cp .env.example .env
# Edit .env — set ANTHROPIC_API_KEY and your protected tickers/strategies

# 5. Run database migrations
alembic upgrade head

# 6. Start the API (auto-reload)
uvicorn api.routes:app --reload --host 0.0.0.0 --port 8000

For the frontend:

cd frontend
npm install
npm run dev     # http://localhost:3000

Configuration

All configuration is via environment variables. Copy .env.example to .env and fill in your values — the file is gitignored and will never be committed.

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	Yes	—	Your Anthropic API key (get one here)
`PROTECTED_TICKERS`	No	`AAPL,MSFT,NVDA,GOOGL,AMZN`	Comma-separated ticker symbols to anonymise
`PROTECTED_STRATEGY_NAMES`	No	(empty)	Comma-separated strategy names to redact
`PROTECTED_COUNTERPARTIES`	No	(empty)	Comma-separated counterparty names to redact
`REDIS_URL`	No	`redis://localhost:6379/0`	Redis connection string (needed for background reports)
`STRATEGY_RETENTION_DAYS`	No	`90`	Days before strategies are swept by the retention cleaner
`APP_ENV`	No	`development`	`development` or `production`
`LOG_LEVEL`	No	`INFO`	Python logging level
`API_PORT`	No	`8000`	Port to expose the API
`FRONTEND_PORT`	No	`3000`	Port to expose the frontend

Extending the protection list: Add tickers and strategy names to PROTECTED_TICKERS and PROTECTED_STRATEGY_NAMES. No code changes required — both the Sanitiser and PromptGuard read these at startup.

Running the Tests

The test suite runs entirely offline — no Anthropic API key required.

# Install dev dependencies (if not already done)
pip install -e ".[dev]"

# Run all 292 tests
pytest

# Run with coverage report
pytest --cov=core --cov=api --cov=analysis --cov-report=term-missing

# Run the adversarial corpus (58 injection payloads)
pytest tests/test_adversarial.py -v

Test Coverage

Module	Tests
`core/sanitiser.py`	Unit + property-based (Hypothesis)
`core/prompt_guard.py`	Unit + adversarial corpus (58 payloads)
`core/context_builder.py`	Integration — sanitiser + guard pipeline
`core/output_validator.py`	Unit — hallucination detection
`api/routes.py`	Integration — full HTTP round-trip (respx)
`client/`	SDK unit tests

The adversarial corpus covers:

Direct injection attempts (ignore previous instructions)
Unicode homoglyph substitution
Whitespace obfuscation around sensitive values
Nested JSON inside free-text fields
Base64-encoded payloads
Prompt continuation attacks

Frontend E2E

cd frontend
npx playwright test         # headless
npx playwright test --ui    # interactive UI mode

Python Client SDK

A typed Python client ships in client/ for use in your own pipelines:

from secure_alpha_ai_client import SecureAlphaAIClient

client = SecureAlphaAIClient(
    base_url="http://localhost:8000",
    api_key="your-api-key",
)

# Analyse a signal
result = client.analyse(
    asset_returns=[{"ticker": "AAPL", "daily_return_pct": 2.3}],
    strategy_signal={
        "strategy_name": "AlphaV1",
        "signal_direction": "long",
        "conviction_score": 0.75,
    },
    analysis_question="Is the long signal supported by current conditions?",
)
print(result.analysis)

Async client available as AsyncSecureAlphaAIClient.

CLI (built on Typer):

pip install -e client/
secure-alpha analyse --ticker AAPL --strategy AlphaV1 --conviction 0.75
secure-alpha report --strategy-id <uuid> --style executive_summary

Threat Model

In Scope (mitigated)

Accidental inclusion of proprietary data in LLM prompts
Config gaps (untlisted tickers/strategies) caught by PromptGuard
Prompt injection via analyst notes (length limits + guard pattern matching)
Internal ID leakage (trade IDs, order IDs, position IDs)
High-precision parameter fingerprinting

Out of Scope (requires additional controls)

Risk	Recommended Control
LLM provider data retention	Use Anthropic's zero data retention tier
Network-level exfiltration	TLS everywhere + egress filtering
Insider threat (config tampering)	Access-control `.env` / secrets manager
Model training data contamination	If data was ever public, it may already be in the model
Side-channel timing attacks	Not addressed at the application layer

Tech Stack

Layer	Technology
LLM	Anthropic Claude (claude-opus-4-6) via `anthropic` Python SDK
Backend	FastAPI + Pydantic v2 + Uvicorn
Database	SQLite (dev) / PostgreSQL (prod) via SQLAlchemy async + Alembic
Auth	API key auth with RBAC (admin / analyst / viewer)
Job Queue	arq + Redis — background report generation
Observability	Prometheus metrics, structured logging, request-ID tracing
Frontend	Next.js 15 (App Router) + Tailwind CSS + TanStack Query
Testing	pytest + Hypothesis (property-based) + respx + Playwright (E2E)
CI/CD	GitHub Actions — lint (ruff), type-check (mypy), tests, coverage
Containerisation	Docker multi-stage build + Docker Compose
IP Protection	Custom Sanitiser + PromptGuard pipeline (zero external deps)

Built by Rishabh Patil — Quant Developer.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
secure-alpha-ai		secure-alpha-ai
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

SecureAlphaAI

Table of Contents

The Problem

Theoretical Framework

Information Theory: Why Precision Is the Attack Surface

Fail-Closed Design Philosophy

Defence in Depth

Architecture

System Overview

Data Flow: Concrete Example

Defence Layers in Detail

Sanitiser (core/sanitiser.py)

PromptGuard (core/prompt_guard.py)

ContextBuilder (core/context_builder.py)

LLMClient (core/llm_client.py)

Frontend Dashboard

Pages

API Reference

Core Endpoints

Strategy Management

Admin & Governance

Observability

Quick Start (Docker — recommended)

Quick Start (Local Dev)

Configuration

Running the Tests

Test Coverage

Frontend E2E

Python Client SDK

Threat Model

In Scope (mitigated)

Out of Scope (requires additional controls)

Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Sanitiser (`core/sanitiser.py`)

PromptGuard (`core/prompt_guard.py`)

ContextBuilder (`core/context_builder.py`)

LLMClient (`core/llm_client.py`)

Packages