MCP Observatory

  ███╗   ███╗ ██████╗██████╗
  ████╗ ████║██╔════╝██╔══██╗
  ██╔████╔██║██║     ██████╔╝
  ██║╚██╔╝██║██║     ██╔═══╝
  ██║ ╚═╝ ██║╚██████╗██║
  ╚═╝     ╚═╝ ╚═════╝╚═╝
     O B S E R V A T O R Y

Test, secure, and monitor MCP servers before agents depend on them.

MCP Observatory gives MCP servers the production safety rails every dependency eventually needs: CI checks, security scans, schema drift detection, PR reports, score badges, and agent-accessible diagnostics.

Add MCP CI in one command:

npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"

Or test a server immediately:

npx @kryptosai/mcp-observatory test npx -y @modelcontextprotocol/server-everything

Use it as a CLI, a GitHub Action, or an MCP server that lets your AI agent scan, test, record, replay, and verify other MCP servers autonomously.

Why MCP Observatory

MCP servers are becoming production dependencies. If agents rely on them, teams need a way to catch broken tools, unsafe schemas, schema drift, slow responses, and security footguns before those failures reach users.

Observatory gives maintainers and teams:

One-command CI setup with init-ci --all
GitHub PR comments for compatibility, drift, and security findings
Health score badges for public trust signals
Record/replay/verify workflows for regression testing
MCP server mode so agents can inspect other MCP servers directly
Production pilot path for hosted history, private repo reporting, certification, support, and fleet visibility

See public proof, the MCP safety report, the certification distribution loop, and commercial pilots.

Production / Enterprise

Free for local OSS use. Paid pilots are available for hosted reporting, private repo CI, security reports, production monitoring, certification, support, and MCP fleet visibility.

Pilot	Starts At	Best Fit
Team Pilot	$299/month	Small teams adding MCP checks to CI
Business Pilot	$999/month	Private repos and recurring security reports
Enterprise Pilot	$3k/month	Production monitoring, support, and fleet visibility
Strategic Accounts	Custom, $250k+/year	Major companies running MCP in production

Run npx @kryptosai/mcp-observatory cloud or contact william@banksey.com for production MCP usage.

See commercial pilots, privacy and telemetry, and terms for production use. For a fuller narrative, see the project case study.

Quick Start

Scan every MCP server in your Claude config:

npx @kryptosai/mcp-observatory

Go deeper — also invoke safe tools to verify they actually run:

npx @kryptosai/mcp-observatory scan deep

Test a specific server:

npx @kryptosai/mcp-observatory test npx -y @modelcontextprotocol/server-everything

Add it to Claude Code as an MCP server:

claude mcp add mcp-observatory -- npx -y @kryptosai/mcp-observatory serve

Or add it manually to your config:

{
  "mcpServers": {
    "mcp-observatory": {
      "command": "npx",
      "args": ["-y", "@kryptosai/mcp-observatory", "serve"]
    }
  }
}

Commands

Command	What it does
`scan`	Auto-discover servers from config files and check them all (default)
`scan deep`	Scan and also invoke safe tools to verify they execute
`test <cmd>` / `test --target <file>`	Test a specific server by command or target config
`record <cmd>`	Record a server session to a cassette file for offline replay
`replay <cassette>`	Replay a cassette offline — no live server needed
`verify <cassette> <cmd>`	Verify a live server still matches a recorded cassette
`diff <base> <head>`	Compare two run artifacts for regressions and schema drift
`watch <config>`	Watch a server for changes, alert on regressions
`suggest`	Detect your stack and recommend MCP servers from the registry
`serve`	Start as an MCP server for AI agents
`lock`	Snapshot MCP server schemas into a lock file
`lock verify`	Verify live servers match the lock file
`history`	Show health score trends for your MCP servers
`init-ci`	Create a GitHub Action and badge snippet for MCP compatibility/security checks
`ci-report`	Generate CI report for GitHub issue creation
`enterprise-report`	Generate a static production/security report from run artifacts
`score <cmd>`	Score an MCP server's health (0-100)
`badge <cmd>`	Generate an SVG health score badge for README
`cloud`	Show hosted reporting, production monitoring, and enterprise pilot options

Run with no arguments for an interactive menu:

What It Does

Check capabilities — connects to a server and verifies tools, prompts, and resources respond correctly.

Invoke tools — goes beyond listing. Actually calls safe tools (no required params / readOnlyHint) and reports which ones work and which ones crash.

npx @kryptosai/mcp-observatory scan deep

Detect schema drift — diffs two runs and surfaces added/removed fields, type changes, and breaking parameter changes.

npx @kryptosai/mcp-observatory diff run-a.json run-b.json

Recommend servers — scans your project for languages, frameworks, databases, and cloud providers, then cross-references the MCP registry to suggest servers you're missing.

npx @kryptosai/mcp-observatory suggest

Or ask your agent "what MCP servers should I add?" when running in MCP server mode.

Security scanning — analyzes tool schemas for dangerous patterns: shell injection surfaces, broad filesystem access, missing auth, and credential leakage in responses.

npx @kryptosai/mcp-observatory test --security npx -y my-mcp-server

Record / replay / verify — capture a live session, replay it offline in CI, and verify nothing changed. Like VCR for MCP.

# Record a session
npx @kryptosai/mcp-observatory record npx -y @modelcontextprotocol/server-everything

# Replay offline (no server needed)
npx @kryptosai/mcp-observatory replay .mcp-observatory/cassettes/latest.cassette.json

# Verify the live server still matches
npx @kryptosai/mcp-observatory verify cassette.json npx -y @modelcontextprotocol/server-everything

Watch for regressions — re-runs checks on an interval and alerts when something changes.

npx @kryptosai/mcp-observatory watch target.json

Scan locations

When you run scan, it looks for MCP configs in:

~/.claude.json (Claude Code)
~/Library/Application Support/Claude/claude_desktop_config.json (Claude Desktop, macOS)
%APPDATA%/Claude/claude_desktop_config.json (Claude Desktop, Windows)
.claude.json and .mcp.json (current directory)

CI / GitHub Action

Add Observatory to your MCP server's CI pipeline:

npx @kryptosai/mcp-observatory init-ci --all --command "npx -y my-mcp-server"

Or create the workflow manually:

# .github/workflows/observatory.yml
name: MCP Server Check
on: [pull_request]

jobs:
  observatory:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: KryptosAI/mcp-observatory/action@main
        with:
          command: npx -y my-mcp-server
          security: true

Action inputs:

Input	Description	Default
`command`	Server command to test	(required if no `target`)
`target`	Path to target config JSON
`targets`	Path to MCP config file for multi-server matrix scan
`deep`	Also invoke safe tools	`false`
`security`	Run security analysis	`false`
`fail-on-regression`	Fail the action on issues	`true`
`fail-on-baseline-drift`	Fail the action when baseline verification detects drift	`true`
`comment-on-pr`	Post report as PR comment	`true`
`set-status`	Set a commit status check (green/red) on the HEAD SHA	`true`
`github-token`	Token for PR comments and commit statuses	`${{ github.token }}`

The action runs checks on every PR, comments a markdown report, and blocks merge on regressions. See action/README.md for all options.

Production teams can add hosted CI history, private-repo reporting, security reports, production monitoring, support, and fleet visibility. Run npx @kryptosai/mcp-observatory cloud for pilot options.

Certified by MCP Observatory

MCP server maintainers can add a public compatibility/security signal to their README:

[![MCP Observatory](https://img.shields.io/badge/MCP%20Observatory-enabled-2563eb)](https://github.com/KryptosAI/mcp-observatory)

Or generate a score badge from a live check:

npx @kryptosai/mcp-observatory badge npx -y my-mcp-server --output docs/mcp-health.svg

See the certification distribution loop for the GitHub Action template, maintainer PR body, and badge rollout playbook.

Generate a pilot-ready production/security report from local run artifacts:

npx @kryptosai/mcp-observatory enterprise-report \
  --account "Your Company" \
  --format html \
  --output observatory-enterprise-report.html

For clearer internal account attribution in CI, set:

MCP_OBSERVATORY_ORG=your-company.com
MCP_OBSERVATORY_CONTACT=mcp-owner@your-company.com

Testing Feishu/Lark integrations? See the Feishu/Lark MCP guide.

Lock Files

$ npx @kryptosai/mcp-observatory lock              # Snapshot all server schemas
$ npx @kryptosai/mcp-observatory lock verify        # Verify no drift since last lock

Trend Tracking

$ npx @kryptosai/mcp-observatory history            # Show health trends over time

Nightly Scans

$ npx @kryptosai/mcp-observatory ci-report          # Generate regression report for CI

MCP Server Mode

No other testing tool is itself an MCP server. Add Observatory as a server and your AI agent can autonomously test, diagnose, and monitor your other MCP servers.

claude mcp add mcp-observatory -- npx -y @kryptosai/mcp-observatory serve

Your agent gets 9 tools:

Tool	When to use it
`scan`	Check if all your configured MCP servers are healthy
`check_server`	Test a specific server before installing or after updating
`record`	Capture a baseline of a working server for future comparison
`replay`	Test against a recorded session — no live server needed
`verify`	Confirm a server update didn't break anything
`watch`	Check a server and see what changed since the last check
`diff_runs`	Find regressions between two check results
`get_last_run`	Retrieve previous check results for a server
`suggest_servers`	Discover MCP servers that match your project stack

An AI tool that checks other AI tools. It's a tool testing tools that serve tools.*

_{* I'm a dude playing a dude disguised as another dude.}

Security

The MCP server runs inside AI hosts where an LLM chooses which tools to call. To prevent prompt-injection attacks:

Command allowlist: Only npx, node, python, python3, uvx, docker, deno, bun are permitted as base executables. The CLI has no restrictions.
Path validation: File-reading tools are constrained to the runs/cassettes directories.
No arbitrary execution: Use the CLI for unrestricted commands.

CLI vs MCP: Intentional Differences

Feature	CLI	MCP Server	Why
`watch`	Polling loop	Single check + diff	Request/response doesn't support long-polling
Interactive menu	Arrow-key navigation	Not available	MCP has no interactive UI
Color output	`--no-color` flag	Always plain text	MCP returns structured content
`report`	Renders saved artifacts	Not available	Agents read artifacts directly
`serve`	Starts MCP server	N/A	Is the MCP server
`run`	Reads target config files	Inline params	MCP tools accept params directly
`get_last_run`	Not available (use `ls` + `diff`)	Available	Convenience for agents

Compatibility

Works with any MCP server that uses standard transports:

Transport	Examples	Adapter
stdio (most servers)	filesystem, memory, context7, brave-search, sentry, notion, stripe	`local-process`
HTTP/SSE (remote)	Cloudflare, Exa, Tavily	`http`
Docker	All `@modelcontextprotocol/server-*` images	`local-process` via `docker run -i`

Servers needing API keys work via env in the target config. Python servers work via uvx. See the full compatibility matrix for tested servers and known issues.

Target config files

For more control (env vars, metadata, custom timeout):

{
  "targetId": "filesystem-server",
  "adapter": "local-process",
  "command": "npx",
  "args": ["-y", "@modelcontextprotocol/server-filesystem", "."],
  "timeoutMs": 15000,
  "skipInvoke": false
}

npx @kryptosai/mcp-observatory run --target ./target.json

HTTP / SSE targets

{
  "targetId": "my-remote-server",
  "adapter": "http",
  "url": "http://localhost:3000/mcp",
  "authToken": "${MCP_SERVER_TOKEN}",
  "headers": {
    "X-Api-Key": "$MCP_SERVER_API_KEY"
  },
  "timeoutMs": 15000
}

Target configs support ${VAR}, $VAR, and env:VAR references in authToken, headers, and local-process env values.

How It Compares

Feature	Observatory	mcp-recorder	MCPBench	mcp-jest
Auto-discover servers	✅	—	—	—
Check capabilities	✅	—	✅	✅
Invoke tools	✅	—	—	✅
Schema drift detection	✅	—	—	—
Record / replay	✅	✅	—	—
Verify against cassette	✅	—	—	—
Response snapshot diffs	✅	—	—	—
Benchmarking / latency	—	—	✅	—
Jest integration	—	—	—	✅
MCP proxy mode	—	✅	—	—
Works as MCP server	✅	—	—	—

Each tool has strengths. Observatory focuses on regression detection and CI-friendly workflows. mcp-recorder is great as a transparent proxy. MCPBench is the go-to for performance benchmarking. mcp-jest is ideal if you're already in a Jest workflow.

Prior Art

The record/replay/verify pattern is inspired by:

VCR (Ruby) — pioneered cassette-based HTTP record/replay
Polly.js (Netflix) — HTTP interaction recording for JavaScript
mcp-recorder — MCP-specific traffic recording proxy
MCPBench — MCP server benchmarking
mcp-jest — Jest-style testing for MCP servers

Limitations

Servers requiring interactive OAuth (e.g., Google Drive) need pre-authentication before Observatory can connect
Custom WebSocket transports (e.g., BrowserTools MCP) are not supported
A few servers time out or close before init — see known issues and compatibility

Contributing

See CONTRIBUTING.md for guidelines. The fastest way to contribute is to add a real passing target with a distinct capability shape, a clearer report surface, or a cleaner startup diagnosis.

If Observatory saved you a broken deploy, consider giving it a star. It helps others find the project.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.github		.github
action		action
api		api
docs		docs
examples		examples
github-app		github-app
schemas		schemas
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.releaserc.json		.releaserc.json
CHANGELOG.md		CHANGELOG.md
COMMERCIAL.md		COMMERCIAL.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PRIVACY.md		PRIVACY.md
README.md		README.md
SECURITY.md		SECURITY.md
TERMS.md		TERMS.md
eslint.config.js		eslint.config.js
glama.json		glama.json
package-lock.json		package-lock.json
package.json		package.json
server.json		server.json
smithery.yaml		smithery.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MCP Observatory

Why MCP Observatory

Production / Enterprise

Quick Start

Commands

What It Does

Scan locations

CI / GitHub Action

Certified by MCP Observatory

Lock Files

Trend Tracking

Nightly Scans

MCP Server Mode

Security

CLI vs MCP: Intentional Differences

Compatibility

Target config files

HTTP / SSE targets

How It Compares

Prior Art

Limitations

Contributing

About

Uh oh!

Releases 42

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MCP Observatory

Why MCP Observatory

Production / Enterprise

Quick Start

Commands

What It Does

Scan locations

CI / GitHub Action

Certified by MCP Observatory

Lock Files

Trend Tracking

Nightly Scans

MCP Server Mode

Security

CLI vs MCP: Intentional Differences

Compatibility

Target config files

HTTP / SSE targets

How It Compares

Prior Art

Limitations

Contributing

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 42

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages