Skip to content

gagan1510/kubeguard

KubeGuard

Helm-aware Kubernetes Pull Request Risk Analyzer — a GitHub App that analyzes Helm charts and rendered Kubernetes manifests in PRs, applies risk rules, scores findings, and posts inline comments plus a Check Run (pass/fail).

Features

  • Webhook-based: Processes pull_request (opened, synchronize, reopened)
  • Helm-first: Detects charts (Chart.yaml / templates/ / values.yaml), runs helm dependency build and helm template
  • No cluster access: Rendering only; no kubeconfig or helm install
  • Rule engine: 20+ rules across Resource Safety, Availability, Scheduling, Security, Networking, and Helm-specific
  • Risk scoring: Environment-aware (prod: stricter weights and threshold 70; nonprod: relaxed weights and threshold 85). See Risk scoring (prod vs nonprod) below.
  • PR output: Summary comment, optional inline comments, GitHub Check Run

Tech stack

  • Python 3.11+, FastAPI, Uvicorn, Pydantic, PyYAML, httpx, python-jose
  • Helm CLI in container for helm template / helm dependency build
  • Docker + Kubernetes-ready, stateless

Configuration (environment)

KubeGuard is configured via environment variables. You can set them in the shell or in a .env file (or pass a file to kubeguard-analyze serve --config).

Full reference: docs/configuration.md — all variables with types, defaults, and descriptions.

Summary:

Category Key variables
GitHub GITHUB_APP_ID, GITHUB_PRIVATE_KEY, GITHUB_WEBHOOK_SECRET (App); or GITHUB_TOKEN (PAT for local testing)
Webhook SKIP_WEBHOOK_SECRET_VERIFICATION (dev only), WEBHOOK_VERBOSE, KUBEGUARD_CONFIG_PATH
Risk RISK_PROFILE, RISK_FAIL_THRESHOLD (prod, default 70), RISK_FAIL_THRESHOLD_NONPROD (default 85)
Helm HELM_CHART_SOURCE_URL, HELM_CHART_PATH, HELM_CHART_SOURCE_REF; HELM_VALUES_PATH, HELM_VALUES_SOURCE_URL, HELM_VALUES_SOURCE_REF; HELM_ENVIRONMENT, PROD_VALUES_REPO_URL, NON_PROD_VALUES_REPO_URL

Two-environment criticality: For prod, all rules run at full severity and the PR fails on any CRITICAL or score > 70. For nonprod, each finding’s severity is downgraded by one level and scoring is relaxed (LOW = 0 points; higher threshold 85). See Risk scoring below.

Risk scoring (prod vs nonprod)

Production Nonprod
Severity Findings keep original severity (LOW, MEDIUM, HIGH, CRITICAL). Each finding is downgraded by one level (CRITICAL→HIGH, HIGH→MEDIUM, MEDIUM→LOW, LOW→LOW). The run-as-root rule is further set to MEDIUM in nonprod.
Points per finding LOW = 5, MEDIUM = 10, HIGH = 20, CRITICAL = 40 LOW = 0, MEDIUM = 5, HIGH = 10, CRITICAL = 20
Total score Sum of points (capped at 100) Sum of points (capped at 100)
Risk level 0–20 LOW, 21–50 MEDIUM, 51–80 HIGH, 81–100 CRITICAL Same bands
Fail threshold Any CRITICAL finding or score > 70 (RISK_FAIL_THRESHOLD) Any CRITICAL finding (after downgrade) or score > 85 (RISK_FAIL_THRESHOLD_NONPROD)

Example (nonprod): Four findings: 2 MEDIUM, 2 LOW. After downgrade they stay MEDIUM and LOW. Nonprod points: 2×5 + 2×0 = 10 → score 10 (LOW). No CRITICAL and 10 ≤ 85 → pass.

Example (prod): Same four findings at original severity. Prod points: 2×10 + 2×5 = 30 → score 30 (MEDIUM). No CRITICAL and 30 ≤ 70 → pass. If one of those were CRITICAL in prod, the run would fail regardless of score.

AWS Context Engine (optional): When ENABLE_AWS_CONTEXT=true, the engine validates AWS resources referenced in manifests (ACM, ELB, Secrets Manager, IRSA, ECR, SGP). It uses the default AWS credential chain. Set ENABLE_AWS_CHECKS=false to disable all AWS checks while keeping the engine enabled. Full AWS env list: docs/configuration.md#aws-context-engine.

ECR checks: Repo presence (aws-ecr-repo-missing) and tag presence (aws-ecr-tag-missing) can be toggled with ENABLE_ECR_REPO_CHECK and ENABLE_ECR_TAG_CHECK.

Three-repo setup: Install the app on the repo where you want PR comments/checks (e.g. an app repo). Set HELM_CHART_SOURCE_URL to the chart repo and HELM_VALUES_SOURCE_URL to the deployment repo. Chart and values are then loaded from those remotes; the PR repo is only used for posting results.

GitHub App setup

  1. Create a GitHub App with:

    • Repository permissions: Contents (Read), Pull requests (Read & Write), Checks (Read & Write), Metadata (Read)
    • Webhook: Subscribe to Pull requests; set URL to https://<your-host>/webhook
    • Webhook secret: Set and use as GITHUB_WEBHOOK_SECRET
  2. Install the app on the desired repos.

Local testing (no GitHub)

The CLI has three subcommands:

  • analyze — Run risk analysis against a local Helm chart and local values file(s) (no clone, no webhook).
  • test — Test a remote PR (GitHub API + clone) or a local repo using the resolver from .kubeguard.yaml (same flow as the webhook).
  • serve — Start the KubeGuard server to receive GitHub PR webhooks (POST /webhook). Use for local or self-hosted deployment.

1. Install

cd /path/to/kubeguard
pip install -e .

2. Run the CLI

Subcommand: analyze (chart + values paths)

Basic (chart + one or more values files):

python -m app.cli analyze --chart-path /path/to/chart --values /path/to/values.yaml
# or with the script name:
kubeguard-analyze analyze -c /path/to/chart -v /path/to/values.yaml

Multiple values files (comma-separated or multiple -v):

kubeguard-analyze analyze -c /path/to/chart \
  -v values.yaml \
  -v env/prod.yaml
# or one argument:
kubeguard-analyze analyze -c /path/to/chart -v "values.yaml,env/prod.yaml"

Example script (from repo root):

./scripts/run-test-analyze.sh

Subcommand: test (remote PR or local repo)

Test a PR or local repo using the resolver configured in .kubeguard.yaml (or --config). No dashboard or PR comments; results are printed to stdout. See Config file (.kubeguard.yaml) below for how to set up the config.

Remote PR (requires GITHUB_TOKEN in the environment):

export GITHUB_TOKEN=ghp_...
kubeguard-analyze test --repo org/repo --pr 123
# Optional: --config /path/to/.kubeguard.yaml --resolver single --format json --verbose

Local repo (no GitHub; use a directory that contains .kubeguard.yaml or pass --config):

kubeguard-analyze test --local-path ./path/to/repo [--branch main]
kubeguard-analyze test --local-path ./repo --config .kubeguard.yaml --format json --verbose
Option Description
--repo Repository org/repo (required for remote PR, use with --pr)
--pr PR number (required for remote PR)
--local-path Local repo path (for offline mode; use instead of --repo/--pr)
--branch Branch when using --local-path (default: main)
--config Path to config file (default: .kubeguard.yaml in repo)
--resolver Override resolver type: single, layered, helm_template, pattern
--format text (default) or json
--verbose / -V Verbose output to stderr; repeat for more detail (-V verbose, -VV debug)
--environment / -e prod or nonprod for scoring (default: from HELM_ENVIRONMENT)

Test exit codes:

Code Meaning
0 Pass (risk below threshold)
1 Validation fail (CRITICAL or score above threshold)
2 Resolver error (e.g. values/chart not found, helm template failed)
3 Config error (no .kubeguard.yaml or invalid config)
4 GitHub API failure (missing GITHUB_TOKEN or API error)

Config file (.kubeguard.yaml)

The test command (and the webhook when using the resolver) reads resolver config from the repo. Place a config file in the repository root (or pass --config /path/to/file). Accepted names: .kubeguard.yaml, .kubeguard.yml, or kubeguard.yaml.

The file must define a resolver under values.resolver. All paths are relative to the repo root (or the directory given by --local-path).

Example (minimal):

values:
  resolver: single
  path: helm/my-app/values.yaml
  chart_path: helm/my-app

Resolver types and required keys:

Resolver Description Required config
single One values file, then helm template → validate. values.path (path to values file), values.chart_path (path to chart directory).
layered Multiple values files; <app> in filenames is replaced by the repo/app name; files are deep-merged in order, then helm template. values.base_path, values.files (list of filenames, e.g. ["<app>.yaml", "env-<app>.yaml"]), values.chart_path.
helm_template Same as layered but passes the resolved file list directly to helm template (no merge in Python). Best reflects real deployment. values.chart_path, values.values_files (list with <app> placeholder, e.g. ["<app>.yaml", "env-<app>.yaml"]).
pattern Discover values files by glob under a base path; deterministic sort, deep-merge, then helm template. values.base_path, values.include_patterns (e.g. ["*.yaml"]), values.chart_path.
central_chart Chart from a central repo (clone by ref), values from PR repo; then helm template. Use for split chart repo + app values. values.chart.repo_url, values.chart.chart_path, values.chart.ref (default main), values.values.base_path, values.values.files (list with <app> placeholder).

Full example (layered resolver):

values:
  resolver: layered
  base_path: helm
  files:
    - "<app>.yaml"
    - "env-<app>.yaml"
    - "tag-<app>.yaml"
  chart_path: helm/my-app

Here <app> is replaced by the repository name (e.g. my-service for org/my-service), so the resolver looks for helm/my-service.yaml, helm/env-my-service.yaml, and helm/tag-my-service.yaml.

Example (central_chart resolver — chart from central repo, values from PR):

values:
  resolver: central_chart
  source: pr
  chart:
    repo_url: git@github.com:platform/base-helm-charts.git
    chart_path: charts/base-app
    ref: main
  values:
    base_path: helm/
    files:
      - "<app>.yaml"
      - "env-<app>.yaml"
      - "tag-<app>.yaml"

Optional (central_chart auto-detect): you can omit the values: block and KubeGuard will infer it from the PR’s changed YAML files (it picks the directory with the most changed .yaml/.yml files, then includes the canonical set if present and any changed files in that directory). This requires that at least one values file is changed in the PR.

Override from CLI: Use --config /path/to/config.yaml to load from a custom path, and --resolver single (or another type) to override the resolver type in the file.

3. CLI options (analyze)

Option Short Description
--chart-path -c Path to Helm chart directory (required)
--values -v Path(s) to values file(s); repeat or comma-separate (required)
--namespace -n Namespace for helm template (default: default)
--environment -e prod or nonprod for criticality (default: from HELM_ENVIRONMENT)
--show-rendered -R Print rendered Helm template YAML in output
--json Output report as JSON
--verbose -V Verbose output to stderr; repeat for debug (-V or -VV)

Examples:

# Staging/nonprod criticality (downgraded severities, higher fail threshold)
kubeguard-analyze analyze -c ./chart -v values.yaml -e nonprod

# Full report + rendered manifests
kubeguard-analyze analyze -c ./chart -v values.yaml --show-rendered

# JSON only (e.g. for scripting)
kubeguard-analyze analyze -c ./chart -v values.yaml --json

Example output (analyze): When you run the CLI against a local chart and values files, the report looks like this (paths and app names below are generic for illustration):

## Kubernetes Helm Risk Report (local)

Chart:       /path/to/charts/my-app
App:         my-app
Environment: nonprod
Values:      /path/to/deployments/staging/platform/values-my-service.yaml, /path/to/deployments/staging/platform/env-my-service.yaml, /path/to/deployments/staging/platform/tag-my-service.yaml
Score:       20/100 (LOW)

Passed checks

  - ✅ CPU request set
  - ✅ Memory request set
  - ✅ Requests equal limits
  - ✅ Multiple replicas
  - ✅ HPA min replicas OK
  - ✅ Readiness probe set
  - ✅ Readiness with liveness
  - ✅ Topology spread configured
  - ✅ Node selector set
  - ✅ Tolerations set
  - ✅ Not privileged
  - ✅ Host network not used
  - ✅ No host path volumes
  - ✅ No dangerous capabilities
  - ✅ Service type OK
  - ✅ Ingress has TLS
  - ✅ Replica count not default
  - ✅ PDB not disabled by default
  - ✅ HPA template present
  - ✅ Service type safe
  - ✅ No hardcoded namespace

  🟡 MEDIUM:
    - [Deployment/my-app] Container 'my-app' has no resource limits
    - [Deployment/my-app] Deployment has replicas > 1 but no PodDisruptionBudget found
    - [Deployment/my-app] Container 'my-app' has no securityContext
    - [Deployment/my-app] Container 'my-app' does not set runAsNonRoot: true

  🟢 LOW:
    - [Deployment/my-app] No podAntiAffinity (pods may schedule on same node)
    - [Deployment/my-app] No default resources in values.yaml
    - [Pod/my-app-test-connection] No podAntiAffinity (pods may schedule on same node)

Recommendation: Review findings and improve where needed.
  • Chart / Values: Paths you passed with -c and -v.
  • App: Inferred from the values file names (e.g. tag-my-service.yamlmy-service) or chart directory name.
  • Score: 0–100; risk level is LOW / MEDIUM / HIGH / CRITICAL.
  • Passed checks: Rules that ran and found no issue.
  • Findings: Grouped by severity (🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW) with resource kind/name and message.

Use --json for machine-readable output or --show-rendered to include the rendered Helm template.

4. Optional: AWS context (CLI)

To run AWS validators (ACM, ELB, Secrets, IAM, ECR, SGP) from the CLI, use credentials from the environment or an attached IAM role, then set env vars and run as usual:

export ENABLE_AWS_CONTEXT=true
export ENABLE_AWS_CHECKS=true   # default; set false to skip all AWS checks
export ENABLE_ECR_REPO_CHECK=true
export ENABLE_ECR_TAG_CHECK=true
export AWS_REGION=ap-south-1
export DEPLOY_ENV=staging   # or prod, dev
# optional:
# export AWS_STRICT_MODE=true
# export AWS_TIMEOUT_SECONDS=2
# For SGP VPC validation (optional but recommended when ENABLE_AWS_CHECKS=true):
# export PROD_CLUSTER_NAME=my-prod-eks
# export NONPROD_CLUSTER_NAME=my-nonprod-eks

# Credentials: use env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) or
# let the default chain use the IAM role attached to the instance/task.
kubeguard-analyze analyze -c ./chart -v values.yaml

If any AWS call fails, you get a warning and the run still completes with K8s/Helm findings only.

5. Exit codes (analyze)

Code Meaning
0 Pass (risk below threshold)
1 Fail (CRITICAL finding or score above threshold)
2 Error (e.g. missing chart or values path)

Examples: See examples/ for sample config and values.

Local run (server for GitHub PR webhooks)

Start KubeGuard so it can receive GitHub PR webhook events (e.g. for local testing or a self-hosted deployment).

Option 1 — CLI serve command (recommended):

pip install -e .
export GITHUB_APP_ID=... GITHUB_PRIVATE_KEY="..." GITHUB_WEBHOOK_SECRET=...
kubeguard-analyze serve
# Optional: custom host/port or auto-reload for development
kubeguard-analyze serve --host 0.0.0.0 --port 8000 --reload

# Verbose webhook logging (repo, PR, clone, score, conclusion)
export WEBHOOK_VERBOSE=true
kubeguard-analyze serve
Option Description
--host Bind address (default: 0.0.0.0)
--port / -p Port (default: 8000)
--reload Auto-reload on code changes (development)
--config / -c Path to env file (e.g. .env) to load before starting; existing env vars take precedence

Example: kubeguard-analyze serve --config .env.production -p 8080

Using the same config as test: If you have a resolver config file that works with kubeguard-analyze test --config .config.value, you can make the webhook use it by setting KUBEGUARD_CONFIG_PATH to that file. An example env file is .config.webhook.env:

# .config.webhook.env
KUBEGUARD_CONFIG_PATH=.config.value
# GITHUB_TOKEN=ghp_xxx
# GITHUB_WEBHOOK_SECRET=...

Then run: kubeguard-analyze serve --config .config.webhook.env (from the repo root so .config.value resolves). The webhook will load the resolver config from .config.value and run the same central_chart (or other) flow as test.

Option 2 — uvicorn directly:

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Configure your GitHub App webhook URL to https://<your-host>/webhook (use HTTPS in production; for local dev you can use a tunnel like ngrok).

If you get {"detail":"Invalid signature"} (401):

  • The request must include the X-Hub-Signature-256 header (GitHub sends this when a webhook secret is set).
  • Set GITHUB_WEBHOOK_SECRET to the exact value from your GitHub App → Webhook → "Secret" (create one if missing). No quotes or extra spaces.
  • Ensure the server receives the raw request body (no re-encoding). Proxies that modify the body will break the signature.
  • For local testing only (e.g. curl without signing), you can skip verification:
    export SKIP_WEBHOOK_SECRET_VERIFICATION=true
    Do not use this in production.

Docker

docker build -t kubeguard .
docker run -p 8000:8000 \
  -e GITHUB_APP_ID=... \
  -e GITHUB_PRIVATE_KEY="..." \
  -e GITHUB_WEBHOOK_SECRET=... \
  kubeguard

Endpoints

  • POST /webhook — GitHub webhook (validates X-Hub-Signature-256)
  • GET /health — Health check
  • GET / — Service info

Documentation

Detailed docs are in the docs/ directory:

  • docs/configuration.md — All environment variables (main app + AWS) with types, defaults, and descriptions.
  • docs/architecture.md — Execution flow, config, data structures, and key modules.
  • docs/resolvers.md — All five resolvers (single, layered, helm_template, pattern, central_chart) with required config and when to use each.
  • docs/validators.md — Manifest rules, Helm rules, and AWS context validators with rule IDs and severities.

Project layout

app/
  main.py           # FastAPI app
  webhook.py        # Webhook handler + PR processing
  core/             # Engine (per spec: resolvers, validators, models)
    resolver/       # Resolvers: base + central_chart (helm resolvers in helm/resolver/)
    validators/     # Re-exports analyzer + rules
    models.py       # Re-exports app.models
  cli/              # CLI package: analyze and test
    run.py          # CLI implementation
    __main__.py     # Entry for python -m app.cli
  github/           # Auth, API client, pr_fetch (GITHUB_TOKEN)
  helm/             # Detector, renderer, chart loader, values
    resolver/       # Resolvers: single, layered, helm_template, pattern + factory
  parser/           # Manifest YAML parsing
  rules/            # Base + manifest + helm rules
  analyzer/         # Engine, scorer, correlator
  models/           # Config, RuleResult

Status

  • PRs that modify Helm charts trigger analysis
  • Helm templates are rendered; misconfigurations are detected via rules; risk score is computed
  • PR comment and GitHub Check Run reflect pass/fail
  • No cluster access required; service is stateless and suitable for concurrent PRs

About

A GitHub App that analyzes Helm charts and rendered Kubernetes manifests in PRs, applies risk rules, scores findings, and posts inline comments.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages