Helm-aware Kubernetes Pull Request Risk Analyzer — a GitHub App that analyzes Helm charts and rendered Kubernetes manifests in PRs, applies risk rules, scores findings, and posts inline comments plus a Check Run (pass/fail).
- Webhook-based: Processes
pull_request(opened, synchronize, reopened) - Helm-first: Detects charts (Chart.yaml / templates/ / values.yaml), runs
helm dependency buildandhelm template - No cluster access: Rendering only; no kubeconfig or
helm install - Rule engine: 20+ rules across Resource Safety, Availability, Scheduling, Security, Networking, and Helm-specific
- Risk scoring: Environment-aware (prod: stricter weights and threshold 70; nonprod: relaxed weights and threshold 85). See Risk scoring (prod vs nonprod) below.
- PR output: Summary comment, optional inline comments, GitHub Check Run
- Python 3.11+, FastAPI, Uvicorn, Pydantic, PyYAML, httpx, python-jose
- Helm CLI in container for
helm template/helm dependency build - Docker + Kubernetes-ready, stateless
KubeGuard is configured via environment variables. You can set them in the shell or in a .env file (or pass a file to kubeguard-analyze serve --config).
Full reference: docs/configuration.md — all variables with types, defaults, and descriptions.
Summary:
| Category | Key variables |
|---|---|
| GitHub | GITHUB_APP_ID, GITHUB_PRIVATE_KEY, GITHUB_WEBHOOK_SECRET (App); or GITHUB_TOKEN (PAT for local testing) |
| Webhook | SKIP_WEBHOOK_SECRET_VERIFICATION (dev only), WEBHOOK_VERBOSE, KUBEGUARD_CONFIG_PATH |
| Risk | RISK_PROFILE, RISK_FAIL_THRESHOLD (prod, default 70), RISK_FAIL_THRESHOLD_NONPROD (default 85) |
| Helm | HELM_CHART_SOURCE_URL, HELM_CHART_PATH, HELM_CHART_SOURCE_REF; HELM_VALUES_PATH, HELM_VALUES_SOURCE_URL, HELM_VALUES_SOURCE_REF; HELM_ENVIRONMENT, PROD_VALUES_REPO_URL, NON_PROD_VALUES_REPO_URL |
Two-environment criticality: For prod, all rules run at full severity and the PR fails on any CRITICAL or score > 70. For nonprod, each finding’s severity is downgraded by one level and scoring is relaxed (LOW = 0 points; higher threshold 85). See Risk scoring below.
| Production | Nonprod | |
|---|---|---|
| Severity | Findings keep original severity (LOW, MEDIUM, HIGH, CRITICAL). | Each finding is downgraded by one level (CRITICAL→HIGH, HIGH→MEDIUM, MEDIUM→LOW, LOW→LOW). The run-as-root rule is further set to MEDIUM in nonprod. |
| Points per finding | LOW = 5, MEDIUM = 10, HIGH = 20, CRITICAL = 40 | LOW = 0, MEDIUM = 5, HIGH = 10, CRITICAL = 20 |
| Total score | Sum of points (capped at 100) | Sum of points (capped at 100) |
| Risk level | 0–20 LOW, 21–50 MEDIUM, 51–80 HIGH, 81–100 CRITICAL | Same bands |
| Fail threshold | Any CRITICAL finding or score > 70 (RISK_FAIL_THRESHOLD) |
Any CRITICAL finding (after downgrade) or score > 85 (RISK_FAIL_THRESHOLD_NONPROD) |
Example (nonprod): Four findings: 2 MEDIUM, 2 LOW. After downgrade they stay MEDIUM and LOW. Nonprod points: 2×5 + 2×0 = 10 → score 10 (LOW). No CRITICAL and 10 ≤ 85 → pass.
Example (prod): Same four findings at original severity. Prod points: 2×10 + 2×5 = 30 → score 30 (MEDIUM). No CRITICAL and 30 ≤ 70 → pass. If one of those were CRITICAL in prod, the run would fail regardless of score.
AWS Context Engine (optional): When ENABLE_AWS_CONTEXT=true, the engine validates AWS resources referenced in manifests (ACM, ELB, Secrets Manager, IRSA, ECR, SGP). It uses the default AWS credential chain. Set ENABLE_AWS_CHECKS=false to disable all AWS checks while keeping the engine enabled. Full AWS env list: docs/configuration.md#aws-context-engine.
ECR checks: Repo presence (aws-ecr-repo-missing) and tag presence (aws-ecr-tag-missing) can be toggled with ENABLE_ECR_REPO_CHECK and ENABLE_ECR_TAG_CHECK.
Three-repo setup: Install the app on the repo where you want PR comments/checks (e.g. an app repo). Set HELM_CHART_SOURCE_URL to the chart repo and HELM_VALUES_SOURCE_URL to the deployment repo. Chart and values are then loaded from those remotes; the PR repo is only used for posting results.
-
Create a GitHub App with:
- Repository permissions: Contents (Read), Pull requests (Read & Write), Checks (Read & Write), Metadata (Read)
- Webhook: Subscribe to Pull requests; set URL to
https://<your-host>/webhook - Webhook secret: Set and use as
GITHUB_WEBHOOK_SECRET
-
Install the app on the desired repos.
The CLI has three subcommands:
analyze— Run risk analysis against a local Helm chart and local values file(s) (no clone, no webhook).test— Test a remote PR (GitHub API + clone) or a local repo using the resolver from.kubeguard.yaml(same flow as the webhook).serve— Start the KubeGuard server to receive GitHub PR webhooks (POST/webhook). Use for local or self-hosted deployment.
cd /path/to/kubeguard
pip install -e .Basic (chart + one or more values files):
python -m app.cli analyze --chart-path /path/to/chart --values /path/to/values.yaml
# or with the script name:
kubeguard-analyze analyze -c /path/to/chart -v /path/to/values.yamlMultiple values files (comma-separated or multiple -v):
kubeguard-analyze analyze -c /path/to/chart \
-v values.yaml \
-v env/prod.yaml
# or one argument:
kubeguard-analyze analyze -c /path/to/chart -v "values.yaml,env/prod.yaml"Example script (from repo root):
./scripts/run-test-analyze.shTest a PR or local repo using the resolver configured in .kubeguard.yaml (or --config). No dashboard or PR comments; results are printed to stdout. See Config file (.kubeguard.yaml) below for how to set up the config.
Remote PR (requires GITHUB_TOKEN in the environment):
export GITHUB_TOKEN=ghp_...
kubeguard-analyze test --repo org/repo --pr 123
# Optional: --config /path/to/.kubeguard.yaml --resolver single --format json --verboseLocal repo (no GitHub; use a directory that contains .kubeguard.yaml or pass --config):
kubeguard-analyze test --local-path ./path/to/repo [--branch main]
kubeguard-analyze test --local-path ./repo --config .kubeguard.yaml --format json --verbose| Option | Description |
|---|---|
--repo |
Repository org/repo (required for remote PR, use with --pr) |
--pr |
PR number (required for remote PR) |
--local-path |
Local repo path (for offline mode; use instead of --repo/--pr) |
--branch |
Branch when using --local-path (default: main) |
--config |
Path to config file (default: .kubeguard.yaml in repo) |
--resolver |
Override resolver type: single, layered, helm_template, pattern |
--format |
text (default) or json |
--verbose / -V |
Verbose output to stderr; repeat for more detail (-V verbose, -VV debug) |
--environment / -e |
prod or nonprod for scoring (default: from HELM_ENVIRONMENT) |
Test exit codes:
| Code | Meaning |
|---|---|
0 |
Pass (risk below threshold) |
1 |
Validation fail (CRITICAL or score above threshold) |
2 |
Resolver error (e.g. values/chart not found, helm template failed) |
3 |
Config error (no .kubeguard.yaml or invalid config) |
4 |
GitHub API failure (missing GITHUB_TOKEN or API error) |
The test command (and the webhook when using the resolver) reads resolver config from the repo. Place a config file in the repository root (or pass --config /path/to/file). Accepted names: .kubeguard.yaml, .kubeguard.yml, or kubeguard.yaml.
The file must define a resolver under values.resolver. All paths are relative to the repo root (or the directory given by --local-path).
Example (minimal):
values:
resolver: single
path: helm/my-app/values.yaml
chart_path: helm/my-appResolver types and required keys:
| Resolver | Description | Required config |
|---|---|---|
| single | One values file, then helm template → validate. |
values.path (path to values file), values.chart_path (path to chart directory). |
| layered | Multiple values files; <app> in filenames is replaced by the repo/app name; files are deep-merged in order, then helm template. |
values.base_path, values.files (list of filenames, e.g. ["<app>.yaml", "env-<app>.yaml"]), values.chart_path. |
| helm_template | Same as layered but passes the resolved file list directly to helm template (no merge in Python). Best reflects real deployment. |
values.chart_path, values.values_files (list with <app> placeholder, e.g. ["<app>.yaml", "env-<app>.yaml"]). |
| pattern | Discover values files by glob under a base path; deterministic sort, deep-merge, then helm template. |
values.base_path, values.include_patterns (e.g. ["*.yaml"]), values.chart_path. |
| central_chart | Chart from a central repo (clone by ref), values from PR repo; then helm template. Use for split chart repo + app values. |
values.chart.repo_url, values.chart.chart_path, values.chart.ref (default main), values.values.base_path, values.values.files (list with <app> placeholder). |
Full example (layered resolver):
values:
resolver: layered
base_path: helm
files:
- "<app>.yaml"
- "env-<app>.yaml"
- "tag-<app>.yaml"
chart_path: helm/my-appHere <app> is replaced by the repository name (e.g. my-service for org/my-service), so the resolver looks for helm/my-service.yaml, helm/env-my-service.yaml, and helm/tag-my-service.yaml.
Example (central_chart resolver — chart from central repo, values from PR):
values:
resolver: central_chart
source: pr
chart:
repo_url: git@github.com:platform/base-helm-charts.git
chart_path: charts/base-app
ref: main
values:
base_path: helm/
files:
- "<app>.yaml"
- "env-<app>.yaml"
- "tag-<app>.yaml"Optional (central_chart auto-detect): you can omit the values: block and KubeGuard will infer it from the PR’s changed YAML files (it picks the directory with the most changed .yaml/.yml files, then includes the canonical set if present and any changed files in that directory). This requires that at least one values file is changed in the PR.
Override from CLI: Use --config /path/to/config.yaml to load from a custom path, and --resolver single (or another type) to override the resolver type in the file.
| Option | Short | Description |
|---|---|---|
--chart-path |
-c |
Path to Helm chart directory (required) |
--values |
-v |
Path(s) to values file(s); repeat or comma-separate (required) |
--namespace |
-n |
Namespace for helm template (default: default) |
--environment |
-e |
prod or nonprod for criticality (default: from HELM_ENVIRONMENT) |
--show-rendered |
-R |
Print rendered Helm template YAML in output |
--json |
Output report as JSON | |
--verbose |
-V |
Verbose output to stderr; repeat for debug (-V or -VV) |
Examples:
# Staging/nonprod criticality (downgraded severities, higher fail threshold)
kubeguard-analyze analyze -c ./chart -v values.yaml -e nonprod
# Full report + rendered manifests
kubeguard-analyze analyze -c ./chart -v values.yaml --show-rendered
# JSON only (e.g. for scripting)
kubeguard-analyze analyze -c ./chart -v values.yaml --jsonExample output (analyze): When you run the CLI against a local chart and values files, the report looks like this (paths and app names below are generic for illustration):
## Kubernetes Helm Risk Report (local)
Chart: /path/to/charts/my-app
App: my-app
Environment: nonprod
Values: /path/to/deployments/staging/platform/values-my-service.yaml, /path/to/deployments/staging/platform/env-my-service.yaml, /path/to/deployments/staging/platform/tag-my-service.yaml
Score: 20/100 (LOW)
Passed checks
- ✅ CPU request set
- ✅ Memory request set
- ✅ Requests equal limits
- ✅ Multiple replicas
- ✅ HPA min replicas OK
- ✅ Readiness probe set
- ✅ Readiness with liveness
- ✅ Topology spread configured
- ✅ Node selector set
- ✅ Tolerations set
- ✅ Not privileged
- ✅ Host network not used
- ✅ No host path volumes
- ✅ No dangerous capabilities
- ✅ Service type OK
- ✅ Ingress has TLS
- ✅ Replica count not default
- ✅ PDB not disabled by default
- ✅ HPA template present
- ✅ Service type safe
- ✅ No hardcoded namespace
🟡 MEDIUM:
- [Deployment/my-app] Container 'my-app' has no resource limits
- [Deployment/my-app] Deployment has replicas > 1 but no PodDisruptionBudget found
- [Deployment/my-app] Container 'my-app' has no securityContext
- [Deployment/my-app] Container 'my-app' does not set runAsNonRoot: true
🟢 LOW:
- [Deployment/my-app] No podAntiAffinity (pods may schedule on same node)
- [Deployment/my-app] No default resources in values.yaml
- [Pod/my-app-test-connection] No podAntiAffinity (pods may schedule on same node)
Recommendation: Review findings and improve where needed.
- Chart / Values: Paths you passed with
-cand-v. - App: Inferred from the values file names (e.g.
tag-my-service.yaml→my-service) or chart directory name. - Score: 0–100; risk level is LOW / MEDIUM / HIGH / CRITICAL.
- Passed checks: Rules that ran and found no issue.
- Findings: Grouped by severity (🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW) with resource kind/name and message.
Use --json for machine-readable output or --show-rendered to include the rendered Helm template.
To run AWS validators (ACM, ELB, Secrets, IAM, ECR, SGP) from the CLI, use credentials from the environment or an attached IAM role, then set env vars and run as usual:
export ENABLE_AWS_CONTEXT=true
export ENABLE_AWS_CHECKS=true # default; set false to skip all AWS checks
export ENABLE_ECR_REPO_CHECK=true
export ENABLE_ECR_TAG_CHECK=true
export AWS_REGION=ap-south-1
export DEPLOY_ENV=staging # or prod, dev
# optional:
# export AWS_STRICT_MODE=true
# export AWS_TIMEOUT_SECONDS=2
# For SGP VPC validation (optional but recommended when ENABLE_AWS_CHECKS=true):
# export PROD_CLUSTER_NAME=my-prod-eks
# export NONPROD_CLUSTER_NAME=my-nonprod-eks
# Credentials: use env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) or
# let the default chain use the IAM role attached to the instance/task.
kubeguard-analyze analyze -c ./chart -v values.yamlIf any AWS call fails, you get a warning and the run still completes with K8s/Helm findings only.
| Code | Meaning |
|---|---|
0 |
Pass (risk below threshold) |
1 |
Fail (CRITICAL finding or score above threshold) |
2 |
Error (e.g. missing chart or values path) |
Examples: See examples/ for sample config and values.
Start KubeGuard so it can receive GitHub PR webhook events (e.g. for local testing or a self-hosted deployment).
Option 1 — CLI serve command (recommended):
pip install -e .
export GITHUB_APP_ID=... GITHUB_PRIVATE_KEY="..." GITHUB_WEBHOOK_SECRET=...
kubeguard-analyze serve
# Optional: custom host/port or auto-reload for development
kubeguard-analyze serve --host 0.0.0.0 --port 8000 --reload
# Verbose webhook logging (repo, PR, clone, score, conclusion)
export WEBHOOK_VERBOSE=true
kubeguard-analyze serve| Option | Description |
|---|---|
--host |
Bind address (default: 0.0.0.0) |
--port / -p |
Port (default: 8000) |
--reload |
Auto-reload on code changes (development) |
--config / -c |
Path to env file (e.g. .env) to load before starting; existing env vars take precedence |
Example: kubeguard-analyze serve --config .env.production -p 8080
Using the same config as test: If you have a resolver config file that works with kubeguard-analyze test --config .config.value, you can make the webhook use it by setting KUBEGUARD_CONFIG_PATH to that file. An example env file is .config.webhook.env:
# .config.webhook.env
KUBEGUARD_CONFIG_PATH=.config.value
# GITHUB_TOKEN=ghp_xxx
# GITHUB_WEBHOOK_SECRET=...Then run: kubeguard-analyze serve --config .config.webhook.env (from the repo root so .config.value resolves). The webhook will load the resolver config from .config.value and run the same central_chart (or other) flow as test.
Option 2 — uvicorn directly:
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000Configure your GitHub App webhook URL to https://<your-host>/webhook (use HTTPS in production; for local dev you can use a tunnel like ngrok).
If you get {"detail":"Invalid signature"} (401):
- The request must include the
X-Hub-Signature-256header (GitHub sends this when a webhook secret is set). - Set
GITHUB_WEBHOOK_SECRETto the exact value from your GitHub App → Webhook → "Secret" (create one if missing). No quotes or extra spaces. - Ensure the server receives the raw request body (no re-encoding). Proxies that modify the body will break the signature.
- For local testing only (e.g. curl without signing), you can skip verification:
export SKIP_WEBHOOK_SECRET_VERIFICATION=true
Do not use this in production.
docker build -t kubeguard .
docker run -p 8000:8000 \
-e GITHUB_APP_ID=... \
-e GITHUB_PRIVATE_KEY="..." \
-e GITHUB_WEBHOOK_SECRET=... \
kubeguardPOST /webhook— GitHub webhook (validatesX-Hub-Signature-256)GET /health— Health checkGET /— Service info
Detailed docs are in the docs/ directory:
- docs/configuration.md — All environment variables (main app + AWS) with types, defaults, and descriptions.
- docs/architecture.md — Execution flow, config, data structures, and key modules.
- docs/resolvers.md — All five resolvers (
single,layered,helm_template,pattern,central_chart) with required config and when to use each. - docs/validators.md — Manifest rules, Helm rules, and AWS context validators with rule IDs and severities.
app/
main.py # FastAPI app
webhook.py # Webhook handler + PR processing
core/ # Engine (per spec: resolvers, validators, models)
resolver/ # Resolvers: base + central_chart (helm resolvers in helm/resolver/)
validators/ # Re-exports analyzer + rules
models.py # Re-exports app.models
cli/ # CLI package: analyze and test
run.py # CLI implementation
__main__.py # Entry for python -m app.cli
github/ # Auth, API client, pr_fetch (GITHUB_TOKEN)
helm/ # Detector, renderer, chart loader, values
resolver/ # Resolvers: single, layered, helm_template, pattern + factory
parser/ # Manifest YAML parsing
rules/ # Base + manifest + helm rules
analyzer/ # Engine, scorer, correlator
models/ # Config, RuleResult
- PRs that modify Helm charts trigger analysis
- Helm templates are rendered; misconfigurations are detected via rules; risk score is computed
- PR comment and GitHub Check Run reflect pass/fail
- No cluster access required; service is stateless and suitable for concurrent PRs