feat: Improve startup time with init containers by adityathebe · Pull Request #363 · flanksource/mission-control-chart

adityathebe · 2026-03-24T14:58:10Z

We have a complex dependency between the pods installed by this chart.

postgres: kratos, config-db, canary-checker
kratos: mission-control

On first installation, it takes time for the db and kratos to come up.
We want to avoid running into the exponential backoff before the
dependencies are even ready

Summary by CodeRabbit

Chores
- Enhanced deployment startup reliability by adding initialization checks that wait for service dependencies to be available before starting main containers.
- Added health checks for mission-control service and PostgreSQL database connectivity during pod initialization.
- Configured Kratos authentication provider to wait for service readiness with 15-minute timeout.

We have a complex dependency between the pods installed by this chart. postgres: kratos, config-db, canary-checker kratos: mission-control On first installation, it takes time for the db and kratos to come up. We want to avoid running into the exponential backoff before the dependencies are even ready

coderabbitai · 2026-03-24T14:58:27Z

Walkthrough

Changes add startup dependency checks via init containers in Kubernetes deployments. The deployment template conditionally includes a wait-for-kratos init container when Kratos authentication is configured. Configuration values enable HTTP and TCP health checks for database and service dependencies before main containers start.

Changes

Cohort / File(s)	Summary
Init Container Configuration `chart/templates/deployment.yaml`, `chart/values.yaml`	Adds conditional init container to deployment for Kratos TCP health check, and configuration blocks for HTTP health checks (mission-control) and TCP checks (postgres) with 15-minute timeouts.
Dependency Management `chart/test/go.mod`	Promotes `github.com/chromedp/cdproto` from indirect to direct dependency.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: adding init containers to the deployment to improve startup behavior by waiting for dependencies, as detailed in the objectives.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/improve-installation-duration

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/improve-installation-duration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

chart/values.yaml (2)
410-422: Init container image doesn't follow the global registry pattern.

The wait4x image is hardcoded to docker.io/atkrad/wait4x:3.6, whereas other images in this chart use .Values.global.imageRegistry. This could cause issues in air-gapped or restricted environments where docker.io is not accessible.

Consider making this configurable or documenting the external dependency.

445-457: Duplicated init container configuration.

This block is identical to the canary-checker init container (lines 410-422). If more components need this pattern, consider extracting a reusable template or helper, or documenting the pattern for maintainability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chart/values.yaml` around lines 445 - 457, The initContainer block for
wait-for-mission-control is duplicated with the canary-checker init container;
remove duplication by extracting the shared initContainer spec into a reusable
definition (e.g., a Helm helper template or a YAML anchor) and reference it from
both places instead of copying the block; specifically, create a reusable
snippet for the wait4x/http check and replace the concrete initContainer entries
for "wait-for-mission-control" and "canary-checker" with references to that
snippet (or call the helper), keeping the unique name fields (name:
wait-for-mission-control, name: canary-checker) while sharing the common
image/command configuration.
chart/templates/deployment.yaml (1)
35-46: Use HTTP health check with Kratos health endpoint for more robust readiness detection.

The init container uses a TCP check for kratos-public:80, but Kratos provides dedicated health endpoints. A TCP connection succeeding doesn't guarantee the service is fully ready to handle requests. Kratos's /health/ready endpoint provides more robust readiness detection by verifying both HTTP server availability and database connectivity.
💡 Suggested improvement using HTTP health check
       {{- if and (eq .Values.authProvider "kratos") .Values.kratos.enabled }}
       initContainers:
         - name: wait-for-kratos
           image: docker.io/atkrad/wait4x:3.6
           imagePullPolicy: IfNotPresent
           command:
             - wait4x
-            - tcp
-            - kratos-public:80
+            - http
+            - http://kratos-public:80/health/ready
+            - --expect-status-code
+            - "200"
             - -t
             - 15m
       {{- end }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@chart/templates/deployment.yaml` around lines 35 - 46, The initContainer
wait-for-kratos currently uses a TCP check which can report ready before Kratos
is fully initialized; update the initContainer named wait-for-kratos to use an
HTTP health probe against Kratos's readiness endpoint (e.g., /health/ready on
kratos-public:80) by switching the wait4x command from tcp to http and adding
the appropriate path and expected status options so the container only proceeds
when the HTTP readiness check succeeds; modify the initContainers block in
deployment.yaml (look for initContainers and the wait-for-kratos container
definition) to use wait4x http targeting /health/ready.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@chart/templates/deployment.yaml`:
- Around line 35-46: The initContainer wait-for-kratos currently uses a TCP
check which can report ready before Kratos is fully initialized; update the
initContainer named wait-for-kratos to use an HTTP health probe against Kratos's
readiness endpoint (e.g., /health/ready on kratos-public:80) by switching the
wait4x command from tcp to http and adding the appropriate path and expected
status options so the container only proceeds when the HTTP readiness check
succeeds; modify the initContainers block in deployment.yaml (look for
initContainers and the wait-for-kratos container definition) to use wait4x http
targeting /health/ready.

In `@chart/values.yaml`:
- Around line 445-457: The initContainer block for wait-for-mission-control is
duplicated with the canary-checker init container; remove duplication by
extracting the shared initContainer spec into a reusable definition (e.g., a
Helm helper template or a YAML anchor) and reference it from both places instead
of copying the block; specifically, create a reusable snippet for the
wait4x/http check and replace the concrete initContainer entries for
"wait-for-mission-control" and "canary-checker" with references to that snippet
(or call the helper), keeping the unique name fields (name:
wait-for-mission-control, name: canary-checker) while sharing the common
image/command configuration.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 886d66d0-3169-4806-a5bd-5189fea23e87

📥 Commits

Reviewing files that changed from the base of the PR and between e80802e and f70b96d.

⛔ Files ignored due to path filters (1)

chart/test/go.sum is excluded by !**/*.sum

📒 Files selected for processing (3)

chart/templates/deployment.yaml
chart/test/go.mod
chart/values.yaml

moshloop · 2026-03-24T20:57:54Z

+      {{- if and (eq .Values.authProvider "kratos") .Values.kratos.enabled }}
+      initContainers:
+        - name: wait-for-kratos
+          image: docker.io/atkrad/wait4x:3.6


I think this introduces 2 reliability issues:

Docker rate limiting will cause the instance to fail to start

A kratos failure will prevent mission-control starting i.e. moving it into the critical path

Maybe we should just copy the kraros seed scripts and run them ourselves?

coderabbitai Bot reviewed Mar 24, 2026

View reviewed changes

moshloop reviewed Mar 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Improve startup time with init containers#363

feat: Improve startup time with init containers#363
adityathebe wants to merge 1 commit into
mainfrom
feat/improve-installation-duration

adityathebe commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

moshloop Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

adityathebe commented Mar 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

moshloop Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adityathebe commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 24, 2026 •

edited

Loading