feat(approval): reviewer-aware auto-approval + in-app approve/decline for protected envs by krusche · Pull Request #1098 · ls1intum/Helios

krusche · 2026-05-28T09:20:12Z

Motivation

Today, when Helios dispatches a deployment to a GitHub environment with a required-reviewer protection rule, three things go wrong:

The webhook handler blindly impersonates the deployment creator and tries to approve, regardless of whether they're actually a required reviewer. When they aren't, GitHub returns 403 and the error is silently swallowed — the deployment sits in WAITING with no signal to the user.
The only way to actually resolve such a pending deployment is to leave Helios, open GitHub's UI, find the workflow run, and click Review pending deployments. Reviewers forget; valuable time is lost.
The Keycloak SSO session is hard-capped at 8h, so users effectively need to re-auth roughly daily — independent of the approval problem but symptomatic of the same "leaves Helios to do basic things" pain.

This PR is Phase 1 + Phase 2 of a focused approval flow that keeps everything inside Helios. Phase 3 (email one-click links) is intentionally out of scope and can land later.

What changes

Reviewer-aware auto-approval (Phase 1)

When GitHub fires deployment_status: waiting for a Helios-originated deployment, ApprovalService.reviewDeployment now branches on the env's protection rule:

Situation	Outcome	`HeliosDeployment.autoApprovalDecision`
No required-reviewer rule	Do nothing — GitHub handles any wait-timer / branch policy itself	`NOT_APPLICABLE`
Team-type reviewers present	Legacy "impersonate deployer, try anyway" (Helios doesn't yet expand teams to members; punting silently would regress working setups)	`TEAM_REVIEWER_FALLBACK`
`preventSelfReview` is on, or deployer not in User-type reviewer list	Defer; surface in the in-app pending-approvals list for someone else to resolve	`DEFERRED_TO_REVIEWERS`
Deployer is a User-type required reviewer	Auto-approve as them via Keycloak impersonation, with a source-tagged comment so the GitHub audit trail says `Auto-approved by Helios (deployer is required reviewer)`	`AUTO_APPROVED`

A bounded retry (up to 3 attempts, 100 ms + 300 ms backoff = 400 ms worst case) handles the race where the webhook arrives before the deploy-side transaction commits. The service is idempotent against webhook redelivery: once autoApprovalDecision is non-null, a second invocation is a no-op.

In-app approve / decline (Phase 2)

Two endpoints, gated on "current user is in this env's User-type required reviewer list":

POST /api/deployments/{id}/approve — impersonates the caller, approves on GitHub, marks any sibling pending requests CONSUMED_BY_OTHER in the same transaction.
POST /api/deployments/{id}/decline body {comment?} — same but state: rejected (the workflow run is killed for everyone; the UI surfaces that warning explicitly).
GET /api/deployments/pending-approvals — caller's outstanding requests, joined with deployment + env + repo summary.

Concurrency: SELECT FOR UPDATE on the HeliosDeployment row serialises concurrent clicks; only one reviewer's GitHub call ever lands. FAILED_AT_GITHUB distinguishes "clicked Approve" from "GitHub accepted Approve", letting the UI offer a clean retry. The transaction is annotated noRollbackFor = ResponseStatusException.class so the FAILED_AT_GITHUB audit row survives the 502 throw — without this, Spring's default rollback would silently discard the row.

`DeploymentApprovalRequest` entity (Phase 1 → 3)

One row per (deployment, reviewer) tracking the lifecycle (PENDING → APPROVED | DECLINED | EXPIRED | CONSUMED_BY_OTHER | FAILED_AT_GITHUB), tagged with via in {AUTO, IN_APP, EMAIL_LINK}. Auto rows are created in terminal state. Email-link Phase 3 will populate tokenHash (SHA-256 of the link token, plaintext only in the URL). Flyway V53 adds the table plus auto_approval_decision / auto_approval_at on helios_deployment for fast UI lookups + at-a-glance audit. ApprovalDecisionWriter atomically commits the audit row + deployment stamp together so a crash between the two writes cannot leave a half-state.

UI

New cross-repo /pending-approvals page (Approve / Decline per row, decline opens a dialog with an optional reason and the "this kills the workflow for everyone" warning). Badge on the profile button + a Pending approvals entry in the profile popover, both driven by the same 30 s-polled query — visible across every page since the queue is cross-repo.

Realm session lifespans

ssoSessionMaxLifespan 8 h → 30 d, ssoSessionIdleTimeout 8 h → 7 d on helios-example-realm.json. Prod Keycloak must mirror these for the user-visible effect; the example file is only the dev/staging seed.

GitHubService surface

approveDeploymentOnBehalfOfUser gains a comment parameter so the source of every approval (auto / in-app / email-link) is captured in the GitHub-side audit comment. A new rejectDeploymentOnBehalfOfUser(...) implements the decline path. The legacy 4-arg overload still exists. Token-exchange failure now throws IOException instead of silently returning, so callers can react.

Intentionally NOT in this PR

Email one-click approve / decline links (Phase 3). The plan lays out a token table (token_hash is already on DeploymentApprovalRequest, ready for use), but no Notification payload or template yet.
Team-type reviewer expansion (Phase 4). Today's "impersonate deployer, try anyway" fallback runs when team-type reviewers are present; this preserves existing setups rather than silently no-opping. Phase 4 will resolve team membership via GET /orgs/{org}/teams/{slug}/members (cached).
Pre-deploy confirmation modal. Mentioned in the brief; deferred. Phase 2's pending-approvals queue plus the source-tagged audit comments already cover the core requirement ("don't make me leave Helios").
deployment_review webhook sync. Approvals that happen outside Helios in the GitHub UI won't sync back to the pending list until Phase 4. The list is correct for Helios-driven deployments, which is the bulk of usage.
Admin bypass. GitHub allows repo admins to bypass reviewer rules in its UI; Helios does not replicate this in code.

Testing instructions

Prerequisites

Helios dev stack running locally (server on :8080, client on :4200, Postgres, Keycloak, NATS, GitHub-app webhook reachable).
A connected GitHub repository in Helios with at least one environment that has a required-reviewer protection rule of the User type (Settings → Environments → … → Deployment protection rules → Required reviewers).
Add your GitHub user as a required reviewer of that environment. For the cross-user tests, also add a second user (or use a colleague's account in a second browser profile).
Make sure your Helios login has gone through the GitHub IDP at least once so Keycloak holds a GitHub token for you (Keycloak Admin Console → realm → Users → you → Identity Provider Links → github should show a stored token).
Have a trivial deployment workflow on the repo's default branch — anything that targets the protected environment (a sleep 5 job is fine) — so you don't need to wait for a long build between attempts.

Quick DB peek commands used below (adjust connection params for your dev stack):

# Latest deployment-approval audit rows
psql -h localhost -U helios -d helios -c \
  "select id, helios_deployment_id, state, via, reviewer_login, responded_at, failure_reason
   from deployment_approval_request order by id desc limit 5;"

# Latest helios_deployment auto-approval decisions
psql -h localhost -U helios -d helios -c \
  "select id, status, auto_approval_decision, auto_approval_at
   from helios_deployment order by id desc limit 5;"

Server unit tests

cd server && ./gradlew :application-server:test \
  --tests "*ApprovalServiceTest" \
  --tests "*DeploymentReviewActionServiceTest" \
  --tests "*GitHubServiceTest"

These exercise: deployer-is-reviewer auto-approve, defer when preventSelfReview, defer when deployer not reviewer, no-rule no-op, team-fallback path, GitHub failure → FAILED_AT_GITHUB, persist/webhook race retry, missing workflowRunId skip, idempotency on replayed webhook, in-app happy-approve / happy-decline / not-a-reviewer-403 / preventSelfReview-403 / already-resolved-409 / no-runId-409 / no-rule-409 / GitHub-fail-502 / sibling-CONSUMED_BY_OTHER. (208 server tests total; all green.)

End-to-end test matrix

The matrix below is what a reviewer should walk through to confirm the feature actually does what the PR promises. Each test takes ~1 min once the prerequisites are in place.

Test 1 — Auto-approval when the deployer is a required reviewer (the headline)

Setup: the protected env's required reviewers include your account; preventSelfReview is off.

In Helios, open Repo → Environments and click Deploy on the protected environment.
Wait for the workflow run to reach the protected job (a few seconds with a trivial workflow).

Expected:

The GitHub workflow run resumes automatically within ~1 s of hitting the waiting state — no clicks needed.
On GitHub: open the workflow run → "Review pending deployments" log shows comment Auto-approved by Helios (deployer is required reviewer).
DB: helios_deployment.auto_approval_decision = AUTO_APPROVED, auto_approval_at set; one new deployment_approval_request row with state=APPROVED, via=AUTO, reviewer_login=<you>.
Server log: Auto-approved deployment {id} on behalf of @<you> (AUTO_APPROVED).

Test 2 — Deferral + in-app Approve (the core Phase 2 path)

Setup: the protected env's required reviewer is someone else (not you). Easiest: temporarily replace yourself as required reviewer with a colleague's account, or use a second test account in a second browser profile.

As you (not a reviewer), click Deploy.
Wait until the workflow reaches the protected job.
Switch to the reviewer's browser. Click the profile avatar (top-right).

Expected at the reviewer's side:

A warning badge appears on the profile button (within ~30 s — that's the poll interval).
The dropdown shows a Pending approvals entry with the same badge count.
Clicking it lands on /pending-approvals and the deployment is in the table with env, repo (clickable), the deployer's login, branch, commit SHA, and a link to the GitHub run.

As the reviewer, click Approve on that row → confirm in the dialog.

Expected:

The mutation returns 200; a green toast says "Approved".
On GitHub the run resumes; the "Review pending deployments" log shows Approved by @<reviewer> via Helios (in-app).
DB: the audit row state=APPROVED, via=IN_APP, responded_at set; helios_deployment.auto_approval_decision=DEFERRED_TO_REVIEWERS (the original deferral decision is preserved as the audit anchor).
The badge on the reviewer's profile clears within ~30 s.

Test 3 — In-app Decline rejects the workflow for everyone

Setup: same as Test 2 — a pending approval the reviewer can act on.

As the reviewer, click Decline on a row.
The dialog should warn explicitly: "Declining will reject the workflow run on GitHub — the deployment will not happen and the run is marked as failed."
Optionally type a reason ("wrong branch"). Click Decline deployment.

Expected:

The dialog closes; a warn-severity toast says "Declined".
The GitHub workflow run goes to failure / rejected state (not just paused).
GitHub's "Review pending deployments" log shows Declined by @<reviewer> via Helios (in-app): wrong branch (or without : … if no reason).
DB: state=DECLINED, via=IN_APP, responded_at set.

Test 4 — `preventSelfReview` honored

Setup: the env's required reviewers include your account; preventSelfReview is on.

As yourself, click Deploy.
Wait for the waiting event.

Expected:

GitHub run does not resume — no auto-approval.
DB: auto_approval_decision = DEFERRED_TO_REVIEWERS.
/pending-approvals for you does not show this row (you can't self-review).
If a second required reviewer exists, they can resolve it through the same Test-2 / Test-3 paths.

Test 5 — Not-a-required-reviewer can't approve

As a logged-in user who is not in the env's required-reviewer list, attempt the API directly:

curl -i -X POST http://localhost:8080/api/deployments/{deployment-id}/approve \
  -H "Authorization: Bearer $KEYCLOAK_BEARER"

Expected: HTTP 403 Forbidden, body contains You are not a required reviewer for this environment. The UI doesn't expose the button to non-reviewers, so this guards direct API misuse.

Test 6 — Already-resolved → 409 Conflict

After Test 2 succeeds, hit the same deployment id again:

curl -i -X POST http://localhost:8080/api/deployments/{same-id}/approve \
  -H "Authorization: Bearer $KEYCLOAK_BEARER"

Expected: HTTP 409 Conflict, body contains This deployment has already been resolved. Also confirms the first-responder race protection: if two reviewers had clicked at the same time, the slower one would see this exact response.

Test 7 — Idempotency on webhook redelivery (the subtle one)

Setup: a deployment from Test 1 (already AUTO_APPROVED).

GitHub → repo Settings → Webhooks → (Helios webhook) → Recent Deliveries → find the deployment_status event with state=waiting for your run → Redeliver.

Expected:

The Helios server log shows Skipping reviewDeployment for {id}; already decided as AUTO_APPROVED.
No new deployment_approval_request row is created (still exactly one APPROVED row).
helios_deployment.auto_approval_decision is not overwritten — it stays AUTO_APPROVED.
No new GitHub API call (look for absence of an approveDeploymentOnBehalfOfUser log line).

Test 8 — Env without required reviewers is left alone

Setup: deploy to an env that has no required-reviewer rule (a wait-timer-only env, or no protection rules at all).

Click Deploy.

Expected:

GitHub handles the deployment per its own rules (wait-timer expires, or no gate).
Helios stamps auto_approval_decision = NOT_APPLICABLE.
No deployment_approval_request rows are created.
Server log: No required-reviewer rule for environment {name}; leaving deployment {id} to GitHub.

Visual / UX smoke

The /pending-approvals page handles empty list ("Nothing waiting on you right now.") and loading skeleton correctly.
The Approve / Decline buttons disable while a mutation is in flight (no double-submits possible by spam-clicking).
The decline dialog blocks until the user types a reason or clicks "Decline deployment" with no reason — both should work; the GitHub-side comment changes accordingly.
The badge on the profile button updates within ~30 s of the underlying list changing (refetch interval).

Operator notes

Mirror the realm session lifespans on prod Keycloak: Admin Console → Realm Settings → Sessions → SSO Session Idle: 7 d, SSO Session Max: 30 d. The realm-export json in this PR is only the dev/staging seed.
Confirm the GitHub IDP on the prod realm still has Store Tokens + Stored Tokens Readable enabled (these are baked into the example realm; verify on prod).
helios.clientBaseUrl per profile is used today by the email pipeline; no change here, but Phase 3 will rely on it.

Phase 1 of the in-Helios deployment approval flow. When GitHub fires deployment_status: waiting for a Helios-originated deployment, decide what to do based on the env's protection rules instead of blindly impersonating the creator (which silently 403'd today when the creator wasn't actually a required reviewer). The new branching in ApprovalService.reviewDeployment: - No required-reviewer rule on the env → NOT_APPLICABLE (GitHub handles any wait-timer / branch-policy gating itself). - Team-type reviewers present → TEAM_REVIEWER_FALLBACK: legacy "impersonate deployer, try anyway" path (Helios doesn't yet expand teams to members; punting silently would regress working setups). - preventSelfReview is set, or the deployer isn't in the User-type reviewer list → DEFERRED_TO_REVIEWERS: deployment stays WAITING on GitHub; Phase 2 will surface it in an in-app pending-approvals list, Phase 3 will email reviewers. - Otherwise → AUTO_APPROVED on behalf of the deployer with a source-tagged comment ("Auto-approved by Helios (deployer is required reviewer)") so the audit trail in GitHub distinguishes auto vs in-app vs email-link. Companion changes: - DeploymentApprovalRequest entity + Flyway V53: one row per (deployment, reviewer) tracking the decision, with state machine including FAILED_AT_GITHUB (clicked-but-not-yet-accepted-by-GitHub is distinguishable from "missed click") and Via in {AUTO, IN_APP, EMAIL_LINK}. Indices on (deployment, state), (reviewer, state), and a partial unique on token_hash for Phase 3. - HeliosDeployment gains PENDING_DISPATCH status (reserved for a future refactor; today's persist-before-dispatch is already in place) plus autoApprovalDecision + autoApprovalAt for at-a-glance audit on the deployment row. - EnvironmentService.resolveReviewers / isUserRequiredReviewer: leaner hot-path companions to the DTO-returning getEnvironmentReviewers. - GitHubService.approveDeploymentOnBehalfOfUser gains a comment parameter (legacy 4-arg overload preserved for callers passing the default). New rejectDeploymentOnBehalfOfUser(...) for Phase 2's decline path. Token-exchange failure now throws IOException instead of silently returning, so callers can surface FAILED_AT_GITHUB. - A bounded retry against the persist-vs-webhook race: three lookups for the HeliosDeployment with 100ms/500ms/2s backoff before treating the event as a non-Helios deployment. - Keycloak realm: ssoSessionMaxLifespan 8h→30d, ssoSessionIdleTimeout 8h→7d (the actual root cause of users getting bounced back through the GitHub login roughly daily). Prod Keycloak must mirror these. Tests: new ApprovalServiceTest covering each branch + retry + GitHub failure recovery; updated GitHubServiceTest to reflect the new "throws on token-exchange failure" semantics and exercise both the comment parameter and the reject path.

Phase 2 of the in-Helios deployment approval flow. Adds a cross-repo pending-approvals page where a logged-in required reviewer can resolve deployments waiting for them without ever opening the GitHub UI. Server: - DeploymentReviewActionService.approveAsCurrentUser / declineAsCurrentUser take a pessimistic write lock on the HeliosDeployment row, validate the caller is a User-type required reviewer for the env (and not the deployer when preventSelfReview is on), then impersonate the caller via the Keycloak token-exchange broker to POST state=approved/rejected to GitHub's /actions/runs/{id}/pending_deployments. On GitHub failure, persist FAILED_AT_GITHUB on the audit row and return 502 so the UI can offer a retry without conflating "user clicked Approve" with "GitHub accepted Approve". - First-responder race: every successful click flips sibling PENDING rows for the same deployment to CONSUMED_BY_OTHER inside the same transaction, so the loser sees a clean "already handled" message. - HTTP surface: GET /api/deployments/pending-approvals (caller's outstanding requests, joined with deployment + env + repo for the UI table), POST /api/deployments/{id}/approve, POST /api/deployments/{id}/decline {comment?}. - HeliosDeploymentRepository.findByIdForUpdate (pessimistic write lock) for the lock target. DeploymentApprovalRequestRepository gains findPendingForReviewer (join-fetched for the list) and lockByDeploymentAndReviewerLogin for transactional updates. Client: - New /pending-approvals route + page. Table with Approve / Decline per row, decline opens a dialog with an optional reason and an explicit "this will reject the workflow run on GitHub for everyone" warning. Polls every 30 s so a reviewer who keeps the tab open sees new pending items. - Badge on the profile button + a "Pending approvals" entry in the profile popover, both driven by the same query — visible across every page (the queue is cross-repo). Tests: DeploymentReviewActionServiceTest covering the eight code paths (happy approve / happy decline / not-a-reviewer 403 / preventSelfReview 403 / already-resolved 409 / no-runId 409 / no-rule 409 / GitHub failure 502 / sibling consumption); spec for ProfileNavSectionComponent updated to provide a QueryClient for the new badge query.

codacy-production · 2026-05-28T09:21:14Z

Not up to standards ⛔

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🔴 Metrics 191 complexity

Metric Results

Complexity ⚠️ 191 (≤ 20 complexity)

View in Codacy

🟢 Coverage 65.07% diff coverage · +0.51% coverage variation

Metric Results

Coverage variation ✅ +0.51% coverage variation (-1.00%)

Diff coverage ✅ 65.07% diff coverage

View coverage diff in Codacy

Coverage variation details

Coverable lines Covered lines Coverage

Common ancestor commit (23c77b6) 15036 7250 48.22%

Head commit (dfbe636) 15393 (+357) 7500 (+250) 48.72% (+0.51%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

Coverable lines Covered lines Diff coverage

Pull request (#1098) 375 244 65.07%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

Copilot

Pull request overview

This PR introduces Phase 1 + Phase 2 of an in-Helios deployment-approval flow for protected GitHub environments. It replaces today's blind "impersonate the deployer" auto-approval with a branch that respects the env's required-reviewer rule, and adds an in-app approve/decline UI so reviewers no longer need to bounce to the GitHub UI. It also extends the Keycloak example realm's SSO session lifespans (8h → 7d idle / 30d max).

Changes:

Adds DeploymentApprovalRequest entity + Flyway V53 (with auto_approval_decision / auto_approval_at on helios_deployment); ApprovalService now branches on ReviewerResolution with a bounded retry for the webhook-vs-dispatch race.
Adds DeploymentReviewActionService + DeploymentApprovalController (/api/deployments/{id}/approve|decline, /pending-approvals) with SELECT FOR UPDATE row locking; GitHubService.approveDeploymentOnBehalfOfUser gains a comment parameter and a new rejectDeploymentOnBehalfOfUser; token-exchange null now throws IOException.
Adds the /pending-approvals Angular page, regenerated openapi client, profile-nav badge polling, and bumps helios-example-realm.json SSO session lifespans.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
server/.../approval/ApprovalService.java	Reviewer-aware auto-approval branching + 3-attempt persist retry
server/.../approval/DeploymentReviewActionService.java	In-app approve/decline with pessimistic deployment-row lock + sibling consume
server/.../approval/DeploymentApprovalController.java	REST endpoints for approve / decline / pending-approvals
server/.../approval/DeploymentApprovalRequest{,Repository}.java	New entity + repo with locked-by-reviewer and pending-for-reviewer queries
server/.../approval/PendingApprovalDto.java, ReviewDeploymentRequest.java	DTOs for list + decline body
server/.../environment/EnvironmentService.java	`resolveReviewers` / `isUserRequiredReviewer` helpers + `ReviewerResolution` record
server/.../github/GitHubService.java	New `comment` overload, `rejectDeploymentOnBehalfOfUser`, null token now throws
server/.../heliosdeployment/HeliosDeployment.java	`autoApprovalDecision` field, `AutoApprovalDecision` enum, `PENDING_DISPATCH` status
server/.../heliosdeployment/HeliosDeploymentRepository.java	`findByIdForUpdate` with `PESSIMISTIC_WRITE` lock
server/.../deployment/LatestDeploymentUnion.java	Maps new `PENDING_DISPATCH` status to `REQUESTED`
server/.../db/migration/V53__deployment_approval_request.sql	Creates table, indexes, and helios_deployment columns
server/.../openapi.yaml	Adds approval endpoints and DTO schemas
server/.../GitHubServiceTest.java	Updates token-exchange-null assertion, adds comment-passthrough and reject tests
server/.../ApprovalServiceTest.java, DeploymentReviewActionServiceTest.java	Net-new coverage for auto-approval branches and in-app actions
client/src/app/pages/pending-approvals/*	New cross-repo pending-approvals page with approve/decline
client/src/app/components/profile-nav-section/*	Pending-approvals badge driven by polled query
client/src/app/app.routes.ts	Adds `/pending-approvals` route
client/src/app/core/modules/openapi/*	Regenerated client (types, sdk, schemas, tanstack helpers, index)
helios-example-realm.json	Bumps SSO idle (8h→7d) and max lifespan (8h→30d)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@transactional

Fixes uncovered by a careful second-pass review of the approval flow. Reliability-grade issues; no functional addition. - **Idempotency against webhook redelivery (P0).** ApprovalService now short-circuits at the top when HeliosDeployment.autoApprovalDecision is already set. Without this guard, a replayed deployment_status webhook would re-enter the auto-approve branch, append a spurious FAILED_AT_GITHUB audit row (because GitHub 422s an already-approved deployment), and *overwrite* AUTO_APPROVED with DEFERRED_TO_REVIEWERS. - **Audit row survived the BAD_GATEWAY throw (P0).** Annotated DeploymentReviewActionService's @transactional methods with noRollbackFor = ResponseStatusException.class. Spring's default rollback-on-RuntimeException silently discarded the FAILED_AT_GITHUB row that the in-app path persists just before re-throwing as 502 — the UI had no audit trail to retry against. - **Atomic audit-row + deployment-stamp (P2).** Extracted the finalisation pair into a new ApprovalDecisionWriter @service with one transaction per outcome (recordSuccess / recordFailure / recordDecisionOnly). A crash between the two writes can no longer leave the database half-stamped. Done as a separate bean because @transactional via self-invocation in ApprovalService would be bypassed by Spring's proxy. - **JPQL nested-enum literal → parameter (P1).** Repository's findPendingForReviewer used the non-portable `$` separator (`...DeploymentApprovalRequest$State.PENDING`); switched to a typed State parameter and renamed to findByReviewerAndState. Also added `left join fetch d.creator` so PendingApprovalDto.from doesn't trip LazyInitialization if creator is ever flipped to LAZY or OSIV is turned off in prod hardening. - **Controller readOnly transaction (P1).** DeploymentApprovalController .myPendingApprovals is now @transactional(readOnly = true) so the Hibernate session stays open across the DTO mapping (defensive against OSIV=false). - **Removed unused PENDING_DISPATCH (P1).** The enum value was declared but never assigned, and several callers (EnvironmentService .isActiveDeploymentStatus, DeploymentService.canRedeploy, the stuck- deployment recovery queries) didn't handle it — silent reliability gaps the moment anything started producing it. Reintroduce together with its producer when persist-before-dispatch is actually wired. - **expires_at nullable (P3).** AUTO and IN_APP rows have no TTL; the old NOT NULL forced us to seed `now()`, which made them look already- expired to any future sweep filtering on expires_at. Migration now leaves the column nullable; entity matches. - **Shorter retry budget (P2).** Persist-vs-webhook race retry tightened from {100ms, 500ms, 2000ms} (2.6s worst case) to {100ms, 300ms} (400ms worst case) — easily enough for the in-process transaction commit race, and far less of a NATS dispatcher latency tail. - **Client consistency (P4).** Converted the four leftover `*ngIf` uses in pending-approvals.component.html to the `@if` control flow that the rest of the codebase uses. ApprovalServiceTest reworked to mock the new ApprovalDecisionWriter and to cover the idempotency-skip path. The DeploymentReviewActionServiceTest still covers FAILED_AT_GITHUB via its 502-throw assertion, but the true rollback-vs-audit-survival behavior is only verifiable in a real DB integration test — flagged for follow-up; the noRollbackFor annotation plus a code comment ensures the production behavior is correct.

The PR's V52__add_workflow_job_and_runner_inventory.sql collides with V52__add_deployment_workflow_run_id_index.sql that landed on staging later. Flyway rejects duplicate version numbers (CompositeMigrationResolver line 93), which is what tanked server-tests and validate-migrations on the last CI run. Bumped to V54 (V53 is reserved for the in-flight #1098 approval-flow migration, so this avoids a second collision when that lands). No SQL changes — pure file rename. Migration content is identical.

When a deployment is deferred to required reviewers (DEFERRED_TO_REVIEWERS branch), email every User-type reviewer with a "Review in Helios" button that lands on /pending-approvals?focus=<deploymentId>. The link is just a deep-link — auth is the existing Keycloak login, not a one-shot token — so the same endpoints used by the in-app flow (Phase 2) handle the approve/decline POST. No new attack surface; no token TTL to manage. Server - New notification type DEPLOYMENT_APPROVAL_REQUEST with Flyway V54 that widens the chk_notification_type CHECK and backfills enabled=true rows in notification_preference (default-on for existing users; the per-user toggle works from day one). - DeploymentApprovalRequestPayload record carries username, env, repo, creator, branch, sha, workflow URL, deploymentId — the template resolves ${heliosBaseUrl}/pending-approvals?focus=${deploymentId} from the notification service's configured client_url. - ApprovalService.notifyReviewers() runs inside DEFERRED_TO_REVIEWERS: for each User-type reviewer (skipping creator when preventSelfReview), upserts a PENDING DeploymentApprovalRequest row with emailSentAt and publishes via NatsNotificationPublisherService. Reviewers without a Helios User row are logged and skipped — Team-type expansion is the Phase 4 gap and is acknowledged in code. Client - /pending-approvals reads ?focus=<id> via ActivatedRoute → toSignal, scrolls the matching row into view, and pulses a 2.4s highlight. The query-param signal handles both pre- and post-login arrivals cleanly (login redirect interleaves before mount). - alreadyFocusedId guards against the 30s polling refetch re-scrolling the user mid-read. Tests - ApprovalServiceTest: persistsPendingRowAndPublishesEmailForEachReviewer, skipsCreatorWhenPreventSelfReviewBlocksThem, skipsReviewersWithoutHeliosUserRow. - NotificationPreferenceServiceTest: replaced the hard-coded times(2) with Type.values().length - 1 so future enum additions don't break it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

krusche · 2026-05-29T00:17:56Z

@Claudia-Anthropica review

Claudia-Anthropica · 2026-05-29T00:17:59Z

@krusche on it! Taking a look now 👀

Claudia-Anthropica

@krusche The feature direction looks solid, but the approval flow still has a couple of correctness gaps around retries and webhook redelivery. CI is also still failing because the generated OpenAPI spec differs from openapi.yaml for the new notification enum.

Carefully re-reviewed PR #1098 across concurrency, security, data model, transactions/error-handling, GitHub integration, reviewer resolution, frontend, notifications, and tests. This commit lands the CI-blocking fixes, the resolved checkstyle review threads, and the confirmed defects. CI unblock - Regenerate openapi.yaml + the generated TS client: Phase 3 added DEPLOYMENT_APPROVAL_REQUEST to NotificationPreference.Type but the spec and client union were stale, failing the "Validate OpenAPI Spec" check. Checkstyle (the 5 open reviewdog threads) - Rename rejectsCallWhenUserIsNotARequiredReviewer / defersWhenDeployerIsNotARequiredReviewer to drop the "AR" consecutive caps (AbbreviationAsWordInName). - Add the missing summary sentence to DeploymentReviewActionService.review. - Make bob/carol/row final (VariableDeclarationUsageDistance). Confirmed defects - notifyReviewers: HTML-escape the GitHub-/user-sourced free-text fields (login, env, repo, branch, sha) before they reach the email template — the notification module substitutes placeholders into HTML with no escaping, so a crafted branch name could break or inject into the mail. - Auto-approve failure on the happy path now also notifies reviewers: a transient GitHub/token failure stamped DEFERRED_TO_REVIEWERS but never created pending rows or sent email, stranding the deployment. It now mirrors the explicit defer branch. - resolveReviewers: an unparseable reviewers blob no longer collapses to a present-but-empty resolution (which silently deferred to zero reviewers and notified nobody); it now returns Optional.empty so the deployment is left for manual review and the failure is logged loudly. - user-settings: add the DEPLOYMENT_APPROVAL_REQUEST label so the backfilled, default-on preference toggle no longer renders blank. - privacy policy: disclose the new DEPLOYMENT_APPROVAL_REQUEST event. Docs / tests - Document the mixed User+Team reviewer precedence (team-fallback wins), the null prevent_self_review default, and emailSentAt's enqueue (not delivery) semantics. - Strengthen the deferral test to assert the email payload's deploymentId, environmentName and creatorLogin (not any()); assert the decline path returns a DECLINED/IN_APP row. Verified: server tests (Approval/DeploymentReviewAction/GitHub/ NotificationPreference/Environment) green, checkstyle 10.21.0 clean on all touched files, client build + lint + 198 unit tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Claudia-Anthropica

@krusche The previous blockers are still open in the current code, so I am keeping this at changes requested. The approval decision still is not serialized before the GitHub call, deferred reviewer rows are still not persisted atomically with the decision stamp, and failed GitHub attempts still disappear from the pending-approvals retry queue.

Addresses three substantive review comments on the approval flow: - FAILED_AT_GITHUB rows are now surfaced in the in-app pending-approvals list (and therefore the profile badge, which counts the list). The PR promises these rows let the UI offer a clean retry, but the query only returned PENDING, so after a transient GitHub failure the reviewer's queue looked empty while the deployment was still waiting. The list now includes PENDING + FAILED_AT_GITHUB; clicking Approve/Decline reuses the same row to retry. New findByReviewerAndStateIn join-fetch query. - ApprovalDecisionWriter now re-loads the deployment under a SELECT FOR UPDATE lock and re-checks that no decision was recorded before writing. The webhook auto path reads the row unlocked then calls GitHub before persisting, so a concurrent in-app approve/decline (which holds the row lock for its whole transaction) could otherwise overwrite a recorded decision or write a duplicate audit row. The first decision now wins; a losing writer becomes a no-op. - The deferral branch now persists the per-reviewer pending rows (and publishes the emails) BEFORE stamping DEFERRED_TO_REVIEWERS. The stamp is the idempotency gate: stamping first and then failing in notifyReviewers would let a redelivered webhook short-circuit, stranding the deployment with no in-app rows and nobody notified. Stamping last leaves autoApprovalDecision null on failure so redelivery re-runs the deferral. Reverted the earlier notify-on-auto-fallback change: on the happy path the deployer is themselves a required reviewer, so the FAILED_AT_GITHUB row now surfaces in their own pending list for retry — and the notify-on-fallback conflicted with the lock-recheck above (it would email reviewers for a deployment a concurrent in-app click had already resolved). Verified: Approval/DeploymentReviewAction/GitHub/NotificationPreference/ Environment tests green; checkstyle 10.21.0 clean; no OpenAPI drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

krusche added 2 commits May 28, 2026 10:56

Copilot AI review requested due to automatic review settings May 28, 2026 09:20

krusche requested a review from a team as a code owner May 28, 2026 09:20

github-actions Bot assigned krusche May 28, 2026

Copilot started reviewing on behalf of krusche May 28, 2026 09:20 View session

github-actions Bot added client application-server feature migration-script size:XXL labels May 28, 2026

Copilot AI reviewed May 28, 2026

View reviewed changes

Merge branch 'staging' into feat/deployment-approval-in-helios

95c03eb

krusche mentioned this pull request May 28, 2026

chore(client): temporarily relax pnpm minimumReleaseAge to 0 (unblock staging deploy) #1107

Merged

krusche and others added 2 commits May 28, 2026 14:56

Merge branch 'staging' into feat/deployment-approval-in-helios

8dc9d54

github-actions Bot added the notification-server label May 28, 2026

Claudia-Anthropica requested changes May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(approval): reviewer-aware auto-approval + in-app approve/decline for protected envs#1098

feat(approval): reviewer-aware auto-approval + in-app approve/decline for protected envs#1098
krusche wants to merge 8 commits into
stagingfrom
feat/deployment-approval-in-helios

krusche commented May 28, 2026 •

edited

Loading

Uh oh!

codacy-production Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

krusche commented May 29, 2026

Uh oh!

Claudia-Anthropica commented May 29, 2026

Uh oh!

Claudia-Anthropica left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Claudia-Anthropica left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

krusche commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

What changes

Reviewer-aware auto-approval (Phase 1)

In-app approve / decline (Phase 2)

DeploymentApprovalRequest entity (Phase 1 → 3)

UI

Realm session lifespans

GitHubService surface

Intentionally NOT in this PR

Testing instructions

Prerequisites

Server unit tests

End-to-end test matrix

Test 1 — Auto-approval when the deployer is a required reviewer (the headline)

Test 2 — Deferral + in-app Approve (the core Phase 2 path)

Test 3 — In-app Decline rejects the workflow for everyone

Test 4 — preventSelfReview honored

Test 5 — Not-a-required-reviewer can't approve

Test 6 — Already-resolved → 409 Conflict

Test 7 — Idempotency on webhook redelivery (the subtle one)

Test 8 — Env without required reviewers is left alone

Visual / UX smoke

Operator notes

Uh oh!

codacy-production Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not up to standards ⛔

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

krusche commented May 29, 2026

Uh oh!

Claudia-Anthropica commented May 29, 2026

Uh oh!

Claudia-Anthropica left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Claudia-Anthropica left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

krusche commented May 28, 2026 •

edited

Loading

`DeploymentApprovalRequest` entity (Phase 1 → 3)

Test 4 — `preventSelfReview` honored

codacy-production Bot commented May 28, 2026 •

edited

Loading