Skip to content

Fix cert-generator argocd upgrade loop#4904

Draft
ciarams87 wants to merge 1 commit intomainfrom
fix/helm-upgrade-sync-loop
Draft

Fix cert-generator argocd upgrade loop#4904
ciarams87 wants to merge 1 commit intomainfrom
fix/helm-upgrade-sync-loop

Conversation

@ciarams87
Copy link
Copy Markdown
Contributor

Proposed changes

Problem:
When deploying nginx-gateway-fabric helm chart (v2.4.2, for example), everything is deployed successfully initially (cert generator job enabled, by default).

Then, if something is changed in helm values or argocd application and argo detects it, it shows that argo wants to delete cert-generator ServiceAccount, Role & RoleBinding.

  • Bug 1: certGenerator.annotations not applied to ServiceAccount/Role/RoleBinding. In certs-job.yaml, the ServiceAccount, Role, and RoleBinding have hardcoded annotations with only "helm.sh/hook": pre-install. The certGenerator.annotations value is only used on the Job and pod template, but not on the ServiceAccount, Role, and RoleBinding.
    This prevents users from adding ArgoCD-specific annotations (like argocd.argoproj.io/hook: PreSync) as a workaround.

  • Bug 2: Hook resources without pre-upgrade
    The ServiceAccount, Role, and RoleBinding only have "helm.sh/hook": pre-install, while the Job itself has "helm.sh/hook": pre-install, pre-upgrade. This causes the ArgoCD sync loop:

    • The RBAC resources (pre-install only) are not lifecycle-managed by Helm after install, so ArgoCD treats them as orphaned/drift and wants to delete them.
    • On sync, they get deleted, which causes the next upgrade to fail since the Job's RBAC resources no longer exist.
    • The Job also lacks "helm.sh/hook-delete-policy" directives on the supporting RBAC resources, so ArgoCD keeps fighting Helm over ownership.

Solution: The RBAC resources (ServiceAccount, Role, RoleBinding) should match the Job's hook annotation - pre-install, pre-upgrade - and all hook resources should have a "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded annotation. This is the standard pattern for Helm hooks used with ArgoCD.

Also, certGenerator.annotations needs to be wired into the metadata annotations of the ServiceAccount, Role, and RoleBinding, not just the Job.

Testing: Describe any testing that you did.

Please focus on (optional): If you any specific areas where you would like reviewers to focus their attention or provide
specific feedback, add them here.

Closes #4896

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto main
  • I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

Release notes

If this PR introduces a change that affects users and needs to be mentioned in the release notes,
please add a brief note that summarizes the change.

Fixed cert-generator Helm hook lifecycle to prevent ArgoCD sync loops.

@github-actions github-actions bot added bug Something isn't working helm-chart Relates to helm chart labels Mar 10, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.83%. Comparing base (d10ab2d) to head (0edf869).
⚠️ Report is 51 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4904   +/-   ##
=======================================
  Coverage   85.83%   85.83%           
=======================================
  Files         143      143           
  Lines       16719    16719           
  Branches       35       35           
=======================================
+ Hits        14350    14351    +1     
+ Misses       2118     2117    -1     
  Partials      251      251           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

{{- toYaml . | nindent 4 }}
{{- end }}
"helm.sh/hook": pre-install, pre-upgrade
"helm.sh/hook-delete-policy": before-hook-creation
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the helm docs, this is supposed to be the default behavior

@github-actions
Copy link
Copy Markdown
Contributor

This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 14 days.

@github-actions github-actions bot added stale Pull requests/issues with no activity and removed stale Pull requests/issues with no activity labels Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working helm-chart Relates to helm chart release-notes

Projects

Status: 🆕 New

Development

Successfully merging this pull request may close these issues.

Cert-generator job - argocd sync loop

2 participants