Skip to content

Verify resource deletion before removing finalizer#229

Open
a-hilaly wants to merge 1 commit intoaws-controllers-k8s:mainfrom
a-hilaly:deletion
Open

Verify resource deletion before removing finalizer#229
a-hilaly wants to merge 1 commit intoaws-controllers-k8s:mainfrom
a-hilaly:deletion

Conversation

@a-hilaly
Copy link
Member

@a-hilaly a-hilaly commented Mar 3, 2026

Some AWS APIs return success from Delete immediately while the resource
transitions through a deleting state. This change ensures the reconciler
verifies the resource is fully deleted (ReadOne returns NotFound) before
removing the finalizer and allowing the Kubernetes CR to be deleted.

When a resource still exists after Delete, the reconciler sets
ACK.ResourceSynced to False with reason "Deleting" and requeues after
15 seconds. This prevents orphaned AWS resources when the CR is deleted
before AWS finishes cleaning up.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Some AWS APIs return success from Delete immediately while the resource
transitions through a deleting state. This change ensures the reconciler
verifies the resource is fully deleted (ReadOne returns NotFound) before
removing the finalizer and allowing the Kubernetes CR to be deleted.

When a resource still exists after Delete, the reconciler sets
ACK.ResourceSynced to False with reason "Deleting" and requeues after
15 seconds. This prevents orphaned AWS resources when the CR is deleted
before AWS finishes cleaning up.
@ack-prow ack-prow bot requested review from jlbutler and michaelhtm March 3, 2026 23:51
@ack-prow
Copy link

ack-prow bot commented Mar 3, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: a-hilaly

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ack-prow ack-prow bot added the approved label Mar 3, 2026
Comment on lines +1138 to +1139
// is fully gone (ReadOne returns NotFound) before allowing the CR to be
// deleted from Kubernetes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be good enough for most resources, but there could be cases where the AWS resource is still returned by ReadOne, but with a state indicating that deletion was successful. One example that comes to mind is JobRuns from emr-containers. We likely need to added a check for a "deleted" state similar to our sycned:when check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll also need to consider how this is rolled out since we'll need to have those checks in place before applying this to avoid these kinds of resources becoming stuck during deletion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware of those edge cases and willing to make a code-gen patch to support that. I just wanted to get a feel on this change, and alignment that this is something we want to support in ACK.

_, verifyErr := rm.ReadOne(ctx, current)
rlog.Exit("rm.ReadOne (verify deletion)", verifyErr)
if verifyErr == nil {
// Resource still exists, requeue to wait for deletion to complete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second reconcile we may need to protect against sdkDelete returning an error if the AWS API returns a validation error when called on a deleting resource.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we'll probably need a both a IsDeleted and IsDeleting

@ack-prow
Copy link

ack-prow bot commented Mar 4, 2026

@a-hilaly: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
verify-attribution d29005b link false /test verify-attribution
ecr-controller-test d29005b link true /test ecr-controller-test
s3-controller-test d29005b link true /test s3-controller-test
iam-controller-test d29005b link true /test iam-controller-test
ec2-controller-test d29005b link true /test ec2-controller-test
sagemaker-controller-test d29005b link true /test sagemaker-controller-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants