-
Notifications
You must be signed in to change notification settings - Fork 3
[P0] RabbitMQ AsyncActor with scaling.enabled=false can stay at 0 replicas and stall queued tasksΒ #381
Description
Summary
In the RabbitMQ + Crossplane path, an AsyncActor with spec.scaling.enabled=false and spec.replicas=1 can reconcile without a ScaledObject, yet the rendered Deployment can still sit at spec.replicas=0. The gateway accepts work and RabbitMQ receives it, but nothing consumes the queue until the deployment is manually scaled.
Why this matters
This breaks the fixed-replica contract for actors with autoscaling disabled and makes a fresh install look healthy while real work stalls.
Expected behavior
- With
scaling.enabled=false, noScaledObjector KEDA HPA should exist for the actor. - The deployment replica count should follow
spec.replicas. - Status should not report
Nappingunless KEDA is actually present and has scaled the workload to zero. - A task submitted through gateway should be consumed without any manual
kubectl scalestep.
Observed behavior on a real GKE install
Actor under test:
- transport: RabbitMQ
- actor:
test-echo(example actor from smoke test) spec.scaling.enabled=falsespec.replicas=1
What was observed before manual intervention:
AsyncActor/test-echoexisted and showedReady=True/Synced=Truestatus.phasewasNappingstatus.infrastructure.workload.replicas=0status.infrastructure.workload.readyReplicas=0- no
ScaledObjectexisted fortest-echo - no HPA existed for
test-echo - the rendered
Deployment/test-echohadspec.replicas=0 - gateway accepted a real task and RabbitMQ queued it, but no worker pod existed to consume it
Concrete task evidence:
- gateway created a task successfully
- nothing processed it while the actor remained at
0replicas - after
kubectl scale deployment/test-echo --replicas=1, the actor immediately consumed the queued message, forwarded tox-sink, and mesh updated final status tosucceeded
Why this looks like a bug / contract mismatch
The intended contract appears clear from code, docs, and tests:
-
deploy/helm-charts/asya-crossplane/templates/composition-rabbitmq.yamlrender-deploymentsets deployment replicas fromxrSpec.replicaswhenscaling.enabled=falsepatch-status-and-derive-phasetreatsNappingas the case where replicas are0, but that phase is documented as the KEDA scale-to-zero state
-
docs/reference/components/core-crossplane.md- says
Readymeans workload has replicas> 0 - says
Nappingmeans KEDA scaled the workload to zero
- says
-
testing/e2e/tests/test_keda_scaling.py::test_scaledobject_not_created_when_scaling_disabled- expects no
ScaledObjectwhen scaling is disabled
- expects no
-
testing/e2e/tests/test_crossplane_e2e.py::test_asyncactor_replicas_update_scaling_disabled- expects deployment replicas to follow the XR when scaling is disabled
So the live RabbitMQ behavior did not match the documented or tested contract.
Suspected scope
At least one of these needs attention:
- RabbitMQ composition reconciliation for
scaling.enabled=false - drift restoration of the composed deployment replica count
- status/phase derivation allowing
Nappingwhen KEDA is absent - missing end-to-end coverage that proves a RabbitMQ actor with scaling disabled actually consumes work after install
Acceptance criteria
- RabbitMQ-backed
AsyncActorwithscaling.enabled=falseandreplicas=1results inDeployment.spec.replicas=1 - no
ScaledObjector HPA is created for that actor - task submission through gateway is consumed without manual scaling
- status never reports
Nappingwhen KEDA is absent for that actor - CI contains a regression test for RabbitMQ + Crossplane +
scaling.enabled=falsethat proves the actor really processes a queued message
Related
- Related to [P0] RabbitMQ queue lifecycle and AsyncActor readiness must reflect realityΒ #372, but distinct.
- [P0] RabbitMQ queue lifecycle and AsyncActor readiness must reflect realityΒ #372 is about RabbitMQ queue lifecycle/readiness being reported as healthy when queues may not exist.
- This issue is about a disabled-scaling actor remaining at
0replicas even when the queue exists and a task is queued.