Skip to content

fix(hook): keep counter add metrics ungrouped by default#876

Closed
aspr0d wants to merge 1 commit into
flant:mainfrom
aspr0d:fix/868-counter-metrics-accumulate
Closed

fix(hook): keep counter add metrics ungrouped by default#876
aspr0d wants to merge 1 commit into
flant:mainfrom
aspr0d:fix/868-counter-metrics-accumulate

Conversation

@aspr0d
Copy link
Copy Markdown

@aspr0d aspr0d commented May 16, 2026

What

Fixes #868

When a hook emits a counter with action: add and no explicit group, the parser assigned the hook name as the default group. Grouped metric batches expire all series in the group before applying new values, so counters were reset on every run.

Changes

  • Do not assign defaultGroup for action: add when group is omitted
  • Gauges and histograms still get the hook name as the default group for expire-and-refresh semantics
  • Tests cover parsing behavior and cumulative counter updates via ApplyBatchOperations

Verification

go test ./pkg/hook/ -run 'TestMetricOperations|TestCounterAdd' -count=1

When a hook emits {"action":"add"} without an explicit group, the parser
assigned the hook name as the default group. Grouped batches expire all
metrics in the group before applying new values, so counters were reset on
every run and stayed at the last increment only.

Leave group empty for counter add operations without an explicit group so
ApplyBatchOperations uses the non-grouped path and values accumulate.
@ldmonster
Copy link
Copy Markdown
Collaborator

The correct way to use hook metrics with expire previous runs metrics.

@aspr0d
Copy link
Copy Markdown
Author

aspr0d commented May 19, 2026

Thanks for the note. Could you clarify or share an example?

My understanding: counters with action: add are intended to accumulate across runs, so applying ExpireGroupMetrics before each batch resets them - which is the regression reported in #868.

If there is a preferred pattern (e.g. always requiring an explicit group for counters, or a separate expiry path for add vs set actions), I am happy to rework the fix to match it.

@ldmonster
Copy link
Copy Markdown
Collaborator

Thanks for the note. Could you clarify or share an example?

My understanding: counters with action: add are intended to accumulate across runs, so applying ExpireGroupMetrics before each batch resets them - which is the regression reported in #868.

If there is a preferred pattern (e.g. always requiring an explicit group for counters, or a separate expiry path for add vs set actions), I am happy to rework the fix to match it.

Thanks for the follow-up.

In the current design, accumulating metrics across hook runs is outside the scope of what hooks are meant to do. Shell-operator is a tool for running hooks in Kubernetes; hook metrics exist to track what happens inside a single hook execution, not to carry state from one run to the next.

The scenario described in #868 is effectively trying to use hook metrics as cross-run storage. That is not a supported use case, so resetting counters via ExpireGroupMetrics before each batch is consistent with the intended model rather than a regression.

If you need values to persist across runs, that should be handled outside the hook metrics layer—for example via Kubernetes objects, external storage, or another mechanism appropriate to your workflow—not by relying on action: add counters to accumulate between invocations.

Happy to discuss further if a concrete example would help clarify the boundary.

@aspr0d
Copy link
Copy Markdown
Author

aspr0d commented May 19, 2026

Thanks for the detailed explanation - that makes the design intent clear.

Closing this PR. The behavior is correct as designed.

@aspr0d aspr0d closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

counter metrics do not count up

2 participants