Skip to content

Comments

[automatic failover] Refactor Circuit Breaker Command Tracking#3670

Open
atakavci wants to merge 3 commits intoredis:mainfrom
atakavci:failover/captureMetricsEarly
Open

[automatic failover] Refactor Circuit Breaker Command Tracking#3670
atakavci wants to merge 3 commits intoredis:mainfrom
atakavci:failover/captureMetricsEarly

Conversation

@atakavci
Copy link
Collaborator

@atakavci atakavci commented Feb 18, 2026

Closes #3567

Summary

Refactored circuit breaker command tracking to use a wrapper-based approach instead of callback-based tracking, improving consistency and reducing code duplication.

Changes

Core Improvements

  • Introduced CircuitBreakerAwareCommand: New CommandWrapper subclass that tracks circuit breaker results through lifecycle hooks
  • Simplified tracking logic: Replaced callback-based tracking with wrapper-based approach using doBeforeComplete() and doBeforeError() hooks
  • Unified error handling: All command results (success/failure) are now recorded consistently through the wrapper

Modified Components

  • DatabaseCommandTracker:

    • Wraps commands in CircuitBreakerAwareCommand before writing
    • Removed attachRecorder() method and callback-based tracking
    • Records exceptions for each command in batch operations
  • MultiDbOutboundHandler:

    • Simplified to only attach listeners for CircuitBreakerAwareCommand instances
    • Removed recorder() and recordOnCommandComplete() methods
    • Changed from List to Collection for better type flexibility
  • CommandWrapper:

    • Added doBeforeComplete() and doBeforeError() lifecycle hooks
    • Hooks are called before delegating to wrapped command and notifying consumers

Benefits

  • Cleaner separation of concerns
  • Reduced code complexity and duplication
  • More maintainable circuit breaker tracking mechanism
  • Consistent error recording across all command execution paths

- early capture circuit breaker metrics with CircuitBreakerAwareCommand
@atakavci atakavci self-assigned this Feb 18, 2026
@atakavci atakavci added the type: feature A new feature label Feb 18, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the circuit breaker command tracking mechanism from a callback-based approach to a wrapper-based approach. The refactoring introduces a new CircuitBreakerAwareCommand wrapper class that uses lifecycle hooks (doBeforeComplete and doBeforeError) to track command results. This change aims to address a sporadic test failure in CircuitBreakerMetricsIntegrationTests where circuit breaker failures were being tracked in command future completion handlers instead of on actual command completion.

Changes:

  • Introduced lifecycle hooks in CommandWrapper for tracking before command completion and error handling
  • Created CircuitBreakerAwareCommand wrapper that records all command results through these hooks
  • Simplified MultiDbOutboundHandler to only attach the wrapper as a listener to write promises

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
src/main/java/io/lettuce/core/protocol/CommandWrapper.java Added doBeforeComplete() and doBeforeError() lifecycle hooks to enable tracking before command completion
src/main/java/io/lettuce/core/failover/DatabaseCommandTracker.java Introduced CircuitBreakerAwareCommand wrapper and modified write methods to wrap commands before delegation
src/main/java/io/lettuce/core/failover/MultiDbOutboundHandler.java Simplified to attach CircuitBreakerAwareCommand as listener; removed old callback-based tracking logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +154 to 162
@Override
protected void doBeforeComplete() {
generation.recordResult(null);
}

@Override
protected void doBeforeError(Throwable throwable) {
generation.recordResult(throwable);
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical issue: The new implementation changes the tracking behavior in a way that conflicts with existing tests. The CircuitBreakerAwareCommand records ALL successes (line 156) and ALL errors (line 161), but the existing tests expect that only timeout exceptions are recorded by DatabaseCommandTracker while successes and non-timeout exceptions are recorded by MultiDbOutboundHandler. This change appears to cause double-recording when combined with MultiDbOutboundHandler, or the tests need to be updated to reflect the new behavior. The PR description does not mention this behavioral change, and the tests are not included in the diff, suggesting they may not have been updated or verified to pass with these changes.

Copilot uses AI. Check for mistakes.
Comment on lines 136 to 138
} catch (Exception e) {
// TODO: here not sure we should record exception for each command or just once for the batch
circuitBreaker.getGeneration().recordResult(e);
commands.forEach(c -> circuitBreaker.getGeneration().recordResult(e));
throw e;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: Potential triple recording of failures for batch write exceptions. When writeMany throws an exception, line 137 records the exception once for each command. Then, each command will also be completed exceptionally (presumably), which will trigger doBeforeError (line 160-161) recording the exception again. Additionally, if the write promise fails, operationComplete (line 165-168) will record it a third time. This results in each batch write failure being recorded 2-3 times per command.

Copilot uses AI. Check for mistakes.
Comment on lines 102 to 104
} catch (Exception e) {
circuitBreaker.getGeneration().recordResult(e);
throw e;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: Potential triple recording of write exceptions. When writeOne throws an exception, line 103 records it. Then, the command will also be completed exceptionally (presumably), which will trigger doBeforeError (line 160-161) recording the exception again. Additionally, if the write promise fails, operationComplete (line 165-168) will record it a third time. This results in each write failure being recorded 2-3 times.

Copilot uses AI. Check for mistakes.
}

/**
* Callback method called after successful completion and before notifying downstream consumers.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JavaDoc comment is incorrect. It says "after successful completion and before notifying downstream consumers", but the method is called BEFORE the wrapped command is completed (line 69), not after. The comment should say "before successful completion of the wrapped command and before notifying downstream consumers".

Suggested change
* Callback method called after successful completion and before notifying downstream consumers.
* Callback method called before successful completion of the wrapped command and before notifying downstream
* consumers.

Copilot uses AI. Check for mistakes.
}

/**
* Callback method called after error completion and before notifying downstream consumers.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JavaDoc comment is incorrect. It says "after error completion and before notifying downstream consumers", but the method is called BEFORE the wrapped command is completed exceptionally (line 145 for completeExceptionally, line 129 for cancel), not after. The comment should say "before error completion of the wrapped command and before notifying downstream consumers".

Suggested change
* Callback method called after error completion and before notifying downstream consumers.
* Callback method called before error completion of the wrapped command and before notifying downstream consumers.

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +170
@Override
public void operationComplete(Future<Void> future) throws Exception {
if (!future.isSuccess()) {
generation.recordResult(future.cause());
}
}

Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: This implementation can lead to double recording of failures. When a write operation fails, both operationComplete() (line 165-168) and doBeforeError() (line 160-161) will record the same failure to the circuit breaker generation. The operationComplete() method should be removed, or the doBeforeError() should check if the error is being recorded due to a write failure to avoid duplicate recording.

Suggested change
@Override
public void operationComplete(Future<Void> future) throws Exception {
if (!future.isSuccess()) {
generation.recordResult(future.cause());
}
}

Copilot uses AI. Check for mistakes.
for (Object cmd : collection) {
if (cmd instanceof CircuitBreakerAwareCommand<?, ?, ?>) {
// we want to use the generation that is active at the time of write.
// Using a lamda which acces directly to circuitBreaker.getGeneration() here would delay capturing the
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "acces" should be "access", and "lamda" should be "lambda".

Suggested change
// Using a lamda which acces directly to circuitBreaker.getGeneration() here would delay capturing the
// Using a lambda which access directly to circuitBreaker.getGeneration() here would delay capturing the

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@ggivo ggivo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atakavci
I see it as a simplified and more predictable way of tracking command completion.

From my point of view, this approach is better, and guaranitee command result is tracked in CB before notifying of command completion

@atakavci
Copy link
Collaborator Author

@atakavci I see it as a simplified and more predictable way of tracking command completion.

From my point of view, this approach is better, and guaranitee command result is tracked in CB before notifying of command completion

@ggivo there is a downside of this approach that is contradicting with fundamentals of async API; the wrapper command is not the one API returns, it remains completely hidden/internal. In case the async command itself is somehow directly managed, CircuitBreakerAware is left out of equation. And there is no elegant way to improve that,, at least i couldn't see one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[automatic failover] Sporadic test failure observed in CircuitBreakerMetricsIntegrationTests

2 participants