Skip to content

Feature:RedisBroker connection loss handling #2797

@liudyna

Description

@liudyna

Connection loss handling: is this intentional behavior or missing feature?

Problem Description

When Redis connection is lost for a short period of time (e.g., network hiccup, Redis restart), RedisBroker immediately raises ConnectionError and the message being published is lost without any retry attempts.

Current Behavior

broker = RedisBroker("redis://localhost:6379")
await broker.connect()

# If Redis goes down here, the next publish will fail immediately
await broker.publish("message", channel="test")  # ❌ ConnectionError - message lost

What happens:

  1. Connection established
  2. Redis becomes unavailable (network issue, restart, etc.)
  3. publish() raises ConnectionError immediately
  4. Message is lost - no retry attempts

Desired Behavior

Similar to how retry_on_timeout=True handles timeout errors, it would be useful to have retry_on_connection_lost=True to handle transient connection failures.

When connection is lost temporarily, the broker should:

  • Attempt to reconnect with exponential backoff
  • Retry the failed operation within a reasonable retry policy
  • Only raise error after exhausting retry attempts

Proposed Solution

Add connection loss handling parameters to RedisBroker.__init__(), similar to existing retry_on_timeout:

broker = RedisBroker(
    "redis://localhost:6379",
    retry_on_timeout=True,              # ✅ Already exists
    retry_on_connection_lost=True,      # ❌ Proposed: handle ConnectionError
    connection_retry_attempts=3,        # ❌ Proposed: max retry attempts
    connection_retry_delay=1.0,         # ❌ Proposed: base delay between retries
)

This approach:

  • Follows existing FastStream patterns (retry_on_timeout)
  • Doesn't expose full redis-py retry complexity
  • Provides sensible defaults for common use case
  • Won't break existing pub/sub logic

Current Workaround Consideration

I understand that error handling could be implemented in middleware, but this feels like the wrong abstraction level for connection-level issues. Middleware should handle business logic errors, not infrastructure failures at the connection pool level.

Question

Is the current behavior intentional (expecting users to handle connection errors in application code), or is this a missing feature that should be added?

If this is intentional, what is the recommended approach for handling transient Redis connection failures?

Environment

  • FastStream version: 0.6.7
  • redis-py version: 5.0.4
  • Python version: 3.13

Related Code

  • faststream/redis/configs/state.py:30 - ConnectionPool is created without retry configuration
  • faststream/redis/broker/broker.py:97 - retry_on_timeout parameter exists but no equivalent for connection errors
  • redis/asyncio/connection.py:127 - Underlying AbstractConnection supports retry parameter

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions