Question: Using a stretched etcd cluster for active–passive application failover across regions #21053

prateekkohli21 · 2025-12-27T05:16:31Z

prateekkohli21
Dec 27, 2025

Overview

I’m designing an active–passive deployment across multiple geographic regions and would like to validate whether my approach using a single stretched etcd cluster is correct.

Architecture

Regions

Region A: Application + etcd member
Region B: Application + etcd member
Region C: etcd member only (tie-breaker / quorum)

The application runs only in two regions (active + passive).
The third region exists only to provide Raft quorum.

Goal

At any point in time, only one application instance must be active
On failure of the active site:
Leadership should move automatically
The passive site should become active
Split-brain must never occur

Proposed Approach

Run a single etcd cluster stretched across 3 regions
Use etcd leases for leader election
Application behavior:
Acquire lease → become active
Continuously renew lease
If lease renewal fails → stop immediately
Another site may acquire the lease

Assumptions

Inter-region latency: ~150–300 ms
Failover time requirement: ~30–60 seconds is acceptable
etcd is used only for coordination, not for application data

Questions

Is using a single stretched etcd cluster the correct approach for this type of active–passive setup?
2.Are there known pitfalls or operational concerns with this design (e.g., WAN latency, quorum stability)?
3.Are there recommended best practices for:
Election timeouts
Lease TTLs
Network latency thresholds

Goal
To achieve strong safety guarantees (no split-brain) rather than fast failover, while keeping the system operationally simple.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question: Using a stretched etcd cluster for active–passive application failover across regions #21053

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Question: Using a stretched etcd cluster for active–passive application failover across regions #21053

Uh oh!

prateekkohli21 Dec 27, 2025

Replies: 0 comments

prateekkohli21
Dec 27, 2025