fix: preserve wl_output across connector disconnect/reconnect#2116
fix: preserve wl_output across connector disconnect/reconnect#2116modelmiser wants to merge 1 commit intopop-os:masterfrom
Conversation
When HDMI monitors power-cycle, the kernel HPD de-assert produces the same udev event as a cable unplug. Previously cosmic-comp destroyed the Output (wl_output global, workspace state, client bindings) on disconnect and rebuilt everything on reconnect, causing browsers' D-Bus sessions to get stuck in Auth state with unrecoverable rendering failures. Track disconnected connectors as "suspended" instead of removing their Output. The Surface (render thread, DRM resources) is still destroyed since the hardware is gone, but the wl_output global, workspace assignments, and client bindings survive. On reconnect, connector_added() finds the existing Output and creates a fresh Surface for it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Minisforum MS-01 mini workstation, Intel 13th gen i9-13900H (Iris Xe, Raptor Lake-P), mainboard HDMI. NVIDIA RTX 4000 SFF Ada that is compute-only — not driving any displays. GPU topology (no cross-GPU interaction): The NVIDIA GPU has no display role — all rendering goes through i915/HDMI-A-1. Two reproduction paths:
If your monitors recover cleanly, it could be connection type (DP vs HDMI), GPU driver, or how the monitor handles the power cycle. Happy to gather logs or test further if that would help narrow it down. |
|
So this just indefinitely keeps wl_outputs alive? I am sorry this is no solution, clients need to be able to handle wl_outputs appearing and disappearing, that is part of the core wayland protocol. |
|
Thanks for the feedback — understood that preserving Wanted to close the loop on the client side: I've investigated the Firefox/Gecko failure and filed it upstream with full diagnostic evidence: https://bugzilla.mozilla.org/show_bug.cgi?id=2018866 The failure mode on my hardware (i915/HDMI-A-1, details in the bug): after the Note that @hojjatabdollahi doesn't reproduce this, so it may be hardware- or connection-dependent. Linking here in case other COSMIC users report similar symptoms — the Bugzilla bug has the details. |
Summary
connector_added()finds the existing Output (viaoutputs.get(&conn).cloned()at device.rs:547) and creates a fresh Surface for it.What changes
Implementation
A
suspended_connectors: HashSet<connector::Handle>field is added toInnerDevice. When a connector disappears:When the connector reappears, it's reported as "added",
connector_added()reuses the existing Output, a new Surface is created, and the suspended flag is cleared.Suspend-without-timeout was chosen over a debounce approach (wait N seconds, then destroy). A timeout adds complexity and race conditions for minimal benefit — permanently-removed connectors are cleaned up when
device_removed()handles full GPU removal, which is the only case where the hardware is truly gone rather than power-cycled.Two files changed, ~28 lines added (~18 net). Edge cases handled:
device_removed()cleans up suspended outputs)enumerate_surfaces()skips suspended connectors (avoidsbail!("Missing crtc"))Test plan
cargo build --release— clean, zero warningsSupersedes #2115, which was filed as an issue with the patch linked externally because GitHub forking was restricted at the time. This PR contains the same fix submitted as a proper PR. Related: #1463, #906 — previous fixes made the destroy/recreate path crash-free; this PR avoids the destroy path entirely for transient disconnects.