Skip to content

fix: preserve wl_output across connector disconnect/reconnect#2116

Closed
modelmiser wants to merge 1 commit intopop-os:masterfrom
modelmiser:master
Closed

fix: preserve wl_output across connector disconnect/reconnect#2116
modelmiser wants to merge 1 commit intopop-os:masterfrom
modelmiser:master

Conversation

@modelmiser
Copy link
Copy Markdown

@modelmiser modelmiser commented Feb 18, 2026

  • I have disclosed use of any AI generated code in my commit messages.
    • If you are using an LLM, and do not fully understand the changes it is making to the code base, do not create a PR.
    • In our experience, AI generated code often results in overly complex code that lacks enough context for a proper fix or feature inclusion. This results in considerably longer code reviews. Due to this, AI authored or partially authored PRs may be closed without comment.
  • I understand these changes in full and will be able to respond to review comments.
  • My change is accurately described in the commit message.
  • My contribution is tested and working as described.
  • I have read the Developer Certificate of Origin and certify my contribution under its conditions.

Summary

  • When HDMI monitors power-cycle (or cable is unplugged/replugged), the kernel HPD de-assert produces the same udev event as a cable unplug. Previously cosmic-comp destroyed the Output (wl_output global, workspace state, client bindings) on disconnect and rebuilt everything on reconnect, causing browsers' D-Bus sessions to get stuck in Auth state with unrecoverable rendering failures.
  • Track disconnected connectors as "suspended" instead of removing their Output. The Surface (render thread, DRM resources) is still destroyed since the hardware is gone, but the wl_output global, workspace assignments, and client bindings all survive.
  • On reconnect, connector_added() finds the existing Output (via outputs.get(&conn).cloned() at device.rs:547) and creates a fresh Surface for it.

What changes

Component Before After
wl_output global Destroyed + recreated Persists
Client surface bindings Severed Intact
Workspace assignments Lost + migrated Preserved
Layer surfaces (panels) Closed + respawned Preserved
Window positions Rearranged Preserved
Surface (render thread) Destroyed + recreated Destroyed + recreated (unchanged)

Implementation

A suspended_connectors: HashSet<connector::Handle> field is added to InnerDevice. When a connector disappears:

  • Surface is destroyed (DRM resources are invalid) — unchanged
  • Output is kept alive and connector is marked suspended — new

When the connector reappears, it's reported as "added", connector_added() reuses the existing Output, a new Surface is created, and the suspended flag is cleared.

Suspend-without-timeout was chosen over a debounce approach (wait N seconds, then destroy). A timeout adds complexity and race conditions for minimal benefit — permanently-removed connectors are cleaned up when device_removed() handles full GPU removal, which is the only case where the hardware is truly gone rather than power-cycled.

Two files changed, ~28 lines added (~18 net). Edge cases handled:

  • Repeated udev events while disconnected (skipped)
  • Full GPU device removal (device_removed() cleans up suspended outputs)
  • enumerate_surfaces() skips suspended connectors (avoids bail!("Missing crtc"))

Test plan

  • cargo build --release — clean, zero warnings
  • Power-cycle HDMI monitor via external power button — Firefox/Librewolf tabs continue rendering
  • Unplug/replug HDMI cable — Firefox/Librewolf tabs continue rendering
  • VT switch while monitor is disconnected
  • Different monitor plugged into same connector

Supersedes #2115, which was filed as an issue with the patch linked externally because GitHub forking was restricted at the time. This PR contains the same fix submitted as a proper PR. Related: #1463, #906 — previous fixes made the destroy/recreate path crash-free; this PR avoids the destroy path entirely for transient disconnects.

When HDMI monitors power-cycle, the kernel HPD de-assert produces the
same udev event as a cable unplug. Previously cosmic-comp destroyed the
Output (wl_output global, workspace state, client bindings) on disconnect
and rebuilt everything on reconnect, causing browsers' D-Bus sessions to
get stuck in Auth state with unrecoverable rendering failures.

Track disconnected connectors as "suspended" instead of removing their
Output. The Surface (render thread, DRM resources) is still destroyed
since the hardware is gone, but the wl_output global, workspace
assignments, and client bindings survive. On reconnect, connector_added()
finds the existing Output and creates a fresh Surface for it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@modelmiser
Copy link
Copy Markdown
Author

Minisforum MS-01 mini workstation, Intel 13th gen i9-13900H (Iris Xe, Raptor Lake-P), mainboard HDMI. NVIDIA RTX 4000 SFF Ada that is compute-only — not driving any displays.

GPU topology (no cross-GPU interaction):

card1 (i915, Intel Iris Xe):  HDMI-A-1 connected/enabled (only connector)
card2 (nvidia, RTX 4000 SFF): DP-5..8 all disconnected, Display Active: Disabled

The NVIDIA GPU has no display role — all rendering goes through i915/HDMI-A-1.

Two reproduction paths:

  1. Physical power-cycle: monitor has a physical power button — off, wait 30 seconds to a minute, back on. After reconnect, new Gecko browser tabs (Firefox, Librewolf) render blank. Existing tabs are fine.
  2. COSMIC power settings: Settings → Power & Battery → Power Saving Options → "Turn off the screen after" set to anything other than Never. When the screen turns off, wait 30 seconds to a minute. After it turns back on, same result — new tabs render blank.

If your monitors recover cleanly, it could be connection type (DP vs HDMI), GPU driver, or how the monitor handles the power cycle. Happy to gather logs or test further if that would help narrow it down.

@Drakulix
Copy link
Copy Markdown
Member

So this just indefinitely keeps wl_outputs alive? I am sorry this is no solution, clients need to be able to handle wl_outputs appearing and disappearing, that is part of the core wayland protocol.

@modelmiser
Copy link
Copy Markdown
Author

Thanks for the feedback — understood that preserving wl_output indefinitely isn't the right compositor-side fix, and that clients are expected to handle output removal and re-addition per the Wayland protocol. I'll close this PR.

Wanted to close the loop on the client side: I've investigated the Firefox/Gecko failure and filed it upstream with full diagnostic evidence:

https://bugzilla.mozilla.org/show_bug.cgi?id=2018866

The failure mode on my hardware (i915/HDMI-A-1, details in the bug): after the wl_output destroy/recreate sequence, Firefox permanently disables hardware compositing for the session (layers.acceleration.disabled=true), and the new wl_output's geometry/scale events appear not to be applied — Display0 ends up recorded as 0x0@60Hz scales:0.000000|0.000000. New tabs render blank; only a restart recovers. The GPU process also appears to crash and restart at the moment of reconnect.

Note that @hojjatabdollahi doesn't reproduce this, so it may be hardware- or connection-dependent. Linking here in case other COSMIC users report similar symptoms — the Bugzilla bug has the details.

@modelmiser modelmiser closed this Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants