fix: pace low-power GPU presentation with vsync by oz-for-oss[bot] · Pull Request #13119 · warpdotdev/warp

oz-for-oss · 2026-06-26T23:46:24Z

Closes #2319

Summary

Thread the GPU power preference into WGPU surface configuration.
Use PresentMode::AutoVsync for low-power rendering so typing and scrolling are paced to display refresh instead of using the non-vsync presentation path.
Preserve PresentMode::AutoNoVsync for the high-performance rendering path to keep the existing low-latency behavior when users opt out of low-power rendering.
Add unit tests for both present-mode mappings.

Validation

./script/format --check
cargo test -p warpui presentation_mode --no-fail-fast
cargo clippy --workspace --exclude warp_completer --all-targets --tests -- -D warnings
cargo clippy -p warp_completer --all-targets --tests -- -D warnings

Notes

No approved spec context was available for this issue, so this implements the smallest rendering-layer mitigation aligned with recent reports of GPU spikes while typing or scrolling.
A direct cargo clippy --workspace --all-targets --all-features --tests -- -D warnings run is not the current presubmit path and fails on unrelated all-features dead-code/unused-item warnings across terminal modules.

Co-Authored-By: Elijah Lynn <ElijahLynn@users.noreply.github.com> Co-Authored-By: Oz <oz-agent@warp.dev>

ElijahLynn · 2026-06-27T05:08:09Z

Manual A/B test results (native Wayland, Intel Iris Xe)

I reproduced the high GPU usage from #2319 on a local dev build of this PR branch and ran an automated A/B comparison between the two present-mode paths. The PR change did not meaningfully reduce GPU load under a heavy typing/scroll workload.

Environment

Hardware: Intel Iris Xe Graphics (ADL GT2), single integrated GPU (no discrete GPU)
OS: Arch Linux, GNOME on Wayland
Build: cargo run dev build (warp-oss), PR branch checked out locally
Windowing: native Wayland (system.force_x11 = false, launched with WARP_ENABLE_WAYLAND=1)
Measurement: whole-GPU i915 PMU rcs0-busy via perf_event_open (same source btop / intel_gpu_top use), not per-process fdinfo

Workload

Automated, repeatable input via ydotool (kernel uinput):

rapid typing into the prompt + clear (Ctrl+U)
seq 1 4000 output flood + autoscroll
PageUp/PageDown scrollback churn

~18s workload, ~20s GPU sampling at 200ms intervals.

Results

Mode	`prefer_low_power_gpu`	Present mode	Render avg	Render peak
vsync (PR path)	`true`	`AutoVsync`	49.9%	76.8%
novsync (old path)	`false`	`AutoNoVsync`	51.0%	82.1%

Both runs used the same binary, same native Wayland config, and the same automated workload. The difference is within noise (~1% avg, ~5% peak).

Observations

On a single-iGPU Linux machine, enabling "prefer low power GPU" was already possible (integrated GPU detected), but stable builds did not change present mode — only adapter selection. This PR adds the missing vsync pacing for that path, but it did not fix the reported symptom for me.
AutoVsync caps frame rate, not per-frame render cost. Under sustained typing/scrolling Warp still drives the render engine hard (~50% avg, ~77–82% peak whole-GPU).
On Wayland, a meaningful fraction of compositing cost may land outside Warp's process (GNOME/mutter), so whole-GPU PMU is the right metric for user-visible impact.

Suggestion

This looks like a reasonable small mitigation for uncapped over-presentation, but for cases like mine (single Iris Xe, native Wayland, heavy terminal redraw) a deeper fix may be needed — e.g. frame coalescing/throttling on input, dirty-region rendering, or reducing full-grid redraws during typing/scroll.

Happy to re-run with different configs (XWayland vs Wayland, longer scrollback, etc.) if useful.

ElijahLynn · 2026-06-27T05:09:12Z

Reproducibility: methodology + scripts

Follow-up to the A/B results above — full reproduction steps and the scripts used.

Prerequisites

# Build the PR branch locally
gh pr checkout 13119
cargo build

# GPU measurement: i915 PMU via perf (same source as btop/intel_gpu_top)
# btop ships with cap_perfmon; for a script you need one of:
sudo sysctl kernel.perf_event_paranoid=0   # revert: =2

# Input automation under native GNOME Wayland (kernel uinput)
sudo pacman -S --needed ydotool   # or apt equivalent
# ydotoold runs as your user if /dev/uinput ACL grants access

btop reference: cloned https://github.com/aristocratos/btop and traced GPU measurement to vendored intel_gpu_top (src/linux/intel_gpu_top/), which reads i915 PMU *-busy events (rcs0-busy = render) via perf_event_open with PERF_FORMAT_TOTAL_TIME_ENABLED. Utilization = Δbusy_ns / Δtime_enabled_ns. Whole-GPU (pid=-1), not per-process fdinfo.

Per-process /proc/<pid>/fdinfo drm-engine-render under-counts on Wayland because compositor work lands in mutter/gnome-shell.

Config (dev build `~/.config/warp-oss/settings.toml`)

[system]
force_x11 = false
prefer_low_power_gpu = true   # vsync / AutoVsync path (PR)
# prefer_low_power_gpu = false  # novsync / AutoNoVsync path (baseline)

Launch with native Wayland (also hides the Wayland settings toggle in UI by design):

WARP_ENABLE_WAYLAND=1 /path/to/target/debug/warp-oss

Scripts

Public gist with the three scripts used:

https://gist.github.com/ElijahLynn/dc18971e77101b32a823215b8fa67a98

File	Role
`gpu_pmu_sampler.py`	Samples whole-GPU `rcs0-busy` (render) via i915 PMU, writes CSV
`workload.sh`	Fixed typing / `seq 1 4000` / PageUp-Down workload via ydotool
`run_mode_auto.sh`	Writes config, relaunches warp-oss, countdown, runs sampler + workload

Run commands (what I actually ran)

mkdir -p ~/warp-gpu-test
# copy scripts from gist into ~/warp-gpu-test, chmod +x

ydotoold --socket-path "$XDG_RUNTIME_DIR/.ydotool_socket" --socket-own "$(id -u):$(id -g)" &

# AutoVsync (PR path)
./run_mode_auto.sh vsync true 10 20 18
# -> /tmp/gpu_vsync.csv

# AutoNoVsync (baseline) — re-focus Warp during countdown
./run_mode_auto.sh novsync false 10 20 18
# -> /tmp/gpu_novsync.csv

Important: ydotool sends input to the focused window. Click the Warp terminal pane during the countdown and don't touch keyboard/mouse during the ~18s workload.

Summarize CSVs

for f in /tmp/gpu_vsync.csv /tmp/gpu_novsync.csv; do
  echo -n "$f: "
  awk -F, 'NR>1{r+=$2; n++; if($2>p)p=$2} END{printf "render avg=%.1f%% peak=%.1f%%\n", r/n, p}' "$f"
done

fix: pace low-power GPU presentation with vsync

1138877

Co-Authored-By: Elijah Lynn <ElijahLynn@users.noreply.github.com> Co-Authored-By: Oz <oz-agent@warp.dev>

cla-bot Bot added the cla-signed label Jun 26, 2026

oz-for-oss Bot mentioned this pull request Jun 26, 2026

Too high GPU memory usage #2319

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: pace low-power GPU presentation with vsync#13119

fix: pace low-power GPU presentation with vsync#13119
oz-for-oss[bot] wants to merge 1 commit into
masterfrom
oz-agent/implement-issue-2319

oz-for-oss Bot commented Jun 26, 2026

Uh oh!

ElijahLynn commented Jun 27, 2026

Uh oh!

ElijahLynn commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

oz-for-oss Bot commented Jun 26, 2026

Summary

Validation

Notes

Uh oh!

ElijahLynn commented Jun 27, 2026

Manual A/B test results (native Wayland, Intel Iris Xe)

Environment

Workload

Results

Observations

Suggestion

Uh oh!

ElijahLynn commented Jun 27, 2026

Reproducibility: methodology + scripts

Prerequisites

Config (dev build ~/.config/warp-oss/settings.toml)

Scripts

Run commands (what I actually ran)

Summarize CSVs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Config (dev build `~/.config/warp-oss/settings.toml`)