Skip to content

Fix H264 hardware video decoding failure#2042

Open
kdxcxs wants to merge 2 commits intonovnc:masterfrom
kdxcxs:fix/h264-hw-decode
Open

Fix H264 hardware video decoding failure#2042
kdxcxs wants to merge 2 commits intonovnc:masterfrom
kdxcxs:fix/h264-hw-decode

Conversation

@kdxcxs
Copy link

@kdxcxs kdxcxs commented Feb 9, 2026

Summary

H264 decoding via WebCodecs VideoDecoder silently fails when hardware acceleration is used (e.g. Intel iGPU with D3D11 on Windows). The screen stays blank with no errors in the console. This PR fixes two issues:

  1. self vs this typo in H264Context.decode()self refers to window in browser context, so SPS parameters (profile, constraint set, level) were being set on the global object instead of the H264Context instance, preventing the decoder from ever being configured.

  2. Render queue deadlock with hardware video decoder — The display render queue blocks on unready video frames (ready = false), which triggers the _flushing backpressure mechanism in rfb.js, stopping all VNC message processing. This starves the VideoDecoder pipeline of input. Hardware decoders (unlike software decoders) may buffer frames internally for reordering (e.g. H264 High profile allows B-frames), requiring continued input before producing output. The result is a deadlock:

    Decoder waiting for more input → render queue waiting for decoder output
    → _flushing stops VNC messages → no more input fed to decoder → deadlock
    

Changes

  • core/decoders/h264.js: Replace self._profileIdc / self._constraintSet / self._levelIdc with this.*
  • core/display.js:
    • Video frames that aren't ready no longer block the render queue. Instead, they register an async callback to draw when the decoder produces output, allowing subsequent frames to continue being fed to the decoder.
    • The flip operation waits for all pending video frames to resolve before executing (Promise.all), preserving visual correctness.
    • Extract _drawVideoFrame() helper to deduplicate the frame drawing logic.

Root cause analysis

Confirmed via chrome://tracing on Windows 10 with Intel UHD Graphics:

  • The D3D11 hardware decoder accepted 3 H264 frames (1 keyframe + 2 delta) with kOk status
  • DoDecode completed successfully, CreatePictureBuffers was called
  • BeginScopedWriteAccess was called 3 times but EndScopedWriteAccess was never called — the decoder held frames in its internal buffer
  • Zero OutputResult events, zero JS output callbacks
  • After ~6 seconds of silence, VideoDecoder::Shutdown was called

The render queue's blocking pattern was originally designed for Image objects (Tight/TightPNG encoding), where each image decodes independently. The same pattern was applied to video frames when H264 support was added, but VideoDecoder is a pipeline that requires continuous input flow — blocking the queue cuts off the input and creates a deadlock.

Tests

  • Verify H264 decoding works with hardware acceleration on Intel iGPU (Windows)
  • Verify H264 decoding still works with software decoding fallback
  • Verify no visual tearing or frame ordering issues during H264 playback

keradoxchen added 2 commits February 9, 2026 02:44
`self` refers to `window` in browser context, causing SPS parameters
to be set on the global object instead of the H264Context instance.
This prevents the decoder from ever being properly configured.
The render queue blocks on unready video frames and triggers the
`_flushing` mechanism in rfb.js, which stops all VNC message
processing. This starves the VideoDecoder of input, preventing it
from producing output — creating a deadlock where the queue waits
for decoder output that can never arrive.

Video frames now skip the queue blocking and draw asynchronously
via the decoder's output callback. The flip operation waits for all
pending frames to resolve before executing, preserving visual
correctness without blocking the decoder input pipeline.
Copy link
Member

@CendioOssman CendioOssman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

Comment on lines 198 to 202
if (parser.profileIdc !== null) {
self._profileIdc = parser.profileIdc;
self._constraintSet = parser.constraintSet;
self._levelIdc = parser.levelIdc;
this._profileIdc = parser.profileIdc;
this._constraintSet = parser.constraintSet;
this._levelIdc = parser.levelIdc;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit embarrassing that this was overlooked.

It suggests that we are lacking one or more unit tests for this code. Is it something you could have a look at?

Comment on lines +567 to +571
if (!a.frame.ready) {
// Don't block the queue — the video decoder
// pipeline needs continued input to produce
// output. Register a callback to draw later,
// and let the queue keep feeding the decoder.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives us tearing, so I'm afraid it's not an approach we want to use.

B-frames should ideally be rare when used with VNC, as they will never be rendered and hence pointless to waste resources on.

But I can accept that you might not always have that control and we need to be prepared to deal with them.

Ideally, we queue them in the decoder rather than the display. Hopefully, we can detect these there?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, this issue is not caused by B-frames. The stream I tested uses H264 High profile with only IDR and P-frames, yet Intel's hardware decoder still doesn't output frames immediately. It buffers a few frames of input before producing any output. It seems is Intel-specific; NVIDIA and AMD GPUs output frames without delay. Since Intel iGPUs are very common (laptops, office machines), I think it's worth handling.

I agree that handling this in the display layer is not ideal. I'll look into addressing it on the decoder side (h264.js) instead. I'll update the PR once I have a revised approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a violation of the protocol, and could cause all kinds of weird rendering effects. Frames need to be displayed immediately after they are received. It's incorrect to buffer them and display them later.

So we need to find some way to get the Intel decoder to spit out frames right away. Perhaps there is some "low latency mode"?

Failing that, we'd need to blacklist it. Can we detect it and fall back to software decoding?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked into what's available from the WebCodecs API:

  • optimizeForLatency: true is already set in our configure() call, but Intel's decoder seems ignores it.
  • VideoDecoder.flush() forces all pending outputs, but per spec it sets [[key chunk required]] = true afterwards, meaning the next decode() must provide a key frame. Not suitable for calling after every frame in a continuous stream with P-frames.

As far as I can tell, there's no other way from JavaScript to change Intel's buffering behavior. The only practical option is detection and fallback to software decoding via hardwareAcceleration: "prefer-software". This does come with a performance cost, but at least it produces correct output.

For detection, we can query the GPU vendor via WebGL's WEBGL_debug_renderer_info extension (widely available, though blocked in some privacy-hardened browsers).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants