Skip to content

Race condition: program invocation can occur before trampoline table is populated #4983

@Alan-Jowett

Description

@Alan-Jowett

Summary

A race condition exists in _ebpf_program_type_specific_program_information_attach_provider() where program->extension_program_data is set to non-NULL before the trampoline table is updated via _ebpf_program_update_helpers(). This allows program invocation to occur with a stale or uninitialized trampoline table, leading to calls into invalid helper function addresses.

Root Cause Analysis

The Race Window

In libs/execution_context/ebpf_program.c, the attach provider function has this sequence:

// Line 532-533: Comment acknowledges the issue!
// "This should be done after the call to NmrClientAttachProvider, but _ebpf_program_update_helpers requires
// the program information to be set."

// Line 536: Unblock calls to use the program information - ENABLES INVOCATION
program->extension_program_data = extension_program_data;

// Line 538: Initialize rundown protection
ExInitializeRundownProtection(&program->program_information_rundown_reference);

// Line 543: Update helpers - POPULATES TRAMPOLINE TABLE (too late!)
if (_ebpf_program_update_helpers(program) != EBPF_SUCCESS) {
    ...
}

The Invocation Check

In ebpf_program_invoke() (line 1536):

// Check if extension_program_data is non-NULL (allows invocation)
if (ReadPointerNoFence((void* const volatile*)(&program->extension_program_data)) == NULL) {
    *result = 0;
    return EBPF_EXTENSION_FAILED_TO_LOAD;
}

As soon as extension_program_data is set to non-NULL, invocations are permitted. The JIT code will use the trampoline table to call helper functions.

Race Sequence

Time Attach Thread Invoke Thread (e.g., WFP callback)
T1 Set extension_program_data (non-NULL)
T2 Check extension_program_data != NULL → TRUE
T3 Get context_descriptor from extension_program_data
T4 Invoke JIT code
T5 JIT calls helper via trampoline table
T6 _ebpf_program_update_helpers() runs
T7 CRASH: Trampoline has stale/uninitialized addresses!

At T5, the JIT code uses the trampoline table, but _ebpf_program_update_helpers hasn't populated it yet (or is running concurrently). The trampoline contains either:

  • Old addresses from a previous provider instance
  • Uninitialized/garbage addresses
  • Addresses from a driver that has since unloaded

Evidence

Crash Observed

During concurrent attach/detach testing with driver stop/start between test runs, a crash was observed:

`
Child-SP RetAddr Call Site
ffffce05

Metadata

Metadata

Labels

P3triagedDiscussed in a triage meeting

Type

Projects

Status

In Progress

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions