Skip to content

process cache: replace single refcnt with structured ProcessLifecycle for safe cache eviction #4665

@sglushko

Description

@sglushko

The process cache uses a single atomic.Uint32 reference counter (refcnt) to track process lifecycle and determine when a process can be safely evicted. This design has several correctness issues:

  1. Double decrement on exit + cleanup When a process is replaced by execve, both an exit event and a cleanup event fire for the old process. Each handler independently calls proc.RefDec("process") and parent.RefDec("parent"), leading to refcnt underflow:
process++: 1, process--: 2  →  refcnt wraps to 4294967295
parent++: 1, parent--: 2    →  parent refcnt wraps to 4294967295
  1. Premature cache eviction The garbage collector uses refcnt.Load() != 0 to decide whether a process can be deleted. However, refcnt == 0 does not mean the process lifecycle is complete — a process can reach refcnt == 0 before its exit event arrives, causing premature eviction and loss of process context for subsequent events.

  2. No separation of concerns A single counter mixes different lifecycle aspects (process start/exit, parent/child relationships, ancestor tracking). This makes it impossible to determine why a process is still alive or which component is holding a reference.

  3. No idempotency guarantees Multiple code paths can decrement the same logical reference (e.g., both exit and cleanup decrement "parent"), with no mechanism to ensure each logical operation happens exactly once.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions