Skip to content

perf: parallel fingerprint computation via multiprocessing fork #364

@sjawhar

Description

@sjawhar

Context

After module loading (runpy.run_path(pipeline.py)), fingerprinting runs sequentially for all stages in status.py:293-294:

for stage_name in execution_order:
    stage_registry.ensure_fingerprint(stage_name)

Fingerprinting is CPU-bound (AST parsing, normalization, closure walking). Python's GIL means threading doesn't help. In a 110-stage pipeline, this takes ~1300ms wall time.

Proposal

After module loading, use multiprocessing with fork to parallelize fingerprint computation across cores.

  1. All stage functions are already in memory after runpy.run_path()
  2. Fork N worker processes (inherit address space via copy-on-write)
  3. Each worker fingerprints a partition of stages
  4. Merge results back to parent via pipe/queue

Expected impact

With 4 cores: ~1300ms → ~400ms

Challenges

  • LMDB handles: Can't survive fork — children must reopen their own readonly StateDB
  • _pending_ast_writes: Each child accumulates its own list; parent must collect and flush all
  • WeakKeyDictionary caches: Per-process (redundant computation on shared helpers, but correct)
  • Fork safety: LMDB, logging, thread state need care. macOS discourages fork.

Priority

Lower priority than #363 (stage manifest caching), which eliminates the need for recomputation in the common case. This issue addresses the cold/recompute path — useful when many files change simultaneously or on first run.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions