Skip to content

Comments

Streaming debug_traceBlock with inline tracer to prevent OOM#9844

Open
qu0b wants to merge 4 commits intohyperledger:bal-devnet-2-with-prefetchfrom
qu0b:qu0b/fix/streaming-debug-trace
Open

Streaming debug_traceBlock with inline tracer to prevent OOM#9844
qu0b wants to merge 4 commits intohyperledger:bal-devnet-2-with-prefetchfrom
qu0b:qu0b/fix/streaming-debug-trace

Conversation

@qu0b
Copy link

@qu0b qu0b commented Feb 18, 2026

Problem

debug_traceBlockByHash accumulates all TraceFrame objects in an ArrayList for every transaction in a block before serializing. Each frame holds a full copy of EVM memory (up to 256KB). On blocks with 200K-600K+ opcode steps, this exhausts heap and causes OutOfMemoryError — particularly under continuous load from tracing services like tracoor.

Solution

New StreamingDebugOperationTracer writes each struct log entry directly to a JsonGenerator during EVM execution via tracePostExecution(), then discards it immediately. No frames are accumulated. Combined with lazy memory capture (only 22 of ~140 opcodes actually touch memory), peak memory drops from O(all_frames_in_block) to O(1).

The pipeline architecture (PipelineBuilder / ExecuteTransactionStep / DebugTraceTransactionStepFactory) is removed in favor of direct sequential iteration with inline JSON serialization.

Error handling

  • If the tracing lambda fails mid-stream (connection reset, worldstate unavailable), the response is abandoned cleanly instead of writing broken JSON
  • JsonRpcObjectExecutor handles generator.close() failures on connection reset to prevent Self-suppression not permitted exceptions

Changes

File Description
StreamingDebugOperationTracer.java New — inline streaming tracer with lazy memory capture
StreamingJsonRpcSuccessResponse.java NewJsonRpcResponse that streams result via JsonGenerator
AbstractDebugTraceBlock.java Removed pipeline, new getStreamingTraces() with mid-stream error handling
DebugTraceBlockByNumber.java Simplified to direct iteration with DebugOperationTracer (per-tx reset)
JsonRpcObjectExecutor.java Handle streaming response type + fix generator close error path
StructLogWithError.java Constructor visibility: package-private → public

Verification

  • Tested on live bal-devnet-2: 210K-567K+ struct logs per block, zero OOM, JVM heap stable at 1.76/8 GiB
  • Field-by-field output matches geth (pc, op, gas, gasCost, depth, memory, storage)
  • 93-94% of struct logs omit memory (lazy capture working)

🤖 Generated with Claude Code

qu0b and others added 2 commits February 18, 2026 17:17
debug_traceBlockByHash/ByNumber/Block accumulated all transaction
traces in a synchronized ArrayList before serializing the entire
response. A single block trace produces 350-600 MB of JSON (with
disableMemory=false), causing OOM under concurrent requests on nodes
targeted by tracing services like tracoor.

Introduce StreamingJsonRpcSuccessResponse which writes each transaction
trace directly to the HTTP response via JsonGenerator as it is produced.
Peak memory drops from O(all_traces_in_block) to O(single_trace).

Tested on bal-devnet-2: 26+ blocks traced continuously under 12 GiB
memory limit, 12+ GB total data streamed, memory stable at 2-4 GiB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of accumulating all TraceFrames in memory per transaction and
then converting them to StructLog objects, write each StructLog directly
to the JsonGenerator as it is produced during execution. After each
transaction's frames are written, reset the tracer to release memory.

This reduces peak memory from O(all_frames_in_block) to O(one_tx_frames),
preventing OOM when tracoor sends concurrent debug_traceBlockByHash
requests with memory capture enabled.

Stress tested with 10 concurrent block traces: peak 4.1 GiB vs 7.6 GiB
before, with full recovery after GC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@qu0b qu0b changed the title Stream debug_traceBlock responses to prevent OOM Stream debug_traceBlock struct logs frame-by-frame to prevent OOM Feb 19, 2026
qu0b added a commit to ethpandaops/bal-devnets that referenced this pull request Feb 19, 2026
Limit `--rpc-http-max-active-connections` to 20 (default 80) and
increase `--Xhttp-timeout-seconds` to 600 (default 30) on all besu
nodes.

tracoor floods 250+ concurrent `debug_traceBlockByHash` requests with
full memory/storage tracing enabled. Even with the streaming fix
(PR hyperledger/besu#9844), the default 80-connection limit allows
enough concurrent traces to exhaust JVM heap. Lowering to 20
prevents OOM while still serving normal RPC traffic. The 600s timeout
prevents heavy block traces from being killed mid-flight.

Already deployed manually to both live besu nodes — stable with zero
OOM under continuous tracoor load.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
qu0b and others added 2 commits February 19, 2026 04:33
Replace frame-by-frame iteration with true inline streaming: StructLog
entries are written directly to the JsonGenerator during EVM execution
via StreamingDebugOperationTracer, achieving O(1) frame memory instead
of O(all_frames_per_tx).

Additionally implements lazy memory capture: EVM memory is only copied
for the 24 opcodes that actually read/write memory (MLOAD, MSTORE,
CALL, CREATE, LOG*, etc). All other opcodes (~90%) omit the memory
field entirely, reducing per-frame overhead from ~256KB to ~1-2KB.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When debug_traceBlockByHash fails mid-stream (connection reset or
worldstate unavailable), the JSON generator may be in an unknown
nested context. Previously this caused JsonGenerationException and
Self-suppression errors.

- Move writeStartArray before processTracing so worldstate-not-found
  produces valid empty array [] instead of crashing
- Track whether the tracing lambda ran to detect mid-stream failures
- Swallow generator.close() IOException in handleJsonObjectResponse
  to prevent Self-suppression not permitted errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@qu0b qu0b changed the title Stream debug_traceBlock struct logs frame-by-frame to prevent OOM Streaming debug_traceBlock with inline tracer to prevent OOM Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant