Backport: fix auto deletion of wal files by purusang · Pull Request #1454 · alpenlabs/alpen

purusang · 2026-03-03T13:11:36Z

Description

This is a back ported fix which has already been tested in staging.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature/Enhancement (non-breaking change which adds functionality or enhances an existing one)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactor
New or updated tests
Dependency Update

Notes to Reviewers

Is this PR addressing any specification, design doc or external reference document?

Yes
No

If yes, please add relevant links:

Checklist

I have performed a self-review of my code.
I have commented my code where necessary.
I have updated the documentation if needed.
My changes do not introduce new warnings.
I have added (where necessary) tests that prove my changes are effective or that my feature works.
New and existing tests pass with my changes.
I have disclosed my use of AI in the body of this PR.

Related Issues

codecov · 2026-03-03T13:34:37Z

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.34%. Comparing base (e160667) to head (977333a).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/reth/exex/src/prover_exex.rs	0.00%	3 Missing ⚠️
crates/eectl/src/worker.rs	0.00%	1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (e160667) and HEAD (977333a). Click for more details.

HEAD has 1 upload less than BASE

Flag BASE (e160667) HEAD (977333a)

functional 1 0

@@             Coverage Diff             @@
##             main    #1454       +/-   ##
===========================================
- Coverage   75.58%   65.34%   -10.24%     
===========================================
  Files         802      800        -2     
  Lines       75676    76102      +426     
===========================================
- Hits        57200    49730     -7470     
- Misses      18476    26372     +7896

Flag	Coverage Δ
functional	`?`
unit	`65.34% <0.00%> (-0.31%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
crates/eectl/src/worker.rs	`13.72% <0.00%> (ø)`
crates/reth/exex/src/prover_exex.rs	`0.00% <0.00%> (ø)`

... and 256 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-03-03T13:41:18Z

Commit: ada79b1

SP1 Execution Results

program	cycles	success
EVM EE STF	1,321,280	✅
Checkpoint	5,226	✅
Checkpoint New	883,623	✅

storopoli · 2026-03-03T14:00:09Z

note for myself: we need to implement wait_for_genesis here which has been done in releases/0.2.0

storopoli

ACK 489fd98

barakshani · 2026-03-09T15:19:52Z

new test_exex_wal_pruning constantly fails. Does it pass locally?

storopoli · 2026-03-09T16:12:14Z

new test_exex_wal_pruning constantly fails. Does it pass locally?

Yes it does. That's why I'm stalled in this PR :/

barakshani · 2026-03-09T16:32:14Z

something is a bit weird in the tests.... (the fact that checkpoints get final). Closing and reopening to make sure CI uses the correct code.

barakshani

I believe this PR mixes 2 notions of finalisation (and actually admit it in one of the comments):

reth has some notion of block finlisation, which I believe it's 100 blocks of confirmation.
OL has a different notion of (bitcoin) finlisation.

I believe that ExEX kicks in when the block is final (ExecCommand::NewFinalizedTip) according to reth. This PR checks checkpoint (OL) finalisation as some triggering point in time. I may wrong, but this need further investigation. Anyway the test constantly fails right now.

Also left some comment about the correctness of our method to see if files were pruned.

barakshani · 2026-03-09T16:17:56Z

functional-tests-new/tests/alpen_client/test_exex_wal_pruning.py

+
+        # Mine L1 blocks while polling Strata status until epoch 1 is confirmed.
+        #
+        # In functional-tests-new `el_ol`, epoch finalization may not always be available


This, ideally, would be fixed in https://alpenlabs.atlassian.net/browse/STR-2424.

functional-tests-new/tests/alpen_client/test_exex_wal_pruning.py

barakshani · 2026-03-09T16:44:05Z

functional-tests-new/tests/alpen_client/test_exex_wal_pruning.py

+        #
+        # In functional-tests-new `el_ol`, epoch finalization may not always be available
+        # (for example when no proving/finalization pipeline is wired in the environment),
+        # but confirmation still advances with L1 progress and is sufficient to exercise


if so, we should not check for finalization, because the test may fail unjustifiably.

barakshani · 2026-03-11T06:31:31Z

PR #1479 adds seal config to el_ol, which is used in the test in the current PR. How come in here we are able to have checkpoints final, in particular sealed?

purusang · 2026-03-11T11:04:07Z

functional-tests-new/tests/alpen_client/test_exex_wal_pruning.py

+            status_after_finalization = wait_until_with_value(
+                lambda: _mine_and_get_sync_status(strata_seq, btc_rpc, mine_address),
+                lambda status: status["finalized"] is not None
+                and status["finalized"]["epoch"] >= 1,
+                error_with="Epoch 1 was not finalized in time",
+                timeout=120,
+                step=2,
+            )


@storopoli To prune wal files, the ideal requirement should be to have the proofs to be generated and stored in db (or might only be witness stored in db 🤔 ). In releases/0.2.0 I waited for OL checkpoint finalization because that was the safest signal to cleanup wal files. And in fast mode epoch could be finalized in seconds there. So we just waited for a finalized checkpoint. And we checked if reth cleaned up wal files.

In main branch, I found out that OL checkpoints never get finalized (which @bewakes confirmed as well) due to which this condition never passes. So I think this whole functional test can be simplified by just having two waits:

first we wait until we have some n number of wal files

after that we wait until some of those preexisting wal files gets deleted, don't need to wait for all of them to be cleaned up.

In this test we do not care if OL checkpoint finalizes or not. We just want to make sure that the wal files are cleared by reth after sometime.

See c92fffe

storopoli · 2026-03-18T14:53:07Z

@sapinb do you think this fix is relevant for the Reth issues we experienced in testnet I?

- Fix ProverWitnessGenerator emitting FinishedHeight with block number 0 (from outcome.first_block()) instead of the real block number, which prevented WAL files from ever being pruned - Add functional test verifying WAL files are pruned after epoch finalization

delbonis

The wait calls about WAL files can stay as-is, those are weird and specific so don't make sense in a general-purpose wait helper.

functional-tests/tests/el_exex_wal_pruning.py

The el_ol environment does not reliably expose checkpoint finalization, so the functional test now only checks that preexisting WAL files eventually disappear. This keeps the pruning assertion stable even while new WAL files are created during block production.

storopoli

ACK 977333a

storopoli · 2026-03-20T17:09:33Z

@purusang whenever you have time please take a look.

purusang self-assigned this Mar 3, 2026

storopoli self-assigned this Mar 3, 2026

storopoli force-pushed the backport/wal-pruning-fix branch 3 times, most recently from fcc620b to 489fd98 Compare March 6, 2026 16:18

storopoli marked this pull request as ready for review March 6, 2026 16:32

storopoli requested review from a team as code owners March 6, 2026 16:32

storopoli previously approved these changes Mar 6, 2026

View reviewed changes

delbonis previously approved these changes Mar 6, 2026

View reviewed changes

storopoli added this pull request to the merge queue Mar 9, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 9, 2026

alexhui01 added this pull request to the merge queue Mar 9, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 9, 2026

storopoli added this pull request to the merge queue Mar 9, 2026

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 9, 2026

storopoli dismissed stale reviews from delbonis and themself via f97ebd5 March 9, 2026 12:47

storopoli force-pushed the backport/wal-pruning-fix branch from f97ebd5 to 489fd98 Compare March 9, 2026 13:48

barakshani closed this Mar 9, 2026

barakshani reopened this Mar 9, 2026

barakshani requested changes Mar 9, 2026

View reviewed changes

purusang commented Mar 11, 2026

View reviewed changes

purusang and others added 5 commits March 18, 2026 11:53

refactor: use wait_until instead of sleep

7c37a04

test(functional-tests): fix el_exex_wal pruning

3898da4

test(functional-tests-new): exex WAL pruning

d463f1c

fix(eectl): send finalized tip updates to reth

6f3f775

storopoli force-pushed the backport/wal-pruning-fix branch from 489fd98 to c92fffe Compare March 18, 2026 17:33

storopoli requested review from barakshani, delbonis and storopoli March 18, 2026 17:33

delbonis requested changes Mar 19, 2026

View reviewed changes

functional-tests/tests/el_exex_wal_pruning.py Outdated Show resolved Hide resolved

storopoli force-pushed the backport/wal-pruning-fix branch from c92fffe to 977333a Compare March 20, 2026 16:39

storopoli requested a review from delbonis March 20, 2026 16:39

delbonis approved these changes Mar 20, 2026

View reviewed changes

storopoli approved these changes Mar 20, 2026

View reviewed changes

Conversation

purusang commented Mar 3, 2026

Description

Type of Change

Notes to Reviewers

Checklist

Related Issues

Uh oh!

codecov bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

storopoli commented Mar 3, 2026

Uh oh!

storopoli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

barakshani commented Mar 9, 2026

Uh oh!

storopoli commented Mar 9, 2026

Uh oh!

barakshani commented Mar 9, 2026

Uh oh!

barakshani left a comment

Choose a reason for hiding this comment

Uh oh!

barakshani Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

barakshani Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

barakshani commented Mar 11, 2026

Uh oh!

purusang Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

storopoli Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

storopoli commented Mar 18, 2026

Uh oh!

delbonis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

storopoli left a comment

Choose a reason for hiding this comment

Uh oh!

storopoli commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Mar 3, 2026 •

edited

Loading

github-actions bot commented Mar 3, 2026 •

edited

Loading

purusang Mar 11, 2026 •

edited

Loading