STR 1935 Do startup checks in the new strata binary#1515
STR 1935 Do startup checks in the new strata binary#1515krsnapaudel wants to merge 4 commits intomainfrom
Conversation
|
@delbonis had to rebase. |
|
Commit: 3461523 SP1 Execution Results
|
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## main #1515 +/- ##
===========================================
- Coverage 84.34% 65.61% -18.73%
===========================================
Files 802 801 -1
Lines 75697 76559 +862
===========================================
- Hits 63846 50234 -13612
- Misses 11851 26325 +14474
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 401 files with indirect coverage changes 🚀 New features to boost your workflow:
|
bewakes
left a comment
There was a problem hiding this comment.
nit: the _on_restart suffix on almost every function feels unnecessary and indicates that the function is making assumptions about what context it is being called from (during restart) which is not necessary.
| fn resolve_tip_ol_block(storage: &NodeStorage) -> Result<OLBlockCommitment> { | ||
| storage | ||
| .ol_block() | ||
| .get_canonical_tip_blocking() |
There was a problem hiding this comment.
I think this is validating the wrong thing. get_canonical_tip_blocking() is still just "the first block at the highest slot", so on a forked tip this can pick an Unchecked/Invalid block and fail startup even when a valid tip exists. The storage helper currently does exactly that here:
alpen/crates/storage/src/managers/ol.rs
Lines 126 to 149 in 19b8d1f
Can we either resolve a BlockStatus::Valid tip here, or avoid treating this helper as canonical until canonical selection is implemented?
| ctx.config().bitcoind.network, | ||
| ))?; | ||
|
|
||
| let has_persisted_client_state = ctx |
There was a problem hiding this comment.
This persisted-state gate looks too narrow. init_ol_genesis() writes OL data before put_update_blocking() writes client state, so a crash between those two calls leaves OL state on disk but makes has_persisted_client_state false on restart. In that case these checks get skipped and the node goes back through genesis init, which seems likely to fail later with duplicate inserts instead of producing the intended startup-consistency error.
The write order is here:
alpen/bin/strata/src/context.rs
Lines 288 to 299 in 19b8d1f
Should this be keyed off the presence of persisted OL data as well, or should those initialization writes be made atomic?
Description
get_blockchain_info()).config.bitcoind.network.OLParams.last_l1_blockhash from bitcoind.This PR was created with help from Claude Code and Codex.
Type of Change
Notes to Reviewers
Not sure if we need all checks. Please suggest how we can test these better.
Is this PR addressing any specification, design doc or external reference document?
If yes, please add relevant links:
Checklist
Related Issues
STR-1935