Skip to content

1877 da by re execution#1504

Draft
bewakes wants to merge 5 commits intomainfrom
1877-da-by-re-execution
Draft

1877 da by re execution#1504
bewakes wants to merge 5 commits intomainfrom
1877-da-by-re-execution

Conversation

@bewakes
Copy link
Contributor

@bewakes bewakes commented Mar 16, 2026

Description

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature/Enhancement (non-breaking change which adds functionality or enhances an existing one)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactor
  • New or updated tests
  • Dependency Update

Notes to Reviewers

Is this PR addressing any specification, design doc or external reference document?

  • Yes
  • No

If yes, please add relevant links:

Checklist

  • I have performed a self-review of my code.
  • I have commented my code where necessary.
  • I have updated the documentation if needed.
  • My changes do not introduce new warnings.
  • I have added (where necessary) tests that prove my changes are effective or that my feature works.
  • New and existing tests pass with my changes.
  • I have disclosed my use of AI in the body of this PR.

Related Issues

@github-actions
Copy link
Contributor

github-actions bot commented Mar 16, 2026

Commit: 6216848

SP1 Execution Results

program cycles success
EVM EE STF 1,320,889
Checkpoint 5,226
Checkpoint New 883,623

@bewakes bewakes force-pushed the 1877-da-by-re-execution branch from 66d124e to 3125c4c Compare March 19, 2026 11:03
Comment on lines +150 to +158
accumulated_da: AccumulatedDaData,
}

impl BlockTemplateResult {
/// Create a new block template result.
pub(crate) fn new(template: FullBlockTemplate, failed_txs: Vec<FailedMempoolTx>) -> Self {
pub(crate) fn new(
template: FullBlockTemplate,
failed_txs: Vec<FailedMempoolTx>,
accumulated_da: AccumulatedDaData,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense to include here, it's like it's threading it through block assembly the whole way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does, but I will still look more into it.

Comment on lines +51 to +56
#[derive(Clone, Debug)]
pub(crate) struct AccumulatedDaData {
epoch: Epoch,
accumulator: EpochDaAccumulator,
logs: Vec<OLLog>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type hierarchy isn't very well thought-out. The AccumulatedDaData vs EpochDaAccumulator, it's not clear what's being represented here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see how the name can be confusing. I think it's because of what "da" represents. In EpochDaAccumulator it is about the state diff while in AccumulatedDaData it is about both state diff and logs.
I am interested to hear what exactly about the hierarchy can be made better.

Comment on lines +37 to +48
/// Inserts the entry for given block id and also removes the entry for parent if exists. This
/// method is used to optimize memory usage because in the next assembly we would require
/// accumulation upto the current block and not the parent block.
pub(crate) fn set_accumulated_da_and_remove_parent_entry(
&mut self,
blkid: OLBlockId,
parent: OLBlockId,
da: AccumulatedDaData,
) {
self.set_accumulated_da(blkid, da);
self.block_da_map.remove(&parent);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of having the map then? If the intention is that in normal circumstances we should always be consuming the old state and creating a new state, then we should explicitly track "the old state" and directly replace it with the new state. The way this is written suggests that it could get stuck in a dirty intermediate state with dangling refs that only get cleaned up when the epoch finishes.

Copy link
Contributor Author

@bewakes bewakes Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of having a map is that we can't expect block assembly to request template generation requests from a single canonical chain. It might get requests to build block on other forks of the chain as well. What this does is keeps track of the accumulated DA of different tips. When a block is made on top of a tip, it removes that tip and replaces it with the new proposed tip.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is written suggests that it could get stuck in a dirty intermediate state with dangling refs that only get cleaned up when the epoch finishes.

We can just expose this particular method and not the other one which doesn't remove the parent.

Comment on lines +514 to +517
/// Takes the epoch accumulator, leaving a default in its place.
pub fn take_accumulator(&mut self) -> EpochDaAccumulator {
take(&mut self.epoch_acc)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a C++-ism. If the point of the type is to be about epoch DA, then it should be consumed at the end of the epoch and a new one created at the beginning of the next one.

Comment on lines +360 to +365
/// Executes a batch of blocks sequentially, verifying each produced header
/// matches the input block's header.
///
/// Generic over `S: IStateAccessor` so callers can pass `OLState` directly
/// or wrap it (e.g. `DaAccumulatingState<OLState>`) to intercept mutations.
pub fn execute_block_batch<S: IStateAccessor>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this function?

This function looks like it's verifying that the blocks were constructed correctly by executing them as if we were constructing them the first time, re-constructing them, and then verifying the re-constructed header matches the header we were given. There is a verify_block function that does block verification the expected way already.

Copy link
Member

@storopoli storopoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design this PR is using is combination 4 from the Notion page right? accumulate in block assembly, persist by block, and enforce DA limits.

failed_txs.push((txid, stf_exec_error_to_mempool_reason(&e)));
}
}
// TODO: Check for DA limits here and somehow signal caller to seal epoch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still the core missing behavior for design combination 4. We accumulate DA here, but block sealing is still driven only by should_seal_epoch(block_slot) and this function never returns a signal when the accumulated blob crosses the DA budget. As written, block assembly can still assemble an epoch whose checkpoint sidecar is too large, so the PR does not yet enforce the DA cap that STR-1877 and the design doc require.

Some(da) => da.clone(),
None => {
// TODO: rebuild by re-executing epoch blocks
AccumulatedDaData::new_empty(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback makes the persisted-accumulation story incorrect. EpochDaTracker is only an in-memory map and is initialized empty on service startup

impl<M: MempoolProvider, E: EpochSealingPolicy, S> BlockasmServiceState<M, E, S> {
/// Create new block assembly service state.
pub(crate) fn new(
params: Arc<Params>,
blockasm_config: Arc<BlockAssemblyConfig>,
sequencer_config: SequencerConfig,
ctx: Arc<BlockAssemblyContext<M, S>>,
epoch_sealing_policy: E,
) -> Self {
let ttl = Duration::from_secs(sequencer_config.block_template_ttl_secs);
Self {
params,
blockasm_config,
sequencer_config,
ctx,
epoch_sealing_policy,
state: BlockAssemblyState::new(ttl),
epoch_da_tracker: EpochDaTracker::new_empty(),
}

on a tracker miss we silently restart from AccumulatedDaData::new_empty(0), and the parent entry is removed as soon as a template is generated

let template_id = full_template.get_blockid();
// Store accumulated DA for the new block, removing parent entry.
state
.epoch_da_tracker_mut()
.set_accumulated_da_and_remove_parent_entry(template_id, parent_blkid, accumulated_da);

After a restart, or after regenerating from the same parent, DA accumulation can resume from an empty state instead of the actual parent epoch state.

// We work directly on this state and only clone for backup before each tx.
// On success: backup is discarded. On failure: restore from backup.
let mut staging_state = WriteTrackingState::new(parent_state, accumulated_batch);
let mut staging_state = DaAccumulatingState::new_with_accumulator(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because execute_block_initialization() above still runs under WriteTrackingState, block assembly and checkpoint replay are accumulating different DA surfaces. Checkpoint generation replays the whole epoch under DaAccumulatingState

let mut da_state = DaAccumulatingState::new(ol_state);
let batch_outputs = execute_block_batch(&mut da_state, &epoch_blocks, &prev_terminal_header)
.map_err(|e| anyhow::anyhow!("epoch block replay failed: {e}"))?;

and EpochDaAccumulator explicitly records slot changes as DA-relevant

impl EpochDaAccumulator {
/// Records a slot change event.
fn record_slot_change(&mut self, prior: u64, new: u64) -> Result<(), DaAccumulationError> {
let builder = self
.slot_builder
.get_or_insert_with(|| DaCounterBuilder::<CtrU64ByU16>::from_source(prior));
builder.set(new)?;
Ok(())

So any DA-size decision made from the block-assembly accumulator will undercount relative to the checkpoint blob, because it misses at least block-start and epoch-start writes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants