x/bor: add target producer id in producer planned downtime by kamuikatsurgi · Pull Request #567 · 0xPolygon/heimdall-v2

kamuikatsurgi · 2026-03-24T11:12:30Z

Description

Allows a producer declaring planned downtime to optionally nominate a specific replacement via target_producer_id on MsgSetProducerDowntime
If the target is invalid, inactive, or down for the range, selection falls through to round-robin - the feature is best-effort, not a guarantee

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review.

_{Tip: disable this comment in your organization's Code Review settings.}

kamuikatsurgi · 2026-03-24T11:14:40Z

@claude review for the PR diff, and check for the broader consensus issues across the codebase.

claude · 2026-03-24T11:14:55Z

Claude finished @kamuikatsurgi's task in 1m 48s —— View job

Review: x/bor target producer id in producer planned downtime

Gather context: read PR diff and changed files
Review proto and message type changes
Review keeper and side message server logic
Review vote extension (veblop) changes
Review tests for coverage and correctness
Check broader consensus implications across the codebase
Post final review

Overall Assessment

Clean, well-structured change. The feature is properly height-gated via targetProducerOverrideHeight, the fallthrough-to-round-robin design is the right call for safety, and test coverage is thorough. A few items below ranging from consensus-safety observations to minor nits.

Consensus-Critical Observations

1. SelectNextSpanProducer does not reject self-targeting — veblop.go:270

The side handler (side_msg_server.go:289) and msg_server (msg_server.go:330) both reject targetProducerId == currentProducer upstream, but SelectNextSpanProducer itself does not. The test at veblop_test.go:627 explicitly documents this: "the side handler and msg_server reject self-targeting upstream."

This is fine as defense-in-depth at the calling layers, but worth noting that if SelectNextSpanProducer is ever called from a new code path that skips those checks, a producer could nominate itself. Consider adding a guard in SelectNextSpanProducer as well — it's a one-line check and would make the function self-contained.

Fix this →

2. Side handler votes VOTE_NO on CalculateProducerSet failure for target validation — side_msg_server.go:297-299

If CalculateProducerSet fails in the target validation block, the side handler returns VOTE_NO. This is a strict response — if the failure is transient (e.g., state iteration hiccup), it could cause honest validators to disagree on votes for an otherwise-valid downtime message, since the existing non-target validation path doesn't call CalculateProducerSet.

This is mitigated by the fact that this only fires when targetProducerId != 0, so old-style messages (target=0) are unaffected. But for target-bearing messages, consider whether a softer fallback (ignoring the target and voting YES anyway) would be safer for consensus agreement. Right now the design is: "if I can't verify the target, reject the whole message." This is reasonable but more aggressive than the execution path (which falls through to round-robin on any target failure).

3. Determinism is maintained — confirmed

The targetProducerID flows through deterministic paths only:

PostHandleSetProducerDowntime → AddNewVeBlopSpan → SelectNextSpanProducer (all in PreBlocker, deterministic)
No map iteration issues — the target check uses a linear scan of activeCandidates (a slice)
The height gate uses helper.GetTargetProducerOverrideHeight() which is chain-specific and deterministic

All good here.

Height Gate Values

4. All targetProducerOverrideHeight values are 0 (TBD) — helper/config.go:489,497,509

All chains (mainnet, mumbai, amoy) have targetProducerOverrideHeight = 0 // TBD. Since the height gate check is ctx.BlockHeight() >= 0, this means the feature is immediately active on all chains once deployed.

If the intent is to coordinate activation at a specific block height (as is typical for consensus-affecting changes), the TBD values should be set before merging. If immediate activation is intentional, the // TBD comments are misleading and should be removed.

Code Quality

5. Duplicated "is target in candidates" logic across three files

The same linear search pattern appears in:

msg_server.go:333-339
side_msg_server.go:304-310
veblop.go:272-278

Consider extracting a small helper like containsID(candidates []uint64, id uint64) bool to reduce duplication. Low priority — the logic is simple and correct in all three places.

6. Nit: blank line removal in config.go:97

The diff removes a blank line between the comment // Self-healing defaults and DefaultEnableSH. This is cosmetic but unrelated to the feature — could be a separate cleanup.

Test Coverage

Tests are comprehensive and well-structured:

msg_server_test.go: covers target=0 (round-robin), valid target, self-targeting rejection, and target-not-in-set rejection
veblop_test.go (TestSelectNextSpanProducerWithTarget): 8 sub-tests covering happy path, downtime overlap, inactive target, zero target, non-existent target, self-target passthrough, partial overlap, and fallthrough determinism
veblop_test.go (TestAddNewVeBlopSpanWithTarget, TestAddNewVeBlopSpanTargetFallthrough): integration tests for the full span creation flow

Missing test: SideHandleSetProducerDowntime with a non-zero targetProducerId. The existing side handler tests in side_msg_server_test.go all use newMsg(addr, start, end) which produces TargetProducerId: 0. Adding a test case with a valid target (and one with an invalid target) in the SideHandleSetProducerDowntime test suite would verify the new VOTE_NO paths at side_msg_server.go:281-315.

Fix this →

Summary

The change is well-designed with proper height-gating and graceful fallthrough. The main items to address before merge:

Priority	Item
High	Set actual `targetProducerOverrideHeight` values (or confirm 0 is intentional and remove TBD comments)
Medium	Add side handler test coverage for non-zero `targetProducerId`
Low	Consider self-target guard in `SelectNextSpanProducer` for defense-in-depth
Low	Extract `containsID` helper to reduce duplication

|

Copilot

Pull request overview

This PR extends the Bor planned-downtime flow to allow a producer to optionally request a specific replacement producer via a new target_producer_id field on MsgSetProducerDowntime, and threads that preference through span creation/producer selection (with fallthrough to the existing round-robin behavior when not eligible).

Changes:

Add target_producer_id to MsgSetProducerDowntime protobuf/API types and propagate it through constructors and CLI wiring.
Update SelectNextSpanProducer / AddNewVeBlopSpan to optionally honor a requested target producer (height-gated).
Add/extend keeper and type tests to cover target selection and fallthrough paths.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`proto/heimdallv2/bor/tx.proto`	Adds `target_producer_id` to the downtime message schema.
`x/bor/types/tx.pb.go`	Regenerated protobuf Go types/codec for the new field.
`api/heimdallv2/bor/tx.pulsar.go`	Regenerated Pulsar fast-reflection code to include the new field.
`x/bor/types/msg.go`	Extends `NewMsgSetProducerDowntime` to accept and set `targetProducerID`.
`x/bor/types/msg_test.go`	Adds coverage for the new constructor parameter.
`x/bor/client/cli/flags.go`	Introduces the `--target-producer-id` CLI flag name constant.
`x/bor/client/cli/tx.go`	Reads the new CLI flag and populates `MsgSetProducerDowntime`.
`x/bor/keeper/veblop.go`	Threads `targetProducerID` into producer selection and tries to honor it when eligible.
`x/bor/keeper/veblop_test.go`	Adds selection tests for honoring/falling-through from a target producer.
`x/bor/keeper/side_msg_server.go`	Side-handler validation for target producer (height-gated) + passes target into span creation.
`x/bor/keeper/side_msg_server_test.go`	Adds tests around downtime validation boundaries and target-producer handling.
`x/bor/keeper/msg_server.go`	Adds msg-server validation for `TargetProducerId` (height-gated).
`x/bor/keeper/msg_server_test.go`	Adds msg-server tests for target producer accept/reject scenarios.
`helper/config.go`	Adds `targetProducerOverrideHeight` config knob and getter.
`app/abci.go`	Updates `AddNewVeBlopSpan` callsites for the new function signature (passes `0`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

helper/config.go

x/bor/keeper/msg_server.go

codecov · 2026-03-24T11:17:17Z

Codecov Report

❌ Patch coverage is 57.89474% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.8%. Comparing base (2166896) to head (b9b8614).

Files with missing lines	Patch %	Lines
app/abci.go	15.3%	9 Missing and 2 partials ⚠️
x/bor/keeper/keeper.go	10.0%	8 Missing and 1 partial ⚠️
x/bor/keeper/side_msg_server.go	70.9%	5 Missing and 4 partials ⚠️
app/app.go	0.0%	8 Missing ⚠️
helper/config.go	50.0%	4 Missing ⚠️
x/bor/keeper/veblop.go	85.7%	2 Missing and 2 partials ⚠️
x/bor/client/cli/tx.go	0.0%	3 Missing ⚠️

❌ Your patch check has failed because the patch coverage (57.8%) is below the target coverage (90.0%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff            @@
##           develop    #567     +/-   ##
=========================================
+ Coverage     41.6%   41.8%   +0.2%     
=========================================
  Files          172     172             
  Lines        17499   17601    +102     
=========================================
+ Hits          7296    7374     +78     
- Misses        9124    9148     +24     
  Partials      1079    1079

Files with missing lines	Coverage Δ
x/bor/keeper/msg_server.go	`81.8% <100.0%> (+1.1%)`	⬆️
x/bor/types/msg.go	`100.0% <100.0%> (ø)`
x/bor/types/params.go	`90.0% <ø> (ø)`
x/bor/types/util.go	`93.4% <ø> (ø)`
x/bor/client/cli/tx.go	`0.0% <0.0%> (ø)`
helper/config.go	`20.0% <50.0%> (+0.4%)`	⬆️
x/bor/keeper/veblop.go	`64.0% <85.7%> (+4.8%)`	⬆️
app/app.go	`56.0% <0.0%> (-0.7%)`	⬇️
x/bor/keeper/keeper.go	`65.2% <10.0%> (-1.6%)`	⬇️
x/bor/keeper/side_msg_server.go	`60.3% <70.9%> (+3.4%)`	⬆️
... and 1 more

... and 2 files with indirect coverage changes

Files with missing lines	Coverage Δ
x/bor/keeper/msg_server.go	`81.8% <100.0%> (+1.1%)`	⬆️
x/bor/types/msg.go	`100.0% <100.0%> (ø)`
x/bor/types/params.go	`90.0% <ø> (ø)`
x/bor/types/util.go	`93.4% <ø> (ø)`
x/bor/client/cli/tx.go	`0.0% <0.0%> (ø)`
helper/config.go	`20.0% <50.0%> (+0.4%)`	⬆️
x/bor/keeper/veblop.go	`64.0% <85.7%> (+4.8%)`	⬆️
app/app.go	`56.0% <0.0%> (-0.7%)`	⬇️
x/bor/keeper/keeper.go	`65.2% <10.0%> (-1.6%)`	⬇️
x/bor/keeper/side_msg_server.go	`60.3% <70.9%> (+3.4%)`	⬆️
... and 1 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

claude · 2026-03-24T11:21:31Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

x/bor/keeper/msg_server.go

x/bor/keeper/veblop.go

x/bor/keeper/veblop_test.go

app/abci.go

x/bor/keeper/side_msg_server.go

kamuikatsurgi · 2026-03-25T16:52:58Z

@claude review

claude · 2026-03-25T16:53:17Z

Claude finished @kamuikatsurgi's task in 3m 36s —— View job

Review: x/bor target producer id in producer planned downtime

Read PR diff and changed files
Review core logic (keeper, side handler, veblop)
Review consensus safety and determinism
Check for remaining issues from prior reviews
Post final review

Overall Assessment

Well-structured change since the latest commit (17abc78). The height-gating, defense-in-depth self-target guard in SelectNextSpanProducer, RoundRobinDefault constant, side handler tests, and pre-fork rejection have all been properly addressed. A few items remain.

Issues

1. Redundant condition checks — msg_server.go:328-348, side_msg_server.go:281-327

Both files have two sequential if blocks that can be collapsed:

// Block 1: rejects if target != 0 AND height < fork
if msg.TargetProducerId != types.RoundRobinDefault && height < forkHeight {
    return error
}
// Block 2: validates if target != 0 AND height >= fork  
if msg.TargetProducerId != types.RoundRobinDefault && height >= forkHeight {
    // validation...
}

After block 1 returns on < forkHeight, block 2's >= forkHeight check is always true. This should be a single if with an early return:

if msg.TargetProducerId != types.RoundRobinDefault {
    if height < forkHeight {
        return error // or VOTE_NO
    }
    // validation...
}

Not a bug, but cleaner and avoids evaluating GetTargetProducerOverrideHeight() twice.

Fix this →

2. Validation strictness asymmetry between msg_server and side_msg_server

msg_server.go:334 validates the target against the full candidates list (from CalculateProducerSet), while side_msg_server.go:308 does the same. However, SelectNextSpanProducer at veblop.go:278 checks against activeCandidates — a subset filtered by active validators. This means a target could pass validation in both handlers but still fall through to round-robin at execution time because the target isn't in activeCandidates.

This is consistent with the "best effort" design, but it's worth noting that a user could submit a message that appears valid but whose target is silently ignored. Consider adding a comment near the candidates check in msg_server.go explaining this intentional gap:

// NOTE: we validate against the full candidate set here; at execution time
// SelectNextSpanProducer further filters by active validators and falls
// through to round-robin if the target is no longer in the active set.

3. Height gate values still TBD — helper/config.go:489,497,509

All targetProducerOverrideHeight values are 0 // TBD. With 0, the feature activates immediately on deployment (since blockHeight >= 0 is always true). This was flagged in the prior review and acknowledged — just confirming it's still outstanding and should be set before merge to avoid mixed-version network issues during rollout.

Looks Good

Determinism: Target selection flows through deterministic paths only — slice iteration (no maps), height-gated checks, and state reads in PreBlocker context.
Defense-in-depth: Self-target guard in SelectNextSpanProducer (veblop.go:272) prevents future callers from bypassing upstream checks.
Graceful fallthrough: Every failure mode in SelectNextSpanProducer falls through to round-robin rather than blocking span creation.
RoundRobinDefault constant: Clean self-documenting replacement for magic 0 at all callsites including abci.go.
Side handler test coverage: TestSideHandleSetProducerDowntimeTargetProducer covers zero target, valid target, self-targeting, not-in-set, pre-fork rejection, and pre-fork round-robin acceptance. TestSideHandleSetProducerDowntimeGetValIdError covers the error path.
Pre-fork rejection: Both msg_server and side_msg_server properly reject TargetProducerId != 0 before the fork height.
expected_keepers.go: GetValIdFromAddress is properly declared in the bor module's expected keepers interface.

Summary

Priority	Item	Status
High	Set actual `targetProducerOverrideHeight` values before merge	Pending (acknowledged)
Low	Collapse redundant condition checks in msg_server and side_msg_server	Suggested
Nit	Add comment about validation gap between handlers and execution	Suggested

View job

cffls · 2026-03-25T23:51:12Z

Looks good overall. Could you test a case where a cluster is half upgraded before the HF, a heimdall node with new version send out a rotation message request with target candidate, and make sure the consensus won't halt? Last time when the propose down time message was introduced, we had to add a check here.

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

x/bor/keeper/side_msg_server.go

x/bor/keeper/veblop_test.go

helper/config.go

sonarqubecloud · 2026-03-27T07:59:35Z

Quality Gate passed

Issues
4 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

kamuikatsurgi · 2026-03-27T08:00:02Z

Looks good overall. Could you test a case where a cluster is half upgraded before the HF, a heimdall node with new version send out a rotation message request with target candidate, and make sure the consensus won't halt? Last time when the propose down time message was introduced, we had to add a check here.

Good catch, I added checks there as well similar to what we had for MsgSetProducerDowntime.

kamuikatsurgi added 2 commits March 23, 2026 14:16

x/bor: add target producer id in producer planned downtime

a275f3b

chore: nit

d99eecd

Copilot AI review requested due to automatic review settings March 24, 2026 11:12

claude bot reviewed Mar 24, 2026

View reviewed changes

Copilot started reviewing on behalf of kamuikatsurgi March 24, 2026 11:12 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

helper/config.go Show resolved Hide resolved

x/bor/keeper/msg_server.go Outdated Show resolved Hide resolved

marcello33 reviewed Mar 24, 2026

View reviewed changes

chore: address comments and add tests

17abc78

marcello33 previously approved these changes Mar 25, 2026

View reviewed changes

Merge remote-tracking branch 'origin' into kamui/target-producer-id

ead1a4f

Copilot AI review requested due to automatic review settings March 27, 2026 07:39

Copilot started reviewing on behalf of kamuikatsurgi March 27, 2026 07:39 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

x/bor/keeper/side_msg_server.go Outdated Show resolved Hide resolved

x/bor/keeper/veblop_test.go Show resolved Hide resolved

helper/config.go Show resolved Hide resolved

chore: address comment

b9b8614

kamuikatsurgi dismissed marcello33’s stale review via b9b8614 April 8, 2026 19:52

Conversation

kamuikatsurgi commented Mar 24, 2026

Description

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

kamuikatsurgi commented Mar 24, 2026

Uh oh!

claude bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: x/bor target producer id in producer planned downtime

Overall Assessment

Consensus-Critical Observations

Height Gate Values

Code Quality

Test Coverage

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude bot commented Mar 24, 2026

Code review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kamuikatsurgi commented Mar 25, 2026

Uh oh!

claude bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: x/bor target producer id in producer planned downtime

Overall Assessment

Issues

Looks Good

Summary

Uh oh!

cffls commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 27, 2026

Quality Gate passed

Uh oh!

kamuikatsurgi commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

claude bot commented Mar 24, 2026 •

edited

Loading

codecov bot commented Mar 24, 2026 •

edited

Loading

claude bot commented Mar 25, 2026 •

edited

Loading