[fix](editlog) Fix BDBJEJournal.write() burning journal IDs on failure#62063
[fix](editlog) Fix BDBJEJournal.write() burning journal IDs on failure#62063dataroaring wants to merge 1 commit intoapache:masterfrom
Conversation
…uccessful write Both single-item write() and batch write() call nextJournalId.getAndIncrement() / getAndAdd() before the actual BDB put/commit. If the write fails (e.g. ReplicaWriteException, InsufficientAcksException), the journal IDs are permanently consumed but no journal entry exists for them. On retry, the write uses a new ID, creating gaps in the journal ID sequence. These gaps can confuse the replayer which expects contiguous journal IDs, potentially causing it to skip entries or fail to find expected journals. Fix: read the current ID with nextJournalId.get() before writing, and only advance the ID with incrementAndGet()/addAndGet() after the write succeeds. Both write methods are synchronized, so the get-then-advance pattern is safe.
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
Fixes journal ID gaps by only advancing nextJournalId after a successful BDBJE put/commit, preventing “burned” IDs on transient write failures that confuse the replayer.
Changes:
- Batch
write(JournalBatch)now readsnextJournalIdwithget()and advances viaaddAndGet()only after a successful transaction commit. - Single-item
write(short, Writable)now readsnextJournalIdwithget()and advances viaincrementAndGet()only after a successful put.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| long dataSize = 0; | ||
| long firstId = nextJournalId.getAndAdd(entitySize); | ||
| // Reserve IDs only after successful commit to avoid burning IDs on write failure. | ||
| long firstId = nextJournalId.get(); |
There was a problem hiding this comment.
nextJournalId.addAndGet(entitySize) is performed inside the retry loop. This is safe only if the method exits/breaks immediately after a successful commit; otherwise a future refactor (or a subtle control-flow change) could advance the counter multiple times for the same batch. To make this robust, consider structuring the success path to return/break immediately after txn.commit() (and then advance once), or perform the addAndGet in a single place that is executed exactly once when writeSucceed is finalized.
|
|
||
| txn.commit(); | ||
| txn = null; | ||
| nextJournalId.addAndGet(entitySize); |
There was a problem hiding this comment.
nextJournalId.addAndGet(entitySize) is performed inside the retry loop. This is safe only if the method exits/breaks immediately after a successful commit; otherwise a future refactor (or a subtle control-flow change) could advance the counter multiple times for the same batch. To make this robust, consider structuring the success path to return/break immediately after txn.commit() (and then advance once), or perform the addAndGet in a single place that is executed exactly once when writeSucceed is finalized.
| // id is the key. Reserve ID only after successful write to avoid burning IDs on failure. | ||
| // This is safe because the method is synchronized. |
There was a problem hiding this comment.
The comment says “Reserve ID only after successful write”, but the code still selects the ID before the write; what changes is when nextJournalId is advanced. Consider rewording to something like “Advance nextJournalId only after successful write/commit” to match the actual behavior and avoid confusion for future maintainers.
| // id is the key. Reserve ID only after successful write to avoid burning IDs on failure. | |
| // This is safe because the method is synchronized. | |
| // id is the key. We advance nextJournalId only after successful write/commit | |
| // to avoid burning IDs on failure. This is safe because the method is synchronized. |
Summary
write()and batchwrite()callnextJournalId.getAndIncrement()/getAndAdd()before the actual BDB put/commitReplicaWriteException,InsufficientAcksException), the journal IDs are permanently consumed but no journal entry existsget()before writing, advance withincrementAndGet()/addAndGet()only after successful commit. Both write methods aresynchronized, so the pattern is safe.Test plan
🤖 Generated with Claude Code