[fix](editlog) Fix concurrency bugs and robustness issues in editlog subsystem#62051
[fix](editlog) Fix concurrency bugs and robustness issues in editlog subsystem#62051dataroaring wants to merge 1 commit intoapache:masterfrom
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
This PR hardens the FE editlog/journal subsystem by addressing concurrency hazards, NPE risks, and robustness issues around batching, rolling, and BDB environment lifecycle.
Changes:
- Fix editlog rolling/counting behavior across code paths (queue-based flushing, direct writes, and list/batch writes).
- Improve failure handling and thread-safety in the editlog flusher and BDB environment/database management.
- Prevent NPEs and reduce ID “burn” on failed BDB journal writes.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| fe/fe-core/src/main/java/org/apache/doris/qe/JournalObservable.java | Replaces custom binary search with a standard upper_bound-style implementation for observer notification. |
| fe/fe-core/src/main/java/org/apache/doris/persist/EditLog.java | Converts txId to AtomicLong, adds roll checks for batch-list logging, and notifies queued producers on fatal flush errors. |
| fe/fe-core/src/main/java/org/apache/doris/journal/bdbje/BDBJEJournal.java | Defers advancing nextJournalId until after successful commit/write to avoid burning IDs on failures. |
| fe/fe-core/src/main/java/org/apache/doris/journal/bdbje/BDBEnvironment.java | Fixes removeDatabase() removal logic and makes close() thread-safe by holding the write lock. |
| fe/fe-core/src/main/java/org/apache/doris/ha/BDBHA.java | Adds null-check for getReplicationGroupAdmin() in getLeader() to avoid NPE during partitions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Notify all pending items so producer threads don't hang if System.exit is delayed | ||
| for (EditLogItem req : batch) { | ||
| synchronized (req.lock) { | ||
| req.finished = true; | ||
| req.logId = -1; | ||
| req.lock.notifyAll(); | ||
| } | ||
| } |
There was a problem hiding this comment.
In the error path you notify waiters by setting finished=true and logId=-1, but the caller side (e.g. logEditWithQueue() / EditLogItem.await()) will treat this as a normal completion and continue executing follow-up logic with an invalid logId until the JVM actually halts. Consider propagating a fatal signal instead (e.g., store the throwable on the request and have await/logEditWithQueue throw an Error when completion is due to failure), so callers don’t proceed with partially persisted state during shutdown hooks / delayed exit.
| // Notify all pending items so producer threads don't hang if System.exit is delayed | |
| for (EditLogItem req : batch) { | |
| synchronized (req.lock) { | |
| req.finished = true; | |
| req.logId = -1; | |
| req.lock.notifyAll(); | |
| } | |
| } |
| public InetSocketAddress getLeader() { | ||
| ReplicationGroupAdmin replicationGroupAdmin = environment.getReplicationGroupAdmin(); | ||
| if (replicationGroupAdmin == null) { | ||
| return null; |
There was a problem hiding this comment.
getLeader() now returns null when getReplicationGroupAdmin() is null, but several call sites assume a non-null return and dereference it (e.g. StatisticsUtil.isMaster() calls getLeader().equals(...)). To avoid introducing new NPEs, either keep the previous “throw” behavior but with a clearer exception (so existing try/catch paths handle it), or update the contract/callers to null-check consistently before use.
| return null; | |
| throw new IllegalStateException("ReplicationGroupAdmin is not initialized"); |
…subsystem 1. EditLog.txId data race: Convert `txId` from plain `long` to `AtomicLong`. The flusher thread and DDL caller threads both read/write `txId` without synchronization, causing torn reads on 64-bit values. 2. EditLog.logEdit(op, List<T>) missing roll check: This bulk-write path incremented txId but never checked against `edit_log_roll_num`, so the edit log would never roll when this path was used. 3. BDBEnvironment.close() not thread-safe: `close()` iterated `openedDatabases` without holding `lock.writeLock()`, while `openDatabase()` concurrently modifies the same list under the write lock. This causes ConcurrentModificationException when a RollbackException triggers close() on the replayer thread while HTTP/heartbeat/RPC threads call openDatabase() via getMaxJournalId(). 4. EditLog.flushEditLog() silent hang on fatal error: When `System.exit(-1)` is called after a write failure, producer threads blocked on `req.lock.wait()` are never notified. If `System.exit` is delayed by shutdown hooks, those threads hang indefinitely. 5. BDBJEJournal.write() burns journal IDs on failure: Both single and batch write methods called `nextJournalId.getAndIncrement/getAndAdd` before the actual write, so a failed write permanently skips journal IDs. Now IDs are only advanced after successful commit. 6. BDBEnvironment.removeDatabase() index corruption: Manual index tracking with `openedDatabases.remove(index)` computed wrong index when the target was not the first element. Replaced with `iterator.remove()`. 7. JournalObservable.upperBound() off-by-one: The binary search used `right = size - 1` and `right = middle - 1`, which could miss notifying observers whose targetJournalVersion exactly equals the journalId. Fixed to standard upper_bound pattern. 8. Remove dead `editStream` field and `getEditLogSize()` method from EditLog. The field was never assigned (always null), so `getEditLogSize()` would always NPE.
c935ad4 to
1ec3092
Compare
|
Splitting into individual PRs for each fix. |
Summary
Fix 8 bugs in the editlog subsystem (concurrency, data loss, correctness):
EditLog.txId data race — Convert
txIdfromlongtoAtomicLong. Flusher and DDL threads read/write without synchronization, risking torn reads.EditLog.logEdit(op, List) missing roll check — Bulk-write path never checked
edit_log_roll_num, so edit log wouldn't roll.BDBEnvironment.close() not thread-safe —
close()iteratedopenedDatabaseswithoutwriteLock, whileopenDatabase()modifies the list concurrently. CausesConcurrentModificationExceptionwhen replayer handlesRollbackExceptionwhile HTTP/heartbeat threads callgetMaxJournalId().EditLog.flushEditLog() silent hang — Producer threads blocked on
req.lock.wait()never notified whenSystem.exit(-1)is delayed by shutdown hooks. Now notifies all pending items before exit.BDBJEJournal.write() burns journal IDs on failure —
nextJournalIdwas advanced before the write, so failures skip IDs permanently. Now advanced only after successful commit.BDBEnvironment.removeDatabase() index corruption — Manual index tracking computed wrong removal index. Replaced with
iterator.remove().JournalObservable.upperBound() off-by-one — Binary search with
right = size - 1could miss notifying observers whosetargetJournalVersionexactly matchesjournalId. Fixed to standard upper_bound.Dead
editStreamfield — Never assigned (always null), sogetEditLogSize()always NPEs. Removed both.Test plan
SHOW FRONTENDSworks concurrently with journal operationsedit_log_roll_numtransactions via both single and bulk write paths🤖 Generated with Claude Code