-
-
Notifications
You must be signed in to change notification settings - Fork 881
Description
Summary
A single IndexWriter doing multiple commit() calls fails with PermissionDenied (OS error 5) on Windows. The race is between Tantivy's own internal merge thread (spawned by consider_merge_options after commit) and the user's next commit() call — both call register_file_as_managed → save_managed_paths → atomic_write(".managed.json") → tempfile::persist() → MoveFileExW. On Windows, concurrent MoveFileExW calls targeting the same destination file cause ERROR_ACCESS_DENIED.
On Linux, rename(2) is atomic and non-blocking, so both calls succeed (last writer wins, potentially losing file registrations in .managed.json — a silent correctness issue). On Windows, the same race causes a hard error.
Version
- tantivy 0.25.0
- Windows 10/11 (NTFS)
The Internal Race
There is only one IndexWriter, but Tantivy creates internal concurrency:
- User calls
commit()on the main thread commit()callsconsider_merge_options()(segment_updater.rs:452)consider_merge_options()callsstart_merge()which doesself.merge_thread_pool.spawn(...)(segment_updater.rs:515) — fire-and-forget background workcommit()returns to user while merge runs in background- User calls
commit()again → flush creates new segment files viaManagedDirectory::open_write→register_file_as_managed→atomic_write(".managed.json") - Meanwhile, merge thread is still running, also calling
open_write→register_file_as_managed→atomic_write(".managed.json") - Both threads call
MoveFileExW(.tmp → .managed.json)concurrently → one getsERROR_ACCESS_DENIED
The meta_informations RwLock in register_file_as_managed (managed_directory.rs:213) protects the in-memory HashSet, but the save_managed_paths → atomic_write → persist chain runs outside the lock scope.
Misleading error message
The IO error from .managed.json's failed rename is wrapped with the segment file path at managed_directory.rs:287:
self.register_file_as_managed(path)
.map_err(|io_error| OpenWriteError::wrap_io_error(io_error, path.to_path_buf()))?;So the user sees "Failed to open file for write: 'XXX.store'" but the actual failing operation is renaming a tempfile to .managed.json.
Cross-platform note
On Linux, rename(2) doesn't fail here — both renames succeed. But this means register_file_as_managed calls race silently: one thread's .managed.json content overwrites the other's, potentially losing file registrations. This could cause garbage_collect to delete files that are still needed.
Minimal Reproduction
Single IndexWriter, multiple commits — reproduces the internal merge vs. commit race:
use tantivy::schema::{Schema, TEXT};
use tantivy::directory::ManagedDirectory;
use tantivy::{Index, IndexWriter};
#[test]
fn single_writer_commit_race_on_windows() {
let dir = tempfile::TempDir::new().unwrap();
let mut schema_builder = Schema::builder();
let body = schema_builder.add_text_field("body", TEXT);
let schema = schema_builder.build();
let mmap = tantivy::directory::MmapDirectory::open(dir.path()).unwrap();
let managed = ManagedDirectory::wrap(Box::new(mmap)).unwrap();
let index = Index::open_or_create(managed, schema).unwrap();
let mut writer: IndexWriter = index.writer(50_000_000).unwrap();
// Add enough documents to create multiple segments and trigger merges.
// Then do rapid commits — the background merge from commit N races with commit N+1.
let mut failures = 0;
for round in 0..20 {
for i in 0..500 {
let doc = tantivy::doc!(body => format!("document {round} {i} with some text for indexing"));
writer.add_document(doc).unwrap();
}
if let Err(e) = writer.commit() {
eprintln!("commit {round} failed: {e}");
failures += 1;
}
}
assert_eq!(failures, 0, "{failures}/20 commits failed");
}Expected: All 20 commits succeed.
Actual on Windows: Commit fails with PermissionDenied — e.g., commit 4 failed: "Failed to open file for write: 'IoError { io_error: Os { code: 5, kind: PermissionDenied, message: "Access is denied." }, filepath: "d55dab7a...term" }'". Tested on Windows 10 with tantivy 0.25.0, plain %TEMP% path on C: drive, 1/20 commits failed.
Suggested Fixes
-
Extend the RwLock scope — Hold the
meta_informationswrite lock across both the in-memory insert AND thesave_managed_paths→atomic_writecall. This serializes all.managed.jsonupdates and also fixes the silent last-writer-wins data loss on Linux. -
Retry
persistonPermissionDenied— Add a retry loop (e.g., 3 attempts with short sleep) aroundtempfile.persist()inatomic_write. Addresses the Windows symptom but not the Linux correctness issue. -
Use file locking — Acquire
LockFileExon.managed.jsonbeforeMoveFileExW.
Option 1 is the most correct — it fixes both the Windows crash and the Linux silent data loss with minimal code change.