perf(storage): fetch versioning config once in put_opts#20
Merged
Conversation
`put_opts` called `BucketVersioningSys::prefix_enabled` and `BucketVersioningSys::prefix_suspended` sequentially. Each is a thin wrapper around `BucketVersioningSys::get`, which acquires the bucket metadata write lock and reads versioning config from disk — two separate lock+read cycles per PUT request (~4ms each, ~8ms total). Replace both calls with a single `BucketVersioningSys::get` and evaluate both predicates from the result. A/B result on 8disk cluster (PUT 100t 750KB EC:1): before: 50ms avgOpMs / 1434 MB/s after: 45ms avgOpMs / 1557 MB/s (–5ms, –10%, +123 MB/s) See reference/perf-baselines.md for the full baseline table. Co-authored-by: Cursor <cursoragent@cursor.com>
Three bugs in stop_monitors() allowed remote monitor processes to accumulate across runs: - PID file race: the launch SSH job is backgrounded, so echo $! may not have written the file by the time stop_monitors fires on a fast error path. Replace the two-round-trip test-f/cat with a single cat retried up to 3 times (1 s apart). - Silent SSH failure: the outer || true on the kill command swallowed every SSH transport error with no log message. Wrap the SSH kill in if ! ssh …; then log WARNING so failures are visible in run.log. - Subtree not killed: kill $PID only signals monitor.sh; sar/mpstat/ iostat (nohup+disown children) are never reaped if monitor.sh's SIGTERM handler races. Add setsid to the remote launch so the monitor becomes a session leader (PGID == PID), then kill -- -PGID in stop_monitors to signal the entire process group atomically, followed by SIGKILL after 2 s for any stragglers. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Pull request overview
This PR improves PUT-path performance by avoiding redundant bucket versioning configuration reads, and hardens perf-monitor teardown on peer nodes to prevent orphaned system-monitoring processes.
Changes:
- Optimize
put_optsto fetch bucket versioning configuration once and derive bothversionedandversion_suspendedfrom it. - Update perf test runner to terminate the entire remote monitor process group (sar/mpstat/iostat subtree) and retry PID-file retrieval on fast-failure paths.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| scripts/perf/run-perf-test.sh | Uses setsid + process-group killing and PID-file retry to avoid orphaned monitor subprocesses on peers. |
| rustfs/src/storage/options.rs | Reduces lock+disk reads per PUT by replacing two versioning sys calls with a single BucketVersioningSys::get(). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type of Change
Related Issues
Summary of Changes
put_opts in rustfs/src/storage/options.rs called BucketVersioningSys::prefix_enabled and BucketVersioningSys::prefix_suspended sequentially. Both are thin wrappers around BucketVersioningSys::get, which acquires the bucket metadata write lock and reads versioning config from disk — two separate lock+read cycles per PUT request (~4ms each, ~8ms total wasted per request).
This change replaces the two calls with a single BucketVersioningSys::get and evaluates both predicates from the result. Error handling matches the existing warn! + VersioningConfiguration::default() fallback pattern used elsewhere in versioning_sys.rs.
Also, fix perf testing script to avoid orphaned mpstat and other data collection processes on remote nodes.
Checklist
make pre-commitImpact
Additional Notes
Only rustfs/src/storage/options.rs is changed. No other call sites of prefix_enabled/prefix_suspended are modified — each has different surrounding context and is a separate fix candidate.
make pre-commit passed (1051+ tests, clean clippy and format).
Thank you for your contribution! Please ensure your PR follows the community standards (CODE_OF_CONDUCT.md). If this is your first contribution, review the CLA document and sign it by commenting
I have read and agree to the CLA.on the PR.