Skip to content

Made papaya ~9% faster using parking_lot#88

Open
ppmpreetham wants to merge 1 commit intoibraheemdev:masterfrom
ppmpreetham:master
Open

Made papaya ~9% faster using parking_lot#88
ppmpreetham wants to merge 1 commit intoibraheemdev:masterfrom
ppmpreetham:master

Conversation

@ppmpreetham
Copy link
Copy Markdown

Made papaya ~9% faster single-thread reads & ~85% lower p99 tail latency in incremental mode by replacing Mutex with parking_lot

Latency Bench

Variant Before (p99 values) After (p99 values)
papaya (incremental) 4 ms 4 ms, 5 ms, 9 ms, 19 ms, 14 ms, 5 ms, 7 ms, 4 ms 4 ms 1 ms, 0 ms, 0 ms, 2 ms, 12 ms, 0 ms, 5 ms, 0 ms
papaya (blocking) 4 ms (single) 67 ms, 185 ms, 189 ms, 192 ms, 186 ms, 185 ms, 185 ms, 185 ms 2 ms (single) 32 ms, 86 ms, 87 ms, 86 ms, 90 ms, 86 ms, 87 ms, 86 ms

Single-threaded Read Bench

Map Before After Relative Change
papaya [161.52 µs 164.23 µs 167.35 µs] [149.58 µs 150.64 µs 151.87 µs] −7.7% to −10.0%
std [93.431 µs 95.598 µs 97.980 µs] [81.907 µs 82.839 µs 83.976 µs] −13.2% to −18.5%
dashmap [195.79 µs 199.30 µs 203.44 µs] [184.78 µs 186.77 µs 189.01 µs] −3.6% to −7.8%

BEFORE:

Running benches\latency.rs (target\release\deps\latency-febcbe066c63fc01.exe)
=== papaya (incremental) ===
p99 insert: 4ms
p99 concurrent insert: 4ms
p99 concurrent insert: 5ms
p99 concurrent insert: 9ms
p99 concurrent insert: 19ms
p99 concurrent insert: 14ms
p99 concurrent insert: 5ms
p99 concurrent insert: 7ms
p99 concurrent insert: 4ms
=== papaya (blocking) ===
p99 insert: 4ms
p99 concurrent insert: 67ms
p99 concurrent insert: 185ms
p99 concurrent insert: 189ms
p99 concurrent insert: 192ms
p99 concurrent insert: 186ms
p99 concurrent insert: 185ms
p99 concurrent insert: 185ms
p99 concurrent insert: 185ms
=== dashmap ===
p99 insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
p99 concurrent insert: 3ms
Running benches\single_thread.rs (target\release\deps\single_thread-c96b88cabe8f6003.exe)
Gnuplot not found, using plotters backend
read/papaya time: [161.52 µs 164.23 µs 167.35 µs]
Found 10 outliers among 100 measurements (10.00%)
5 (5.00%) high mild
5 (5.00%) high severe
read/std time: [93.431 µs 95.598 µs 97.980 µs]
Found 8 outliers among 100 measurements (8.00%)
7 (7.00%) high mild
1 (1.00%) high severe
read/dashmap time: [195.79 µs 199.30 µs 203.44 µs]
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) high mild
9 (9.00%) high severe

AFTER:

=== papaya (incremental) ===
p99 insert: 4ms
p99 concurrent insert: 1ms
p99 concurrent insert: 0ms
p99 concurrent insert: 0ms
p99 concurrent insert: 2ms
p99 concurrent insert: 12ms
p99 concurrent insert: 0ms
p99 concurrent insert: 5ms
p99 concurrent insert: 0ms
=== papaya (blocking) ===
p99 insert: 2ms
p99 concurrent insert: 32ms
p99 concurrent insert: 86ms
p99 concurrent insert: 87ms
p99 concurrent insert: 86ms
p99 concurrent insert: 90ms
p99 concurrent insert: 86ms
p99 concurrent insert: 87ms
p99 concurrent insert: 86ms
=== dashmap ===
p99 insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
p99 concurrent insert: 2ms
Running benches\single_thread.rs (target\release\deps\single_thread-c96b88cabe8f6003.exe)
Gnuplot not found, using plotters backend
read/papaya time: [149.58 µs 150.64 µs 151.87 µs]
change: [-10.011% -8.8298% -7.7247%] (p = 0.00 < 0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
read/std time: [81.907 µs 82.839 µs 83.976 µs]
change: [-18.498% -15.797% -13.173%] (p = 0.00 < 0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
8 (8.00%) high mild
5 (5.00%) high severe
read/dashmap time: [184.78 µs 186.77 µs 189.01 µs]
change: [-7.7925% -5.6063% -3.5561%] (p = 0.00 < 0.05)
Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe

@ppmpreetham
Copy link
Copy Markdown
Author

The previous source code has a lot of unused functions, consider removing it and run it

@ibraheemdev
Copy link
Copy Markdown
Owner

I haven't looked at this closely, but I'm a little confused about your before and after numbers. Why are numbers other than papaya also changing?

I'm not sure if the primary impact of your change is caused by the allocation lock or the parker lock, but I would be interested in removing the parker entirely and using parking_lot_core::{park, unpark} instead, considering we're already adding the dependency.

@ibraheemdev
Copy link
Copy Markdown
Owner

ibraheemdev commented Mar 22, 2026

The CI warnings should be gone after you rebase on top #90.

@ppmpreetham
Copy link
Copy Markdown
Author

I'm confused about the timings for the 2nd run to be less too, I'll try testing it 100s of times and averaging them out.

Also, if time permits, I'd also try to work on parking using parking_lot_core::{park, unpark}.

@ibraheemdev ibraheemdev added the performance Performance related improvements. label Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance related improvements.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants