Increase valkey-benchmark max latency bucket to 60 seconds#3157
Increase valkey-benchmark max latency bucket to 60 seconds#3157gabiganam wants to merge 1 commit intovalkey-io:unstablefrom
Conversation
0ae0655 to
65a7b1e
Compare
Signed-off-by: Gabi Ganam <ggabi@amazon.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3157 +/- ##
============================================
+ Coverage 74.75% 74.83% +0.07%
============================================
Files 129 129
Lines 71145 71209 +64
============================================
+ Hits 53184 53286 +102
+ Misses 17961 17923 -38
🚀 New features to boost your workflow:
|
|
I think this is a good idea if the only outcome is that we will have more precision. |
| #define CONFIG_LATENCY_HISTOGRAM_INSTANT_MAX_VALUE 3000000L /* <= 3 secs(us precision) */ | ||
| #define SHOW_THROUGHPUT_INTERVAL 250 /* 250ms */ | ||
| #define CONFIG_LATENCY_HISTOGRAM_MIN_VALUE 10L /* >= 10 usecs */ | ||
| #define CONFIG_LATENCY_HISTOGRAM_MAX_VALUE 60000000L /* <= 60 secs(us precision) */ |
There was a problem hiding this comment.
The PR description says requests upto 30s (which is also quite large). Do you see requests that can take upto 60s?
There was a problem hiding this comment.
For normal commands and/or in steady state probably not.
I'm using it to test commands that I expect to take a while (like scripts reading huge keys) and in edge-case states (like during load test or slot migrations).
There was a problem hiding this comment.
Are there any side affects of increasing this limit? I am thinking what was the reason that it was kept at 3s for so long.
There was a problem hiding this comment.
I guess the "normal" use case of the benchmark tool is to test fast commands and reach high throughput, and my use case is not a classic usage and thus wasn't necessarily supported properly.
As far as side effects, the max value increase means there will be more buckets in the histogram, but I think it's negligible.
There was a problem hiding this comment.
How many more bucket? What's the size of the buckets? Power of two?
|
Majority decision is not required for benchmark tool changes. We've decided this earlier and done many changes without it. |
Summary
Increase the maximum latency histogram value in valkey-benchmark from 3 seconds to 60 seconds.
Problem
When requests experience latencies greater than 3 seconds, they are capped at
CONFIG_LATENCY_HISTOGRAM_MAX_VALUE(3,000,000 μsec) before being recorded in the HDR histogram. Due to HDR histogram bucket boundaries, these capped values are reported as ~3,000,319 μsec.This causes two issues:
Loss of precision: All latencies above 3 seconds appear as the same value (~3,000,319 μsec), making it impossible to distinguish between a 4-second and a 30-second latency.
Misleading metrics: When benchmarking under stress conditions, slow queries, or network issues, the reported p99/p100 latencies significantly underrepresent the actual latency experienced.