Skip to content

Use dictGetSomeKeys instead of dictGetRandomKey for gossip node population#3258

Merged
hpatro merged 6 commits intovalkey-io:unstablefrom
hpatro:pick_n_gossip_elements
Mar 10, 2026
Merged

Use dictGetSomeKeys instead of dictGetRandomKey for gossip node population#3258
hpatro merged 6 commits intovalkey-io:unstablefrom
hpatro:pick_n_gossip_elements

Conversation

@hpatro
Copy link
Contributor

@hpatro hpatro commented Feb 26, 2026

Observation around dictGetSomeKeys(N) performance is that it is around 10x faster than randomly picking N elements over N*3 attempts. Further, this avoids chances of not picking N elements due to duplicate element selection with individual random pick. Overall, it should help picking gossip node(s) for cluster bus message faster.

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Copy link
Member

@dvkashapov dvkashapov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@zuiderkwast
Copy link
Contributor

dictGetSomeKeys is less randomly distributed.

It picks one random slot in the dict and then iterates forward and takes every element until the the right number of elements are returned.

It means that that a node is picked together with its neighbours in the dict. If a node picks 200 out of 2000 nodes, then the next time, if picks one of these again, it will also include much of the same 200 nodes again.

Additionally, if the nodes are configured to use the same hash seed, then the clustering will look the same on all nodes in the cluster.

I have a strong feeling this can affect the information spreading negatively. I think we need some theoretical at least hand-wavy reasoning about it.

@hpatro
Copy link
Contributor Author

hpatro commented Feb 28, 2026

It would start from a random index though for each pass, so I think it wouldn't be the same set of data each time.

Spinning up a large cluster isn't practical for these type of changes. Let me see if I can unit test the randomness of data across runs.

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Mar 2, 2026

It would start from a random index though for each pass, so I think it wouldn't be the same set of data each time.

It's not the same set every time. But it picks a random range instead of nodes at random. Perhaps gossiping about a range of nodes is fine.

Let's also think about the chaining. Some nodes are more likely to be picked than others depending on chained buckets. Especially in small clusters where we want to pick only three nodes. For example in a 3+3 cluster with nodes A-F, the dict will have size 8. Each node is hashed into any slot, independently of other nodes. It may very well end up like this:

Dict slots: | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
Nodes:        F           D       A   B
                                  E
                                  C

When we pick 3 nodes from these using dictGetSomeKeys, we start from a random slot and pick the next three nodes:

Slot   Next 3 nodes
----   ------------
0      F, D, A
1      F, D, A
2      D, A, E
3      D, A, E
4      D, A, E
5      A, E, C
6      A, E, C
7      B, F, D

Node B is only picked if we start at slot 7. The other nodes are picked more often. Some are picked in 5 start-slots. So, clearly this method isn't very fair in a small cluster.

Maybe we can use this method only in larger clusters? In a large cluster, the bucket-chains might be longer so this effect can be observed there too?

Spinning up a large cluster isn't practical for these type of changes. Let me see if I can unit test the randomness of data across runs.

That'd be good!

You can look at the hashtable unit tests for inspiration. When I developed the random selection there, I experimented with different sample sizes and tried to measure the randomness.

@hpatro
Copy link
Contributor Author

hpatro commented Mar 4, 2026

@zuiderkwast Thanks for the pointers. Will take a look.

…ior for fairness

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
@hpatro
Copy link
Contributor Author

hpatro commented Mar 6, 2026

@zuiderkwast Used AI (codex) to generate some unit tests to test the fairness with chaining and evenly spread scenario.

  • With chaining scenario, the fairness is worse with the dictRandomKey unstable approach.
  • With even distribution, the fairness is uniform.

I've added the unit tests to the PR (will delete once we decide to proceed ahead). Have a look at them and the results below.

- `test_dictGossipSamplingSimulationUnstableVsSomeKeys` (chaining/adversarial)
- `test_dictGossipSamplingSimulationUnstableVsSomeKeysNoChainSpread` (no chaining/even spread)

Both run `wanted=3`, `100000` rounds, fixed RNG seed.

`spread` below means: `max(total_picks_per_node) - min(total_picks_per_node)`.

Chaining / adversarial layout (6 nodes, 8 buckets, one 3-node chain) (Scenario explained above)

- **unstable loop** (`dictGetRandomKey` + dedupe + `wanted*3` retries):
  - spread: **39258**
  - rounds with `<3` picks: **605/100000**
  - strong skew (some nodes ~0.69 inclusion/round, others ~0.30)
- **dictGetSomeKeys**:
  - spread: **26464**
  - rounds with `<3` picks: **0/100000**
  - still skewed, but less than unstable in this layout
  
unstable simulation results:
  per-round inclusion: A=0.3038 B=0.6951 C=0.3025 D=0.6941 E=0.3057 F=0.6927
  total picks: A=30382 B=69508 C=30250 D=69408 E=30571 F=69275
  spread max-min: 39258
  rounds with fewer than 3 picks: 605/100000
dictGetSomeKeys simulation results:
  per-round inclusion: A=0.5132 B=0.3745 C=0.5117 D=0.6391 E=0.5125 F=0.4490
  total picks: A=51318 B=37445 C=51174 D=63909 E=51255 F=44899
  spread max-min: 26464
  rounds with fewer than 3 picks: 0/100000

No chaining / evenly spread layout (8 nodes, 8 buckets)

- expected picks per node: **37500**
- **unstable loop**:
  - spread: **249**
  - max deviation from expected: **133**
  - rounds with `<3` picks: **12/100000**
- **dictGetSomeKeys**:
  - spread: **304**
  - max deviation from expected: **179**
  - rounds with `<3` picks: **0/100000**
 
 unstable (no-chain spread) simulation results:
  per-round inclusion: A=0.3746 B=0.3738 C=0.3758 D=0.3753 E=0.3745 F=0.3763 G=0.3750 H=0.3745
  total picks: A=37458 B=37384 C=37581 D=37529 E=37448 F=37633 G=37503 H=37452
  spread max-min: 249
  rounds with fewer than 3 picks: 12/100000

dictGetSomeKeys (no-chain spread) simulation results:
  per-round inclusion: A=0.3753 B=0.3739 C=0.3758 D=0.3768 E=0.3761 F=0.3740 G=0.3738 H=0.3744
  total picks: A=37527 B=37393 C=37581 D=37679 E=37613 F=37395 G=37375 H=37437
  spread max-min: 304
  rounds with fewer than 3 picks: 0/100000

no-chain expected per node=37500
no-chain max deviation: unstable=133 dictGetSomeKeys=179
no-chain spread: unstable=249 dictGetSomeKeys=304

@zuiderkwast
Copy link
Contributor

  • With chaining scenario, the fairness is worse with the dictRandomKey unstable approach.
  • With even distribution, the fairness is uniform.

Wow, nice! Good result.

I guess I was thinking about dictFairRandomKey.

OK, then, go for it! And remove the unit tests. We already moved to googletest in unstable so you will get nasty conflicts if you keep them.

hpatro added 2 commits March 6, 2026 20:39
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Copy link
Contributor

@sarthakaggarwal97 sarthakaggarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice!

@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.02%. Comparing base (8c397ea) to head (e6436dc).
⚠️ Report is 8 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3258      +/-   ##
============================================
+ Coverage     74.98%   75.02%   +0.04%     
============================================
  Files           129      129              
  Lines         71551    71552       +1     
============================================
+ Hits          53654    53684      +30     
+ Misses        17897    17868      -29     
Files with missing lines Coverage Δ
src/cluster_legacy.c 87.96% <100.00%> (-0.11%) ⬇️

... and 27 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@hpatro
Copy link
Contributor Author

hpatro commented Mar 9, 2026

@zuiderkwast Have cleaned up the tests. If you're fine with the changes, PTAL and approve it.

@hpatro hpatro merged commit 6fee1aa into valkey-io:unstable Mar 10, 2026
58 checks passed
JimB123 pushed a commit that referenced this pull request Mar 19, 2026
…ation (#3258)

Observation around `dictGetSomeKeys(N)` performance is that it is around
10x faster than randomly picking `N` elements over N*3 attempts.
Further, this avoids chances of not picking N elements due to duplicate
element selection with individual random pick. Overall, it should help
picking gossip node(s) for cluster bus message faster.

---------

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants