Use dictGetSomeKeys instead of dictGetRandomKey for gossip node population#3258
Use dictGetSomeKeys instead of dictGetRandomKey for gossip node population#3258hpatro merged 6 commits intovalkey-io:unstablefrom
Conversation
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
|
dictGetSomeKeys is less randomly distributed. It picks one random slot in the dict and then iterates forward and takes every element until the the right number of elements are returned. It means that that a node is picked together with its neighbours in the dict. If a node picks 200 out of 2000 nodes, then the next time, if picks one of these again, it will also include much of the same 200 nodes again. Additionally, if the nodes are configured to use the same hash seed, then the clustering will look the same on all nodes in the cluster. I have a strong feeling this can affect the information spreading negatively. I think we need some theoretical at least hand-wavy reasoning about it. |
|
It would start from a random index though for each pass, so I think it wouldn't be the same set of data each time. Spinning up a large cluster isn't practical for these type of changes. Let me see if I can unit test the randomness of data across runs. |
It's not the same set every time. But it picks a random range instead of nodes at random. Perhaps gossiping about a range of nodes is fine. Let's also think about the chaining. Some nodes are more likely to be picked than others depending on chained buckets. Especially in small clusters where we want to pick only three nodes. For example in a 3+3 cluster with nodes A-F, the dict will have size 8. Each node is hashed into any slot, independently of other nodes. It may very well end up like this: When we pick 3 nodes from these using dictGetSomeKeys, we start from a random slot and pick the next three nodes: Node B is only picked if we start at slot 7. The other nodes are picked more often. Some are picked in 5 start-slots. So, clearly this method isn't very fair in a small cluster. Maybe we can use this method only in larger clusters? In a large cluster, the bucket-chains might be longer so this effect can be observed there too?
That'd be good! You can look at the hashtable unit tests for inspiration. When I developed the random selection there, I experimented with different sample sizes and tried to measure the randomness. |
|
@zuiderkwast Thanks for the pointers. Will take a look. |
…ior for fairness Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
|
@zuiderkwast Used AI (codex) to generate some unit tests to test the fairness with chaining and evenly spread scenario.
I've added the unit tests to the PR (will delete once we decide to proceed ahead). Have a look at them and the results below. Chaining / adversarial layout (6 nodes, 8 buckets, one 3-node chain) (Scenario explained above)No chaining / evenly spread layout (8 nodes, 8 buckets) |
Wow, nice! Good result. I guess I was thinking about dictFairRandomKey. OK, then, go for it! And remove the unit tests. We already moved to googletest in unstable so you will get nasty conflicts if you keep them. |
Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #3258 +/- ##
============================================
+ Coverage 74.98% 75.02% +0.04%
============================================
Files 129 129
Lines 71551 71552 +1
============================================
+ Hits 53654 53684 +30
+ Misses 17897 17868 -29
🚀 New features to boost your workflow:
|
|
@zuiderkwast Have cleaned up the tests. If you're fine with the changes, PTAL and approve it. |
…ation (#3258) Observation around `dictGetSomeKeys(N)` performance is that it is around 10x faster than randomly picking `N` elements over N*3 attempts. Further, this avoids chances of not picking N elements due to duplicate element selection with individual random pick. Overall, it should help picking gossip node(s) for cluster bus message faster. --------- Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Observation around
dictGetSomeKeys(N)performance is that it is around 10x faster than randomly pickingNelements over N*3 attempts. Further, this avoids chances of not picking N elements due to duplicate element selection with individual random pick. Overall, it should help picking gossip node(s) for cluster bus message faster.