This repository contains implementations of two hashing strategies used in distributed systems: Modulo Hashing and Consistent Hashing. The project demonstrates the impact of adding or removing nodes on key distribution and migration.
modulo_hashing.py: Implements the traditionalhash(key) % Napproach.consistent_hashing.py: Implements Consistent Hashing using a hash ring, virtual nodes, and weighted distribution.report.md: A detailed comparison and analysis of the two algorithms based on the simulations.
Ensure you have Python 3 installed. No external dependencies are required as the scripts use standard libraries (hashlib, bisect, threading).
python modulo_hashing.pyObserve how adding a single node causes a large percentage of keys to move.
python consistent_hashing.pyObserve how adding a node results in minimal key movement, preserving the mapping for most keys.
- Ring Topology: Maps nodes and keys to a common hash space.
- Virtual Nodes: Improves load balancing by assigning multiple positions on the ring to a single physical node.
- Weighted Nodes: Allows nodes with higher capacity to handle more keys (e.g.,
Node-Bhas weight 2). - Replication: Supports finding
Nunique replicas for a given key for fault tolerance. - Thread Safety: Uses
threading.RLockfor concurrent access safety.
For a detailed analysis of the results and theoretical differences, please refer to report.md.
- Distributes keys uniformly across nodes.
- Minimizes key movement during node addition/removal.
- Supports deterministic lookup.
- Handles node rebalancing efficiently.
- Virtual Nodes creation and distribution.