-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Specification
Kademlia has this to say about refreshing buckets;
"Buckets will generally be kept constantly fresh, due to the traffic of requests travelling through nodes. To avoid pathological cases when no traffic exists, each node refreshes a bucket in whose range it has not performed a node lookup within an hour. Refreshing means picking a random ID in the bucket’s range and performing a node search for that ID. To join the network, a node u must have a contact to an already participating node w. u inserts w into the appropriate k-bucket. u then performs a node lookup for its own node ID. Finally, u refreshes all k-buckets further away than its closest neighbour. During the refreshes, u both populates its own k-buckets and inserts itself into other nodes’ k-buckets as necessary." kademlia spec
Given this we need to implement the following
- A refresh bucket method needs to be added that does the following. This may need to be part of an async queue since refreshing a bucket can be expensive.
- Picks a random node inside the range of a given bucket.
- preforms a node lookup on the generated
NodeId
.
- We need to refresh a bucket if it hasn't been used within a certain time. This defaults to 1 hour. The timeout can be implemented as a priority queue and a single timer tracking the shortest time, no need to maintain 255 timers at once.
- track timeouts for each bucket
- refresh timeout whenever a bucket is used.
- when a timeout on a bucket is triggered we need to do a refresh bucket operation on that bucket.
- When entering the network we need to refresh all k-buckets further away than its closest neighbour.
- when entering the network after searching for itself in the network the node then needs to refresh every bucket 'above' it's closest neighbour.
refreshBucket
is pretty simple. It generates a random NodeId
within the targeted bucket and preforms a nodeFind
operation for that nodeId. We generate the desired 'distance' by generating 32 random bytes and then zeroing the bits above the target bucket bit and forcing 1 for the target bucket bit. The random NodeId
is made by XOR-ing the distance with the base NodeId
.
The activity timers for buckets were implemented the following way. We maintain a map of bucketIndex -> deadline
to track when each bucket needs to be refreshed. We then maintain a single timer that triggers when the closest deadline has passed. When the timer is triggered we add any bucket that has passed it's deadline to the queue and reset the timer to the next closest deadline. When resetting timers for buckets that have been access we just update the map entry for that timer. if the updated bucket is the closest deadline then we update the timer again. When a bucket is is the queue, it's deadline is disabled by setting it to 0
and is not considered when updating the timer or adding to queue.
The refreshBucket
queue is implemented using a set of unique bucket indexes so a bucket can only be in the queue once. The queue is asynchronously digested doing a single refreshBucket
at a time. When a refreshBucket
operation is completed for a bucket it's deadline is reinstated. if a bucket is updated while in the queue then it's deadline is reinstated and removed from the queue.
Entering the network has been updated. NodeConnectionManager.syncNodeGraph
has been updated such that when it completes the initial sync it will add any buckets above the closest node's bucket to the refreshBucket
queue.
Additional context
- originally discussed when talking about adding a node to another node's
NodeGraph
in Seed node not adding details of connecting node to itsNodeGraph
#344 (comment)
Tasks
- Implement
refreshBucket
as per the above spec. - Implement refresh timeouts for every node bucket as per the above spec
- Update how the node enters the network as per the above spec.