PeerStore as the solution to speeding up Kademlia startup? #6005

dhuseby · 2025-04-23T04:15:37Z

dhuseby
Apr 23, 2025

@elenaf9 I was talking to @achingbrain about how js-libp2p accelerates Kademlia startup. As I understand it, they use a peer store that can be persisted to disk and loaded again. It contains PeerId's, their associated multiaddrs and a timestamp of the last time there was a connection on each multiaddr.

In general, when you load that state from disk, it adds the peer data to the peer store and they get filtered by age and some other details irrelevant to this discussion. However, after the PeerId's and multiaddrs have been "expired", the remaining are announced to the rest of the protocols. The Kademlia protocol implementation picks those up and adds them as discovered peers, which at that point, means they go into the buckets of the protocol. Then Kademlia tries to dial/ping all of them to refresh that state.

I was looking at the recently added peer store and I think it can be used to do something similar in rust-libp2p.

On first startup with no persistent state, the kademlia will bootstrap and each discovered peer will cause a NewExternalAddrOfPeer event to be emitted which the peer store receives and records. If, in my peer behavior event loop, I handle peer_store::RecordUpdated events, I can use the MemoryStore::insert_custom_data function and the MemoryStore::get_custom_data_mut function to maintain a mapping between Multiaddr and timestamps to mimic the time stamping that js-libp2p does.

Then on shutdown, I can use the MemoryStore::record_iter to get the stored PeerId's and their associated PeerRecord and custom data to serialize it to disk.

When running the peer again, but this time with cached peer state, I can do the "expiring" of stale Multiaddrs similar to js-libp2p and then add them to the swarm via Swarm::add_peer_address.

The only problem is, the Kademlia implementation in rust-libp2p is very obtuse in how it would pick up the added peers. Essentially, I'm adding peers into the Swarm's address cache which I think the Kademlia protocol will use when bootstrapping if it knows of no other peers. I do know Kademlia will use the Swarm cache when trying to replace a disconnected peer in its k-buckets because it tries to re-dial via PeerId only. However, I'm not 100% certain that the bootstrap process will grab peers from the Swarm cache. I'll need to test that.

Anyway, that's the current best answer on using persistent peer data to potentially accelerate Kademlia startup.

dhuseby · 2025-04-23T04:23:55Z

dhuseby
Apr 23, 2025
Author

The only problem I can see with this approach, assuming Kademlia does pull from the Swarm cache to bootstrap, is that the peer store will be storing all peers we've connected to and not just the peers that were confirmed to support the kademlia protocol and also in the kademlia k-buckets. So this isn't as big of a speed up as it could be. If the peers persisted to disk were only the peers that are also in the kademlia k-buckets, then we'd had the theoretically maximum speedup possible. With this approach only some fraction of the peers will be dialed and have their protocol confirmed so they will be added to the Kademlia routing tables iff kad::Config::kbucket_inserts is set to kad::BucketInserts::OnConnected

0 replies

dhuseby · 2025-04-23T04:25:54Z

dhuseby
Apr 23, 2025
Author

It seems like there could be a more straight forward and efficient way to:

Persist to disk only the peer information we want (i.e. just the peers in Kademlia's routing table)
Initialize the Swarm with the data and directly trigger Kademlia dialing the peers, confirming their protocol and adding them to the k-buckets.

0 replies

achingbrain · 2025-04-23T12:01:35Z

achingbrain
Apr 23, 2025

the peer store will be storing all peers we've connected to and not just the peers that were confirmed to support the kademlia protocol

FWIW the js peer store serializes supported protocols (from identify) along with other metadata and at startup KAD-DHT only adds peers that have previously claimed to support the KAD-DHT protocol to the routing table, it doesn't add every peer in the peer store.

By default on loading from the peer store, peers that have not been successfully dialled in the last hour have their multiaddrs removed (requiring a peer routing lookup to dial to ensure their addresses are current) and peers that have been without multiaddrs for six hours are removed entirely.

So if it's a quick restart of a node, it'll take peers from the peerstore but if the node has been offline for more than six hours it'll go back to the bootstrappers to build it's routing table.

0 replies

dhuseby · 2025-04-23T14:11:35Z

dhuseby
Apr 23, 2025
Author

I'm starting to get the sense that the Peer information is fragmented too much in rust-libp2p. The Swarm has a cache. Kademlia has a cache. Gossipsub has a cache. On and on and on. They all store different things and have different heuristics for adding/removing/updating records.

I'm wondering if the time has come to re-think peer information storage and interfacing. At least a little bit.

My main concern is that there doesn't seem to be a direct way to gather all of the Peer information from the various caches and store it to disk and later restore from disk and emit events that give the various caches to restore their state directly.

For instance, if I serialize the Kademlia's Peer and Provider records to disk and later reload it from disk, there's no code for getting that data back into the k-buckets/routing table and trigger a "filter and refresh" pass. It seems like there should be a way to do that. Operating on the assumption that a time, network, and energy intensive bootstrap process must happen on every startup seems like a bad design.

5 replies

drHuangMHT Apr 25, 2025

My main concern is that there doesn't seem to be a direct way to gather all of the Peer information from the various caches and store it to disk and later restore from disk and emit events that give the various caches to restore their state directly.

Hi Dave, there was a serde support to persist all the data on disk but was later deemed unnecessary(see this comment), but we can bring it back real quick.

Also we can emit NewExternalAddrOfPeer to sync states between peer store and kademlia(and other behaviours that works with addresses), but a quick search into kademlia reveals that it doesn't seem to care about this event. I can see that false positives from NewExternalAddrOfPeer can impact kademlia performance on the network level so maybe we should only emit the address when it is signed. But NewExternalAddrOfPeer doesn't carry information other than the peer ID and its address.

drHuangMHT Apr 25, 2025

I'm starting to get the sense that the Peer information is fragmented too much in rust-libp2p. The Swarm has a cache. Kademlia has a cache. Gossipsub has a cache. On and on and on.

It would be nice to have a unified place for dealing with peer information. And there shouldn't be any race condition on the behaviour level. But changes to it can impact the core NetworkBehaviour interface, and we still need a way to inform behaviours about any changes in the store.

dhuseby Apr 25, 2025
Author

...but a quick search into kademlia reveals that it doesn't seem to care about this event.

That's correct. It only considers peers once we have dialed them and confirmed that they speak the kademlia protocol. That's why I was wondering above if it would be enough to emit the NewExternalAddrOfPeer events, thus populating the swarm cache, and then trigger the kademlia bootstrap process. Will kademlia cause dials to the peers in the swarm cache and accept peers that speak a matching protocol (e.g. ProtocolConfirmed)?

My goal is for rust-libp2p's peer store functionality to be similar to js-libp2p's. Serialize to disk on exit. Restore from disk on startup and minimize the number of required dials, and thus the time required, to get to the "bootstrapped" state.

drHuangMHT Apr 25, 2025

Will kademlia cause dials to the peers in the swarm cache and accept peers that speak a matching protocol (e.g. ProtocolConfirmed)?

Well first I don't think kademlia has a direct way of accessing "swarm cache" because swarm only has access to "handles" defined in NetworkBehaviour and NetworkBehaviour::poll() only receives core::task::Context instead of a context containing references to swam internals(we can make use of this but it will break current API). So all the information is received either from FromSwarm events(invoked in NetworkBehaviour::on_swarm_event()), or from the event loop around Swarm(can obtain mutable reference to the behaviour).
Then for dials to be initated we will need at least one record in the bucket, and the behaviour will initate queries to refresh the routing table periodically. Also on a newly established connection it will try to negotiate the protocol and if succeeded a query will also be made.
Those are my own understanding so they can be wrong.

dhuseby Apr 25, 2025
Author

Then for dials to be initated we will need at least one record in the bucket, and the behaviour will initate queries to refresh the routing table periodically. Also on a newly established connection it will try to negotiate the protocol and if succeeded a query will also be made.

If you are correct—and you probably are because this is close to my own mental model of its behvaiour—it seems like the following are true about the kademlia implementation:

You must have added at least one bootstrapping peer for kademlia to even attempt a bootstrap
It was designed to always start from scratch on startup (i.e. there's no interface and/or events for prepopulating the k-buckets with "candidate" peer records that kademlia will filter/refresh by dialing)
There is no direct way to access the k-buckets and/or the provider records internal to kademlia so they get thrown away on exit

This seems like a minimum viable implementation and should be improved. My qyestion to you is this

Should we modify the kademlia behavior API so that we can serde its internal state directly?
Should we change the way it handles NewExternalAddrOfPeer so that it behaves more like the js-libp2p Kademlia implementation?
Should we modify the kademlia store (e.g. kad::MemoryStore) so that it can be serde'd to/from disk?
Should we modify kademlia so that its peer records and provider records have a state flag that includes a "candidate" or "refresh" state (i.e. was valid as of X seconds ago) so that we can populate its internal state directly and then kademlia will filter/refresh all of the peers that are marked as valid as of X seconds ago?

I'm just trying to map out the best way forward to achieve the goal of "absolute minimum amount of work and shortest time to bootstrapped state on startup".

Maybe @guillaumemichel has thoughts on the direction here. If we solve this, then we'll be adding support for use cases that have stronger interactive time constraints and/or resource constraints to the set of use cases we solve for.

drHuangMHT · 2025-04-25T13:41:36Z

drHuangMHT
Apr 25, 2025

cc @guillaumemichel

0 replies

guillaumemichel · 2025-04-25T15:15:33Z

guillaumemichel
Apr 25, 2025
Maintainer

I can see 2 benefits of persisting peers:

Better resiliency - you don't rely only on a handful of hardcoded bootstrappers
Faster bootstrap - if starting from a few peers, node needs to make many requests, if it knows already many peers when starting less requests are required.

I guess the main goal is 1.

I don't know whether it is actually easy, but periodically persisting kademlia buckets (all peerids and associated maddrs) to disk should be enough. As @dhuseby said you must have at least 1 bootstrapper, so on restart try to connect to all of these peers (irrespective of actual latest connection), and if you can connect to at least 1 (that isn't a hardcoded boostrapper), it's a win.

On restart, it is also possible to load the buckets as they were during the last snapshot, and let them be refreshed/purged automatically. This means that the routing table will contain unreachable peers for a while, which is undesirable.

Restarting by loading the last buckets snapshot may not be the fastest/more efficient way to bootstrap since many timeouts are to be expected. However, trying to connect to all peers from the previous snapshot, and using only the ones responding quickly could be a way to speedup the bootstrap, while relying less on hardcoded bootstrappers.

3 replies

dhuseby Apr 25, 2025
Author

Restarting by loading the last buckets snapshot may not be the fastest/more efficient way to bootstrap since many timeouts are to be expected. However, trying to connect to all peers from the previous snapshot, and using only the ones responding quickly could be a way to speedup the bootstrap, while relying less on hardcoded bootstrappers.

Based on what @achingbrain is saying about js-libp2p, if we record when the multiaddr for a given peer was last valid, along with the PeerId and Multiaddr when we record a snapshot, there are some heuristics that we can apply when we restore the state from the snapshot to minimize the timeouts. I'm also thinking that during the filter/refresh pass after restoring from a snapshot, we can make the timeouts shorter so we fail faster. This would bias the refresh to focus on the multiaddrs that are immediately dial-able and reject any that don't respond right away (i.e. within 2 seconds or something short like that).

@drHuangMHT, @elenaf9 and @jxs I think this is doable with the following approach:

Add an interface to serialize the Kademlia::kbuckes. Include in the serialization the only connected peers. Record only their PeerId and Multiaddr
Upon startup, deserialize the list and just call kad::Behaviour::add_address for each PeerId+Multiaddr pair.

I was looking at the code and it looks like it will just do the correct thing. Assuming this is a fresh startup and the Kademlia behaviour is set up but has no peer data, calling add_address will cause the self.kbuckets.entry(&key) call to return Some(kbucket::Entry::Absent(entry)) and if this is the first time encountering a PeerId, when we call entry.insert() it will return kbucket::InsertResult::Pending { disconnected } because the peer wasn't in the list of connected_peers and disconnected is NodeStatus::Disconnected and a Dial event will be send to the swarm to initiate a new connection to the peer effectively "refreshing" its status.

That kicks off the string of events of dialing, running Identify to confirm the protocol match and if there is a match, then sending the ProtocolConfirmed message which causes Kademlia to add the remote address to the routing table.

So, it looks like we don't really have to change the code at all. I just described above how to re-initialize the state on startup from a list of PeerId's and their Multiaddrs.

Here's how to save the state to disk when needed (i.e. in response to a "bootstrap complete" event, a periodic timer, and a shutdown signal):

call kad::Behaviour::kbuckets to get an iterator over the internal kbuckets.
for each kbucket from the iterator, call KBucket::iter to get an iterator over the "nodes" (i.e. PeerId + Vec) and their status.
filter out the nodes that are NodeStatus::Disconnected
collect what's left into a Vec<(PeerId, Vec<Multiaddr>)> and serialize it to disk.

Something like:

let kad_peer: Vec<(PeerId, Vec<Multiaddr>)> = behaviour.kbuckets()
    .flat_map(|bucket| bucket.iter()
        .filter(|node| matches!(node.status, crate::kbucket::NodeStatus::Connected))
        .map(|node| (node.key.clone().into_peer_id(), node.value.clone().into_vec())))
    .collect::<Vec<_>>();

// serialize that to disk...

Easy peezy lemon squeezy. 🚀

drHuangMHT Apr 26, 2025

I would suggest snapshots be taken after the periodic refresh has finished, and this feature be feature-gated. But I still have some doubts about filesystem operation because currently I don't see fs operation elsewhere in rust-libp2p.
Also we can't store the state on shutdown because swarm doesn't have related API(shutdown notifier or async Drop).

jxs Apr 28, 2025
Maintainer

Hi Dave, there's already #5856 open (thank you @drHuangMHT for making it easy to find) would it suffice your needs? CC @rishiad

drHuangMHT · 2025-04-26T02:01:20Z

drHuangMHT
Apr 26, 2025

Related: #4817 #5313

0 replies

PeerStore as the solution to speeding up Kademlia startup? #6005

Uh oh!

Uh oh!

Replies: 7 comments · 8 replies

Uh oh!

dhuseby Apr 23, 2025 Author

Uh oh!

dhuseby Apr 23, 2025 Author

Uh oh!

Uh oh!

dhuseby Apr 23, 2025 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dhuseby Apr 25, 2025 Author

Uh oh!

Uh oh!

Uh oh!

dhuseby Apr 25, 2025 Author

Uh oh!

Uh oh!

guillaumemichel Apr 25, 2025 Maintainer

Uh oh!

Uh oh!

dhuseby Apr 25, 2025 Author

Uh oh!

Uh oh!

Uh oh!

jxs Apr 28, 2025 Maintainer

Uh oh!

Replies: 7 comments 8 replies

dhuseby
Apr 23, 2025
Author

dhuseby
Apr 23, 2025
Author

dhuseby
Apr 23, 2025
Author

dhuseby Apr 25, 2025
Author

dhuseby Apr 25, 2025
Author

guillaumemichel
Apr 25, 2025
Maintainer

dhuseby Apr 25, 2025
Author

jxs Apr 28, 2025
Maintainer