Add provider record addresses to peerstore #870

dennis-tra · 2023-08-22T11:59:28Z

fixes #868

fixes issue #868

guillaumemichel

LGTM

Jorropo · 2023-08-23T16:39:30Z

What is the point of this ?
Identify already takes care of adding addresses of peer we are connected to.

I mean the fact we are even passing the addresses in the add endpoint seems wrong and I doubt we do.
If we do this is a bug, we shouldn't send all of our addresses multiplied by all of our CIDs, wastefull use of bandwidth.

dennis-tra · 2023-08-23T20:49:58Z

The idea is to avoid the second lookup for network addresses after you've found the provider record.

The routed host will still do the lookup for new addresses if it cannot connect to the peer.

Up until now we relied on identify to keep the addresses around. IIRC @cortze's RFM showed improvement potential if we kept them around longer.

I don't think at all that this is a bug or anything. This is just caching.

Jorropo · 2023-08-23T21:06:04Z

@dennis-tra what does the add provider endpoint has to do with that ?

When I add a provider the remote peer already know my addresses since I'm connected to it.
You could use host.Peerstore() to fetch them instead, and be sure the client does not send them over the wire.
But also generally you don't need to store addresses on each provider record, you should have two table, one CID → [peerid] oen and one PeerID → [Address] one, in order to avoid storing lots of duplicated data.

Jorropo · 2023-08-23T21:17:30Z

Ok so, the provider store does indeed not store the addreses

go-libp2p-kad-dht/providers/providers_manager.go

Line 255 in 2cbe38a

pm.pstore.AddAddrs(provInfo.ID, provInfo.Addrs, ProviderAddrTTL)

I did checked that some addresses are sent over the wire and yes some are:

pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 8
pi.Addrs: 7
pi.Addrs: 16
pi.Addrs: 16
pi.Addrs: 16
pi.Addrs: 14
pi.Addrs: 7
pi.Addrs: 11
pi.Addrs: 5
pi.Addrs: 15
pi.Addrs: 12
pi.Addrs: 8
pi.Addrs: 1
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 7
pi.Addrs: 12
pi.Addrs: 12
pi.Addrs: 14
pi.Addrs: 21
pi.Addrs: 8
pi.Addrs: 26
pi.Addrs: 9
pi.Addrs: 17
pi.Addrs: 12
pi.Addrs: 1
pi.Addrs: 11
pi.Addrs: 4
pi.Addrs: 18
pi.Addrs: 8

However this code is at least buggy because it fails to run the addresses through the address filter (using maybeAddAddrs).
That means it saves all the private LAN addresses of the nodes doing PUTs in our peerstore which we want to stop be doing due to the smart dialler.

dennis-tra · 2023-08-23T21:38:06Z

what does the add provider endpoint has to do with that ?

The add provider endpoint explicitly allows the client to associate addresses with the record.

When I add a provider the remote peer already know my addresses since I'm connected to it.

I agree that, as things currently are, it probably doesn’t matter if we take the addresses from the identify exchange or from the wire message. They probably contain the same set of addresses.

That’s not the point though. The thing that we want to address here is the TTL associated with the multiaddresses of that peer. However, I still think it’s better to take the explicitly passed multiaddresses from the request.

You could use host.Peerstore() to fetch them instead, and be sure the client does not send them over the wire.
But also generally you don't need to store addresses on each provider record, you should have two table, one CID → [peerid] oen and one PeerID → [Address] one, in order to avoid storing lots of duplicated data.

Not sure if I’m missing something but that is what we’re doing isn’t it? We’re storing the multihash -> PeerID mapping in a datastore and another mapping from PeerID to multiaddresses in the peerstore. We’re not storing any duplicate multiaddresses.

When a peer requests a record we look up the peer IDs in the datastore and then the multiaddresses in the peerstore. We serve both information to the client. However, we stop serving multiaddresses after 30 min because the TTL is not properly updated. We want to continue serving the multiaddresses for much longer. This PR attempted to fix that.

Jorropo · 2023-08-23T22:13:03Z

Not sure if I’m missing something but that is what we’re doing isn’t it? We’re storing the multihash -> PeerID mapping in a datastore and another mapping from PeerID to multiaddresses in the peerstore. We’re not storing any duplicate multiaddresses.

Yeah I wrote that before tracing the code. Still over the wire we send theses duplicates which is an ineficient use of resources.

I agree that, as things currently are, it probably doesn’t matter if we take the addresses from the identify exchange or from the wire message. They probably contain the same set of addresses.

That’s not the point though. The thing that we want to address here is the TTL associated with the multiaddresses of that peer. However, I still think it’s better to take the explicitly passed multiaddresses from the request.

We don't need to send the addresses over kad-dht to update the TTL.

IMO This fix just makes thing worst because it incentivize clients to do the wrong thing and waste bandwidth, before it was a bug and we did nothing with theses addresses now you turned it in a feature ✨.
You could call https://pkg.go.dev/github.com/libp2p/[email protected]/core/peerstore#AddrBook.UpdateAddrs, sadly it doesn't have a wildcard update, I think you could submit a PR to libp2p which add some magic wellknown TTL like 0 match all addresses already stored.
If you don't want to submit code to go-libp2p you could do:

// PseudoCode-Ish
addrs := dht.pstore.Addrs(p)
if f := dht.addrsFilter; f != nil {
 addrs = dht.addrsFilter(addrs)
}
pstore.AddAddrs(p, addrs, dht.providerTTL)

They both achieve the same thing except this allows us to stop sending maddrs along side add provider.

Edit: the pseudo code that fetch run through the filter and add the addresses back-in is better IMO, because it only increase the cache on the addresses we want to cache.

Jorropo · 2023-08-23T22:25:26Z

if len(pi.Addrs) < 1 {
	logger.Debugw("no valid addresses for provider", "from", p)
	continue
}

I actually this is on purpose 🤦, forget what I said.
We should eventually make a protocol upgrade with versioned stream handler so we can skip doing this. This is a multiple X background traffic saving.

Edit: new issue here #871

add provider record addresses to peerstore

777160f

fixes issue #868

dennis-tra requested review from guillaumemichel and a team as code owners August 22, 2023 11:59

fixing base32 import

b564493

guillaumemichel approved these changes Aug 22, 2023

View reviewed changes

guillaumemichel merged commit 2cbe38a into master Aug 22, 2023
16 checks passed

guillaumemichel deleted the issue-868 branch August 22, 2023 13:33

Jorropo mentioned this pull request Aug 23, 2023

fix: correctly apply addrFilters in the dht #872

Merged

dennis-tra added the v2 All issues related to the v2 rewrite label Sep 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add provider record addresses to peerstore #870

Add provider record addresses to peerstore #870

dennis-tra commented Aug 22, 2023 •

edited

Loading

guillaumemichel left a comment

Jorropo commented Aug 23, 2023 •

edited

Loading

dennis-tra commented Aug 23, 2023

Jorropo commented Aug 23, 2023

Jorropo commented Aug 23, 2023 •

edited

Loading

dennis-tra commented Aug 23, 2023

Jorropo commented Aug 23, 2023 •

edited

Loading

Jorropo commented Aug 23, 2023 •

edited

Loading

Add provider record addresses to peerstore #870

Add provider record addresses to peerstore #870

Conversation

dennis-tra commented Aug 22, 2023 • edited Loading

guillaumemichel left a comment

Choose a reason for hiding this comment

Jorropo commented Aug 23, 2023 • edited Loading

dennis-tra commented Aug 23, 2023

Jorropo commented Aug 23, 2023

Jorropo commented Aug 23, 2023 • edited Loading

dennis-tra commented Aug 23, 2023

Jorropo commented Aug 23, 2023 • edited Loading

Jorropo commented Aug 23, 2023 • edited Loading

dennis-tra commented Aug 22, 2023 •

edited

Loading

Jorropo commented Aug 23, 2023 •

edited

Loading

Jorropo commented Aug 23, 2023 •

edited

Loading

Jorropo commented Aug 23, 2023 •

edited

Loading

Jorropo commented Aug 23, 2023 •

edited

Loading