-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peer connection drop #2726
Comments
@EthanHeilman is probably a good person to look at this. |
Yes, it'd be great to get an expert opinion on this. My thoughts are:
I put some thought into it, and the best approach I can come up with to solve our current peer health problems, without sacrificing network security, would be to:
Obviously, this doesn't have to all be implemented at once, but I'd prefer to work toward a system similar to this. I'd love to hear everyone's thoughts on this. |
When
IIRC, Peer request ( How about enhance the "Ping / Pong" ( or "Hand / Shake" ? I don't remember well) message? and clearly present the "Peer Request" on the "Hand" (for example) message if needed, means if the remote peer is full of connections, it should at least send back a peer list before refusing/dropping the connection request.
Does that mean the new connections/nodes will be more unstable? ( new connection has bigger chance to be dropped) |
Only if it doesn't know any other peers, we have a queue of peers to connect, so we proceed with the next in this queue. So I don't think we should address it connection request code.
Yeah, here is a problem that we don't know anything about this peer and this peer may not have time to request a peer list
I'm not sure, not a big fan of a such design
that's what #2725 introduces
Vice versa, older connections would be dropped first (first in - first out) |
Just want to reiterate that FIFO doesn't solve any of the issues that we're having right now. |
FIFO would work against any kind of network stability right? We'd drop older long-lived connection over newer connections. |
Ideally we'd want (I think) a mix of old and new connections. But we went with the simplest approach - all nodes are equal, all connections are equal. Ideally we'd keep this as simple as possible (and not too simple) without introducing too much differentiation/precedence/complexity, while at the same time maintaining a healthy network. |
@antiochp Yes, FIFO would make the least stable network. I think my proposal[1] handles most of those concerns, while providing a very stable network (minimal peer dropping). Do you disagree? [1] #2726 (comment) |
Long term connection peers (normally that means more stable peer) will be dropped first? always favor new connections? |
My node log is full of the following errors and I wanted to confirm it's related to what is being discussed in this issue. If not, I can open a separate one. |
@Kargakis Yes, that's the same issue. |
In Bitcoin you keep a banscore for each of your peers which you increase if they do stuff like:
etc. If the peer reaches a threshold you ban him. When you need to evict a node (i.e when you are over capacity and receive a connection) you evict the peer with the highest banscore first |
@mcdallas I like that idea for long-term, but something like that would require a lot of work and even more testing. If done incorrectly, we could end up kicking slow peers off the network, or peers with older versions, etc. |
The banscore stuff in Bitcoin sucks and isn't actually really used -- if Bitcoin Core could get rid of it easily, we would. In general it's a binary thing at this point, modulo one or two use cases. Cases where banscore was incremented are being removed (e.g., no banning for any PoW valid block served). I don't think we want to emulate that here. What is good (besides confusing terminology) is the notion of outbound vs inbound peers. I think thinking about this in terms of stable marriage terminology helps more: there are nodes that are suitors and there are nodes that you would like to be a suitor too (the node being suited can be called the 'wooed'). You place much lower value on your suitors than on the nodes that you want to woo. You should also be suiting those nodes, irrespective of the number of suitors you have. Otherwise it's easy to sybil your node. The nodes that your suiting should either be ones that you configure, ones you've been connected to for a long time without fault, or that have provided you with high value information frequently (e.g., they propagate blocks to you you hadn't seen). Thus you can upgrade a suitor to a 'match' and add them to your woo-list if you're getting high quality information from them. From your woo-list, you should select a fixed number (maybe let's say 8) to peer with. You can rotate these around. If a node on your woo-list connects to you as a suitor, you will let them stay on. From your suitor list, you can treat that as a FIFO or whatever. You can also limit the bandwidth from those nodes more heavily as they are less important to you. Perhaps put heavier delays on propagating them blocks or transactions, or don't consider them for dandelion relay at all (probably a good idea!). Treating all nodes equally sounds nice -- but in practice it makes the network easy to attack. Known good peers are very valuable! Something like the above preserves that structure. I also don't think we should expose the peer list necessarily. Maybe expose your suitors but not your woo-list. It seems a bit sensitive given that this helps someone map the network. @EthanHeilman should eventually chime in here (I'll ping him) about addressing Eclipse attacks under this -- pinging random IP addresses helps iirc. |
@DavidBurkett do you think it's necessary for non-seeder nodes to share their peer list? Maybe share a part of it? Or ask an existing peer's list and send that back to the original request instead? Overall, I find your proposal reasonable. 👍 |
Yes, this is in line with the recommendations of the Dandelion++ paper. Section 4.3 explicitly touches on this topic and outlines an attack where an adversary is lying about how many outbound edges they create and could even connect to all honest nodes on the network this way. They mitigate against this by forcing a node Alice to only forward transactions via its outbound edges, ie to nodes Alice has chosen, rather than to nodes that have chosen Alice:
Inbound edges, i.e. from suitors |
We already keep track of inbound and outbound peers and have a fixed preferred quota of outbounds. This seems like a good thing. Beyond this, I think we're going to have a hard time finding a solution that's not gameable and helps significantly. Question is: are we trying to bend over backward to compensate for a lack of open ports on the network? |
@ignopeverell lack of open slot could explain that but 2 facts are worth further check. I rarely see peers connected for more than 30 secs. There is a trend here (peeks are restarts) https://grin.report/d/jeTROiqmz/peers?orgId=1&fullscreen&panelId=2&from=now-30d&to=now&refresh=1m |
Thanks @JeremyRubin for letting me know about this thread. I am ignorant of how Grin currently manages network connections, but I can provide a high level explanation of how Bitcoin does it which may be helpful. In Bitcoin connections are divided into two groups: incoming connections and outgoing connections. Currently the default configuration for a Bitcoin node is:
The reason that there are so many more slots reserved for incoming connections is that not every client can accept incoming connections (clients behind a NAT, firewall etc). We wrote a paper examining Bitcoin's P2P network and how gamable it was in 2015-2016. This work includes a range of countermeasures to make cryptocurrency P2P networks more robust. I have a project page with a link to paper and the PR's we merged into Bitcoin to strengthen BItcoin against these attacks. I'm happy to discuss this further if anyone is interested but it would be very helpful to have a short summary of how Grin currently manages its P2P network. |
Thank you very much for passing by @EthanHeilman. Currently Grin does the following:
Ii looks like our main gap when compared to bitcoin is the low number of connections we allow overall, and the absence of restrictions on outgoing connections. |
Thanks for providing some insight into how Grin works.
Personally I think 25 connections max is probably not enough, although I believe ETH does get by with 25 connections. Increasing this number will probably get you some breathing room with regard to incoming connection slot exhaustion.
Bitcoin always tries to use all outgoing connection slots so BTC will always attempt to make an outgoing connection until they have 8 outgoing connections. I'm not sure what the magic number here is although 8 seems to work. Increasing outgoing connections Pros:
Increasing outgoing connections Cons:
I'm not sure it is still the case but blockchain surveillance/analysis companies used to attempt to connect to every node which allowed incoming connection slots in Bitcoin for the purposes of learning when a transaction was broadcasted and who broadcasted it. This doesn't work so well as a surveillance method in modern Bitcoin, so I'm not sure if they still do it. This behavior takes up 1 incoming connection slot on each node which if the number of incoming connection slots were low would be very problematic. @ignopeverell How does Grin do peer seeding? Are you using DNS seeders? |
See the Erlay paper for additional discussion on how Bitcoin network connectivity works today: https://arxiv.org/abs/1905.10518 |
It's not an issue per se, it was a design decision, however it may be worth to discuss one more time.
Grin node periodically kills the excess of connections https://github.com/mimblewimble/grin/blob/master/p2p/src/peers.rs#L463 which is understandable but annoying for counterparts. From my experience a node connects and almost immediately get dropped. There are couple ideas how to improve it:
Do we have some other ideas?
The text was updated successfully, but these errors were encountered: