Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix peer dropping #2780

Merged
merged 1 commit into from
Apr 26, 2019
Merged

Fix peer dropping #2780

merged 1 commit into from
Apr 26, 2019

Conversation

hashmap
Copy link
Contributor

@hashmap hashmap commented Apr 26, 2019

It turns out that we drop connection if we fail to process a message
because of chain/store/internal error, eg we have a header already, so
we refuse it and drop the peer.
This pr doesn't forward this error to the peer error channel so the
connection will not be dropped.
With this fix my node doesn't drop healthy peers, some peers close connection of course, hopefully I see that Grin++ nodes (which doesn't have this bug) stay forever so now I have 3 grin++ of 25 peers which says something

It turns out that we drop connection if we fail to process a message
because of chain/store/internal error, eg we have a header already, so
we refuse it and drop the peer.
This pr doesn't forward this error to the peer error channel so the
connection will not be dropped.
@hashmap hashmap added the bug label Apr 26, 2019
@hashmap hashmap added this to the 1.1.0 milestone Apr 26, 2019
@antiochp
Copy link
Member

antiochp commented Apr 26, 2019

😱

Let me take a closer look later on but this is a great discovery.

@hashmap
Copy link
Contributor Author

hashmap commented Apr 26, 2019

This how it looks:

0190426 11:14:28.194 DEBUG grin_p2p::peer - Client 47.101.67.222:13414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 207.180.204.245:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 35.182.136.241:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 45.118.135.254:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 88.202.246.74:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 47.88.255.168:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 149.202.74.156:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 47.101.166.239:13414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 106.12.208.167:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 35.156.218.180:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 89.20.128.218:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 35.201.193.24:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 81.196.105.198:3414 connection lost: Chain(Error { inner:
20190426 11:14:28.194 DEBUG grin_p2p::peer - Client 18.191.106.216:3414 connection lost: Chain(Error { inner:

@DavidBurkett
Copy link
Contributor

3 grin++ of 25 peers which says something

I wish it said 12% of the network was running Grin++, but I'll settle for a p2p network that doesn't drop connections left and right. Excellent find @hashmap!

@antiochp
Copy link
Member

antiochp commented Apr 26, 2019

I've checked this out locally to do some testing.

Still need to go over it a bit closer but what happens with "ban-able" errors?
Say a peer sends me a bad header (vs a duplicate header) - where and how does this bubble up?

Edit: So we have 3 different responses for the various handlers - adaptor.header_received() for example.
We return Result<boolean, Error> -

  • true processing was successful
  • false "bad" data and peer should be banned
  • error (which this PR addresses)

Are we sure we still handle the false case for all ban-able "bad" data with these changes?
(Not suggesting that we are not, just that we should confirm this).

Related - This makes me wonder if having a ShouldBanPeerError instead a false response might make this code easier to reason about. And match on that one specifically where necessary.

@hashmap
Copy link
Contributor Author

hashmap commented Apr 26, 2019

@antiochp yes, we handle it, eg https://github.com/mimblewimble/grin/blob/master/p2p/src/peers.rs#L557

Not sure, I think it's easier to miss this error kind somewhere, but need a second thought

Copy link
Member

@antiochp antiochp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 🚀

Copy link

@chisa0a chisa0a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes peer churn somewhat, and Grin++ peers are stable. Awesome work!

@hashmap hashmap merged commit 304ae44 into mimblewimble:master Apr 26, 2019
@antiochp antiochp added the release notes To be included in release notes (of relevant milestone). label Jun 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug release notes To be included in release notes (of relevant milestone).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants