-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GossipSub 1.2] IDONTWANT control message #548
Conversation
That seems very reasonable; I like @AgeManning any thoughts? |
Actually, this might make some sense to implement with However, this will conflate with heartbeat generated IHAVEs. We can also effect some positive scoring on this action (if it pertains to messages we have seen or see shortly after) Another consideration: if we sending the message to the peer already, would that be useful? PS: |
pubsub/gossipsub/gossipsub-v1.2.md
Outdated
|
||
| Parameter | Description | Reasonable Default | | ||
|-------------------------|------------------------------------------------------------------|--------------| | ||
| `max_dontsend_messages` | The maximum number of `DONTSEND` messages per heartbeat per peer | ??? | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that necessary? We can just descore peer if they send duplicate DONTSENDs, or we don't eventually see the message id
GossipSub already has too many parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we don't eventually see the message id
Sounds like a good idea 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we don't eventually see the message id
Considering DONTSEND
is allowed to be sent before validation, a peer can be downscored if a message DONTSEND
has been sent for appears to be invalid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have numbers on how much we can gain by allowing to send DONTSEND
s before validation? (ie, how long the validation is in practice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid the downscoring by keeping a bounded cache, that gets overfilled to /dev/null.
We probably dont need a parameter for this, each peer can configure appropriately according to the expected message rate.
If we do want to downscore excessive rates of IDONTWANT, then we should validate first or else we open the door for a spam attack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that validation in some networks can be slow, so there is real benefit by sending early.
Thinking a bit more about this, I am not entirely convinved we need any of this. We can simply start keep tracking messages we have received from a mesh peer recently; if they have send us a message, we should supress it. Otherwise we just send the message and the IDONTWANT is implicit/redundant. |
@vyzo the DONTSEND becomes useful when the messages get big enough that it actually takes time to transmit them ie, a 0.5mb message over a 25mbps line will take 160ms to be sent to 8 peers, so sending a small DONTSEND before sending the full message gives us 160 ms more to avoid duplicates |
Ok, fair enough. |
Yes, we discussed that option, and came to agreement that those messages have slightly different semantics which might make the difference under some conditions. For example a peer may send
Yes I'm voting for
Yes, the basic strategy would be to broadcast corresponding |
pubsub/gossipsub/gossipsub-v1.2.md
Outdated
} | ||
|
||
message ControlDontSend { | ||
required bytes messageID = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for discussion, what would be the pros/cons of also including the topic?
I think it's the first time that we reference messages only by their id on the wire, so needs some consideration
I guess the only con of include the topic is more bandwidth usage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe an optional field?
Agreed about bandwidth usage, we should aim to keep this lean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IHAVE
also has an optional topic
field. But it looks like it is utilized by one implementation only (not sure which one exactly). I couldn't find any reasonable usage of topic for IDONTWANT
tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no one is opposed or has any ideas on usage scenarios I would keep it without optional topic field to be more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no one is opposed or has any ideas on usage scenarios I would keep it without optional topic field to be more explicit
The usage can be #548 (comment)
for context, here are two scenarios we're addressing with this upgrade, for large messages in particular: |
@vyzo Can I also join the episub group? FYI, I am the network person of the EF research and have been working with @Nashatyrev, @Menduist, and others. |
Of course! Reach out on telegram to add you. |
Let's target this to #560 |
# Conflicts: # pubsub/gossipsub/gossipsub-v1.2.md
Looks good to me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small comment, otherwise lgtm.
|
||
When the peer receives the first message instance it immediately broadcasts | ||
(not queue for later piggybacking) `IDONTWANT` with the `messageId` to all its mesh peers. | ||
This could be performed prior to the message validation to further increase the effectiveness of the approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerns about spam attacks triggering amplified IDONTWANT spam?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look like a feasible attack vector to me:
IDONTWANT
is primarily intended for larger messages, so the cumulative size of resultingIDONTWANT
messages is expected to be significantly smaller than the original message- If an attacker is sending invalid messages to initiate
IDONTWANT
spamming it would be pretty quickly banned due to negative scoring - And we have
max_idontwant_messages
limit as the last resort
…VE and IWANT messages Co-authored-by: Pop Chunhapanya <[email protected]>
|
||
| Parameter | Description | Reasonable Default | | ||
|--------------------------|------------------------------------------------------------------|--------------| | ||
| `max_idontwant_messages` | The maximum number of `IDONTWANT` messages per heartbeat per peer | ??? | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this probably needs to be per-topic per heartbeat, rather than a total per heartbeat.
It seems it could be tied in with the scoring for mesh message delivery rate. I.e the more messages we are expecting per topic, the more IDONTWANT messages we would expect to receive.
One thought would be to add a behaviour penalty, similar to broken promises, if the number of IDONTWANT messages received from a peer exceeds the mesh message delivery rate.
We intend to implement this fairly soon. Perhaps we can leave the scoring penalty here for a future PR if we dont want to specify it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking IDONTWANT
by topics would be perfect, however you don't know a message topic by its ID unless you already received this message.
IDONTWANT
message is almost semantically equivalent to IHAVE
message, so probably should have a similar anti-spam protection mechanism?
Probably it could make sense to set the 'reasonable default' to maxIHaveLength * maxIHaveMessages
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however you don't know a message topic by its ID unless you already received this message.
Is it a good reason to include topics to IDONTWANT
and not care what IHAVE
and IWANT
are doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a good reason to include topics to IDONTWANT and not care what IHAVE and IWANT are doing?
Including a topic would increase IDONTWANT
traffic 2-3 times (messageID is 20 bytes, topic could be up to around 40 bytes). Not sure if it's worth just to enable more advanced spam protection...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also consider that IHAVE
is sent on heartbeat with a batch of messages which may be grouped by topics and the topic overhead here could not be that significant.
IDONTWANT
on the other hand is intended to be flushed immediately and would most likely contain just a single messageID so the topic overhead would be more significant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it might be valuable to specify it - ie effectively, after mcache expires, the message should no longer exist and implementations should be able to rely on them being "resent" if they resurface after that time - this more faithfully keeps the protocol consistent in this aspect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why using a weak hash function for unwanted is a problem.
@ppopth I was meaning this HashDoS attack.
The messageIDs in IDONTWANT
are not validated: neither upon receive nor later (as for IHAVE
). Thus an adversary may generate and send significant amount of messageIDs which yields the same hashCode in the context of unwanted
hash set. That would result in a DoS on the receiver's side.
Actually this attack vector could be addressed on the implementation side pretty easily. I just wanted to add a 'warning note' for implementers into the spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nashatyrev Got it. Thank you so much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure it's been mentioned, but because the message id:s are not validated
@arnetheduck Good point!
I believe as far as the messageId
function is application specific, the validation of a messageId
is left for an application responsibility. I believe Ethereum clients should all have such validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Ethereum clients should all have such validation
I'm not convinced this is the case, ie at least in nimbus, we don't validate message id:s (only messages). There's a custom message id generation feature, but this again does not validate message id:s themselves.
This PR represents the first time we receive message id:s that we're expected to store / keep track of - all others are either generated from actual messages or ephemeral.
Co-authored-by: João Oliveira <[email protected]>
Hey all. We are very interested in testing this. What are our thoughts on merging this. We can do a future PR to handle scoring, or leave it up to implementors to handle their own dos prevention strategies. We want to start testing on live networks using the 1.2 protocol id. My concern is that if we do this, without this PR being merged and someone decides to change the protobuf, we will have then polluted the protocol-id space with an incompatible version. We will go with a |
fwiw, nimbus/nim-libp2p has implemented this as an extension to 1.1: https://github.com/vacp2p/nim-libp2p/blob/2b5319622c997ce1c80bc62c863e30f3349ee0d7/libp2p/protocols/pubsub/gossipsub/behavior.nim#L266 - it would indeed be nice to get this merged and properly give it a version number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Soooo, lets merge it?
Any objections?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, unless someone wants to edit in the various security recommendations
## GossipSub v1.2 implementation Specification: libp2p/specs#548 ### Work Summary Sending IDONTWANT Implement a smart queue Add priorities to the smart queue Put IDONTWANT packets into the smart priority queue as soon as the node gets the packets Handling IDONTWANT Use a map to remember the message ids whose IDONTWANT packets have been received Implement max_idontwant_messages (ignore the IDONWANT packets if the max is reached) Clear the message IDs from the cache after 3 heartbeats Hash the message IDs before putting them into the cache. More requested features Add a feature test to not send IDONTWANT if the other side doesnt support it ### Commit Summary * Replace sending channel with the smart rpcQueue Since we want to implement a priority queue later, we need to replace the normal sending channels with the new smart structures first. * Implement UrgentPush in the smart rpcQueue UrgentPush allows you to push an rpc packet to the front of the queue so that it will be popped out fast. * Add IDONTWANT to rpc.proto and trace.proto * Send IDONTWANT right before validation step Most importantly, this commit adds a new method called PreValidation to the interface PubSubRouter, which will be called right before validating the gossipsub message. In GossipSubRouter, PreValidation will send the IDONTWANT controll messages to all the mesh peers of the topics of the received messages. * Test GossipSub IDONWANT sending * Send IDONWANT only for large messages * Handle IDONTWANT control messages When receiving IDONTWANTs, the host should remember the message ids contained in IDONTWANTs using a hash map. When receiving messages with those ids, it shouldn't forward them to the peers who already sent the IDONTWANTs. When the maximum number of IDONTWANTs is reached for any particular peer, the host should ignore any excessive IDONTWANTs from that peer. * Clear expired message IDs from the IDONTWANT cache If the messages IDs received from IDONTWANTs are older than 3 heartbeats, they should be removed from the IDONTWANT cache. * Keep the hashes of IDONTWANT message ids instead Rather than keeping the raw message ids, keep their hashes instead to save memory and protect again memory DoS attacks. * Increase GossipSubMaxIHaveMessages to 1000 * fixup! Clear expired message IDs from the IDONTWANT cache * Not send IDONTWANT if the receiver doesn't support * fixup! Replace sending channel with the smart rpcQueue * Not use pointers in rpcQueue * Simply rcpQueue by using only one mutex * Check ctx error in rpc sending worker Co-authored-by: Steven Allen <[email protected]> * fixup! Simply rcpQueue by using only one mutex * fixup! Keep the hashes of IDONTWANT message ids instead * Use AfterFunc instead implementing our own * Fix misc lint errors * fixup! Fix misc lint errors * Revert "Increase GossipSubMaxIHaveMessages to 1000" This reverts commit 6fabcdd. * Increase GossipSubMaxIDontWantMessages to 1000 * fixup! Handle IDONTWANT control messages * Skip TestGossipsubConnTagMessageDeliveries * Skip FuzzAppendOrMergeRPC * Revert "Skip FuzzAppendOrMergeRPC" This reverts commit f141e13. * fixup! Send IDONWANT only for large messages * fixup! fixup! Keep the hashes of IDONTWANT message ids instead * fixup! Implement UrgentPush in the smart rpcQueue * fixup! Use AfterFunc instead implementing our own --------- Co-authored-by: Steven Allen <[email protected]>
The go implementation in libp2p/go-libp2p-pubsub#553 has merged. |
This PR introduces a new GossipSub version
1.2
Co-authored with @Menduist
New messages
The new
IDONTWANT
control message is added which notifies mesh peers to suspend sending back the full publish message based on itsmessageId
Simulation results
Various simulations demonstrated ~30% traffic reduction and ~3-5% message delivery latency reduction
choke
there) : https://hackmd.io/X1DoBHtYTtuGqYg0qK4zJw#4---BBRRelated work
Relates to #413