-
Notifications
You must be signed in to change notification settings - Fork 992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix race condition with dandelion_relay peer map and make more semantic #2548
fix: Fix race condition with dandelion_relay peer map and make more semantic #2548
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice bit of cleanup. 👍
I think the map was there because originally we were considering nodes with multiple relay peers (as discussed in the Dandelion paper). But we went with a simplified implementation with single relay peer.
…after refreshing it
I realized there was another latent bug in this code -- we actually never re-read the dandelion_relay variable after refreshing it previously, and I didn't fix it in my original patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good on the intent and the implementation, simpler and nicer. Did you get a chance to test it "in the wild" with a few transactions? Pondering whether to target this for 1.0.2
or not.
Nope! What's the best way to do that? |
@JeremyRubin just run the branch against floonet. You can likely mine a Floonet block in a few minutes on your CPU. Then send it around. Sounds doable? |
Had a chance to try against Floonet? |
Yep!
|
This might be a little controversial.
Currently, we are storing a HashMap of peers by time in the dandelion_relay field inside of the Peers struct.
However, it doesn't seem to be the case that there is an intended code-path where this is ever more than one peer (the only writer is from set_dandelion_relay, which first clears the map).
However, it's possible that two or more calls to set_dandelion_relay happen concurrently. Let's assume two calls. It's then possible for this synchronization to happen:
In which case the map would contain two entries. Because this can't be triggered deterministicly, I'm assuming it's unintended and can be safely removed.
edit: there is also a minor race condition caused by the potential for both of the timestamps received to be the same
Because we only want to store one value, I replace the map with an Option.
As a bonus, this should be a small performance improvement as we don't have to copy the hashmap any more and we only need the write lock one time.
Note: Future work could also replace the RwLock<Option<_>> with an AtomicPtr<Arc<(i64, Arc)>> and some trivial-ish unsafe code if the read-write-locking burden is an issue.