-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ipld): fix races in the Retriever's doRequest method #815
Conversation
Oh shoot, I just realized I am using a feature from 1.18 here and we are not updated yet, ahhh. |
Codecov Report
@@ Coverage Diff @@
## main #815 +/- ##
==========================================
+ Coverage 53.30% 53.33% +0.03%
==========================================
Files 119 119
Lines 6833 6858 +25
==========================================
+ Hits 3642 3658 +16
- Misses 2821 2826 +5
- Partials 370 374 +4
Continue to review full report at Codecov.
|
98957a2
to
a61cda6
Compare
Ok, so this race fixing PR uncovered another race on CI. Nice |
Ok, so now this PR fixes two independent races. Will update the description |
8d446a4
to
2ad02cf
Compare
Rebased on main with go1.18. Removed temporal semaphore as we have the needed feature from 1.18 now |
* During which the square was continued to be written in while used outside of the retrievalSession * Also, unlock on defer to simplify code a bit
2ad02cf
to
6eb883b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P.S. The complexity of the code in Retriever is starting to grow, and it's hard to test it. We will likely need to refactor the code(I have a brief picture in my head). However, though, this should happen after all the API improvements land in rsmt2d.
Can you please write that down (also which changes in r2mt2d are necessary and how they related to simplifying the retriever code here).
This is really hard to reason about and I can't really tell if this PR fixes things while introducing a bunch of others.
@liamsi, to reason about the code is better to check out and see the whole code, rather than changes only on GH. |
I also put a lot of effort into comments. If they don't make sense, pls point that out. Otherwise, I am pretty confident in this code now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes with the comments make sense (besides one case I commented on).
Thank you @liamsi! |
Co-authored-by: Ismail Khoffi <[email protected]>
Mainly this PR fixes two races:
The mentioned race is a known one, and we had an incorrect attempt previously to fix it cca832a.
This fix is correct and achieved by keeping a mutex per share. If a share comes from any of two sources first, it locks the slot and prevents a share from another source from being written.
Closes #814
P.S. The complexity of the code in Retriever is starting to grow, and it's hard to test it. We will likely need to refactor the code(I have a brief picture in my head). However, though, this should happen after all the API improvements land in rsmt2d.