-
Notifications
You must be signed in to change notification settings - Fork 971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
share: handle rows concurrently #241
share: handle rows concurrently #241
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vgonkivs, thank you for working on this. Unfortunately, this is wrong and I'll try to explain why.
@vgonkivs Thank you very much for this PR - would you be able to take a first shot at writing a test for this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revisit this: https://github.com/celestiaorg/celestia-node/pull/241/files#r760551422
I think it would be good to add a benchmark that shows that this parallelization is justified (though it is relatively obvious that it should be faster but still).
would you be able to take a first shot at writing a test for this?
The benchmark could also serve as a test. The current tests do not yet test multiple rows (thx @renaynay for point this out).
celestia-node/service/share/share_test.go
Line 33 in 31cdabc
func TestService_GetSharesByNamespace(t *testing.T) { |
@liamsi, the optimization is not CPU bound, i.e. we don't parallelize something calculation intensive, but we parallelize IO bound logic, so parallelization is furthermore a wrong word here, while concurrency is a right one(like it is said in PR name). Thus, we might not see any performance improvements and even performance degradation when we mock out network responses. Particularly, here we start traversing and requesting each row concurrently, instead of doing them in order. Imagine sending 10 network requests in order and compare this to spawning 10 concurrent goroutines. In order requires for n+1 request to wait for n response to land first, while sending concurrently allows requests to be sent independently. In the mock environment, where responses are immediate, goroutines may only add overhead, as there are no responses to wait for. |
Thanks for the explanation @Wondertan. I was a bit confused after I had received the results - the code really works slower on mock. |
169beba
to
1839a6e
Compare
1839a6e
to
f324797
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
Waiting on review from @liamsi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liamsi required for the merged |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you @vgonkivs 🙏🏼
Also thanks @renaynay and @Wondertan for the reviews.
Closes #184