-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24648][SQL] SqlMetrics should be threadsafe #21634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Use LongAdder to make SQLMetrics threadsafe.
|
ok to test |
|
add to whitelist |
|
|
||
| test("writing metrics from multiple threads") { | ||
| implicit val ec: ExecutionContextExecutor = ExecutionContext.global | ||
| val nThreads = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nThreads? This is the number of futures right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True - I'll rename.
| assert(acc.isZero()) | ||
| } | ||
|
|
||
| test("writing metrics from multiple threads") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really dumb question, does this fail without the fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Checking in a separate branch:
org.scalatest.exceptions.TestFailedException: 100000 did not equal 56544
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I think this is not a valid test. Spark assumes there should be only one writer at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, we just need AccumulatorV2 can be read in the heart beat thread. That's the only place need to think about concurrence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean it's a one-writer, multi-reader scene?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was originally raised because an implementation of broadcast join did have multiple writers. Unfortunately we recently determined that the LongAdder is causing a performance regression and we are going to revert this.
@cloud-fan or @hvanhovell can one of you send the rollback PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean it's a one-writer, multi-reader scene?
Yes.
hvanhovell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - pending jenkins
|
Test build #92300 has finished for PR 21634 at commit
|
|
Test build #92302 has finished for PR 21634 at commit
|
HeartSaVioR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 Nice finding.
|
Merging to master. Thanks! |
|
FYI I've reverted this patch w.r.t. #21634 (comment) |
Use LongAdder to make SQLMetrics thread safe. Replace += with LongAdder.add() for concurrent counting Unit tests with local threads
Use LongAdder to make SQLMetrics thread safe.
Use LongAdder to make SQLMetrics thread safe.
What changes were proposed in this pull request?
Replace += with LongAdder.add() for concurrent counting
How was this patch tested?
Unit tests with local threads