-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6685][MLLIB]Use DSYRK to compute AtA in ALS #13891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Is this actually faster though? |
|
This is a prototype. Actually, it is critical if it will be faster = =! |
|
Test build #61171 has finished for PR 13891 at commit
|
|
@hqzizania This could be tested with benchmarks without ALS. I guess even with a correct implementation, we need a large rank to see improvement. |
|
@mengxr Do you mean only test |
|
Test build #61184 has finished for PR 13891 at commit
|
|
code for testing results: |
|
@mengxr this is a simple imitation of the loop in |
| } | ||
| var i = srcPtrs(j) | ||
| var numExplicits = 0 | ||
| val doStack = if (srcPtrs(j + 1) - srcPtrs(j) > 10) true else false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (...) true else false is redundant
|
It may not matter much but your test code is a little different than in the patch, like for copyToTri(). |
set stack size > 128 and comments added
|
@srowen Ops~ The And The results are basically the same. Actually |
|
I set the threshold size to stack as 128 according to some more tests results, where 128 maybe a conservative size. However, this change will bypassing existing unit tests, as |
|
Test build #61381 has finished for PR 13891 at commit
|
|
Test build #61465 has finished for PR 13891 at commit
|
|
cc @mengxr @yanboliang Was this patch Okay? |
|
@hqzizania Could you share the regression performance test result? I have time to get this in if it's ready. Thanks. |
|
@yanboliang sorry, i'm on a business trip and will upload the test result ASAP. |
|
@yanboliang So sorry for my late response. Some regression performance test results: ALS: rank = 1024 ALS: rank = 129 ALS: rank = 512 The results shows this patch makes it faster very much when rank is large, but we should reset the two threshold values of "doStack" as 1024 rather 128. |
|
@hqzizania Thanks for the performance tests! This matches my guess. I'm not sure how often people use a rank greater than 1000 or even 250. But I think it is good to use BLAS level-3 routines. We can make the threshold a param and set a small threshold and test both code paths. |
|
@mengxr I see. I will add a param for it. :) |
|
Test build #67323 has finished for PR 13891 at commit
|
|
Test build #67422 has finished for PR 13891 at commit
|
|
Test build #67423 has finished for PR 13891 at commit
|
|
Test build #67424 has finished for PR 13891 at commit
|
|
Test build #67432 has finished for PR 13891 at commit
|
|
Test build #67434 has finished for PR 13891 at commit
|
|
Test build #67435 has finished for PR 13891 at commit
|
|
@mengxr @srowen @yanboliang A threshold param is added for unit tests. Does it look okay now? |
project/MimaExcludes.scala
Outdated
| ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.status.api.v1.TaskMetricDistributions.this") | ||
| ) ++ Seq( | ||
| // SPARK-6685 | ||
| ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.ml.recommendation.ALS.train") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually remove a method? that shouldn't be necessary, I imagine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't know how to write the "mima exclude" exactly. It could be not a proper solution to the failure of mima, which may be caused by the modification to def train
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It's a developer API, so more reasonable to change, though it's still ideal to not change these APIs unless necessary. Try putting the new param at the end? I don't think that changes the situation but makes it at least source-compatible with any current invocations.
|
Test build #67523 has finished for PR 13891 at commit
|
|
@srowen It seems mima test still fails when putting the new Param at the end of train method. :( |
|
@hqzizania OK thanks for checking that. That may be an issue for this change. |
|
@srowen I am not familiar with MiMa really, so what should I do now? Or just go back to the previous commit, and create a JIRA for the issue? |
|
@hqzizania If you check the log, there are some guides for how to. Should we maybe rebase this and check the logs? |
|
I will propose to close this assuming this is inactive. |
## What changes were proposed in this pull request? This PR proposes to close PRs ... - inactive to the review comments more than a month - WIP and inactive more than a month - with Jenkins build failure but inactive more than a month - suggested to be closed and no comment against that - obviously looking inappropriate (e.g., Branch 0.5) To make sure, I left a comment for each PR about a week ago and I could not have a response back from the author in these PRs below: Closes apache#11129 Closes apache#12085 Closes apache#12162 Closes apache#12419 Closes apache#12420 Closes apache#12491 Closes apache#13762 Closes apache#13837 Closes apache#13851 Closes apache#13881 Closes apache#13891 Closes apache#13959 Closes apache#14091 Closes apache#14481 Closes apache#14547 Closes apache#14557 Closes apache#14686 Closes apache#15594 Closes apache#15652 Closes apache#15850 Closes apache#15914 Closes apache#15918 Closes apache#16285 Closes apache#16389 Closes apache#16652 Closes apache#16743 Closes apache#16893 Closes apache#16975 Closes apache#17001 Closes apache#17088 Closes apache#17119 Closes apache#17272 Closes apache#17971 Added: Closes apache#17778 Closes apache#17303 Closes apache#17872 ## How was this patch tested? N/A Author: hyukjinkwon <[email protected]> Closes apache#18017 from HyukjinKwon/close-inactive-prs.





What changes were proposed in this pull request?
jira: https://issues.apache.org/jira/browse/SPARK-6685
This is to swtich DSPR to DSYRK to use native BLAS to accelerate the computation of AtA in ALS. A buffer is allocated to stack vectors to do Level 3 BLAS routine
How was this patch tested?
java and scala ut