-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Mitigate exponential complexity when running gpu_topology tests #13343
Conversation
@mxnet-label-bot add [pr-awaiting-review] |
4900f7c
to
9f6b9cd
Compare
LGTM, changing the test coverage from 16 to 8 should allow it finish much quicker. |
@@ -67,15 +63,18 @@ bool IsSatisfactory(const std::vector<float>& W, int num_gpus, int depth) { | |||
// Generates random link topology matrix using random number generator | |||
void TestComputeTreesRandomized(int num_gpus, float alpha, int backtrack, | |||
std::mt19937* gen) { | |||
using namespace std; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need it if you are not using it
std::fill(W.begin(), W.end(), 0.f); | ||
GenerateMatrix(&W, num_gpus, k, gen); | ||
satisfied = IsSatisfactory(W, num_gpus, depth); | ||
std::fill(W.begin(), W.end(), 0.f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to std::vector
's constructor, the initialization is already done at line 69.
@@ -562,7 +561,7 @@ TEST(GpuTopology, TestComputeTrees1) { | |||
float alpha = 0.7; | |||
bool backtrack = true; | |||
// Do 5 randomized tests per GPU count from 2 to 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the comment from 16 to 8 as well
- KL never succeeds so it always goes exponential - Too many weight matrices were rejected because of zero weights, simplify generation to not include 0 weight edges
satisfied = IsSatisfactory(W, num_gpus, depth); | ||
GenerateMatrix(&W, num_gpus, gen); | ||
satisfied = IsSatisfactory(W, num_gpus, depth); | ||
if (mxnet::kvstore::kLogTree && !satisfied) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously it was trying multiple times - we want to stop doing it with this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's of no use. I also changed the distribution, is not relevant anymore and took a long time.
- KL never succeeds so it always goes exponential - Too many weight matrices were rejected because of zero weights, simplify generation to not include 0 weight edges
Description
Mitigates MXNET_KVSTORE_USETREE problems #13341
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.