-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
@mxnet-label-bot add [pr-awaiting-review, Operator] |
@marcoabreu @perdasilva any idea why the build fails for only windows gpu and passes for the rest? |
I've had a look and also reached out to Anton. I have no idea what the problem here could be. I wonder if it has something to do with the environment. On Windows we seem to be using CUDA v9.2 and driver v398.75. But I don't see how this could really be an issue...=S |
Thank you @perdasilva . @lebeg can you please help out here. |
@@ -401,8 +401,6 @@ void TopKImpl(const RunContext &ctx, | |||
mxnet::op::SortByKeyWorkspaceSize<int, int, xpu>(src.Size())); | |||
temp_size = std::max(temp_size, | |||
mxnet::op::SortByKeyWorkspaceSize<int, DType, xpu>(src.Size())); | |||
temp_size = std::max(temp_size, | |||
mxnet::op::SortByKeyWorkspaceSize<DType, int, xpu>(src.Size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should be useful because we need to do mxnet::op::SortByKey(dat, ind, is_ascend, &sort_work);
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't it be done even now? this change passed all the unit tests, btw.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this line may break some corner cases uncovered by the existing test. Since we need to call SortByKey<DType, int>(dat, ind, ...)
, SortByKey<int, DType>(batch_id, dat, ...)
and SortByKey<int, int>(batch_id, ind, ...)
, we should make sure that the temporary storage has enough size for all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may break some corner cases uncovered by the existing test
it does not break any existing tests. I checked that locally. And the tests passed on the CI too.
we should make sure that the temporary storage has enough size for all cases.
I thought about this when making the change and tried to make sure. Is there a use-case/unit test you could point to which would break with this change.
@anirudh2290 @sxjscience Could you see if your comments are addressed? Thanks. |
@anirudh2290 Bouncing for a review... |
Is this PR good to go? |
Description
Add fp16 and fp64 support for topk operator. Required for certain machine translation tasks( and other NLP related tasks).
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
For review - @anirudh2290 @apeforest