Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add fp16 and fp64 support for topk #14912

Closed
wants to merge 1 commit into from

Conversation

anirudhacharya
Copy link
Member

@anirudhacharya anirudhacharya commented May 7, 2019

Description

Add fp16 and fp64 support for topk operator. Required for certain machine translation tasks( and other NLP related tasks).

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

For review - @anirudh2290 @apeforest

@anirudhacharya anirudhacharya changed the title Add fp16 and fp64 support for for topk Add fp16 and fp64 support for topk May 7, 2019
@vandanavk
Copy link
Contributor

@mxnet-label-bot add [pr-awaiting-review, Operator]

@anirudhacharya
Copy link
Member Author

@marcoabreu @perdasilva any idea why the build fails for only windows gpu and passes for the rest?

@perdasilva
Copy link
Contributor

I've had a look and also reached out to Anton. I have no idea what the problem here could be. I wonder if it has something to do with the environment. On Windows we seem to be using CUDA v9.2 and driver v398.75. But I don't see how this could really be an issue...=S

@anirudhacharya
Copy link
Member Author

Thank you @perdasilva . @lebeg can you please help out here.

@@ -401,8 +401,6 @@ void TopKImpl(const RunContext &ctx,
mxnet::op::SortByKeyWorkspaceSize<int, int, xpu>(src.Size()));
temp_size = std::max(temp_size,
mxnet::op::SortByKeyWorkspaceSize<int, DType, xpu>(src.Size()));
temp_size = std::max(temp_size,
mxnet::op::SortByKeyWorkspaceSize<DType, int, xpu>(src.Size()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be useful because we need to do mxnet::op::SortByKey(dat, ind, is_ascend, &sort_work);.

Copy link
Member Author

@anirudhacharya anirudhacharya May 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't it be done even now? this change passed all the unit tests, btw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this line may break some corner cases uncovered by the existing test. Since we need to call SortByKey<DType, int>(dat, ind, ...), SortByKey<int, DType>(batch_id, dat, ...) and SortByKey<int, int>(batch_id, ind, ...), we should make sure that the temporary storage has enough size for all cases.

Copy link
Member Author

@anirudhacharya anirudhacharya May 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may break some corner cases uncovered by the existing test

it does not break any existing tests. I checked that locally. And the tests passed on the CI too.

we should make sure that the temporary storage has enough size for all cases.

I thought about this when making the change and tried to make sure. Is there a use-case/unit test you could point to which would break with this change.

@abhinavs95
Copy link
Contributor

@anirudh2290 @sxjscience Could you see if your comments are addressed? Thanks.

@piyushghai
Copy link
Contributor

@anirudh2290 Bouncing for a review...

@vandanavk
Copy link
Contributor

Is this PR good to go?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Operator pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FP16 support for topK
8 participants