Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Added launch bounds to the reduce kernels #16397

Merged
merged 4 commits into from
Oct 31, 2019

Conversation

ptrendx
Copy link
Member

@ptrendx ptrendx commented Oct 8, 2019

Description

Fixes #16338. Adds launch bounds around the reduce_kernel_M1 kernels.

@reminisce Please verify the fix.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@ptrendx ptrendx requested a review from reminisce October 8, 2019 20:14
@ptrendx
Copy link
Member Author

ptrendx commented Oct 22, 2019

@reminisce ping to verify the fix

@ptrendx
Copy link
Member Author

ptrendx commented Oct 24, 2019

@hgt312 Could you help verifying that this fixes the issue with "too many resources requested for launch" when building in DEBUG mode?

@hgt312
Copy link
Contributor

hgt312 commented Oct 25, 2019

@ptrendx Yes, it fixed that.

@ptrendx
Copy link
Member Author

ptrendx commented Oct 25, 2019

@hgt312 Thanks :-)!

Copy link
Contributor

@DickJC123 DickJC123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, in ./mshadow/cuda/tensor_gpu-inl.cuh: const int kMaxThreadsPerBlock = 1024; Valid since GPU arch 2.0.

LGTM.

@DickJC123 DickJC123 merged commit 979e610 into apache:master Oct 31, 2019
yajiedesign pushed a commit to yajiedesign/mxnet that referenced this pull request Nov 6, 2019
* Added launch bounds to the reduce_kernel_M1

* Trigger CI

* Reretrigger the CI
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce op throws "too many resources requested for launch"
3 participants