Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[BUGFIX] Fix workspace of BoxNMS #20212

Merged
merged 1 commit into from
Apr 27, 2021
Merged

Conversation

ptrendx
Copy link
Member

@ptrendx ptrendx commented Apr 23, 2021

Description

This commit fixes 2 problems:

  • previously in the calculation of the workspace size of NMS operator the space for the buffer was not included, resulting in potentially too small allocation
  • because the type requested for uint8_t, the maximum workspace size without setting large tensor support was 2GB, which was too small for some cases, without any error reporting (going over that limit would make the index_t in Shape1 overflow, resulting in negative value, which would later be cast to size_t in the allocator and MXNet would try to allocate a huge amount of memory, resulting in OoM). This PR changes the type of the allocation to double so that the size limit for workspace rises to 16 GB when large tensor support is not enabled.

Checklist

Essentials

  • PR's title starts with a category (e.g. [BUGFIX], [MODEL], [TUTORIAL], [FEATURE], [DOC], etc)
  • Changes are complete (i.e. I finished coding on this PR)

@ptrendx ptrendx requested a review from DickJC123 April 23, 2021 23:40
@mxnet-bot
Copy link

Hey @ptrendx , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [sanity, clang, centos-gpu, centos-cpu, windows-gpu, miscellaneous, unix-cpu, windows-cpu, edge, website, unix-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 23, 2021
@ptrendx
Copy link
Member Author

ptrendx commented Apr 24, 2021

@mxnet-bot run ci [centos-cpu, centos-gpu, unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [centos-gpu, centos-cpu, unix-gpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 24, 2021
@ptrendx
Copy link
Member Author

ptrendx commented Apr 26, 2021

@mxnet-bot run ci [centos-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [centos-cpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-work-in-progress PR is still work in progress and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 26, 2021
@ptrendx
Copy link
Member Author

ptrendx commented Apr 26, 2021

@mxnet-bot run ci [centos-cpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [centos-cpu]

@mseth10 mseth10 added pr-awaiting-testing PR is reviewed and waiting CI build and test pr-awaiting-review PR is waiting for code review and removed pr-work-in-progress PR is still work in progress pr-awaiting-testing PR is reviewed and waiting CI build and test labels Apr 26, 2021
Copy link
Contributor

@DickJC123 DickJC123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an appropriate fix for this operator. LGTM.

@DickJC123 DickJC123 merged commit 915aa55 into apache:master Apr 27, 2021
chinakook pushed a commit to chinakook/mxnet that referenced this pull request May 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants