Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

set_lower_bound(1) so that stride is not zero #18997

Merged
merged 4 commits into from
Aug 31, 2020
Merged

Conversation

ekdnam
Copy link
Contributor

@ekdnam ekdnam commented Aug 24, 2020

in issue: #18942, an error was occurring where strides became zero. to solve the issue, the lower bounds of stride1 and stride2 have been set to 1.

in issue: #18994, a typo was mentioned. that has also been cleared.

fix #18942
fix #18994

Description

(Brief description on what this PR is about)
When the correlation function is called on two arrays when the stride is zero, it results in a floating point exception as a division by zero is carried out. Thus, the lower bounds of stride1 and stride2 have been set to 1.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

in issue: apache#18942, an error was occurring where strides became zero. to solve the issue, the lower bounds of stride1 and stride2 have been set to 1.
@mxnet-bot
Copy link

Hey @ekdnam , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [windows-gpu, centos-gpu, unix-gpu, sanity, unix-cpu, clang, windows-cpu, edge, miscellaneous, website, centos-cpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

'use_unifrom' changed to 'use_uniform'
@ekdnam ekdnam requested a review from szha as a code owner August 24, 2020 13:01
@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 24, 2020

What changes can I make to the source code so that the unix-cpu job succeeds?

@szha
Copy link
Member

szha commented Aug 24, 2020

@ekdnam I think the test failed due to a network issue that's unrelated to the change. You can use the command as described by the mxnet ci bot in this comment: #18997 (comment). for now I triggered the retry for you.

@szha
Copy link
Member

szha commented Aug 24, 2020

Thanks for the fix, @ekdnam! Could you add a test here and assert that proper error will happen when the parameter is out of bound? https://github.com/apache/incubator-mxnet/blob/3c4ac19daa3e645d918692b864ea19640f7e0314/tests/python/unittest/test_operator.py#L3200-L3233

Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. once test is added it should be good to go!

@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 25, 2020

Thanks for triggering the test! I will add the required unit tests.

@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 28, 2020

@szha can you give someone guidelines on the tests? I am sorta new to testing

@szha
Copy link
Member

szha commented Aug 29, 2020

@ekdnam yes happy to help. As documented here, mxnet uses pytest as the python testing tool for development. Here's pytest's guide on how to write tests for expected exception. With that, you can write the statements to check whether an exception (and in this case the expected error should be MXNetError) is raised when input argument doesn't meet the lower bound.

To test this out locally, you need to:

  • follow the build from source guide (we use cmake for this, configs are in /config, choose the one that's right for your platform) and build the change locally.
  • cd into /python folder, and you can use python setup.py develop to link your local copy to your python site packages, which will enable your python to find the mxnet package you built.
  • if you haven't yet, install pytest. you may also choose to install the development and testing dependencies that are exactly the same as mxnet CI. you can achieve this with
python -m pip install -r https://raw.githubusercontent.com/apache/incubator-mxnet/master/ci/docker/install/requirement
  • finally, run the test and make sure it passes. make changes to the code base as needed. the complete test suite of mxnet is quite large so instead you may choose to run a specific test. see pytest's guide on selecting the test to run

@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 29, 2020

Thanks a lot for such an in-depth answer!!

checking that stride1 and stride2 are be greater than zero
@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 30, 2020

@szha I am currently having some issues while building mxnet from source. It may take some more time to resolve them. So I can't test myself the changes made. Hope the changes made to test_operator.py are enough. Thanks

@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 31, 2020

@szha thanks for making the changes! :))

@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 31, 2020

@mxnet-bot run ci [windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu]

@ekdnam
Copy link
Contributor Author

ekdnam commented Aug 31, 2020

@mxnet-bot run ci [windows-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [windows-gpu]

@szha szha merged commit de34763 into apache:master Aug 31, 2020
@szha
Copy link
Member

szha commented Aug 31, 2020

@ekdnam thanks for the fix!

@ekdnam
Copy link
Contributor Author

ekdnam commented Sep 1, 2020

@szha sure! Thank you for the help!

@szha szha mentioned this pull request Sep 7, 2020
4 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants