Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Make check_isfinite, check_scale optional in clip_global_norm #12042

Merged
merged 5 commits into from
Aug 27, 2018

Conversation

leezu
Copy link
Contributor

@leezu leezu commented Aug 6, 2018

Description

Make check_isfinite, check_scale optional in clip_global_norm. If both are set to false, clip_global_norm does not force any synchronization and throughput can be increased. Note if check_scale=False, this requires multiplying all arrays with 1 in cases where the multiplication could be skipped by doing a blocking check.

While this PR preserves the old default behavior of using blocking calls, we may want to change the default behavior to improve throughput.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Make check_isfinite, check_scale optional in clip_global_norm

Comments

@leezu leezu requested a review from szha as a code owner August 6, 2018 02:23
@leezu leezu force-pushed the optimizeclipglobalnorm branch 7 times, most recently from 69926ef to 4546948 Compare August 6, 2018 22:10
@nswamy nswamy added Python Gluon pr-awaiting-review PR is waiting for code review labels Aug 8, 2018
@lupesko
Copy link
Contributor

lupesko commented Aug 21, 2018

@leezu can you please follow the community's convention to file a JIRA and update the issue title?
@eric-haibin-lin bouncing for review.

@leezu
Copy link
Contributor Author

leezu commented Aug 21, 2018

@lupesko I considered this to fall into "PRs with tiny changes" which don't require JIRA. Currently tests are failing, I'll need to add some fixes before review can proceed. Sorry for the delay

requires a blocking .asscalar() call.
check_scale : bool, default True
If True, skip array rescaling if max_norm / total_norm >= 1. This
requires a blocking call. If False, rescale arrays with min(1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rescale is not blocking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But previously there is a blocking call to check if re-scale is necessary.
If check_scale is False, we always rescale, possibly by 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Maybe we can use contrib.cond for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contrib.cond would only help when working with the symbolic interface. For ndarray, contrib.cond also uses a blocking .asscalar() call. clip_global_norm always works with the ndarray API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now the PR preserves the old default behavior. I'm happy to remove the check_scale argument and to always rescale, trading off computation against avoiding blocking calls. That would assume that it is always cheaper/better do the rescaling than to wait for a blocking asscalar() and potentially avoid rescaling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for removing check_scale. It doesn't seem necessary. We can always perform a scaling.

If both are set to false, clip_global_norm does not force any synchronization
and throughput can be increased.
@eric-haibin-lin eric-haibin-lin merged commit 308ada1 into apache:master Aug 27, 2018
@leezu leezu deleted the optimizeclipglobalnorm branch August 27, 2018 11:01
anirudh2290 pushed a commit to anirudh2290/mxnet that referenced this pull request Sep 19, 2018
…#12042)

* Make check_isfinite, check_scale optional in clip_global_norm

If both are set to false, clip_global_norm does not force any synchronization
and throughput can be increased.

* Add tests

* Remove check_scale

* Document return type

* Fix test_gluon_gpu
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Gluon pr-awaiting-review PR is waiting for code review Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants