Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Mxnet allclose #14443

Merged
merged 58 commits into from
Oct 15, 2019
Merged

Mxnet allclose #14443

merged 58 commits into from
Oct 15, 2019

Conversation

drivanov
Copy link
Contributor

Description

  • The analog of numpy.allclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False) is implemented as a MxNet operator:
    mx.nd.contrib.allclose(a, b, rtol, atol, equal_nan)
  • For now, besides the unit test test_allclose_function(), this method is used only in
def assert_almost_equal(a, b, rtol=None, atol=None, names=('a', 'b'), equal_nan=False)

where parameter a or/and b could be defined also as mx.nd.array(s):

    Parameters
    ----------
    a : np.ndarray or mx.nd.array
    b : np.ndarray or mx.nd.array
  • When calling assert_almost_equal, no more asnumpy() conversion needed. It will be done automatically (in assert_almost_equal), if
    (a) a or b has no attribute "context" OR
    (b) these attributes are different.

  • The elimination of asnumpy() conversions and the use of mx.nd.contrib.allclose(...) for mx.nd.array's allows to achieve 5-7x speedup for GPU tests that use long arrays.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Comments

Most of the changed files, were changed because of the eliminations of asnumpy() conversions which were used when assert_almost_equal function is called.

@drivanov drivanov requested a review from szha as a code owner March 15, 2019 22:36
@karan6181
Copy link
Contributor

@mxnet-label-bot add [Operator, Test, pr-awaiting-review]

@marcoabreu marcoabreu added Operator pr-awaiting-review PR is waiting for code review Test labels Mar 15, 2019
@chinakook
Copy link
Contributor

It this op only support float32?

@drivanov
Copy link
Contributor Author

It this op only support float32?

Actually, only the tolerances (rtol, atol) are supposed to be float32. The elements of tensors could be any type.

@drivanov
Copy link
Contributor Author

I am not sure that this fail is related to my changes made to this PR. I will make an empty commit for triggering CI.

@drivanov
Copy link
Contributor Author

Failing tests are not related to the changes of this PR. I will launch CI one more time.

@drivanov drivanov closed this Mar 26, 2019
@drivanov drivanov reopened this Mar 26, 2019
@drivanov drivanov closed this Apr 2, 2019
@drivanov drivanov reopened this Apr 2, 2019
@drivanov drivanov closed this Apr 2, 2019
@drivanov drivanov reopened this Apr 2, 2019
@wkcn
Copy link
Member

wkcn commented Apr 2, 2019

Sometimes the CI may fails and raise an irrelated exception. We can disable the using of mx.nd.contrib.allclose firstly to check it.

@drivanov
Copy link
Contributor Author

drivanov commented Apr 2, 2019

As I could see that _test_bulking() is not there and I need to fix it.

@wkcn
Copy link
Member

wkcn commented Jul 16, 2019

Glad to see the success of CI : )

@chinakook @szha @mseth10
Could you help take a review? Thank you!

@abhinavs95
Copy link
Contributor

@drivanov Could you do a rebase to resolve the merge conflicts?

@drivanov drivanov mentioned this pull request Aug 22, 2019
4 tasks
@ChaiBapchya
Copy link
Contributor

@szha @mseth10 Good to merge? Pl review

@szha
Copy link
Member

szha commented Sep 9, 2019

there're merge conflicts that need to be resolved first.

@drivanov
Copy link
Contributor Author

drivanov commented Sep 9, 2019

Just in case, I rebased. CI is ok and should be no merge conflicts.

@wkcn
Copy link
Member

wkcn commented Sep 26, 2019

Hi @mseth10 and @szha , the conflict has been resolved.

This PR accelerates MXNet allclose, and it accepts mx.nd.NDArray and np.ndarray without any conversion.

Could you please help take a review? Thank you so much!

drivanov added a commit to drivanov/incubator-mxnet that referenced this pull request Sep 26, 2019
drivanov added a commit to drivanov/incubator-mxnet that referenced this pull request Sep 27, 2019
Copy link
Member

@ptrendx ptrendx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ptrendx ptrendx merged commit 06438ab into apache:master Oct 15, 2019
aaronmarkham pushed a commit to aaronmarkham/incubator-mxnet that referenced this pull request Oct 16, 2019
* MXNET version of numpy.allclose operation implemented

* Helper function test_bulking renamed to _test_bulking

* Will use mx.test_utils.assert_allclose and not a numpy version of similar function

* Trigger CI

* Trigger CI

* Will use mx.test_utils.assert_allclose

* Trigger CI

* Trigger CI

* Problem with missed _test_bulking() function fixed

* Fixing minor bug in error reporting

* Trigger CI

* Trigger CI

* retrigger CI

* Fixing problems in discrepancies printout in assert_almost_equal

* Trigger CI

* Trigger CI

* Improved version of MxNet allclose operator

* Fixing minor problem in attribite definition for allclose operator

* retrigger CI

* Minor problem fixed

* Trigger CI

* try to fix find_max_violation

* Trigger CI

* Skip 'test_bulking_gluon_gpu()' test

* Fixing bug in reporting MaxErrors for NaN coordinates.

* use smaller testcase for test_layer_norm

* remove redundant test for test_layer_norm

* reuse old testcase

* retrigger CI

* ci

* ci

* Merge problem fixed

* Fixing Python's lint problem

* Trigger CI

* Trigger CI

* Trigger CI

* Trigger CI

* Trigger CI
@drivanov drivanov deleted the mxnet_allclose branch October 16, 2019 16:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Operator pr-awaiting-review PR is waiting for code review Test
Projects
None yet
Development

Successfully merging this pull request may close these issues.