Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Reducing memory footprint of one_hot for Large Array Testing #16136

Merged
merged 2 commits into from
Sep 11, 2019

Conversation

access2rohit
Copy link
Contributor

@access2rohit access2rohit commented Sep 10, 2019

Description

Using 2 element one hot location NDarray to ensure one_hot operator is not using too much memory.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Testing

$ MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/nightly/test_large_array.py:test_one_hot
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
test_large_array.test_one_hot ... ok

----------------------------------------------------------------------
Ran 1 test in 9.245s

OK

Currently running full suite of all tests in test_large_array.py

@access2rohit
Copy link
Contributor Author

@mxnet-label-bot add [pr-awaiting-review]

@marcoabreu marcoabreu added the pr-awaiting-review PR is waiting for code review label Sep 10, 2019
@access2rohit
Copy link
Contributor Author

@apeforest @ChaiBapchya PR is ready for review

a[0] = 1
a[-1] = 1
#default dtype of ndarray is float32 which cannot index elements over 2^32
a = nd.array([1], dtype=np.int64)
Copy link
Contributor

@apeforest apeforest Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you don't even need this array a?

just do a = nd.one_hot([1], LARGE_X)

Copy link
Contributor Author

@access2rohit access2rohit Sep 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it still expects numpy array and errors out AssertionError: Argument indices must have NDArray type cannot avoid that. Also, it keeps default dtype as np.float32 so we have to explicitly give dtype as int64

@apeforest
Copy link
Contributor

Actually, after checking test_large_array.py, I think this test is redundant. We can remove it from test_large_vector.

@access2rohit access2rohit changed the title Reducing memory footprint of one_hot for Large Vector Testing Reducing memory footprint of one_hot for Large Array Testing Sep 10, 2019
@apeforest apeforest merged commit e87995d into apache:master Sep 11, 2019
access2rohit added a commit to access2rohit/incubator-mxnet that referenced this pull request Sep 25, 2019
…16136)

* reducing memory footprint of one_hot for Large Tensor

* removing one_hot from large_vector and changing large_array test to incorporate large vector part of the test
access2rohit added a commit to access2rohit/incubator-mxnet that referenced this pull request Sep 25, 2019
…16136)

* reducing memory footprint of one_hot for Large Tensor

* removing one_hot from large_vector and changing large_array test to incorporate large vector part of the test
access2rohit added a commit to access2rohit/incubator-mxnet that referenced this pull request Sep 25, 2019
…16136)

* reducing memory footprint of one_hot for Large Tensor

* removing one_hot from large_vector and changing large_array test to incorporate large vector part of the test
larroy pushed a commit to larroy/mxnet that referenced this pull request Sep 28, 2019
…16136)

* reducing memory footprint of one_hot for Large Tensor

* removing one_hot from large_vector and changing large_array test to incorporate large vector part of the test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants