Reducing memory footprint of one_hot for Large Array Testing #16136

access2rohit · 2019-09-10T22:32:19Z

Description

Using 2 element one hot location NDarray to ensure one_hot operator is not using too much memory.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Testing

$ MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/nightly/test_large_array.py:test_one_hot
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
test_large_array.test_one_hot ... ok

----------------------------------------------------------------------
Ran 1 test in 9.245s

OK

Currently running full suite of all tests in test_large_array.py

access2rohit · 2019-09-10T22:32:37Z

@mxnet-label-bot add [pr-awaiting-review]

access2rohit · 2019-09-10T22:34:47Z

@apeforest @ChaiBapchya PR is ready for review

apeforest · 2019-09-10T22:42:29Z

tests/nightly/test_large_vector.py

-    a[0] = 1
-    a[-1] = 1
+    #default dtype of ndarray is float32 which cannot index elements over 2^32
+    a = nd.array([1], dtype=np.int64)


maybe you don't even need this array a?

just do a = nd.one_hot([1], LARGE_X)

it still expects numpy array and errors out AssertionError: Argument indices must have NDArray type cannot avoid that. Also, it keeps default dtype as np.float32 so we have to explicitly give dtype as int64

apeforest · 2019-09-10T22:44:12Z

Actually, after checking test_large_array.py, I think this test is redundant. We can remove it from test_large_vector.

…ncorporate large vector part of the test

…16136) * reducing memory footprint of one_hot for Large Tensor * removing one_hot from large_vector and changing large_array test to incorporate large vector part of the test

marcoabreu added the pr-awaiting-review PR is waiting for code review label Sep 10, 2019

reducing memory footprint of one_hot for Large Tensor

2838fea

access2rohit force-pushed the fix_one_hot branch from 9c4c4a2 to 2838fea Compare September 10, 2019 22:34

apeforest reviewed Sep 10, 2019

View reviewed changes

access2rohit changed the title ~~Reducing memory footprint of one_hot for Large Vector Testing~~ Reducing memory footprint of one_hot for Large Array Testing Sep 10, 2019

removing one_hot from large_vector and changing large_array test to i…

f9ce945

…ncorporate large vector part of the test

apeforest approved these changes Sep 11, 2019

View reviewed changes

apeforest merged commit e87995d into apache:master Sep 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing memory footprint of one_hot for Large Array Testing #16136

Reducing memory footprint of one_hot for Large Array Testing #16136

access2rohit commented Sep 10, 2019 •

edited

Loading

access2rohit commented Sep 10, 2019

access2rohit commented Sep 10, 2019

apeforest Sep 10, 2019 •

edited

Loading

access2rohit Sep 10, 2019 •

edited

Loading

apeforest commented Sep 10, 2019

Reducing memory footprint of one_hot for Large Array Testing #16136

Reducing memory footprint of one_hot for Large Array Testing #16136

Conversation

access2rohit commented Sep 10, 2019 • edited Loading

Description

Checklist

Essentials

Testing

access2rohit commented Sep 10, 2019

access2rohit commented Sep 10, 2019

apeforest Sep 10, 2019 • edited Loading

Choose a reason for hiding this comment

access2rohit Sep 10, 2019 • edited Loading

Choose a reason for hiding this comment

apeforest commented Sep 10, 2019

access2rohit commented Sep 10, 2019 •

edited

Loading

apeforest Sep 10, 2019 •

edited

Loading

access2rohit Sep 10, 2019 •

edited

Loading