Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Showing proper error message when an attempt is made to create large tensor but MXNet is not built with it #16570

Merged
merged 7 commits into from
Oct 29, 2019

Conversation

access2rohit
Copy link
Contributor

@access2rohit access2rohit commented Oct 21, 2019

Description

Throwing error message to customers that they need to build MXNet with flag USE_INT64_TENSOR_SIZE=1 when user attempts to create large NDarrays for e.g.

  • nd.array()
  • nd.ones()
  • nd.zeros()
  • nd.full()
  • nd.random() etc.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Code is well-documented:
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Testing

ubuntu@ip-172-31-82-110 ~/incubator-mxnet (err_msg_int32) $ MXNET_TEST_COUNT=1 nosetests --logging-level=DEBUG --verbose -s tests/python/unittest/test_operator.py:test_large_tensor_disabled_err_msg
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from 'float' to 'np.floating' is deprecated. In future, it will be treated as 'np.float64 == np.dtype(float).type'.
  from ._conv import register_converters as _register_converters
[INFO] Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=721807427 to reproduce.
test_operator.test_large_tensor_disabled_err_msg ... ok

----------------------------------------------------------------------
Ran 1 test in 0.042s

OK

@access2rohit
Copy link
Contributor Author

@mxnet-label-bot add [pr-work-in-progress]

@access2rohit access2rohit force-pushed the err_msg_int32 branch 5 times, most recently from 05c8ac1 to d035559 Compare October 25, 2019 00:30
Copy link
Contributor

@marcoabreu marcoabreu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to also add some tests?

python/mxnet/ndarray/ndarray.py Show resolved Hide resolved
@access2rohit
Copy link
Contributor Author

Would it be possible to also add some tests?

@marcoabreu I definitely will. But I need to make sure the functionality works fine first.

@marcoabreu
Copy link
Contributor

Sure, sorry for asking prematurely

@@ -272,10 +272,22 @@ inline bool InitShape(const nnvm::NodeAttrs& attrs,
CHECK_EQ(in_attrs->size(), 0U);
CHECK_EQ(out_attrs->size(), 1U);
mxnet::TShape param_shape = param.shape;
if (shape_is_known(param_shape) && !features::is_enabled(features::INT64_TENSOR_SIZE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not directly check the compiler flag here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer that way - looks cleaner :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in certain cases shape is not know at the beginning, then param_shape.Size() throws a different error, which is not a good user experience. That's why this check is required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apeforest @marcoabreu do you think this makes sense now ?

@access2rohit access2rohit force-pushed the err_msg_int32 branch 2 times, most recently from bf57e72 to c9c8bb4 Compare October 26, 2019 00:20
@access2rohit
Copy link
Contributor Author

@mxnet-label-bot add [pr-awaiting-review]

@lanking520 lanking520 added the pr-awaiting-review PR is waiting for code review label Oct 26, 2019
@access2rohit
Copy link
Contributor Author

access2rohit commented Oct 26, 2019

@apeforest @ChaiBapchya @zheng-da @marcoabreu This PR is ready for review

@access2rohit access2rohit changed the title [WIP]Showing proper error message when an attempt is made to create large tensor but MXNet is not built with it Showing proper error message when an attempt is made to create large tensor but MXNet is not built with it Oct 26, 2019
@access2rohit access2rohit force-pushed the err_msg_int32 branch 2 times, most recently from bbd0b30 to cf7dc07 Compare October 26, 2019 19:35
@zheng-da zheng-da merged commit 60d74bc into apache:master Oct 29, 2019
yajiedesign pushed a commit to yajiedesign/mxnet that referenced this pull request Nov 6, 2019
…tensor but MXNet is not built with it (apache#16570)

* added error message when creating NDarray and syncing copies to and from CPU

* bug fix for 1>>31 -> uint32_t{1}<<31 and showing error message when user attempts to pass large ndarrays as inputs to operators or large shape of inputs for memory allocation.

* adding error messages to other operators where user can create NDArray indirectly

* bug fix

* removing additional checks

* Revert "removing additional checks"

This reverts commit d035559.

* adding tests and comments. Removed int64 check from linspace
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review pr-work-in-progress PR is still work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants