Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

add a compiler flag to use int64 as tensor size #14570

Merged
merged 37 commits into from
Apr 23, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
41351f3
use a compile flag to use int64 tensor size
apeforest Mar 29, 2019
e9bd3cc
use personal mshadow repo
apeforest Mar 29, 2019
d8d21ed
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 2, 2019
caf8e7f
update data type
apeforest Apr 2, 2019
0ea2cbc
update make config
apeforest Apr 2, 2019
3a3c02f
change size_t to index_t and add documentation
apeforest Apr 9, 2019
b1ca6dd
update mshadow submodule to master
apeforest Apr 15, 2019
5443fd5
fix compilation warning
apeforest Apr 15, 2019
872255f
fix compiler warning
apeforest Apr 15, 2019
4bd1805
fix compiler warning
apeforest Apr 15, 2019
08e9b10
fix compiler warning
apeforest Apr 15, 2019
3a4661a
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 15, 2019
d3d6cc6
fix compiler warning
apeforest Apr 15, 2019
7e3ed63
fix compiler error
apeforest Apr 15, 2019
54735db
change nnvm::Tuple to mxnet::Tuple
apeforest Apr 16, 2019
5fd9ad1
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 16, 2019
0758d0c
fix compiler warning
apeforest Apr 16, 2019
a503ec5
fix compiler warning
apeforest Apr 16, 2019
cd9aa53
fix compiler warning
apeforest Apr 16, 2019
12559b1
fix compiler warning
apeforest Apr 16, 2019
a4e4a0c
fix compiler warning
apeforest Apr 16, 2019
2399864
fix lint
apeforest Apr 17, 2019
334d775
update CI runtime_functons
apeforest Apr 17, 2019
826613a
update runtime function
apeforest Apr 17, 2019
4412b90
correct runtime_functions
apeforest Apr 17, 2019
1047eb5
udpate runtime functions
apeforest Apr 17, 2019
97a1c08
add nightly test for large tensor
apeforest Apr 17, 2019
861b95e
update Jenkins files to test new compiler flag
apeforest Apr 17, 2019
5054f8d
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 17, 2019
935389d
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 18, 2019
b86e630
fix CI
apeforest Apr 18, 2019
f7540d1
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 18, 2019
d8b04b3
add runtime feature detect for the compiler flag
apeforest Apr 19, 2019
20221d6
change build from make to cmake
apeforest Apr 19, 2019
bc95113
fix CI
apeforest Apr 19, 2019
9c672b7
move tests to nightly
apeforest Apr 20, 2019
27584ea
Merge remote-tracking branch 'upstream/master' into perf/large-tensor
apeforest Apr 20, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[submodule "3rdparty/mshadow"]
path = 3rdparty/mshadow
url = https://github.com/dmlc/mshadow.git
url = https://github.com/apeforest/mshadow.git
apeforest marked this conversation as resolved.
Show resolved Hide resolved
[submodule "3rdparty/dmlc-core"]
path = 3rdparty/dmlc-core
url = https://github.com/dmlc/dmlc-core.git
Expand Down
2 changes: 1 addition & 1 deletion 3rdparty/mshadow
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,11 @@ ifeq ($(USE_OPERATOR_TUNING), 1)
CFLAGS += -DMXNET_USE_OPERATOR_TUNING=1
endif

ifeq ($(USE_INT64_TENSOR_SIZE), 1)
CFLAGS += -DMSHADOW_INT64_TENSOR_SIZE=1
apeforest marked this conversation as resolved.
Show resolved Hide resolved
else
CFLAGS += -DMSHADOW_INT64_TENSOR_SIZE=0
endif
# verify existence of separate lapack library when using blas/openblas/atlas
# switch off lapack support in case it can't be found
# issue covered with this
Expand Down
4 changes: 4 additions & 0 deletions make/config.mk
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,10 @@ EXTRA_OPERATORS =
# Create C++ interface package
USE_CPP_PACKAGE = 0

# Use int64_t type for tensor size.
# This will cause performance degradation reported in issue #14496
eric-haibin-lin marked this conversation as resolved.
Show resolved Hide resolved
USE_INT64_TENSOR_SIZE = 0

#----------------------------
# plugins
#----------------------------
Expand Down
4 changes: 4 additions & 0 deletions make/crosscompile.jetson.mk
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,10 @@ EXTRA_OPERATORS =
# Create C++ interface package
USE_CPP_PACKAGE = 0

# Use int64_t type for tensor size.
# This will cause performance degradation reported in issue #14496
USE_INT64_TENSOR_SIZE = 0

#----------------------------
# plugins
#----------------------------
Expand Down
4 changes: 4 additions & 0 deletions make/osx.mk
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,10 @@ EXTRA_OPERATORS =
# Create C++ interface package
USE_CPP_PACKAGE = 0

# Use int64_t type for tensor size.
# This will cause performance degradation reported in issue #14496
USE_INT64_TENSOR_SIZE = 0

#----------------------------
# plugins
#----------------------------
Expand Down
2 changes: 1 addition & 1 deletion src/operator/convolution_v1-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ class ConvolutionV1Op : public Operator {
// param_.workspace is in elements of sizeof(DType)
// if param_.workspace is set to zero the nstep_ equals ishape[0] (batch)
nstep_ = std::max<index_t>(
std::min(static_cast<index_t>(param_.workspace) /
std::min<index_t>(param_.workspace /
(shape_colunit_.Size() + shape_dstunit_.Size()), ishape[0]),
1);

Expand Down
2 changes: 1 addition & 1 deletion src/operator/nn/deconvolution-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -460,7 +460,7 @@ class DeconvolutionOp {
oshape[2] * oshape[3]);
// See convolution for workspace calculations. nstep_ will be the effective batch size
nstep_ = std::max<index_t>(
std::min(static_cast<index_t>(param_.workspace) /
std::min<index_t>(param_.workspace /
(shape_colunit_.Size() + shape_dstunit_.Size()), ishape[0]),
1);

Expand Down