Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

CPP examples training acc does not increase #13243

Closed
roywei opened this issue Nov 13, 2018 · 4 comments
Closed

CPP examples training acc does not increase #13243

roywei opened this issue Nov 13, 2018 · 4 comments
Labels
C++ Related to C++ Example

Comments

@roywei
Copy link
Member

roywei commented Nov 13, 2018

Description

I found this problem in PR: #13185
Following examples here:
https://github.com/apache/incubator-mxnet/tree/master/cpp-package/example

During traing, the acc actually does not change at all.

Right now affected models:

  1. inception-bn
  2. resnet

will continue to check other models

Error Message:

[02:17:51] src/io/iter_mnist.cc:110: MNISTIter: load 60000 images, shuffle=1, shape=(50,784)
[02:17:52] src/io/iter_mnist.cc:110: MNISTIter: load 10000 images, shuffle=1, shape=(50,784)
[02:17:52] resnet.cpp:199: Epoch: 0
[02:18:06] resnet.cpp:219: Train Accuracy: 0.0987167
[02:18:08] resnet.cpp:230: Validation Accuracy: 0.098
[02:18:08] resnet.cpp:199: Epoch: 1
[02:18:22] resnet.cpp:219: Train Accuracy: 0.0987167
[02:18:24] resnet.cpp:230: Validation Accuracy: 0.098
[02:18:24] resnet.cpp:199: Epoch: 2
[02:18:38] resnet.cpp:219: Train Accuracy: 0.0987167
[02:18:40] resnet.cpp:230: Validation Accuracy: 0.098
[02:18:40] resnet.cpp:199: Epoch: 3
[02:18:54] resnet.cpp:219: Train Accuracy: 0.0987167
[02:18:56] resnet.cpp:230: Validation Accuracy: 0.098

Minimum reproducible example

(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)

Steps to reproduce

follow:
https://github.com/apache/incubator-mxnet/tree/master/cpp-package/example
run

./resnet

What have you tried to solve it?

  1. changing learning rate and weight decay does not help.
  2. changed to sgd optimizer instead of deprecated ccsgd does not help.
  3. used same setup in python version, python works fine acc increased every epoch, cpp does not.
@roywei
Copy link
Member Author

roywei commented Nov 13, 2018

@mxnet-label-bot add[C++, Example]

@zachgk
Copy link
Contributor

zachgk commented Nov 13, 2018

@mxnet-label-bot add [C++, Example]

@roywei
Copy link
Member Author

roywei commented Dec 4, 2018

fixed in #13284

@roywei roywei closed this as completed Dec 4, 2018
@ZHEQIUSHUI
Copy link

hey bro,did you have fixed it?i got the same issue,the acc and loss have change but just a little bit.....and acc does not increase

[14:41:36] F:\BaiduNetdiskDownload\incubator-mxnet\src\io\iter_mnist.cc:110: MNISTIter: load 60000 images, shuffle=1, shape=(16,784)
[14:41:36] F:\BaiduNetdiskDownload\incubator-mxnet\src\io\iter_mnist.cc:110: MNISTIter: load 10000 images, shuffle=1, shape=(16,784)
[14:41:36] F:\BaiduNetdiskDownload\incubator-mxnet\src\executor\graph_executor.cc:2061: Subgraph backend MKLDNN is activated.
[14:41:36] F:\Code\mxnet_classify_train\train.cpp:126: Epoch: 0
[14:41:36] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 1 Train Accuracy: 0.125 Train Loss: 2.30259
[14:41:37] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 2 Train Accuracy: 0.1875 Train Loss: 2.30146
[14:41:37] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 3 Train Accuracy: 0.1875 Train Loss: 2.30162
[14:41:37] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 4 Train Accuracy: 0.171875 Train Loss: 2.30361
[14:41:37] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 5 Train Accuracy: 0.15 Train Loss: 2.30592
[14:41:37] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 6 Train Accuracy: 0.135417 Train Loss: 2.30718
[14:41:38] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 7 Train Accuracy: 0.142857 Train Loss: 2.30081
[14:41:38] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 8 Train Accuracy: 0.132813 Train Loss: 2.30486
[14:41:38] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 9 Train Accuracy: 0.145833 Train Loss: 2.3007
[14:41:38] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 10 Train Accuracy: 0.13125 Train Loss: 2.31614
[14:41:39] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 11 Train Accuracy: 0.119318 Train Loss: 2.3079
[14:41:39] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 12 Train Accuracy: 0.114583 Train Loss: 2.30001
[14:41:39] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 13 Train Accuracy: 0.105769 Train Loss: 2.30863
[14:41:39] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 14 Train Accuracy: 0.102679 Train Loss: 2.3078
[14:41:39] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 15 Train Accuracy: 0.1125 Train Loss: 2.29704
[14:41:40] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 16 Train Accuracy: 0.109375 Train Loss: 2.30283
[14:41:40] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 17 Train Accuracy: 0.117647 Train Loss: 2.2879
[14:41:40] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 18 Train Accuracy: 0.111111 Train Loss: 2.30765
[14:41:40] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 19 Train Accuracy: 0.108553 Train Loss: 2.29667
[14:41:41] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 20 Train Accuracy: 0.10625 Train Loss: 2.30143
[14:41:41] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 21 Train Accuracy: 0.104167 Train Loss: 2.31742
[14:41:41] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 22 Train Accuracy: 0.107955 Train Loss: 2.3004
[14:41:41] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 23 Train Accuracy: 0.105978 Train Loss: 2.30668
[14:41:41] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 24 Train Accuracy: 0.104167 Train Loss: 2.3081
[14:41:42] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 25 Train Accuracy: 0.105 Train Loss: 2.31609
[14:41:42] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 26 Train Accuracy: 0.103365 Train Loss: 2.2924
[14:41:42] F:\Code\mxnet_classify_train\train.cpp:148: EPOCH: 0 ITER: 27 Train Accuracy: 0.101852 Train Loss: 2.29345

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C++ Related to C++ Example
Projects
None yet
Development

No branches or pull requests

4 participants