-
Notifications
You must be signed in to change notification settings - Fork 538
Error in sst2 mission with BERT #691
Comments
@GeorgieJi thank you for reporting the issue. Since it's an error in the mxnet library, I think this issue can be best handled there. Would you mind follow this issue template and report the error there? I will help you there. |
We found similar regression in apache/mxnet#14872 where an invalid index access was introduced recently to mxnet. Would you mind installing an MXNet version in early April? For example: BTW the issue will be fixed in apache/mxnet#14873 |
@eric-haibin-lin Thanks for your suggestion. I degraded MXNet version to 1.5.0b20190412. Now it seems to be working well. But I have another question whether training process is limited on a single cpu, since I want all cpus to execute task parallel. |
@GeorgieJi sounds good. We will note the anecdote here to make sure the bug can be fixed and verified in mxnet. |
Cpu(0) uses all available cores on the cpu :) |
@eric-haibin-lin Em, that's interesting. I use application "htop" to monitor the conditions of CPUs. in Ubuntu 18.04. When gluon with BERT is running, only partial cores are executing. |
Hi,
Thanks for your sharing.
When I run below code in terminal, I got a fault code and interrupted.
Code is "python finetune_classifier.py --task_name SST --epochs 4 --batch_size 16 --accumulate 1 --optimizer bertadam --lr 2e-5 --log_interval 500".
Error code is "Segmentation fault: 11
Stack trace:
[bt] (0) /home/bit0427/anaconda3/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2981500) [0x7f16104ba500]
[bt] (1) /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20) [0x7f16478cff20]
[bt] (2) /home/bit0427/anaconda3/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x285a260) [0x7f1610393260]
[bt] (3) /home/bit0427/anaconda3/bin/../lib/libgomp.so.1(+0x11bef) [0x7f1642d50bef]
[bt] (4) /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f1647c896db]
[bt] (5) /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7f16479b288f]
"
My python is 3.6.8, mxnet is 1.5 and gluno is the most recent version.
Tanks for your support.
The text was updated successfully, but these errors were encountered: