-
Notifications
You must be signed in to change notification settings - Fork 6.8k
getting segfault while running train_cifar10.py program in example directory #12800
Comments
Thanks for reporting this issue @Vikas89 |
I think there was a fix made in relevant area of code: dmlc/dmlc-core@e3377de#diff-855ba648d1f4003608aa37ba3d060043 I will retry this on latest code and check if I can reproduce this again. I will keep this thread updated. |
I am not seeing this issue now, closing this |
I encountered the issue again, sorry for the confusion. If I revert the changes made in this commit in dmlc_core submodule, the issue is gone My changes in dmlc-core/src/io/input_split_base.cc which fixes the problem
|
Adding newlines caused apache/mxnet#12800. TODO: Add a test case
@Vikas89 I was not able to reproduce the bug with dmlc-core hash 0a0e8addf92e1287fd7a25c6314016b8c0138dee. On the other hand, I was able to reproduce it when the dmlc-core submodule was brought up to the latest master. Which commit hash for dmlc-core were you using when running the example? |
Adding newlines caused apache/mxnet#12800. TODO: Add a test case
@Vikas89 @piyushghai I created a minimal reproduction of the bug: #include <dmlc/io.h>
#include <string>
#include <utility>
#include <vector>
int main(int argc, char** argv) {
std::unique_ptr<dmlc::InputSplit> source(
dmlc::InputSplit::Create("./cifar10_val.rec", 0, 1, "recordio"));
source->BeforeFirst();
dmlc::InputSplit::Blob rec;
size_t sum = 0;
while (source->NextRecord(&rec)) {
sum += rec.size;
}
return 0;
} It shows that the bug came into existence since dmlc/dmlc-core#452 is merged. For a running example, see https://github.com/hcho3/mxnet-issue12800-repro. |
Submitted dmlc/dmlc-core#471, along with a test case. |
* Do not add newline for RecordIO InputSplit Adding newlines caused apache/mxnet#12800. TODO: Add a test case * Add a test case * Cache value of IsTextParser() in a tight loop to avoid virtual function call * Address reviewer comment: use tiny example * Fix typo
* Do not add newline for RecordIO InputSplit Adding newlines caused apache/mxnet#12800. TODO: Add a test case * Add a test case * Cache value of IsTextParser() in a tight loop to avoid virtual function call * Address reviewer comment: use tiny example * Fix typo
@hcho3 Is this still an issue or has been fixed after your PR? |
This is fixed. |
* Do not add newline for RecordIO InputSplit Adding newlines caused apache/mxnet#12800. TODO: Add a test case * Add a test case * Cache value of IsTextParser() in a tight loop to avoid virtual function call * Address reviewer comment: use tiny example * Fix typo
* Do not add newline for RecordIO InputSplit Adding newlines caused apache/mxnet#12800. TODO: Add a test case * Add a test case * Cache value of IsTextParser() in a tight loop to avoid virtual function call * Address reviewer comment: use tiny example * Fix typo
I am trying to run this command:
python example/image-classification/train_cifar10.py
And getting the segfault. It is very consistent.
Environment info
build command - make -j8 USE_DIST_KVSTORE=1
os: mac
I am trying to build and repro on linux machine.
The text was updated successfully, but these errors were encountered: