Memory usage for multi-label classification problem #2113

Gazillionth · 2016-05-11T22:28:21Z

Hi,

I am trying to train using 1M training examples with 8600 labels per example on a g2.2xlarge (15 GB of system memory). The code crashes with and out-of-memory error (std::bad_alloc) when creating the mx.io.ImageRecordIter for the training data. I am initializing the mx.io.ImageRecordIter with both the recordio and image list files as follows:

train = mx.io.ImageRecordIter(
path_imgrec        = "/data/train.rec",
path_imglist       = "/data/train.lst",
label_width        = 8600,
...
)

Does MXNet store the training labels in memory, or if not, how do the CPU memory requirements scale with number of labels and number of training examples? (My code works ok with 100k training examples.)

Thanks,

G.

The text was updated successfully, but these errors were encountered:

piiswrong · 2016-05-11T22:33:28Z

@tqchen We should tune down the default prefetching buffer size

Gazillionth · 2016-05-11T22:37:04Z

@piiswrong I should have noted that I have already changed the number of prefetched minibatches from 4 -> 1 on this line: https://github.com/dmlc/mxnet/blob/master/src/io/iter_image_recordio.cc#L325 and recompiled. This did not fix the out-of-memory error.

diPDew · 2016-05-13T06:07:41Z

@piiswrong @tqchen It seems when passing path_imglist, all of the labels will be loaded once (according to ImageLabelMap in ImageRecordIter). In this case, for OP's problem (with 8600 labels), the RAM consumption is gonna be huge, if PO is using millions of training examples.

Better to consider using a data iter for labels?

martinbel · 2016-08-08T23:50:37Z

I'm getting similar issues with a EC2 instance, using prefetch_buffer=1.

diPDew · 2016-08-09T00:00:37Z

@martinbel , my experience is that the labels are loaded at once into RAM (I've got >100GB labels to load..). I'm actually using g2.8xlarge which only has 60GB RAM.

martinbel · 2016-08-09T00:12:28Z

@EasonD3 You mean, your training set has 100GB of images? I'm not sure what you mean by "labels". I've converted the images to the .rec files as shown in the docs. The .rec files (compresed) are small and fit in my RAM, the raw images wouldn't for the instance I was using. I'm getting the same in my own computer, I need to kill it after 5 seconds.
The library seems promising, but I just wasn't able to make it work with my data, which doesn't make it much useful for me (as I'm not using the benchmark datasets).

diPDew · 2016-08-09T01:27:49Z

@martinbel , sorry about the confusion. My problem is very similar to the one stated in the original post, where each image is associated with hundreds of thousands of labels/classes (i.e., multi-label problem). If you feed path_imglist with the labels of all images in ImageRecordIter, all of the labels will be loaded for one time into RAM before the training starts. So in my case, the label file is too huge to load them directly in ImageRecordIter.

martinbel · 2016-08-09T15:59:22Z

@EasonD3 In my case the label's file isn't huge, but I get the same error. I guess something isn't working well with the mx.io.ImageRecordIter. I don't have so much data, but it crashes. I also don't see how a labels file can get to have 100 GB, that's insane!

zhanghan328 · 2016-12-21T05:21:07Z

#4299 Maybe, i have the same problem.

ysh329 · 2017-03-22T02:57:38Z

Same problem.
#5525

yajiedesign · 2017-09-29T15:41:27Z

This issue is closed due to lack of activity in the last 90 days. Feel free to reopen if this is still an active issue. Thanks!

ysh329 mentioned this issue Mar 22, 2017

ImageRecordIOParser std::bad_alloc Error when or after decoding #5525

Closed

yajiedesign closed this as completed Sep 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage for multi-label classification problem #2113

Memory usage for multi-label classification problem #2113

Gazillionth commented May 11, 2016

piiswrong commented May 11, 2016

Gazillionth commented May 11, 2016 •

edited

Loading

diPDew commented May 13, 2016 •

edited

Loading

martinbel commented Aug 8, 2016

diPDew commented Aug 9, 2016

martinbel commented Aug 9, 2016 •

edited

Loading

diPDew commented Aug 9, 2016 •

edited

Loading

martinbel commented Aug 9, 2016

zhanghan328 commented Dec 21, 2016

ysh329 commented Mar 22, 2017 •

edited

Loading

yajiedesign commented Sep 29, 2017

Memory usage for multi-label classification problem #2113

Memory usage for multi-label classification problem #2113

Comments

Gazillionth commented May 11, 2016

piiswrong commented May 11, 2016

Gazillionth commented May 11, 2016 • edited Loading

diPDew commented May 13, 2016 • edited Loading

martinbel commented Aug 8, 2016

diPDew commented Aug 9, 2016

martinbel commented Aug 9, 2016 • edited Loading

diPDew commented Aug 9, 2016 • edited Loading

martinbel commented Aug 9, 2016

zhanghan328 commented Dec 21, 2016

ysh329 commented Mar 22, 2017 • edited Loading

yajiedesign commented Sep 29, 2017

Gazillionth commented May 11, 2016 •

edited

Loading

diPDew commented May 13, 2016 •

edited

Loading

martinbel commented Aug 9, 2016 •

edited

Loading

diPDew commented Aug 9, 2016 •

edited

Loading

ysh329 commented Mar 22, 2017 •

edited

Loading