[MXNET-737]Add last batch handle for imageiter #12131

stu1130 · 2018-08-10T22:55:42Z

Description

Add last_batch_handle parameter to ImageIter based on #11883

last_batch_handle support pad(default), discard, roll_over

Note that reading record files(.rec) without shuffle(i.e. sequential read) didn't support 'discard'

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Add last_batch_handle feature to ImageIter and adjust test_imageiter to test last_batch_handle parameter
Change the shuffle behavior to the same as NDArrayIter where shuffling the data only happen during the iterator initialization

Comments

N/A
@zhreshold

zhreshold · 2018-08-13T17:50:34Z

python/mxnet/image/image.py

@@ -1059,16 +1059,21 @@ class ImageIter(io.DataIter):
        Label name for provided symbols.
    dtype : str
        Label data type. Default: float32. Other options: int32, int64, float64
+    last_batch_hanle :  str, optional


use last_batch to be consistent with https://mxnet.incubator.apache.org/api/python/gluon/data.html?highlight=batchify_fn#mxnet.gluon.data.BatchSampler

The original NDArrayIter call it last_batch_handle but I think the name last_batch already give much information of what it does. So I would go with your suggestion.

zhreshold · 2018-08-13T17:52:12Z

python/mxnet/image/image.py

@@ -1059,16 +1059,21 @@ class ImageIter(io.DataIter):
        Label name for provided symbols.
    dtype : str
        Label data type. Default: float32. Other options: int32, int64, float64
+    last_batch_hanle :  str, optional
+        How to handle the last batch. This parameter can be ‘pad’, ‘discard’ or ‘roll_over’.
+        'discard' is not support when reading from record file(.rec) withouting shuffle(=False)


why is discard not supported explicitly? We can add a warning to this behavior

and please add explanation what each option stands for

The reason why discard is not supported is that when we read the rec file sequentially we don't know how many images in the file. Therefore, there is no way we can precalculate the number of images we need to discard. The only two solutions that I came up with is

iterate the file during the initialization of data iterator

allow users to input the number of the images

The first solution would take lots of time if the file is large during the initialization. The second one is not user-friendly. So I decided to give up this option.

I think you can check if valid batch.shape[0] is smaller than batch_size without checking how many images are there.

zhreshold · 2018-08-13T17:55:13Z

python/mxnet/image/image.py

@@ -1149,22 +1157,53 @@ def __init__(self, batch_size, data_shape, label_width=1,
        else:
            self.auglist = aug_list
        self.cur = 0
+        self.is_iterated_over = False


please make internal state variables private with leading underscore

zhreshold · 2018-08-13T18:11:59Z

python/mxnet/image/image.py

+            if pad != 0:
+                _ = self.iterate(batch_data, batch_label, i, self._imgrec)
+
+            if self.last_batch_handle != 'roll_over':


is it possible to cache last batch in memory if roll_over? I think open/closing recordio is problematic

…_batch' 2. remove the extra RecordIO and refactor the code accordingly 3. support 'discard' for sequentially reading

…r both sequential read and random access

sandeep-krishnamurthy

Nice work!
Welcome to MXNet community. Thanks for your first contributions!

Few changes requested.

sandeep-krishnamurthy · 2018-08-15T05:41:58Z

python/mxnet/image/image.py

+            if self.imgrec is not None:
+                self.imgrec.reset()
+            self.cur = 0
+        self._is_allowed_reading = True


why do we need to set this again?

We need to reset RecordIO, current cursor as long as the last_batch is not roll_over. As for self._is_allowed_reading, it needs to be set to True after reset. Consider the situation where the data iter with 'pad' or 'discard' have iterated till the last_batch, the self._is_allowed_reading is set to False(After did the padding, the value would set to False in case users call the next()), we need to set it back to True to prepare for next() function call.

sandeep-krishnamurthy · 2018-08-15T05:43:27Z

python/mxnet/image/image.py


    def next_sample(self):
        """Helper function for reading in next sample."""
+        if self._is_allowed_reading is False:
+            raise StopIteration


Please add user comprehensible message when raising errors.

StopIteration is normal behavior, not for users

Yup my bad. It is built in exception for iterators. Please ignore my comment.

sandeep-krishnamurthy · 2018-08-15T05:46:48Z

python/mxnet/image/image.py

+            if self.cur < self.num_image:
+                idx = self.seq[self.cur]
+            else:
+                if self.last_batch != 'discard':


should this be != discard and == rollover?

No matter it is pad or roll_over, we need to "reset the cursor"(assign 0 to the self.cur) to do the padding

sandeep-krishnamurthy · 2018-08-15T05:48:35Z

python/mxnet/image/image.py

+    def next(self):
+        """Returns the next batch of data."""
+        batch_size = self.batch_size
+        c, h, w = self.data_shape


Are we assuming always channels_first?

sandeep-krishnamurthy · 2018-08-15T05:51:33Z

tests/python/unittest/test_image.py

+                    path_imglist=path_imglist, path_root='', dtype=dtype)
+                for _ in range(3):
+                    for batch in test_iter:
+                        pass


can we assert the returned batch size and shape here?

zhreshold · 2018-08-15T06:44:25Z

python/mxnet/image/image.py

+        # handle the last batch
+        if self.seq and last_batch == 'discard':
+            new_seq_n = len(self.seq) - len(self.seq) % batch_size
+            self.seq = self.seq[:new_seq_n]


this is not correct, you now change the behavior from initial shuffle to per epoch shuffle, so self.seq must be shuffled in reset, and the slice can not be fixed.

According to NDArrayIter, it shuffle only during the initialization. But you are right. We should shuffle for each epoch.

zhreshold · 2018-08-15T06:44:54Z

python/mxnet/image/image.py

                raise StopIteration
            header, img = recordio.unpack(s)
            return header.label, img

-    def next(self):
-        """Returns the next batch of data."""
+    def iterate(self, batch_data, batch_label, start=0):


use _collect_batch as name instead

actually _batchify could be better IMO

zhreshold · 2018-08-15T06:45:26Z

python/mxnet/image/image.py

+        pad = batch_size - i
+        # handle padding for sequential read
+        if pad != 0:
+            if self.seq is not None:


the logic of last_batch here is kind weird

…on. Now we check it in next() function 2. move shuffle back to reset function

…ta are not correct

stu1130 · 2018-08-15T23:58:53Z

python/mxnet/image/image.py

+        # calculate the padding
+        pad = batch_size - i
+        # handle padding for 'pad' and 'roll_over' for the last batch
+        if pad != 0:


refactor the padding logic here

zhreshold

Essentially LGTM now, please address remaining comments and we are good to merge.

zhreshold · 2018-08-17T17:32:37Z

python/mxnet/image/image.py

            random.shuffle(self.seq)
        if self.imgrec is not None:
            self.imgrec.reset()
        self.cur = 0
+        self._is_allowed_reading = True


self._allow_read for shorter but same meaning

zhreshold · 2018-08-17T17:34:02Z

tests/python/unittest/test_image.py

+                first_batch_roll_over_twice = test_iter.next()
+                assert np.array_equal(
+                    first_batch_roll_over_twice.data[0][2].asnumpy(), first_image.asnumpy())
+                assert first_batch_roll_over_twice.pad == 1


assert second epoch with size 6?
Also can you add test for shuffle=True, just for sanity test, no value assert is required for shuffle mode.

stu1130

All the code change is completed.
Thanks, @zhreshold and @sandeep-krishnamurthy

stu1130 · 2018-08-17T18:27:59Z

tests/python/unittest/test_image.py

+                # test the third epoch with size 6
+                assert i == 6
+                # test shuffle option for sanity test
+                test_iter = mx.image.ImageIter(3, (3, 224, 224), label_width=1, imglist=imglist, shuffle=True,


add shuffle test case

stu1130 · 2018-08-17T18:28:30Z

tests/python/unittest/test_image.py

+                for _ in test_iter:
+                    i += 1
+                # test the third epoch with size 6
+                assert i == 6


test epoch size

sandeep-krishnamurthy

LGTM.
Thank you @stu1130 @zhreshold

sandeep-krishnamurthy · 2018-08-17T20:29:35Z

python/mxnet/image/image.py

+            if self.imgrec is not None:
+                self.imgrec.reset()
+            self.cur = 0
+            if self._allow_read is False:


nit: just self._allow_read = True would work here?

yes, it works as well. Only iter with 'pad' would change this flag though.

[MXNET-737]Add last batch handle for imageiter

stu1130 added 3 commits August 10, 2018 14:41

Add last_batch_handle on ImageIter

9263a56

Test the last_batch_handle for both of imglist and path_imglist

c3a7dc5

fix the filename typo

79f3002

stu1130 requested a review from szha as a code owner August 10, 2018 22:55

sandeep-krishnamurthy added Gluon pr-work-in-progress PR is still work in progress labels Aug 12, 2018

zhreshold suggested changes Aug 13, 2018

View reviewed changes

stu1130 added 7 commits August 13, 2018 17:27

change the name of parameter from 'last_batch_handle' to 'last_batch'

0eb4573

1. change the name of the parameter from 'last_batch_handle' to 'last…

c214f5b

…_batch' 2. remove the extra RecordIO and refactor the code accordingly 3. support 'discard' for sequentially reading

fix character '\xe2' encoding issue

a66452c

fix roll_over bug happened when calling reset() several times

0f47661

unify the logic of how to deal with 'discard', 'pad', ''roll_over' fo…

10aed35

…r both sequential read and random access

remove the test case used locally

94c96a6

fix the bad indentation

3ffbb06

sandeep-krishnamurthy suggested changes Aug 15, 2018

View reviewed changes

zhreshold suggested changes Aug 15, 2018

View reviewed changes

stu1130 added 4 commits August 15, 2018 09:16

assert batch data shape

fcd7310

1. delete the piece of code that handle the discard when initializati…

7dc8150

…on. Now we check it in next() function 2. move shuffle back to reset function

update the test case when we call reset several times, the pad and da…

0802ae7

…ta are not correct

delete logs we don't need

4ed6756

stu1130 commented Aug 15, 2018

View reviewed changes

stu1130 added 4 commits August 15, 2018 23:00

refine some code comment

a4b207d

change the roll_over behavior

c384e9a

change the unit test according to the latest roll_over behavior

bc48833

fix hard_reset bug which misses to clear the cache data

d7adf64

szha requested review from zhreshold and removed request for szha August 16, 2018 23:05

zhreshold suggested changes Aug 17, 2018

View reviewed changes

stu1130 added 2 commits August 17, 2018 10:49

refine minor variable name

e4a16db

assert second epoch size and add shuffle test case for sanity check

ab6e601

check the third epoch instead of second epoch

5f8a659

stu1130 commented Aug 17, 2018

View reviewed changes

sandeep-krishnamurthy approved these changes Aug 17, 2018

View reviewed changes

zhreshold approved these changes Aug 17, 2018

View reviewed changes

sandeep-krishnamurthy changed the title ~~[MXNET-737][WIP] Add last batch handle for imageiter~~ [MXNET-737]Add last batch handle for imageiter Aug 18, 2018

sandeep-krishnamurthy merged commit afb77f8 into apache:master Aug 18, 2018

XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018

[MXNET-737]Add last batch handle for imageiter (apache#12131)

356be33

[MXNET-737]Add last batch handle for imageiter

zhreshold mentioned this pull request Dec 1, 2018

ImageDetIter looping forever in MXNet-1.3.0 #13037

Closed

stu1130 deleted the add_last_batch_handle_on_imageiter branch January 10, 2020 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-737]Add last batch handle for imageiter #12131

[MXNET-737]Add last batch handle for imageiter #12131

stu1130 commented Aug 10, 2018 •

edited

Loading

zhreshold Aug 13, 2018

stu1130 Aug 13, 2018

zhreshold Aug 13, 2018

zhreshold Aug 13, 2018

stu1130 Aug 13, 2018 •

edited

Loading

zhreshold Aug 13, 2018

zhreshold Aug 13, 2018

zhreshold Aug 13, 2018

sandeep-krishnamurthy left a comment

sandeep-krishnamurthy Aug 15, 2018

stu1130 Aug 15, 2018

sandeep-krishnamurthy Aug 15, 2018

zhreshold Aug 15, 2018

sandeep-krishnamurthy Aug 15, 2018

sandeep-krishnamurthy Aug 15, 2018

stu1130 Aug 15, 2018

sandeep-krishnamurthy Aug 15, 2018

stu1130 Aug 15, 2018

sandeep-krishnamurthy Aug 15, 2018

zhreshold Aug 15, 2018

stu1130 Aug 15, 2018 •

edited

Loading

zhreshold Aug 15, 2018

zhreshold Aug 15, 2018

zhreshold Aug 15, 2018

stu1130 Aug 15, 2018

zhreshold left a comment

zhreshold Aug 17, 2018

zhreshold Aug 17, 2018

stu1130 left a comment

stu1130 Aug 17, 2018

stu1130 Aug 17, 2018

sandeep-krishnamurthy left a comment

sandeep-krishnamurthy Aug 17, 2018

stu1130 Aug 17, 2018

[MXNET-737]Add last batch handle for imageiter #12131

[MXNET-737]Add last batch handle for imageiter #12131

Conversation

stu1130 commented Aug 10, 2018 • edited Loading

Description

Checklist

Essentials

Changes

Comments

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu1130 Aug 13, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu1130 Aug 15, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhreshold left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu1130 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stu1130 commented Aug 10, 2018 •

edited

Loading

stu1130 Aug 13, 2018 •

edited

Loading

stu1130 Aug 15, 2018 •

edited

Loading