Skip to content

Comments

Image Classification API: Fix processing incomplete batch(<batchSize), images processed per epoch , enable EarlyStopping without Validation Set. Fixes #4274 and #4286 #4289

Merged
ashbhandare merged 6 commits intodotnet:masterfrom
ashbhandare:Issue4274
Oct 4, 2019

Conversation

@ashbhandare
Copy link
Contributor

@ashbhandare ashbhandare commented Oct 3, 2019

1)Previously, if the images left were not enough to for a batch of batchSize, the batch would not be processed. Fixed to process incomplete batch in training and validation.
2)There was a bug where the batchIndex was not getting reset when the last batch was incomplete(< batchSize). Fixed to reset batchIndex.
3)EarlyStopping was not triggering when validation set is not provided. Fixed.

fixes #4274 #4286

@ashbhandare ashbhandare requested a review from a team as a code owner October 3, 2019 19:59
…,detected edge case where early stopping not supported.
if (_session.graph.OperationByName(_labelTensor.name.Split(':')[0]) == null)
throw Host.ExceptParam(nameof(options.TensorFlowLabel), $"'{options.TensorFlowLabel}' does not exist in the model");
if (options.EarlyStoppingCriteria != null && options.ValidationSet == null && options.TestOnTrainSet == false)
throw Host.ExceptParam(nameof(options.EarlyStoppingCriteria), $"No Validation dataset provided and testing on Train disabled, Early Stopping not supported.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"No Validation dataset provided and testing on Train disabled, Early Stopping not supported. [](start = 79, length = 92)

Lets improve the exception message "Early stopping enabled but unable to find a validation set and/or train set testing disabled. Please disable early stopping or either provide a validation set or enable train set training."

if (batchIndex > 0)
{
featureTensorShape[0] = batchIndex;
featureBatchSizeInBytes = sizeof(float) * featureBatch.Length * batchIndex / batchSize;
Copy link
Member

@codemzs codemzs Oct 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sizeof(float) * featureBatch.Length * batchIndex / batchSize; [](start = 49, length = 62)

sizeof(float) * featureLength * batchIndex

if(batchIndex > 0)
{
featureTensorShape[0] = batchIndex;
featureBatchSizeInBytes = sizeof(float) * featureBatch.Length * batchIndex / batchSize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sizeof(float) * featureBatch.Length * batchIndex / batchSize; [](start = 50, length = 61)

sizeof(float) * featureLength * batchIndex

Copy link
Member

@codemzs codemzs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@codemzs
Copy link
Member

codemzs commented Oct 4, 2019

@ashbhandare Please also remove the WIP tag.

@ashbhandare ashbhandare changed the title (WIP) Image Classification API: Fix processing incomplete batch(<batchSize), images processed per epoch , enable EarlyStopping without Validation Set. Fixes #4274 and #4286 Image Classification API: Fix processing incomplete batch(<batchSize), images processed per epoch , enable EarlyStopping without Validation Set. Fixes #4274 and #4286 Oct 4, 2019
@codecov
Copy link

codecov bot commented Oct 4, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@82f83a6). Click here to learn what that means.
The diff coverage is 62.71%.

@@            Coverage Diff            @@
##             master    #4289   +/-   ##
=========================================
  Coverage          ?   74.59%           
=========================================
  Files             ?      878           
  Lines             ?   154277           
  Branches          ?    16874           
=========================================
  Hits              ?   115082           
  Misses            ?    34448           
  Partials          ?     4747
Flag Coverage Δ
#Debug 74.59% <62.71%> (?)
#production 70.18% <62.71%> (?)
#test 89.54% <ø> (?)
Impacted Files Coverage Δ
...c/Microsoft.ML.Dnn/ImageClassificationTransform.cs 87.44% <62.71%> (ø)

@ashbhandare ashbhandare merged commit ffc9e9b into dotnet:master Oct 4, 2019
@ashbhandare ashbhandare deleted the Issue4274 branch October 4, 2019 23:16
codemzs pushed a commit that referenced this pull request Oct 8, 2019
…, images processed per epoch , enable EarlyStopping without Validation Set. Fixes #4274 and #4286     (#4289)

* In ImageClassification, process incomplete batch where number of samples < batchSize.

* fixed batchIndex not reseting in train loop, enabled EarlyStopping when validationSet is not given for ImageClassificationAPI

* fixed changing shape of feature and label tensor for incomplete batch,detected edge case where early stopping not supported.

* Improved featureBatchSizeInBytes calculation, improved exception message.
LittleLittleCloud added a commit that referenced this pull request Aug 28, 2020
* Buffer re-use using ArrayPool and a few more checks (#4293)

* commit b468adb
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 21:19:57 2019 -0700

    Fixed a bug in the unit test for image classification
commit 30aa4d1
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 20:43:17 2019 -0700

    addressed Zeeshan's comments

commit 3d4f5fe
Merge: 0fbd3d2 718a238
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 20:41:21 2019 -0700

    Merge branch 'master' of https://github.com/dotnet/machinelearning into ImageClassificationVBuf

commit 0fbd3d2
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 17:10:49 2019 -0700

    Changed type to useImageType in LoadImages(). Changed appropriate variable names in ImageClassificationTransform.cs

commit 2417888
Merge: 3ad26b4 4944be7
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 16:55:25 2019 -0700

    Merge branch 'master' of https://github.com/dotnet/machinelearning into ImageClassificationVBuf

commit 3ad26b4
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 15:59:06 2019 -0700

    Added buffer re-use while reading the image in netstandard 2.0. Addressed Eric's comments. Changed ImageLoadingTransformer to take a bool type instead of a DataViewType to make it user friendly. (type = true means we are using VBuffer<byte> , type = false means we are using ImageDataViewType)

commit c67dd08
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Tue Oct 1 09:50:52 2019 -0700

    Added functionality to load images as VBuffer<byte> in ImageLoader. If no DataViewType options are provided it defaults to loading images as ImageDataViewType. Made LoadImages a part of the sample in ResnetV2101TransferLearningTrainTestSplit.cs. Addressed some of the comments from Zeeshan and Yael. Added a unit test for testing the API. Added TargetFrameworks to get cross platform functionality for System.IO.Stream.Read(Span<Byte>) which doesn't work for netstandard2.0.

commit ae2ac0d
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Wed Sep 25 14:49:41 2019 -0700

    Added some edits to address Yael's comments

commit b1e5739
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Wed Sep 25 13:24:03 2019 -0700

    Added unit test for the change

commit acf985d
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Mon Sep 23 10:39:07 2019 -0700

    Changed the calling function back to how it was in master

commit b80f7ad
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Mon Sep 23 10:20:31 2019 -0700

    Added a few optimizations to re-use buffers and thereby improving performance.

commit b106ae0
Author: Harshitha Parnandi Venkata <havenka@microsoft.com>
Date:   Thu Sep 19 14:07:15 2019 -0700

    Changed Image Classification API to take in a VBuffer<byte> type instead of ImagePath.

* fixed merge conflicts

* Fixed some unit tests that were failing after the merge. Addressed a few comments.

* Fixed TensorFlow unit tests

* Changed the buffer re-use logic for ReadToEnd

* Changed ReadToEnd function to read using span instead of unsafe blocks

* removed unnecessary commits

* Added version check with backward compatability. Addressed Zeeshan's comments.

* Fixed tab and synced to master

* Addressed comments. Checkpoint commit

* changed the solution files and version check in ImageLoader.cs

* Added changes for StableApi.csproj

* Added ArrayPool for buffer re-use

* Handled the case when MakeGetter src is empty we need to send an empty VBuffer. Another check for handling empty images.

* Addressed comments

* Image Classification API: Fix processing incomplete batch(<batchSize), images processed per epoch , enable EarlyStopping without Validation Set. Fixes #4274 and #4286     (#4289)

* In ImageClassification, process incomplete batch where number of samples < batchSize.

* fixed batchIndex not reseting in train loop, enabled EarlyStopping when validationSet is not given for ImageClassificationAPI

* fixed changing shape of feature and label tensor for incomplete batch,detected edge case where early stopping not supported.

* Improved featureBatchSizeInBytes calculation, improved exception message.

* upgrade to 3.1

* write inline data using invariantCulture

* Update: ModelBuilder codegen for Object Detection

* updated testing files to give better test results

* Refactored test code

* trying to test performance

* fix commit

* Adding changes from prior commit

* small changes to finilize codegen

* Made changes based on csproj

* minor changes to make final build

* updated onnxruntime to 1.3

* targetting older automl

* taking out dependency on automl taskkind for OD

* final build got predictions working for OD

* took out old test paths to generalize tests

* cleaning up outdated comments

* for the build packaging

* rebuild

* fix tests

* fix build error

* fix e2e bug

* fix e2e bug

* Update Microsoft.ML.CodeGenerator.nupkgproj

* Update Microsoft.ML.CodeGenerator.csproj

* remove .approved.txt that not used

Co-authored-by: harshithapv <54084812+harshithapv@users.noreply.github.com>
Co-authored-by: ashbhandare <aibhanda@microsoft.com>
Co-authored-by: LittleLittleCloud <bigmiao.zhang@gmail.com>
Co-authored-by: Tevin <t-testan@microsoft.com>
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Image Classification API] No evaluation when batchSize parameter > # of instances in dataset

2 participants