Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
8d07401
Attempt on Issue 4169
mstfbl Sep 18, 2019
d80e23d
Further work on Issue_4169
mstfbl Sep 20, 2019
7b239b0
Temporary change for inquiry
mstfbl Sep 24, 2019
33e71ae
Pushing changes for inquiry
mstfbl Sep 26, 2019
2b1ab69
Implemented PredictedLabel as Categorical value (String/int). Now wor…
mstfbl Sep 26, 2019
7254a7e
Merge branch 'Issue_4169' of https://github.com/mstfbl/machinelearnin…
mstfbl Sep 26, 2019
1a43cc7
Added tests for Prediction Engine
mstfbl Sep 26, 2019
d39f98b
Removed the forwarding of DataViewSchema to the TrySinglePrediction f…
mstfbl Sep 26, 2019
8cf2b7c
Minor performance upgrade to avoid the array bounds checkl
mstfbl Sep 27, 2019
c4182fb
Update ImageClassificationTransform.cs
mstfbl Sep 30, 2019
d490383
Updated tests
mstfbl Oct 2, 2019
01c77c9
Merge branch 'Issue_4169' of https://github.com/mstfbl/machinelearnin…
mstfbl Oct 2, 2019
380d8bf
Revert "Merge branch 'Issue_4169' of https://github.com/mstfbl/machin…
mstfbl Oct 2, 2019
2af03ca
Revert "Updated tests"
mstfbl Oct 2, 2019
2c05ba8
Updated test files and corrected variable spellings
mstfbl Oct 2, 2019
6356665
Merge branch 'master' of https://github.com/dotnet/machinelearning in…
mstfbl Oct 2, 2019
ed928c6
Update ImageClassificationTransform.cs
mstfbl Oct 2, 2019
f7f8253
Merge branch 'master' of https://github.com/dotnet/machinelearning in…
mstfbl Oct 2, 2019
bd42f0a
Revert "Merge branch 'master' of https://github.com/dotnet/machinelea…
mstfbl Oct 2, 2019
f851791
Merge branch 'master' into Issue_4169
mstfbl Oct 2, 2019
643fb58
Deleted unused _outputTypes
mstfbl Oct 2, 2019
b6cfeda
Update ImageClassificationTransform.cs
mstfbl Oct 2, 2019
cca4fb8
Added test case to check the matching of predicted labels
mstfbl Oct 2, 2019
1c4c5dc
Update ImageClassificationTransform.cs
mstfbl Oct 2, 2019
2301a96
Update TensorflowTests.cs
mstfbl Oct 2, 2019
26e2ae1
Update TensorflowTests.cs
mstfbl Oct 2, 2019
94c0423
Update TensorflowTests.cs
mstfbl Oct 3, 2019
109879b
Fixed test case and off-by-one predictedLabel error
mstfbl Oct 3, 2019
6ff4c64
Merge remote-tracking branch 'upstream/master' into Issue_4169
mstfbl Oct 3, 2019
15c4654
Removed comments
mstfbl Oct 3, 2019
0573ef3
Replacing Path.Join with Path.Combine due to build error with Path.Join
mstfbl Oct 3, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,13 @@ public static void Example()
IDataView shuffledFullImagesDataset = mlContext.Data.ShuffleRows(
mlContext.Data.LoadFromEnumerable(images));

shuffledFullImagesDataset = mlContext.Transforms.Conversion
.MapValueToKey("Label")
.Fit(shuffledFullImagesDataset)
var estimator = mlContext.Transforms.Conversion
.MapValueToKey("Label");
var estimatorWithKeyType = estimator.Append(
mlContext.Transforms.Conversion.MapKeyToValue(
outputColumnName: "LabelAsKey", inputColumnName: "Label"));

shuffledFullImagesDataset = estimatorWithKeyType.Fit(shuffledFullImagesDataset)
Comment thread
mstfbl marked this conversation as resolved.
Outdated
.Transform(shuffledFullImagesDataset);

// Split the data 90:10 into train and test sets, train and evaluate.
Expand Down Expand Up @@ -93,15 +97,16 @@ public static void Example()
DataViewSchema schema;
using (var file = File.OpenRead("model.zip"))
loadedModel = mlContext.Model.Load(file, out schema);
// the schema in line 99 and the shuffledFullImagesDataset.Schema in line 93 don't have
// the same annotations.
Comment thread
mstfbl marked this conversation as resolved.
Outdated

EvaluateModel(mlContext, testDataset, loadedModel);

VBuffer<ReadOnlyMemory<char>> keys = default;
loadedModel.GetOutputSchema(schema)["Label"].GetKeyValues(ref keys);

@codemzs codemzs Sep 18, 2019

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't need to do all this if you convert predicted label column from Key to Value. It will output string type that will contain class names that corresponds to key indices. #Resolved


watch = System.Diagnostics.Stopwatch.StartNew();
TrySinglePrediction(fullImagesetFolderPath, mlContext, loadedModel,
keys.DenseValues().ToArray());
TrySinglePrediction(fullImagesetFolderPath, mlContext, loadedModel, keys.DenseValues().ToArray(), shuffledFullImagesDataset.Schema);

watch.Stop();
elapsedMs = watch.ElapsedMilliseconds;
Expand All @@ -119,8 +124,7 @@ public static void Example()
}

private static void TrySinglePrediction(string imagesForPredictions,
MLContext mlContext, ITransformer trainedModel,
ReadOnlyMemory<char>[] originalLabels)
MLContext mlContext, ITransformer trainedModel, ReadOnlyMemory<char>[] originalLabels, DataViewSchema schema)
{
// Create prediction function to try one prediction
var predictionEngine = mlContext.Model
Expand All @@ -135,6 +139,7 @@ private static void TrySinglePrediction(string imagesForPredictions,
};

var prediction = predictionEngine.Predict(imageToPredict);
var predictedLabelsKeyType = ((DataViewSchema.Column)schema.GetColumnOrNull("Label")).Annotations;
var index = prediction.PredictedLabel;
Comment thread
mstfbl marked this conversation as resolved.
Outdated

Console.WriteLine($"ImageFile : " +
Expand Down
4 changes: 2 additions & 2 deletions src/Microsoft.ML.Dnn/ImageClassificationTransform.cs
Original file line number Diff line number Diff line change
Expand Up @@ -878,7 +878,7 @@ protected override DataViewSchema.DetachedColumn[] GetOutputColumnsCore()
{
var info = new DataViewSchema.DetachedColumn[_parent._outputs.Length];
info[0] = new DataViewSchema.DetachedColumn(_parent._outputs[0], new VectorDataViewType(NumberDataViewType.Single, _parent._classCount), null);
info[1] = new DataViewSchema.DetachedColumn(_parent._outputs[1], NumberDataViewType.UInt32, null);
info[1] = new DataViewSchema.DetachedColumn(_parent._outputs[1], new KeyDataViewType(typeof(uint), _parent._classCount), ((DataViewSchema.Column)InputSchema.GetColumnOrNull("Label")).Annotations);
Comment thread
mstfbl marked this conversation as resolved.
Outdated
return info;
}
}
Expand Down Expand Up @@ -1166,7 +1166,7 @@ internal ImageClassificationEstimator(IHostEnvironment env, Options options, Dnn
_options = options;
_dnnModel = dnnModel;
_tfInputTypes = new[] { TF_DataType.TF_STRING };
_outputTypes = new[] { new VectorDataViewType(NumberDataViewType.Single), NumberDataViewType.UInt32.GetItemType() };
_outputTypes = new DataViewType[] { new VectorDataViewType(NumberDataViewType.Single), new KeyDataViewType(typeof(uint), 5) };
Comment thread
mstfbl marked this conversation as resolved.
Outdated

@yaeldekel yaeldekel Sep 19, 2019

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_outputTypes [](start = 12, length = 12)

_outputTypes doesn't need to be an array of DataViewType. All the estimator needs to know is whether the length of the score vector is variable or not (seems to me that it is typically not variable, or at least this information can be inferred from the DnnModel).
If you get rid of this field, you will not have to guess the size of the key (which is also not needed by the estimator). #Resolved

@mstfbl mstfbl Sep 19, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch, thank you! Quick question, the Image Classification example ResnetV2101TransferLearningTrainTestSplit.cs runs perfectly well when I don't further define _outputTypes, i,e, delete line 69. Does this also mean than the estimator isn't using this field? #Resolved

@mstfbl mstfbl Sep 19, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I've found that the DataViewType field is needed for the pipeline in ResnetV2101TransferLearningTrainTestSplit.cs to fit properly. #Resolved

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? The only place _outputTypes is used is in GetOutputSchema (and it is actually being used incorrectly there - see my comments there). What is not working properly in the example you mentioned?


In reply to: 326329163 [](ancestors = 326329163)

@mstfbl mstfbl Sep 20, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was giving a different error when I removed _outputTypes, but as I now see that was not the bottleneck problem I was having while implementing this KeyType solution. #Resolved

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field _outputTypes can be deleted. The only place where it is used is in line 1281 where you have _outputTypes[0].GetItemType(). Instead of that you can write NumberDataViewType.Single, and then the field is not needed.


In reply to: 326229209 [](ancestors = 326229209)

@mstfbl mstfbl Oct 2, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed this, thank you for the catch! #Resolved

}

private static Options CreateArguments(DnnModel tensorFlowModel, string[] outputColumnNames, string[] inputColumnName, bool addBatchDimensionInput)
Expand Down