Conversation
Codecov Report
@@ Coverage Diff @@
## master #5313 +/- ##
===========================================
+ Coverage 73.80% 87.68% +13.87%
===========================================
Files 1022 244 -778
Lines 190617 44759 -145858
Branches 20488 1900 -18588
===========================================
- Hits 140686 39247 -101439
+ Misses 44411 5167 -39244
+ Partials 5520 345 -5175
Flags with carried forward coverage won't be shown. Click here to find out more. |
|
Marked as ready for review, though still looking at how to add a test for this. |
| while (_toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter().GetResult()) | ||
| var waitToReadAwaiter = _toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter(); | ||
|
|
||
| while (waitToReadAwaiter.IsCompleted && waitToReadAwaiter.GetResult()) |
There was a problem hiding this comment.
Can you add a comment here stating that the check to verify the operation is complete is necessary before calling GetResult()?
There was a problem hiding this comment.
Dear sir, I have same error with :
var option = new SdcaMaximumEntropyMulticlassTrainer.Options()
{
FeatureColumnName = "SearchModel",
LabelColumnName = "Label",
NumberOfThreads = 1
};
var maximumEntropy = ctx.MulticlassClassification.Trainers.SdcaMaximumEntropy(option);
var pipeline =ctx.Transforms.Conversion.MapValueToKey("Label", "BestDictMatchModelText")
.Append(ctx.Transforms.Text.FeaturizeText(inputColumnName: "VEHICLE_MODEL",
outputColumnName: "SearchModel"))
.Append(maximumEntropy)
.Append(ctx.Transforms.Conversion.MapKeyToValue("PredictedLabel"));
var part1 = decoratedWithFuzzyList;//.Take(1000);
var dataReader = ctx.Data.LoadFromEnumerable(part1);
ThisModel = pipeline.Fit(dataReader);
So, when part1 contains more than a 1000 items, Fit throws:
at System.Threading.Channels.AsyncOperation.ThrowIncompleteOperationException()
at System.Threading.Channels.AsyncOperation1.GetResult(Int16 token) at Microsoft.ML.Transforms.RowShufflingTransformer.Cursor.MoveNextCore() at Microsoft.ML.Data.RootCursorBase.MoveNext() at Microsoft.ML.Trainers.TrainingCursorBase.MoveNext() at Microsoft.ML.Trainers.SdcaTrainerBase3.TrainCore(IChannel ch, RoleMappedData data, LinearModelParameters predictor, Int32 weightSetCount)
at Microsoft.ML.Trainers.StochasticTrainerBase2.TrainModelCore(TrainContext context) at Microsoft.ML.Trainers.TrainerEstimatorBase2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
at Microsoft.ML.Trainers.TrainerEstimatorBase2.Fit(IDataView input) at Microsoft.ML.Data.EstimatorChain1.Fit(IDataView input)
Can you please consider to fix this?
There was a problem hiding this comment.
I have this same issue with version 1.5.0. I can fit 1046 rows in the shuffled set. 1047 rows will give the The asynchronous operation has not completed exception.
Tested with the daily 1.5.2 package, and now the error is gone.
It sounds like the samples was able to repro the issue: dotnet/machinelearning-samples#833 |
| while (_toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter().GetResult()) | ||
| var waitToReadAwaiter = _toConsumeChannel.Reader.WaitToReadAsync().GetAwaiter(); | ||
|
|
||
| while (waitToReadAwaiter.IsCompleted && waitToReadAwaiter.GetResult()) |
There was a problem hiding this comment.
This is undo-ing what we were trying to avoid with #5123 (comment), which is just spinning a thread seeing if there is something to read.
I was mistaken that you could block synchronously on a Channel. Sorry about that.
What do you think about going back to what you had previously with just calling TryRead, and if that returns false, the only thing I can think of is to Thread.Sleep for some amount of time, even 1 ms, just to stop the thread from consuming a whole core spinning waiting for the producer. It would probably be a good idea to try it multiple ways and see what perf results you get.
There was a problem hiding this comment.
Thanks @eerhardt! Made an update for this. Just let me know if it's still not correct.
|
Since this has broken projects in the past, it would be really good to get a test for this to make sure it doesn't break in the future. |
|
@eerhardt Added a small scenario based on the taxi fare sample. Verified in a separate solution that it throws the error and running the test in this branch runs successfully. Definitely let me know if any changes are needed. Thanks for the help with this! |
|
|
||
| public static string DownloadTaxiFareData() | ||
| { | ||
| string githubPath = "https://raw.githubusercontent.com/dotnet/machinelearning-samples/master/samples/csharp/getting-started/Regression_TaxiFarePrediction/TaxiFarePrediction/Data/taxi-fare-train.csv"; |
There was a problem hiding this comment.
We already have this file in the repo. See
machinelearning/test/Microsoft.ML.Tests/FeatureContributionTests.cs
Lines 388 to 417 in 37edde9
We shouldn't need to download the file again.
| @@ -0,0 +1,40 @@ | |||
| using Xunit; | |||
There was a problem hiding this comment.
All code files should have a copyright at the top.
|
Thanks again for the help @eerhardt! Sorry for breaking it, but glad I can help with the fix. |
Fix for #5312
Setting as a draft to add a test.