Skip to content

Add public generic methods to TextLoader catalog that accept Options objects#5134

Merged
antoniovs1029 merged 2 commits intodotnet:masterfrom
antoniovs1029:textLoaderAPI
May 16, 2020
Merged

Add public generic methods to TextLoader catalog that accept Options objects#5134
antoniovs1029 merged 2 commits intodotnet:masterfrom
antoniovs1029:textLoaderAPI

Conversation

@antoniovs1029
Copy link
Contributor

The TextLoaderSaverCatalog contains 2 overloads of non-generic CreateTextLoader and LoadFromTextFile; one overload accepts a given set of a parameters and the other overload accepts a TexLoader.Options object.

The generic methods for CreateTextLoader<TInput> and LoadFromTextFile<TInput> only have 1 method that accepts a given set of parameters. So in this PR I add another overload for those 2 methods that also accepts an Options object.

The reason to do this is that if we add new parameters to TextLoader (for example, as done in #5125) then they're only accessible through a TextLoader.Options (so without this PR CreateTextLoader<TInput> and LoadFromTextFile<TInput> won't be able to use the new options). Notice that we can't simply add a new parameter to the existing CreateTextLoader<TInput> and LoadFromTextFile<TInput> methods, since that is considered a breaking API change, and we can't do those until ML.NET version 2.

@antoniovs1029 antoniovs1029 requested a review from a team as a code owner May 15, 2020 10:34
@antoniovs1029 antoniovs1029 requested a review from harishsk May 15, 2020 10:34
@codecov
Copy link

codecov bot commented May 15, 2020

Codecov Report

Merging #5134 into master will decrease coverage by 0.23%.
The diff coverage is 56.52%.

@@            Coverage Diff             @@
##           master    #5134      +/-   ##
==========================================
- Coverage   75.83%   75.60%   -0.24%     
==========================================
  Files         951      990      +39     
  Lines      172613   179227    +6614     
  Branches    18632    19292     +660     
==========================================
+ Hits       130904   135497    +4593     
- Misses      36529    38462    +1933     
- Partials     5180     5268      +88     
Flag Coverage Δ
#Debug 75.60% <56.52%> (-0.24%) ⬇️
#production 71.50% <56.52%> (+0.05%) ⬆️
#test 88.85% <ø> (-1.64%) ⬇️
Impacted Files Coverage Δ
....AutoML/API/RunDetails/CrossValidationRunDetail.cs 100.00% <ø> (ø)
...rc/Microsoft.ML.AutoML/API/RunDetails/RunDetail.cs 95.45% <ø> (ø)
....AutoML/ColumnInference/ColumnGroupingInference.cs 85.00% <ø> (ø)
...t.ML.AutoML/ColumnInference/ColumnTypeInference.cs 85.46% <ø> (+1.76%) ⬆️
...rosoft.ML.AutoML/ColumnInference/TextFileSample.cs 62.25% <0.00%> (ø)
...icrosoft.ML.AutoML/Experiment/SuggestedPipeline.cs 88.65% <ø> (ø)
....ML.AutoML/PipelineSuggesters/PipelineSuggester.cs 83.19% <ø> (ø)
src/Microsoft.ML.AutoML/Sweepers/Parameters.cs 84.32% <ø> (ø)
src/Microsoft.ML.AutoML/Sweepers/SmacSweeper.cs 93.47% <ø> (ø)
...plates/Azure/Model/AzureAttachImageConsumeModel.cs 0.00% <0.00%> (ø)
... and 444 more

IMultiStreamSource dataSample = null)
{
Options options = new Options
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to add readMultiLines as an option in a later PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readMultilines is already added to the TextLoader.Options class in my other PR, and it will always default as false, if its value isn't provided.

The method here, that you've pointed to, is used by this public method in the catalog. I can't add a new readMultilines parameter there precisely because it would be a breaking API change. So it doesn't make much sense to me to add a readMultilines option here, and so TextLoader.Options will simply use the default for it (which is false).

In the future, if a user (or we internally) want to use the new options (such as readMultiline), we would simply use a TextLoader.Options object and the overloads for creating TextLoaders that uses the option object. I think it's better and easier to maintain and update only the options object than an overload with several parameters that can't be changed in the public API.

Copy link
Contributor

@harishsk harishsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@antoniovs1029 antoniovs1029 merged commit b4bff87 into dotnet:master May 16, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants