Issue 3234: use model schema type instead of class definition schema#5228
Issue 3234: use model schema type instead of class definition schema#5228frank-dong-ms-zz merged 3 commits intodotnet:masterfrom
Conversation
| public string[] CategoricalFeatures; | ||
| public float[] NumericalFeatures; | ||
| #pragma warning restore SA1401 | ||
| public float Label; |
There was a problem hiding this comment.
Label [](start = 25, length = 5)
add extra column Label as the model modelwithoptionalcolumntransform's schema contains an extra Label column
Codecov Report
@@ Coverage Diff @@
## master #5228 +/- ##
==========================================
- Coverage 73.47% 73.45% -0.02%
==========================================
Files 1010 1010
Lines 187988 187984 -4
Branches 20262 20261 -1
==========================================
- Hits 138118 138092 -26
- Misses 44385 44404 +19
- Partials 5485 5488 +3
|
|
|
||
| //Always use column type from model as this type can be more specific. | ||
| //This can be corner case: | ||
| //For example, we can load an model whose schema contains Vector<Single, 38> |
There was a problem hiding this comment.
Will there be a check or at least logging for this corner case? Maybe we should put in a check for arrays of size 0, which is what float[] would be. Also is there a check for the size of provided arrays matching the size of the corresponding Vector?
There was a problem hiding this comment.
test LoadModelWithOptionalColumnTransform is to test this case.
for length check, it is applied at below:
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.Data/Scorers/SchemaBindablePredictorWrapper.cs#L129-L147
In reply to: 439585282 [](ancestors = 439585282)
| @@ -230,15 +231,17 @@ public class ModelOutput | |||
| [Fact] | |||
| public void LoadModelWithOptionalColumnTransform() | |||
| { | |||
There was a problem hiding this comment.
Does this fix invalidate the old test?
It appears that the the previous test was testing this using SchemaDefinition and to reproduce this issue you are using the one DataViewSchema.
Does it make sense to test both code paths not replace the previous one?
There was a problem hiding this comment.
No, the old test is still valid. I'm ok with test both code path not just replace as I don't find a test is test the previous case directly.
This test was originally created at below PR:
#3700
From the conversation you can see the original intention was to use the modified way to write test(load model with schema output then use the schema to create prediction engine), but with the type bug this test was written in the way with extra schema definition.
I will made the change to test both ways of create prediction engine, thanks
In reply to: 440394844 [](ancestors = 440394844)
address issue: #3234