Adding basic support for handling vectors in the database loader.#4138
Adding basic support for handling vectors in the database loader.#4138eerhardt merged 15 commits intodotnet:masterfrom tannergooding:DatabaseLoader
Conversation
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoaderCursor.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoaderCursor.cs
Outdated
Show resolved
Hide resolved
… the TestModel folder
…le, and removing some dead code.
|
This should be ready for review now. |
|
|
||
| var mockProviderFactory = new MockProviderFactory(mlContext, loader); | ||
| var databaseSource = new DatabaseSource(mockProviderFactory, connectionString, commandText); | ||
| var providerFactory = DbProviderFactories.GetFactory("System.Data.SqlClient"); |
There was a problem hiding this comment.
Why not use SqlClientFactory.Instance instead?
There was a problem hiding this comment.
I was just using DbProviderFactories as that seemed to be the normal way to get registered factories and more closely matched what users wanted.
| get => throw new NotImplementedException(); | ||
| set => throw new NotImplementedException(); | ||
| var databaseFile = GetTestDatabasePath(databaseName); | ||
| return $@"Data Source=(LocalDB)\MSSQLLocalDB;AttachDbFilename={databaseFile};Database={databaseName};Integrated Security=True"; |
There was a problem hiding this comment.
Will this work on non-Windows machines?
There was a problem hiding this comment.
I didn't test, but I believe so.
Linux supports SQL as well.
There was a problem hiding this comment.
Your CI is failing with
System.PlatformNotSupportedException : LocalDB is not supported on this platform.
There was a problem hiding this comment.
I'll have to disable for Linux for the time being and can log an issue for adding a mysql database as well?
There was a problem hiding this comment.
It is also failing for MacOS. It would be unfortunate to not have any unix tests for the DatabaseLoader....
In reply to: 319267186 [](ancestors = 319267186)
There was a problem hiding this comment.
Right, but I'm not sure I'll get a MySQL database together before this needs to be in for the next preview.
When is that cutoff?
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoaderCursor.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
| ctx.Writer.WriteBoolByte(info.SourceIndex.HasValue); | ||
| if (info.SourceIndex.HasValue) | ||
| ctx.Writer.Write(info.SourceIndex.GetValueOrDefault()); | ||
| ctx.Writer.Write(info.Segments.Length); |
There was a problem hiding this comment.
info.Segments can be null, right? (According to ColInfo's constructor, it can be. It has an AssertValueOrNull line in it).
There was a problem hiding this comment.
This is the same as TextLoader, so if there is a bug here, there is a bug there as well.
There was a problem hiding this comment.
there is a bug there as well.
That is not true.
machinelearning/src/Microsoft.ML.Data/DataLoadSave/Text/TextLoader.cs
Lines 576 to 579 in 429f8cc
TextLoader guarantees that Segment[] segs is a non empty array. DatabaseLoader allows it to be null.
There was a problem hiding this comment.
Changed to be ctx.Writer.Write((info.Segments?.Length).GetValueOrDefault());
There was a problem hiding this comment.
Isn't the very next line going to NullRef?
There was a problem hiding this comment.
Maybe we should have a test that saves and loads a DatabaseLoader into a model.
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| /// <summary>Maps a <see cref="DataKind"/> to the associated <see cref="DbType"/>.</summary> | ||
| public static DbType ToDbType(this DataKind dataKind) |
There was a problem hiding this comment.
This isn't used anymore, is it?
There was a problem hiding this comment.
It is used in CreateDatabaseLoader<TInput> when processing the memberInfo kind.
src/Microsoft.ML.Experimental/DataLoadSave/Database/DatabaseLoader.cs
Outdated
Show resolved
Hide resolved
eerhardt
left a comment
There was a problem hiding this comment.
Looks great. Thanks for the nice work. It will make customers lives easier to consume many columns from a Database.
|
@tannergooding - half of the Windows legs are failing on your new tests with an exception of the form: |
|
Added a connection timeout to try and give localdb a chance to initialize if it hadn't been previously |
|
Merging. The failing leg is failing due to an AutoML test, which is unrelated to this work. |
No description provided.