You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
DataFusion as a neat ListingTable abstraction that offers the ability to read (and now write) multiple files in a directory (among other features)
DataFusion comes with built in support for Avro, Parquet, Arrow, CSV, and JSON files.
However, with the introduction of the ability to write to such files, we have inadvertently made it impossible for users to add support for their own formats which has been identified in several reports
I think we lost this ability, as pointed out on #8637 due to the fact that FileFormat::file_type trait now takes a FileType which is an enum and hence can not be extended.
I also have a longer term goal of extracting listing table out of the core of DataFusion (as it is just a (very specialized) TableProvider)
Describe the solution you'd like
I suggest we should use traits to extend FileType as we have done in other areas of the code.
When this is done, we should also make an end to end test case / example showing how a user can create support their owne custom file formats in ListingTable so that we don't cause a regression in functionality like this again in the future.
Describe alternatives you've considered
One potential design is to make FileType a trait rather than an enum.
I looked briefly into this, and it will likely require:
converting other structures like FileTypeWriterOptions into traits (or incorporating them into the FileType trait).
Another slightly different alternate design would be to incorporate all the functionality of FileType into the existing FileFormat as suggested by @devinjdangelo on #8345 (comment)
Additional context
No response
The text was updated successfully, but these errors were encountered:
@alamb@tychoish I took a stab at resolving this in #8667 . I'm also attempting to support the write path (e.g. a listing table backed by a custom FileFormat which supports insert into).
@alamb@tychoish I took a stab at resolving this in #8667 . I'm also attempting to support the write path (e.g. a listing table backed by a custom FileFormat which supports insert into).
Thank you @devinjdangelo -- I plan a whirlwind PR review blitz this afternoon but may not get to this until tomorrow
Is your feature request related to a problem or challenge?
DataFusion as a neat
ListingTable
abstraction that offers the ability to read (and now write) multiple files in a directory (among other features)DataFusion comes with built in support for Avro, Parquet, Arrow, CSV, and JSON files.
However, with the introduction of the ability to write to such files, we have inadvertently made it impossible for users to add support for their own formats which has been identified in several reports
I think we lost this ability, as pointed out on #8637 due to the fact that
FileFormat::file_type
trait now takes aFileType
which is an enum and hence can not be extended.I also have a longer term goal of extracting listing table out of the core of DataFusion (as it is just a (very specialized)
TableProvider
)Describe the solution you'd like
I suggest we should use traits to extend FileType as we have done in other areas of the code.
When this is done, we should also make an end to end test case / example showing how a user can create support their owne custom file formats in
ListingTable
so that we don't cause a regression in functionality like this again in the future.Describe alternatives you've considered
One potential design is to make
FileType
atrait
rather than anenum
.I looked briefly into this, and it will likely require:
FileTypeWriterOptions
into traits (or incorporating them into theFileType
trait).Another slightly different alternate design would be to incorporate all the functionality of
FileType
into the existingFileFormat
as suggested by @devinjdangelo on #8345 (comment)Additional context
No response
The text was updated successfully, but these errors were encountered: