Data: Add GenericFileWriterFactory#9267
Conversation
258d3d5 to
a049d96
Compare
|
|
||
| Row binaryCol = | ||
| Row.of( | ||
| 59L, |
There was a problem hiding this comment.
The column sizes have changed cause the new writer picks up all table properties, which was not true before. I validated the actual values are correct using parquet-tools.
There was a problem hiding this comment.
I probably should have originally picked these up from the file itself, instead of hard-coding it, but its a suggestion for another PR
There was a problem hiding this comment.
Yep, we should do that eventually.
|
@nastra @Fokko @flyrain @amogh-jahagirdar, could you check this one? |
| super(TABLE_FORMAT_VERSION); | ||
| this.fileFormat = fileFormat; | ||
| this.partitioned = partitioned; | ||
| this.dataRows = |
There was a problem hiding this comment.
Had to move the initialization after the table creation.
szehon-ho
left a comment
There was a problem hiding this comment.
It looks good to me. Whose the audience of this, is it generics? Still in early stages of re-ramping up and so may miss some context
|
@szehon-ho, I faced some test failures in other PRs because of issues in |
|
Thanks for reviewing, @szehon-ho! |
This PR adds
GenericFileWriterFactory, similar toFlinkFileWriterFactoryandSparkFileWriterFactory. This is a new API that should be used in favor of methods inFileAppenderFactoryfor creating data, equality and position delete writers. Spark, for instance, has migrated toFileWriterFactorya long time ago.This PR migrates all tests to use this API for writing test files and adds a new test suite.