Support native parquet writer in hive module and Misc fixes#3400
Support native parquet writer in hive module and Misc fixes#3400dain merged 12 commits intotrinodb:masterfrom
Conversation
dain
left a comment
There was a problem hiding this comment.
Looking good. Just some minor comments.
There was a problem hiding this comment.
Consider caching this value if it is expensive to calculate. In the ORC writer, we update the cached value after each write operation is processed.
There was a problem hiding this comment.
I resolved this. Also, I don't quite understand why in OrcWriter:
@Override
public long getWrittenBytes()
{
return orcWriter.getWrittenBytes() + orcWriter.getBufferedBytes();
}
why bufferedBytes is part of written bytes?
presto-hive/src/main/java/io/prestosql/plugin/hive/HiveModule.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/parquet/ParquetFileWriter.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/parquet/ParquetFileWriterFactory.java
Outdated
Show resolved
Hide resolved
presto-hive/src/main/java/io/prestosql/plugin/hive/parquet/ParquetWriterConfig.java
Outdated
Show resolved
Hide resolved
presto-parquet/src/main/java/io/prestosql/parquet/writer/PrimitiveColumnWriter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Are there statistics for non-primitive types?
There was a problem hiding this comment.
No. This aims to fill the stats part for primitive columns https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L740 . There is no support for this on non-primitive types.
673eead to
5e46b04
Compare
resetDictionary in reset() cleaned dictionary regardless of the fact that dictionary page is null or not. resetDictionary should happen only after get dictionary page and the page is not null.
Set both setting to smaller value could force writer to write multiple row groups and multiple pages in each row group, which could better simulate real world examples.
5e46b04 to
a5f7df3
Compare
No description provided.