Skip to content

Commit decd761

Browse files
committed
Fix reporting of row group size by parquet writer
Output of parquet-tools Before Row group 25: count: 2240892 26.32 B records start: 1479039171 total(compressed): 56.243 MB total(uncompressed):56.243 MB After Row group 25: count: 2244256 26.34 B records start: 1479178837 total(compressed): 56.370 MB total(uncompressed):167.418 MB
1 parent a212bf1 commit decd761

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

lib/trino-parquet/src/main/java/io/trino/parquet/writer/ParquetWriter.java

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -374,9 +374,10 @@ Slice getFooter(List<RowGroup> rowGroups, MessageType messageType)
374374

375375
private void updateRowGroups(List<ColumnMetaData> columnMetaData)
376376
{
377-
long totalBytes = columnMetaData.stream().mapToLong(ColumnMetaData::getTotal_compressed_size).sum();
377+
long totalCompressedBytes = columnMetaData.stream().mapToLong(ColumnMetaData::getTotal_compressed_size).sum();
378+
long totalBytes = columnMetaData.stream().mapToLong(ColumnMetaData::getTotal_uncompressed_size).sum();
378379
ImmutableList<org.apache.parquet.format.ColumnChunk> columnChunks = columnMetaData.stream().map(ParquetWriter::toColumnChunk).collect(toImmutableList());
379-
rowGroupBuilder.add(new RowGroup(columnChunks, totalBytes, rows));
380+
rowGroupBuilder.add(new RowGroup(columnChunks, totalBytes, rows).setTotal_compressed_size(totalCompressedBytes));
380381
}
381382

382383
private static org.apache.parquet.format.ColumnChunk toColumnChunk(ColumnMetaData metaData)

0 commit comments

Comments
 (0)