Skip to content

Conversation

@raunaqmorarka
Copy link
Member

@raunaqmorarka raunaqmorarka commented Feb 2, 2023

Description

Optimize decoders for INT96 and booleans in parquet

Additional context and related issues

This completes the work of moving to optimized decoders from the parquet-mr based ones in optimized parquet reader.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive, Hudi, Iceberg, Delta
* Improve performance of reading timestamp and boolean type columns from parquet files. ({issue}`15954`)

raunaqmorarka and others added 3 commits February 3, 2023 00:32
Benchmark                          (encoding)   Mode  Cnt   Before            After            Units
BenchmarkBooleanColumnReader.read         RLE  thrpt   20   245.353 ± 6.927   899.504 ± 7.844  ops/s
Benchmark                         Mode  Cnt  Before         After           Units
BenchmarkInt96ColumnReader.read  thrpt   20  22.513 ± 2.030 119.971 ± 2.384 ops/s
@raunaqmorarka raunaqmorarka merged commit fa3ec99 into trinodb:master Feb 6, 2023
@raunaqmorarka raunaqmorarka deleted the pqr-v2-final branch February 6, 2023 09:54
@github-actions github-actions bot added this to the 407 milestone Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants