Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of FixedLengthBinary decoding #6220

Merged
merged 3 commits into from
Aug 13, 2024

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Aug 9, 2024

Which issue does this PR close?

Tangentially related to #6219.

Rationale for this change

Following up on #6159 (comment), this adds set_from_bytes to the private ParquetValueType trait. This avoids the need for downcasting for byte array types.

What changes are included in this PR?

In addition to the above, it also changes the formatting in the encoding benchmark to distinguish between FixedLenByteArray(2) and FixedLenByteArray(16). Without this critcmp overwrites the FLBA(2) results.

Are there any user-facing changes?

No, changes are to private interfaces.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Aug 9, 2024
@etseidl
Copy link
Contributor Author

etseidl commented Aug 9, 2024

While performance for decoding is still abominable, oddly this change improves it somewhat.

On my laptop (old macbook pro w/ 2.2GHz core i7)

% critcmp master set_from
group                                                                                master                                 set_from
-----                                                                                ------                                 --------
decoding: dtype=FixedLenByteArray(16), encoding=BYTE_STREAM_SPLIT                    1.09   564.8±15.15µs        ? ?/sec    1.00   517.2±10.01µs        ? ?/sec
decoding: dtype=FixedLenByteArray(2), encoding=BYTE_STREAM_SPLIT                     1.05    384.5±6.32µs        ? ?/sec    1.00    365.4±8.88µs        ? ?/sec
decoding: dtype=f32, encoding=BYTE_STREAM_SPLIT                                      1.01     38.8±0.86µs        ? ?/sec    1.00     38.6±0.85µs        ? ?/sec
decoding: dtype=f64, encoding=BYTE_STREAM_SPLIT                                      1.00     82.1±1.48µs        ? ?/sec    1.01     82.6±2.22µs        ? ?/sec
encoding: dtype=FixedLenByteArray(16), encoding=BYTE_STREAM_SPLIT                    1.02   282.4±15.78µs        ? ?/sec    1.00   277.1±14.91µs        ? ?/sec
encoding: dtype=FixedLenByteArray(2), encoding=BYTE_STREAM_SPLIT                     1.05     50.6±4.60µs        ? ?/sec    1.00     48.2±1.91µs        ? ?/sec
encoding: dtype=f32, encoding=BYTE_STREAM_SPLIT                                      1.00     44.4±1.42µs        ? ?/sec    1.01     44.9±1.69µs        ? ?/sec
encoding: dtype=f64, encoding=BYTE_STREAM_SPLIT                                      1.00    108.3±2.07µs        ? ?/sec    1.00    108.6±2.72µs        ? ?/sec

On my workstation (3.6 GHz Core i7-12700K)

group                                                                                master                                 set_from
-----                                                                                ------                                 --------
decoding: dtype=FixedLenByteArray(16), encoding=BYTE_STREAM_SPLIT                    1.19    357.0±2.69µs        ? ?/sec    1.00    299.7±1.98µs        ? ?/sec
decoding: dtype=FixedLenByteArray(2), encoding=BYTE_STREAM_SPLIT                     1.23    300.2±1.84µs        ? ?/sec    1.00    244.4±1.98µs        ? ?/sec
decoding: dtype=f32, encoding=BYTE_STREAM_SPLIT                                      1.00     15.1±0.31µs        ? ?/sec    1.00     15.1±0.13µs        ? ?/sec
decoding: dtype=f64, encoding=BYTE_STREAM_SPLIT                                      1.02     33.5±0.64µs        ? ?/sec    1.00     32.9±0.57µs        ? ?/sec
encoding: dtype=FixedLenByteArray(16), encoding=BYTE_STREAM_SPLIT                    1.01    120.8±2.49µs        ? ?/sec    1.00    119.6±0.52µs        ? ?/sec
encoding: dtype=FixedLenByteArray(2), encoding=BYTE_STREAM_SPLIT                     1.00     24.1±0.24µs        ? ?/sec    1.01     24.2±0.17µs        ? ?/sec
encoding: dtype=f32, encoding=BYTE_STREAM_SPLIT                                      1.01     16.5±0.20µs        ? ?/sec    1.00     16.4±0.12µs        ? ?/sec
encoding: dtype=f64, encoding=BYTE_STREAM_SPLIT                                      1.00     36.1±0.26µs        ? ?/sec    1.01     36.3±0.39µs        ? ?/sec

@alamb alamb changed the title Add set_from_bytes to ParquetValueType trait Improve performance of FixedLengthBinary decoding Aug 13, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @etseidl -- even if this didn't improve performance in my opinion it would still be a valuable change simply for readability.

@alamb alamb merged commit 9c4a7e3 into apache:master Aug 13, 2024
18 checks passed
@etseidl etseidl deleted the add_set_from_bytes branch August 21, 2024 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants