Skip to content

Conversation

@mapleFU
Copy link
Member

@mapleFU mapleFU commented Dec 15, 2022

When flush more than one block, DELTA_BINARY_PACKED will be corrupt, because it didn't reset the context after flush.

I'll add some tests this weekend.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@mapleFU
Copy link
Member Author

mapleFU commented Dec 17, 2022

@wjones127 @pitrou @rok Mind take a took? I don't want to modify testing for a lot, so a just add a ROUND_TRIP_TIME to make it build multiple times. /cc @wgtmac

Copy link
Member

@rok rok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this @mapleFU, I didn't realize encoders get reused like this.
I'm adding a couple of naming suggestions.

@mapleFU
Copy link
Member Author

mapleFU commented Dec 19, 2022

Resolved now, thanks @rok

@mapleFU
Copy link
Member Author

mapleFU commented Dec 20, 2022

@pitrou Mind take a look?

@mapleFU mapleFU force-pushed the fixing-delta-bit-pack branch from ee0d29f to aca32f1 Compare December 22, 2022 04:54
@mapleFU
Copy link
Member Author

mapleFU commented Dec 22, 2022

Retrigger CI, seems the macos error is not caused by me

@mapleFU mapleFU requested a review from rok December 22, 2022 08:39
@mapleFU
Copy link
Member Author

mapleFU commented Dec 22, 2022

@pitrou can we merge this patch?

@pitrou
Copy link
Member

pitrou commented Dec 22, 2022

@rok Can you give this a final look and merge if ok?

@rok rok changed the title ARROW-18437: [C++] Parquet Fixing DELTA_BINARY_PACKED for flush more than one block ARROW-18437: [C++] Parquet fix encoder for DELTA_BINARY_PACKED when flushing more than once Dec 22, 2022
@rok rok changed the title ARROW-18437: [C++] Parquet fix encoder for DELTA_BINARY_PACKED when flushing more than once ARROW-18437: [C++][Parquet] Fix encoder for DELTA_BINARY_PACKED when flushing more than once Dec 22, 2022
@rok rok merged commit ff10020 into apache:master Dec 22, 2022
@rok
Copy link
Member

rok commented Dec 22, 2022

Merged. Thanks for noticing and fixing this @mapleFU !

@mapleFU mapleFU deleted the fixing-delta-bit-pack branch December 22, 2022 15:37
@ursabot
Copy link

ursabot commented Dec 23, 2022

Benchmark runs are scheduled for baseline = 053080d and contender = ff10020. ff10020 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed] test-mac-arm
[Finished ⬇️0.0% ⬆️2.81%] ursa-i9-9960x
[Finished ⬇️0.58% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] ff100205 ec2-t3-xlarge-us-east-2
[Failed] ff100205 test-mac-arm
[Finished] ff100205 ursa-i9-9960x
[Finished] ff100205 ursa-thinkcentre-m75q
[Finished] 053080dc ec2-t3-xlarge-us-east-2
[Failed] 053080dc test-mac-arm
[Finished] 053080dc ursa-i9-9960x
[Finished] 053080dc ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Dec 23, 2022

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

EpsilonPrime pushed a commit to EpsilonPrime/arrow that referenced this pull request Jan 5, 2023
…flushing more than once (apache#14959)

When flush more than one block, `DELTA_BINARY_PACKED` will be corrupt, because it didn't reset the context after flush.

I'll add some tests this weekend.

Authored-by: mwish <[email protected]>
Signed-off-by: Rok Mihevc <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants