Skip to content

Prestissimo: Fix logic deserializing LongDecimal-typed Data Encoded in INT128_ARRAY in bytestream read#18838

Merged
mbasmanova merged 1 commit intoprestodb:masterfrom
markjin1990:zhongjun/fix-bytestream-read-int128
Feb 3, 2023
Merged

Prestissimo: Fix logic deserializing LongDecimal-typed Data Encoded in INT128_ARRAY in bytestream read#18838
mbasmanova merged 1 commit intoprestodb:masterfrom
markjin1990:zhongjun/fix-bytestream-read-int128

Conversation

@markjin1990
Copy link
Contributor

@markjin1990 markjin1990 commented Dec 22, 2022

== RELEASE NOTES ==

General Changes

  • On the Velox side, reading int128 in Bytestream is not yet supported, and instead, it reads two 64bit integers and combine them for form an int128. However, on Prestissimo, the loophole seems to still exist. Without this fix, in the release mode, Prestissimo will crash with EXC_BAD_ACCESS error when deserializing LongDecimal data (unit tests) since it is trying to directly parse a 128-bit block.

This is the backtrace information of the core dump we encounter.
image

Test plan

  • Added multiple unit tests for deserializing base64 encoded long Decimals for Prestissimo

@markjin1990 markjin1990 requested a review from a team as a code owner December 22, 2022 22:57
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 22, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: markjin1990 / name: Zhongjun Jin (Mark) (e134a9259cb3d37c36ed01b96b9553540eff4d16)

@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from 832bc54 to e134a92 Compare December 22, 2022 23:02
@markjin1990
Copy link
Contributor Author

@majetideepak Appreciate if you could help review this PR when you have bandwidth.

@ethanyzhang
Copy link
Contributor

Hi @aditi-pandit @karteekmurthys can you help take a look at this since Deepak is still on parental leave?

@karteekmurthys
Copy link
Contributor

@markjin1990 Thanks for taking this up. I am not sure why we didn't face this issue. We have e2e Decimal query tests that use 128 bit numbers: https://github.com/prestodb/presto/pull/18320/files#diff-efd1daf1b9c66dbc28260d458ec2ddc109f06cdbf9db2aa0696f70c785636401R452

We are able to successfully parse incoming 128-bit stream from Presto co-ordinator to Velox Expressions in PrestoCPP. We don't parse 128-bit data stream as is because the 128-bit representation uses signed magnitude representation. Please refer this issue: #18312

Just to make sure, I copied the unit test from this PR to existing code base and the tests pass without your changes. So, not sure if your fix has any impact. Lmk if I am missing something.

Screen Shot 2022-12-26 at 9 31 40 PM

@markjin1990
Copy link
Contributor Author

Just to make sure, I copied the unit test from this PR to existing code base and the tests pass without your changes. So, not sure if your fix has any impact. Lmk if I am missing something.

@karteekmurthys Thanks for your reply! Did you test in debug mode or release mode? Debug mode should be fine, but release mode gives me the "sigsegv - [EXC_BAD_ACCESS (code=EXC_I386_GPFLT)]" error. Clion crashed at this line in disassembly.
(https://stackoverflow.com/questions/19651788/whats-the-meaning-of-exception-code-exc-i386-gpflt)
image

@karteekmurthys
Copy link
Contributor

karteekmurthys commented Dec 27, 2022

Debug mode should be fine, but release mode gives me the "sigsegv - [EXC_BAD_ACCESS (code=EXC_I386_GPFLT)]" error. Clion crashed at this line in disassembly.

I had run both, but shared the debug mode results. Here are the release mode results as well:

Screen Shot 2022-12-26 at 9 54 05 PM

Edit: Updated the snapshot.

@markjin1990
Copy link
Contributor Author

markjin1990 commented Dec 28, 2022

Debug mode should be fine, but release mode gives me the "sigsegv - [EXC_BAD_ACCESS (code=EXC_I386_GPFLT)]" error. Clion crashed at this line in disassembly.

I had run both, but shared the debug mode results. Here are the release mode results as well:

Screen Shot 2022-12-26 at 9 54 05 PM

Edit: Updated the snapshot.

Thanks for checking this issue, @karteekmurthys. Weird. I just tested on the latest presto repo in my local machine and the cloud environment and the seg fault issue still exists. Would you mind double checking you followed the following steps in reproducing the problem?
image

This is my local machine info output by velox/script/info.sh.

Velox System Info v0.0.2
Commit: 4a0b62e4aa6051a79b2247c50c3eb15cfeb2baef
CMake Version: 3.25.1
System: Darwin-21.6.0
Arch: x86_64
C++ Compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
C++ Compiler Version: 13.0.0.13000029
C Compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
C Compiler Version: 13.0.0.13000029
CMake Prefix Path: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.1.sdk/usr;/usr/local;/usr;/;/usr/local/Cellar/cmake/3.25.1;/usr/local;/usr/X11R6;/usr/pkg;/opt;/sw;/opt/local

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add bigger values (> 64 bit) to the tests with positive and negative values.

Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int128_t pointer casts can cause segfaults on unaligned memory blocks. This change looks good to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These string values seem cryptic. Can we directly cast the unscaled values?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is metadata involved. Can we construct the payload explicitly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markjin1990 Did you push changes to address this comment? It would be easier to verify what exact value been passed in the encoded string if you construct the payload.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markjin1990 how are these payloads even built now? Can you add a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markjin1990 how are these payloads even built now? Can you add a comment?

Just added. I basically synthesized some records of the corresponding decimal values in the table and print the value of the 2nd parameter of base64Encoded of readBlock to the screen in online running service of Prestissimo when reading these synthesized values. Hope that is clear.

Sorry that I haven't found an easy approach automatically synthesizing these values within the unit test and nor the other unit tests have demonstrated their approaches getting these values in the same test file.

@karteekmurthys
Copy link
Contributor

@markjin1990 would you please address the review comments and help close this PR?

@markjin1990
Copy link
Contributor Author

@karteekmurthys Absolutely! Will do tonight.

@markjin1990
Copy link
Contributor Author

Hi, @karteekmurthys! I added new unit tests of large values (>64bit) per your suggestion. Can we merge it?

Copy link
Contributor

@karteekmurthys karteekmurthys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@majetideepak
Copy link
Collaborator

@markjin1990 can you also rebase with the master branch?

@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from dd58fa9 to 4df7886 Compare February 1, 2023 00:35
@markjin1990
Copy link
Contributor Author

@markjin1990 can you also rebase with the master branch?

Just did. Thanks for noticing

@majetideepak
Copy link
Collaborator

@markjin1990 It still says this branch is out-of-date. Can you pull the latest master and rebase?
Also, please squash all the commits into a single commit. Please prefix [native] to the commit and title as described here https://github.com/prestodb/presto/tree/master/presto-native-execution#creating-prs-for-prestopresto-native-execution.

@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from 2df9dd4 to 6dcdd18 Compare February 1, 2023 21:20
@markjin1990
Copy link
Contributor Author

@markjin1990 It still says this branch is out-of-date. Can you pull the latest master and rebase? Also, please squash all the commits into a single commit. Please prefix [native] to the commit and title as described here https://github.com/prestodb/presto/tree/master/presto-native-execution#creating-prs-for-prestopresto-native-execution.

@majetideepak Thanks for the heads up! It's resolved.

@majetideepak
Copy link
Collaborator

@markjin1990 it still says out-of-date to me. Do you see the same?
And can you fix the commit message? Probably make the commit title

[native] Fix deserializing of encoded INT128_ARRAY in byte-stream read

and make the commit message

Prestissimo crashes with EXC_BAD_ACCESS error when deserializing LongDecimal data
as it tries to parse a 128-bit value. This is now fixed by constructing the 128-bit value after
reading two 64-bit values.

@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from 6dcdd18 to 971497c Compare February 2, 2023 00:32
@markjin1990
Copy link
Contributor Author

@markjin1990 it still says out-of-date to me. Do you see the same? And can you fix the commit message? Probably make the commit title

[native] Fix deserializing of encoded INT128_ARRAY in byte-stream read

and make the commit message

Prestissimo crashes with EXC_BAD_ACCESS error when deserializing LongDecimal data
as it tries to parse a 128-bit value. This is now fixed by constructing the 128-bit value after
reading two 64-bit values.

@majetideepak fixed the commit msg and rebased again, thanks!

@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from 971497c to 9151357 Compare February 2, 2023 08:26
@markjin1990
Copy link
Contributor Author

Hi! @mbasmanova Would you kindly help merging this PR to the master branch? @majetideepak and @karteekmurthys already reviewed it. Thanks!

@mbasmanova
Copy link
Contributor

Should we add an e2e test that would crash/fail without the fix?

@majetideepak
Copy link
Collaborator

@mbasmanova: @markjin1990 is seeing this with existing tests on MacOS.

Velox System Info v0.0.2
Commit: 4a0b62e4aa6051a79b2247c50c3eb15cfeb2baef
CMake Version: 3.25.1
System: Darwin-21.6.0
Arch: x86_64
C++ Compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
C++ Compiler Version: 13.0.0.13000029
C Compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
C Compiler Version: 13.0.0.13000029
CMake Prefix Path: /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX12.1.sdk/usr;/usr/local;/usr;/;/usr/local/Cellar/cmake/3.25.1;/usr/local;/usr/X11R6;/usr/pkg;/opt;/sw;/opt/local

@majetideepak
Copy link
Collaborator

See comment here #18838 (comment)

@mbasmanova
Copy link
Contributor

@majetideepak Got it. Thank you for clarifying,

@markjin1990 I'm seeing "This branch is out-of-date with the base branch" message. Please, rebase and ping when tests are green.

@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from 9151357 to 282aeb8 Compare February 2, 2023 20:56
@majetideepak
Copy link
Collaborator

@markjin1990 It still says the branch is out-of-date

Prestissimo crashes with EXC_BAD_ACCESS error when deserializing LongDecimal data
as it tries to parse a 128-bit value. This is now fixed by constructing the 128-bit value after
reading two 64-bit values.
@markjin1990 markjin1990 force-pushed the zhongjun/fix-bytestream-read-int128 branch from 282aeb8 to ba34958 Compare February 2, 2023 22:05
@markjin1990
Copy link
Contributor Author

@markjin1990 I'm seeing "This branch is out-of-date with the base branch" message. Please, rebase and ping when tests are green.

@mbasmanova rebased and tests are now complete except for "continuous-integration/jenkins/pr-merge", which I think has nothing to do with my PR.

@mbasmanova mbasmanova merged commit b21715e into prestodb:master Feb 3, 2023
@mbasmanova
Copy link
Contributor

@markjin1990 Thanks.

@wanglinsong wanglinsong mentioned this pull request Feb 25, 2023
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants