Prestissimo: Fix logic deserializing LongDecimal-typed Data Encoded in INT128_ARRAY in bytestream read#18838
Conversation
832bc54 to
e134a92
Compare
|
@majetideepak Appreciate if you could help review this PR when you have bandwidth. |
|
Hi @aditi-pandit @karteekmurthys can you help take a look at this since Deepak is still on parental leave? |
|
@markjin1990 Thanks for taking this up. I am not sure why we didn't face this issue. We have e2e Decimal query tests that use 128 bit numbers: https://github.com/prestodb/presto/pull/18320/files#diff-efd1daf1b9c66dbc28260d458ec2ddc109f06cdbf9db2aa0696f70c785636401R452 We are able to successfully parse incoming 128-bit stream from Presto co-ordinator to Velox Expressions in PrestoCPP. We don't parse 128-bit data stream as is because the 128-bit representation uses signed magnitude representation. Please refer this issue: #18312 Just to make sure, I copied the unit test from this PR to existing code base and the tests pass without your changes. So, not sure if your fix has any impact. Lmk if I am missing something. |
@karteekmurthys Thanks for your reply! Did you test in debug mode or release mode? Debug mode should be fine, but release mode gives me the "sigsegv - [EXC_BAD_ACCESS (code=EXC_I386_GPFLT)]" error. Clion crashed at this line in disassembly. |
Thanks for checking this issue, @karteekmurthys. Weird. I just tested on the latest presto repo in my local machine and the cloud environment and the seg fault issue still exists. Would you mind double checking you followed the following steps in reproducing the problem? This is my local machine info output by |
There was a problem hiding this comment.
Add bigger values (> 64 bit) to the tests with positive and negative values.
majetideepak
left a comment
There was a problem hiding this comment.
int128_t pointer casts can cause segfaults on unaligned memory blocks. This change looks good to me.
There was a problem hiding this comment.
These string values seem cryptic. Can we directly cast the unscaled values?
There was a problem hiding this comment.
I guess there is metadata involved. Can we construct the payload explicitly?
There was a problem hiding this comment.
@markjin1990 Did you push changes to address this comment? It would be easier to verify what exact value been passed in the encoded string if you construct the payload.
There was a problem hiding this comment.
@markjin1990 how are these payloads even built now? Can you add a comment?
There was a problem hiding this comment.
@markjin1990 how are these payloads even built now? Can you add a comment?
Just added. I basically synthesized some records of the corresponding decimal values in the table and print the value of the 2nd parameter of base64Encoded of readBlock to the screen in online running service of Prestissimo when reading these synthesized values. Hope that is clear.
Sorry that I haven't found an easy approach automatically synthesizing these values within the unit test and nor the other unit tests have demonstrated their approaches getting these values in the same test file.
|
@markjin1990 would you please address the review comments and help close this PR? |
|
@karteekmurthys Absolutely! Will do tonight. |
|
Hi, @karteekmurthys! I added new unit tests of large values (>64bit) per your suggestion. Can we merge it? |
|
@markjin1990 can you also rebase with the master branch? |
dd58fa9 to
4df7886
Compare
Just did. Thanks for noticing |
|
@markjin1990 It still says this branch is out-of-date. Can you pull the latest master and rebase? |
2df9dd4 to
6dcdd18
Compare
@majetideepak Thanks for the heads up! It's resolved. |
|
@markjin1990 it still says out-of-date to me. Do you see the same? and make the commit message |
6dcdd18 to
971497c
Compare
@majetideepak fixed the commit msg and rebased again, thanks! |
971497c to
9151357
Compare
|
Hi! @mbasmanova Would you kindly help merging this PR to the master branch? @majetideepak and @karteekmurthys already reviewed it. Thanks! |
|
Should we add an e2e test that would crash/fail without the fix? |
|
@mbasmanova: @markjin1990 is seeing this with existing tests on MacOS. |
|
See comment here #18838 (comment) |
|
@majetideepak Got it. Thank you for clarifying, @markjin1990 I'm seeing "This branch is out-of-date with the base branch" message. Please, rebase and ping when tests are green. |
9151357 to
282aeb8
Compare
|
@markjin1990 It still says the branch is out-of-date |
Prestissimo crashes with EXC_BAD_ACCESS error when deserializing LongDecimal data as it tries to parse a 128-bit value. This is now fixed by constructing the 128-bit value after reading two 64-bit values.
282aeb8 to
ba34958
Compare
@mbasmanova rebased and tests are now complete except for "continuous-integration/jenkins/pr-merge", which I think has nothing to do with my PR. |
|
@markjin1990 Thanks. |




== RELEASE NOTES ==
General Changes
This is the backtrace information of the core dump we encounter.

Test plan