Skip to content

Conversation

@jlfsdtc
Copy link

@jlfsdtc jlfsdtc commented Jul 16, 2021

[BACKPORT]
[SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
apache#27627
[SPARK-28067][SPARK-32018] Fix decimal overflow issues
apache#29026

@jlfsdtc jlfsdtc changed the title Ke43-24858 Decimal precision problems cause the build to fail Ke-24858 Decimal precision problems cause the build to fail Jul 16, 2021
@jlfsdtc jlfsdtc changed the title Ke-24858 Decimal precision problems cause the build to fail KE-24858 Decimal precision problems cause the build to fail Jul 16, 2021
@jlfsdtc
Copy link
Author

jlfsdtc commented Jul 19, 2021

retest this, please

2 similar comments
@jlfsdtc
Copy link
Author

jlfsdtc commented Jul 21, 2021

retest this, please

@jlfsdtc
Copy link
Author

jlfsdtc commented Jul 22, 2021

retest this, please

skambha and others added 5 commits July 22, 2021 15:59
[SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow
### What changes were proposed in this pull request?

This is a followup of apache#27627 to fix the remaining issues. There are 2 issues fixed in this PR:
1. `UnsafeRow.setDecimal` can set an overflowed decimal and causes an error when reading it. The expected behavior is to return null.
2. The update/merge expression for decimal type in `Sum` is wrong. We shouldn't turn the `sum` value back to 0 after it becomes null due to overflow. This issue was hidden because:
2.1 for hash aggregate, the buffer is unsafe row. Due to the first bug, we fail when overflow happens, so there is no chance to mistakenly turn null back to 0.
2.2 for sort-based aggregate, the buffer is generic row. The decimal can overflow (the Decimal class has unlimited precision) and we don't have the null problem.

If we only fix the first bug, then the second bug is exposed and test fails. If we only fix the second bug, there is no way to test it. This PR fixes these 2 bugs together.

### Why are the changes needed?

Fix issues during decimal sum when overflow happens

### Does this PR introduce _any_ user-facing change?

Yes. Now decimal sum can return null correctly for overflow under non-ansi mode.

### How was this patch tested?

new test and updated test

Closes apache#29026 from cloud-fan/decimal.

Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
fix error: java.lang.IllegalArgumentException: Can not interpolate java.lang.Boolean into code block.
fix ci error
Copy link

@zheniantoushipashi zheniantoushipashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jlfsdtc jlfsdtc merged commit a2ddc4b into Kyligence:kyspark-2.4.1.x-4.x Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants