-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix an edge case with Round function when the scaled number exceeds Double.MAX_VALUE #16876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Chenren Shao.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to use Math.toIntExact (and catch ArithmeticException, converting it to TrinoException if decimals > Integer.MAX_VALUE)
Nevermind, the parameter is annotated @SqlType(StandardTypes.INTEGER) so really we should just change Nevermind again, it looks like these are always passed as decimals to an int parameter instead of long.long on purpose, maybe for code generation simplicity (cc: @martint who might know). I would still suggesting using toIntExact so that we fail instead of overflowing if someone ever passes an value larger than Integer.MAX_VALUE.
core/trino-main/src/main/java/io/trino/operator/scalar/MathFunctions.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having tested this locally, I can say that the performance of this approach is terrible. The unit test examples added take over 5 seconds to perform each rounding operation.
I wonder if there's something better we can do when handling high precision rounding. Note that BigDecimal.valueOf(double) is implemented as:
public static BigDecimal valueOf(double val) {
// <irrelevant comment ellided>
return new BigDecimal(Double.toString(val));
}
where Double.toString internally is already doing a lot of the same work we're trying to accomplish in our rounding routine in order to produce the string output which is then just parsed back in. There might be a more efficient approach we can take from the implementation there to get a correct result without such a large performance hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I totally agree. I tested the performance locally as well! I think the performance implication here is the main reason why we use rescaled approach for other cases, given that BigDecimal approach should work for any input. I am open to other ideas how we can handle this edge case with better performance than BigDecimal approach.
martint
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10000000 is not a reasonable input for this function. A double number can't have more that ~17 significant digits, so it doesn't make sense to be able to round to anything smaller than that. We should fix it either by short-circuiting the computation if the number of digits is larger what would produce any visible effect or by failing on such input.
True, but this issue exists for a more reasonable case too. For round(a, b), if a is somehow at the neighborhood of Double.MAX_VALUE, let's say Double.MAX_VALUE-1, b = 2, which is a reasonable input, then a*b^10 exceeds Double.MAX_VALUE, and it will break. |
|
Double.MAX_VALUE is a large integer number with 17 significant digits and many zeros after, so it doesn't matter what the value of b is. The round operation should be a no-op. This is one of the cases where we could short-circuit the computation. |
|
That makes sense. I have update it with direct return the original value and more reasonable unit tests |
|
cc: @martint - is this what you had in mind instead? |
|
@martint Hi, Martin. Can you merge if you don't have other concerns? |
|
@pettyjamesm @findepi can you guys take a look as well? |
|
Thanks for the fix, @cshao239 ! |
Description
Fix an edge case with Round function when the scaled number exceeds Double.MAX_VALUE
This issue was caused by PR #14620.
Before this PR, select round(123.4, 10000000) would return 0.0 incorrectly.
After this PR, select round(123.4, 10000000) would cause issue "input is infinite or NaN"
After this fix, it would return correctly 123.4.
In general this PR fixes an edge case for round(a, b) if a*b^10 is exceeds Double.MAX_VALUE
Additional context and related issues
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: