Skip to content

Conversation

@cshao239
Copy link
Contributor

@cshao239 cshao239 commented Apr 4, 2023

Description

Fix an edge case with Round function when the scaled number exceeds Double.MAX_VALUE

This issue was caused by PR #14620.

Before this PR, select round(123.4, 10000000) would return 0.0 incorrectly.

After this PR, select round(123.4, 10000000) would cause issue "input is infinite or NaN"

After this fix, it would return correctly 123.4.

In general this PR fixes an edge case for round(a, b) if a*b^10 is exceeds Double.MAX_VALUE

Additional context and related issues

Release notes

(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot
Copy link

cla-bot bot commented Apr 4, 2023

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Chenren Shao.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cshao239 cshao239 requested a review from findepi April 4, 2023 14:57
@cshao239 cshao239 marked this pull request as draft April 4, 2023 15:05
@cla-bot cla-bot bot added the cla-signed label Apr 4, 2023
@cshao239 cshao239 marked this pull request as ready for review April 4, 2023 15:51
@cshao239 cshao239 requested a review from martint April 10, 2023 13:49
Copy link
Member

@pettyjamesm pettyjamesm Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to use Math.toIntExact (and catch ArithmeticException, converting it to TrinoException if decimals > Integer.MAX_VALUE)

Nevermind, the parameter is annotated @SqlType(StandardTypes.INTEGER) so really we should just change decimals to an int parameter instead of long. Nevermind again, it looks like these are always passed as long on purpose, maybe for code generation simplicity (cc: @martint who might know). I would still suggesting using toIntExact so that we fail instead of overflowing if someone ever passes an value larger than Integer.MAX_VALUE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having tested this locally, I can say that the performance of this approach is terrible. The unit test examples added take over 5 seconds to perform each rounding operation.

I wonder if there's something better we can do when handling high precision rounding. Note that BigDecimal.valueOf(double) is implemented as:

public static BigDecimal valueOf(double val) {
        // <irrelevant comment ellided>
        return new BigDecimal(Double.toString(val));
    }

where Double.toString internally is already doing a lot of the same work we're trying to accomplish in our rounding routine in order to produce the string output which is then just parsed back in. There might be a more efficient approach we can take from the implementation there to get a correct result without such a large performance hit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I totally agree. I tested the performance locally as well! I think the performance implication here is the main reason why we use rescaled approach for other cases, given that BigDecimal approach should work for any input. I am open to other ideas how we can handle this edge case with better performance than BigDecimal approach.

Copy link
Member

@martint martint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10000000 is not a reasonable input for this function. A double number can't have more that ~17 significant digits, so it doesn't make sense to be able to round to anything smaller than that. We should fix it either by short-circuiting the computation if the number of digits is larger what would produce any visible effect or by failing on such input.

@cshao239
Copy link
Contributor Author

cshao239 commented Apr 12, 2023

10000000 is not a reasonable input for this function.

True, but this issue exists for a more reasonable case too. For round(a, b), if a is somehow at the neighborhood of Double.MAX_VALUE, let's say Double.MAX_VALUE-1, b = 2, which is a reasonable input, then a*b^10 exceeds Double.MAX_VALUE, and it will break.

@martint
Copy link
Member

martint commented Apr 12, 2023

Double.MAX_VALUE is a large integer number with 17 significant digits and many zeros after, so it doesn't matter what the value of b is. The round operation should be a no-op. This is one of the cases where we could short-circuit the computation.

@cshao239
Copy link
Contributor Author

That makes sense. I have update it with direct return the original value and more reasonable unit tests

@cshao239
Copy link
Contributor Author

cc: @martint - is this what you had in mind instead?

@cshao239
Copy link
Contributor Author

@martint Hi, Martin. Can you merge if you don't have other concerns?

@cshao239
Copy link
Contributor Author

@pettyjamesm @findepi can you guys take a look as well?

@martint martint merged commit fc26d6d into trinodb:master May 1, 2023
@martint
Copy link
Member

martint commented May 1, 2023

Thanks for the fix, @cshao239 !

@github-actions github-actions bot added this to the 416 milestone May 1, 2023
@cshao239 cshao239 deleted the round-max branch December 28, 2023 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants