Fix precision loss in from_unixtime(double) function by hainenber · Pull Request #21899 · prestodb/presto

hainenber · 2024-02-11T09:34:13Z

Description

Apply spershin's proposed change to fix precision loss for timestamp yielded from FROM_UNIXTIME() function.

Motivation and Context

Fixes #21891

Impact

Test Plan

Contributor checklist

Please make sure your submission complies with our development, formatting, commit message, and attribution guidelines.
PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
Documented new properties (with its default value), SQL syntax, functions, or other functionality.
If release notes are required, they follow the release notes guidelines.
Adequate tests were added if applicable.
CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

General Changes
*  Fixes precision loss when timestamp yielded from `from_unixtime(double)` function.

mbasmanova

@hainenber Thank you for the fix.

Please, review Contributing guidelines at https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md and update the PR to comply.

mbasmanova · 2024-02-12T10:30:47Z

presto-main/src/test/java/com/facebook/presto/sql/gen/TestExpressionCompiler.java

Notice that function name is from_unixtime, not from_unix. Make sure to use the right name in PR title/description, commit message, Release Notes. etc.

Is TestExpressionCompiler.java the right place for this test? Shouldn't it go somewhere in com/facebook/presto/operator/scalar/TestDateTimeFunctionsBase.java

Thanks, I've moved them to com/facebook/presto/operator/scalar/TestDateTimeFunctionsBase.java in latest commit

hainenber · 2024-02-12T11:15:48Z

Commits squashed and adhere to commit message guideline.

mbasmanova · 2024-02-12T11:18:14Z

presto-main/src/test/java/com/facebook/presto/operator/scalar/TestDateTimeFunctionsBase.java

Would you confirm that this test fails without the change?

Maybe adding a comment that this particular double was causing loss of precision in the function before the fix and hence we are testing it here.
In case anyone wondering why this number.

I can confirm the test fail without the changes.

I reused many data in your GH issue to provide the details on the why for the change :D

Is this a fix for when the timestamp presented as double has nanos? I cannot repro this issue when working with millis directly.

assertFunction("from_unixtime(1.7041507095805E9)"

My not so unit test code:

public static double toUnixTimeTest(long timestamp) { return timestamp / 1000.0; } public static long fromUnixTimeTestOld(double unixTime) { return Math.round(unixTime * 1000); } public static long fromUnixTimeTestNew(double unixTime) { return Math.round(Math.floor(unixTime) * 1000 + Math.round((unixTime - Math.floor(unixTime)) * 1000)); } public static void main(String[] args) { SqlTimestamp baselineTimestamp = sqlTimestampOf(LocalDateTime.of(2024, 1, 1, 23, 11, 49, millisToNanos(580))); long baselineMillis = baselineTimestamp.getMillis(); for (int i = -1_000_000_000; i < 1_000_000_000; i++) { long expectedMillis = baselineMillis + i; double doubleValue = toUnixTimeTest(expectedMillis); long oldMillis = fromUnixTimeTestOld(doubleValue); long newMillis = fromUnixTimeTestNew(doubleValue); if (expectedMillis != oldMillis || oldMillis != newMillis) { SqlTimestamp tsOld = new SqlTimestamp(oldMillis, MILLISECONDS); SqlTimestamp tsNew = new SqlTimestamp(newMillis, MILLISECONDS); System.out.println(expectedMillis); System.out.println(baselineTimestamp); System.out.println(tsOld); System.out.println(tsNew); System.out.println(); } } }

@mbasmanova In my understanding this change might cause checksum mismatches when running verifier on Velox written DWRF partitions because Velox uses nano precision in timestamps.

mbasmanova · 2024-02-12T11:21:17Z

presto-main/src/main/java/com/facebook/presto/operator/scalar/DateTimeFunctions.java

This is a tricky change. Future readers won't know why this is done in this particular way and 'git log' won't help much because commit message just says 'loses precision in some corner cases' without providing any details.

Would you add a comment here to explain what's going on and why the computation is done this way and also update commit message to explain clearly the problem and solution?

@hainenber
We could reuse some sentences from the issue and put them in comment.

I've amended the change with more details in the form of comments and commit message.

spershin

@hainenber

Thank you for tacking ling this change.
It is good to go.
Please, add some comments to the new code as reviewers suggested.

spershin · 2024-02-12T21:37:19Z

presto-main/src/main/java/com/facebook/presto/operator/scalar/DateTimeFunctions.java

@hainenber
We could reuse some sentences from the issue and put them in comment.

spershin · 2024-02-12T21:39:21Z

presto-main/src/test/java/com/facebook/presto/operator/scalar/TestDateTimeFunctionsBase.java

Maybe adding a comment that this particular double was causing loss of precision in the function before the fix and hence we are testing it here.
In case anyone wondering why this number.

spershin · 2024-02-14T04:03:23Z

@hainenber
You might want to rebase your PR on the fresh master and resubmit - there was an update to prevent e2e tests from failing all the time.

mbasmanova

@hainenber Thank you for iterating on this PR. Looks good % some nits and commit message needs updating.

There are typos in the commit message, the title is too long and the body has some lines that are too long. I suggest to use the following as the commit message.

Fix precision loss in from_unixtime(double) function

from_unixtime(1.7041507095805E9) used to return 1704150709 seconds and 581
milliseconds. It should return 580 milliseconds.

Before this change, the function used Math.round(unixTime * 1000), which loses
precision in some cases.

In the above case, it pushes the resulting number to 1704150709580.500000000,
which after rounding becomes 1704150709581, e.g. 581 milliseconds.

Current commit message:

mbasmanova · 2024-02-14T06:52:27Z

presto-main/src/test/java/com/facebook/presto/operator/scalar/TestDateTimeFunctionsBase.java

and hence we are testing it here

Drop this phrase as it is redundant.

mbasmanova · 2024-02-14T06:53:53Z

presto-main/src/main/java/com/facebook/presto/operator/scalar/DateTimeFunctions.java

Let's move this comment inside the function.

mbasmanova · 2024-02-14T06:54:46Z

presto-main/src/main/java/com/facebook/presto/operator/scalar/DateTimeFunctions.java

Machine-representable double for the 1.7041507095805E9 is 1704150709.58049988746643066406.

My understanding is that 1.7041507095805E9 cannot be represented in a computer. Are you saying that 1704150709.58049988746643066406 is the closest value that can be represented?

Yes, that's the original sentence from the author :D

mbasmanova · 2024-02-14T07:04:58Z

CC: @kaikalur @aditi-pandit

github-actions · 2024-02-15T09:04:38Z

Codenotify: Notifying subscribers in CODENOTIFY files for diff 75ebaf1...5acdf21.

No notifications.

spershin

Looks good, thank you for working on this and addressing tons of comments!

spershin · 2024-02-28T18:46:15Z

@hainenber

Let's rebase and merge this change.
Thanks!

spershin · 2024-03-07T18:25:12Z

@hainenber

Let's rebase and merge this change.
Let us know, please, if you for some reason are unable to do this.
Thanks!

hainenber · 2024-03-08T16:12:48Z

hi there, this change can have other folks taking over. Sorry for the belated info!

from_unixtime(1.7041507095805E9) used to return 1704150709 seconds and 581 milliseconds. It should return 580 milliseconds. Before this change, the function used Math.round(unixTime * 1000), which loses precision in some cases. In the above case, it pushes the resulting number to 1704150709580.500000000, which after rounding becomes 1704150709581, e.g. 581 milliseconds.

tdcmeehan · 2024-03-08T16:21:49Z

@spershin rebased, could you please re-review?

spershin

Thanks for the fix!

spershin · 2024-03-12T21:30:24Z

@tdcmeehan

Accepted, however, I'm not a committer, so a committer needs to review this too.

cc @mbasmanova ?

mbasmanova

Thanks.

hainenber requested a review from a team as a code owner February 11, 2024 09:34

hainenber requested a review from presto-oss February 11, 2024 09:34

tdcmeehan requested a review from spershin February 11, 2024 19:23

tdcmeehan self-assigned this Feb 11, 2024

mbasmanova reviewed Feb 12, 2024

View reviewed changes

hainenber changed the title ~~fix(main/op/scalar): correct timestamp precision for FROM_UNIX() result~~ fix(main/op/scalar): correct timestamp precision for FROM_UNIXTIME() result Feb 12, 2024

hainenber force-pushed the gh-21891-fix-precision-loss-for-fromunix-timestamp-result branch from 3ef4ef7 to 0484117 Compare February 12, 2024 11:14

mbasmanova reviewed Feb 12, 2024

View reviewed changes

spershin reviewed Feb 12, 2024

View reviewed changes

hainenber force-pushed the gh-21891-fix-precision-loss-for-fromunix-timestamp-result branch from 0484117 to 3f0b9ef Compare February 13, 2024 09:57

mbasmanova reviewed Feb 14, 2024

View reviewed changes

mbasmanova changed the title ~~fix(main/op/scalar): correct timestamp precision for FROM_UNIXTIME() result~~ Fix precision loss in from_unixtime(double) function Feb 14, 2024

hainenber requested review from a team, 7c00, agrawalreetika, sdruzkin, shangxinli, shrinidhijoshi and vinothchandar as code owners February 15, 2024 09:03

hainenber force-pushed the gh-21891-fix-precision-loss-for-fromunix-timestamp-result branch from 934f6c9 to b1c9942 Compare February 15, 2024 09:06

spershin approved these changes Feb 16, 2024

View reviewed changes

mbasmanova requested a review from rschlussel March 7, 2024 17:56

tdcmeehan force-pushed the gh-21891-fix-precision-loss-for-fromunix-timestamp-result branch from b1c9942 to 5acdf21 Compare March 8, 2024 16:21

spershin approved these changes Mar 12, 2024

View reviewed changes

mbasmanova approved these changes Mar 12, 2024

View reviewed changes

spershin merged commit 6a9f3cd into prestodb:master Mar 12, 2024

wanglinsong mentioned this pull request May 1, 2024

Add release notes for 0.287 #22647

Merged

48 tasks

Conversation

hainenber commented Feb 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

Uh oh!

mbasmanova left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hainenber commented Feb 12, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sdruzkin Feb 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spershin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spershin commented Feb 14, 2024

Uh oh!

mbasmanova left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mbasmanova commented Feb 14, 2024

Uh oh!

github-actions bot commented Feb 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

spershin left a comment

Choose a reason for hiding this comment

Uh oh!

spershin commented Feb 28, 2024

Uh oh!

spershin commented Mar 7, 2024

Uh oh!

hainenber commented Mar 8, 2024

Uh oh!

tdcmeehan commented Mar 8, 2024

Uh oh!

spershin left a comment

Choose a reason for hiding this comment

Uh oh!

spershin commented Mar 12, 2024

Uh oh!

mbasmanova left a comment

Choose a reason for hiding this comment

hainenber commented Feb 11, 2024 •

edited

Loading

sdruzkin Feb 15, 2024 •

edited

Loading

github-actions bot commented Feb 15, 2024 •

edited

Loading