Skip to content

Conversation

@Sudhar287
Copy link
Contributor

@Sudhar287 Sudhar287 commented Jul 6, 2020

What changes were proposed in this pull request?

Replaced floorDiv to just / in localRebaseGregorianToJulianDays() in spark/sql/catalyst/util/RebaseDateTime.scala

Why are the changes needed?

Easier to understand the logic/code and a little more efficiency.

Does this PR introduce any user-facing change?

NO

How was this patch tested?

Proof of concept here. The operation utcCal.getTimeInMillis / MILLIS_PER_DAY results in an interger value already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the division?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the comma needs to become a slash.
I think this change would then be OK as the sign of these two args are always the same.
However, does it make much difference? it's going to be a few extra math ops. It might be useful to just keep this consistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops Sorry, silly me.
And yes @srowen, this does make things faster. I can do some benchmarking if you would like me to...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'd be surprised if the operations were meaningfully faster, given how much else is happening. I'm sure a microbenchmark of a trillion calls would show a difference, but that's not quite the use case.
Nevertheless are there other instances where both args are definitely nonnegative?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the question is not about performance but about our understanding of the logic.
If we set only date fields:

.setDate(localDate.getYear, localDate.getMonthValue - 1, localDate.getDayOfMonth)

the time part MUST be zero. If it is not, something is wrong going on.

I would add the assert before the division:

assert(utcCal.getTimeInMillis % MILLIS_PER_DAY == 0)

.setDate(localDate.getYear, localDate.getMonthValue - 1, localDate.getDayOfMonth)
.build()
Math.toIntExact(Math.floorDiv(utcCal.getTimeInMillis, MILLIS_PER_DAY))
Math.toIntExact(utcCal.getTimeInMillis/MILLIS_PER_DAY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Math.toIntExact(utcCal.getTimeInMillis/MILLIS_PER_DAY)
Math.toIntExact(utcCal.getTimeInMillis / MILLIS_PER_DAY)

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're OK with it @MaxGekk I think I am.
Are there other places that could use a similar change?

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MaxGekk
Copy link
Member

MaxGekk commented Jul 11, 2020

@Sudhar287 Please, clean up PR's description, and the title (it is so generic).

@Sudhar287
Copy link
Contributor Author

PTAL @MaxGekk :)

@srowen
Copy link
Member

srowen commented Jul 15, 2020

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Jul 15, 2020

Test build #125894 has started for PR 29008 at commit 02cfa89.

@Sudhar287
Copy link
Contributor Author

Hmm, it says no failures here. What am I suppose to do?

@srowen
Copy link
Member

srowen commented Jul 16, 2020

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jul 16, 2020

Test build #125977 has finished for PR 29008 at commit 02cfa89.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jul 17, 2020

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jul 17, 2020

Test build #126066 has finished for PR 29008 at commit 02cfa89.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen closed this in f9f9309 Jul 18, 2020
@srowen
Copy link
Member

srowen commented Jul 18, 2020

Merged to master

@Sudhar287
Copy link
Contributor Author

Awesome. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants