Skip to content

Conversation

@gengliangwang
Copy link
Member

What changes were proposed in this pull request?

This is a follow-up of #26907
It changes the for loop for (element <- array.asScala) to while loop

Why are the changes needed?

As per https://github.com/databricks/scala-style-guide#traversal-and-zipwithindex, we should use while loop for the performance-sensitive code.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

@gengliangwang
Copy link
Member Author

cc @HyukjinKwon @steven-aerts

@SparkQA
Copy link

SparkQA commented Jan 8, 2020

Test build #116281 has finished for PR 27127 at commit 4a96604.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@steven-aerts steven-aerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a small benchmark for this.

The zipWithIndex is of course very slow.
This change gains a little bit, but not significantly.
Another nice idiom i think is:

for(element:collection.iterator().asScala)

Which is also very fast.

@gengliangwang
Copy link
Member Author

@steven-aerts Thanks for the benchmark!

@gengliangwang
Copy link
Member Author

Merging to master.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants