Skip to content

Conversation

@jaceklaskowski
Copy link
Contributor

What changes were proposed in this pull request?

Example of monotonically_increasing_id standard function (with how it works internally) in scaladoc

How was this patch tested?

Local build. Waiting for Jenkins

@SparkQA
Copy link

SparkQA commented Jul 24, 2018

Test build #93494 has finished for PR 21858 at commit 29def00.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

*
* // Make sure that every partition has the same number of rows
* q.mapPartitions(rows => Iterator(rows.size)).foreachPartition(rows => assert(rows.next == 2))
* q.select(monotonically_increasing_id).show
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh @jaceklaskowski, wouldn't this one be enough as an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about explaining the "internals" of the operator through a more involved example and actually thought about removing the line 1166 (but forgot). I think the following lines make for a very in-depth explanation and use other operators in use.

In other words, I'm in favour of removing the line 1166 and leaving the others with no changes. Possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're exploring the internals but .. to be honest I was wondering if users are usually interested in such in-deep explanation since I guess most of them wouldn't care about the details.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO It' enough to add that rows are consecutive in each partition, but not between partitions and that values are shifted left by 33 - written in words, not code, will be much shorter and concise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would simplify the example to not focus on the particular shift; yeah that behavior ought not change but it's not really something a caller would ever rely on. And I think you don't need to make a new variable to subtract 1 from row number, etc. Something simply showing the two properties -- increasing within partition, not between partitions -- is enough.


override def prettyName: String = "monotonically_increasing_id"

override def sql: String = s"$prettyName()"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the default and no need for the override, isn't it?

*
* // Make sure that every partition has the same number of rows
* q.mapPartitions(rows => Iterator(rows.size)).foreachPartition(rows => assert(rows.next == 2))
* q.select(monotonically_increasing_id).show
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would simplify the example to not focus on the particular shift; yeah that behavior ought not change but it's not really something a caller would ever rely on. And I think you don't need to make a new variable to subtract 1 from row number, etc. Something simply showing the two properties -- increasing within partition, not between partitions -- is enough.

@srowen
Copy link
Member

srowen commented Aug 19, 2018

I think this is mergeable if the examples are simplified a bit per comments above @jaceklaskowski

@srowen
Copy link
Member

srowen commented Sep 24, 2018

@jaceklaskowski would you like to update this?

@srowen
Copy link
Member

srowen commented Oct 9, 2018

Ping @jaceklaskowski to update or close

@SparkQA
Copy link

SparkQA commented Oct 22, 2018

Test build #97772 has finished for PR 21858 at commit 29def00.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen srowen mentioned this pull request Oct 24, 2018
@asfgit asfgit closed this in 65c653f Oct 25, 2018
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 22, 2025
Closes apache#22567
Closes apache#18457
Closes apache#21517
Closes apache#21858
Closes apache#22383
Closes apache#19219
Closes apache#22401
Closes apache#22811
Closes apache#20405
Closes apache#21933

Closes apache#22819 from srowen/ClosePRs.

Authored-by: Sean Owen <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants