-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24899][SQL][DOC] Add example of monotonically_increasing_id standard function to scaladoc #21858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…andard function to scaladoc
|
Test build #93494 has finished for PR 21858 at commit
|
| * | ||
| * // Make sure that every partition has the same number of rows | ||
| * q.mapPartitions(rows => Iterator(rows.size)).foreachPartition(rows => assert(rows.next == 2)) | ||
| * q.select(monotonically_increasing_id).show |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh @jaceklaskowski, wouldn't this one be enough as an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about explaining the "internals" of the operator through a more involved example and actually thought about removing the line 1166 (but forgot). I think the following lines make for a very in-depth explanation and use other operators in use.
In other words, I'm in favour of removing the line 1166 and leaving the others with no changes. Possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you're exploring the internals but .. to be honest I was wondering if users are usually interested in such in-deep explanation since I guess most of them wouldn't care about the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO It' enough to add that rows are consecutive in each partition, but not between partitions and that values are shifted left by 33 - written in words, not code, will be much shorter and concise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would simplify the example to not focus on the particular shift; yeah that behavior ought not change but it's not really something a caller would ever rely on. And I think you don't need to make a new variable to subtract 1 from row number, etc. Something simply showing the two properties -- increasing within partition, not between partitions -- is enough.
|
|
||
| override def prettyName: String = "monotonically_increasing_id" | ||
|
|
||
| override def sql: String = s"$prettyName()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the default and no need for the override, isn't it?
| * | ||
| * // Make sure that every partition has the same number of rows | ||
| * q.mapPartitions(rows => Iterator(rows.size)).foreachPartition(rows => assert(rows.next == 2)) | ||
| * q.select(monotonically_increasing_id).show |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally would simplify the example to not focus on the particular shift; yeah that behavior ought not change but it's not really something a caller would ever rely on. And I think you don't need to make a new variable to subtract 1 from row number, etc. Something simply showing the two properties -- increasing within partition, not between partitions -- is enough.
|
I think this is mergeable if the examples are simplified a bit per comments above @jaceklaskowski |
|
@jaceklaskowski would you like to update this? |
|
Ping @jaceklaskowski to update or close |
|
Test build #97772 has finished for PR 21858 at commit
|
Closes apache#22567 Closes apache#18457 Closes apache#21517 Closes apache#21858 Closes apache#22383 Closes apache#19219 Closes apache#22401 Closes apache#22811 Closes apache#20405 Closes apache#21933 Closes apache#22819 from srowen/ClosePRs. Authored-by: Sean Owen <[email protected]> Signed-off-by: Sean Owen <[email protected]>
What changes were proposed in this pull request?
Example of
monotonically_increasing_idstandard function (with how it works internally) in scaladocHow was this patch tested?
Local build. Waiting for Jenkins