-
Notifications
You must be signed in to change notification settings - Fork 29k
[MINOR][DOC] Add note regarding proper usage of QueryExecution.toRdd #23822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MINOR][DOC] Add note regarding proper usage of QueryExecution.toRdd #23822
Conversation
|
Test build #102462 has finished for PR 23822 at commit
|
|
Retest this, please |
|
cc. @cloud-fan |
|
LGTM |
|
Test build #102469 has finished for PR 23822 at commit
|
|
retest this please |
| * accessing after iteration. (Calling `collect()` is one of known bad usage.) | ||
| * If you want to store these rows into collection, please apply some converter or copy row | ||
| * which produces new object per iteration. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HeartSaVioR Should we point the users to dataset.rdd method where the conversion is already applied ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's good suggestion for end users (not Spark developers). Will add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I don't think it's an API though .. technically we don't have to worry about end users.
|
Test build #102484 has finished for PR 23822 at commit
|
| * If you want to store these rows into collection, please apply some converter or copy row | ||
| * which produces new object per iteration. | ||
| * Given QueryExecution is not a public class, end users are discouraged to use this: please | ||
| * user `Dataset.rdd` instead which conversion will be applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user -> use
which -> in which
or
which -> where ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice finding. Applied.
|
LGTM with a very minor comment. |
|
Test build #102490 has finished for PR 23822 at commit
|
|
Merged to master. |
What changes were proposed in this pull request?
This proposes adding a note on
QueryExecution.toRddregarding Spark's internal optimization callers would need to indicate.How was this patch tested?
This patch is a documentation change.