Skip to content

[SPARK-1701] Clarify slice vs partition in the programming guide#2305

Closed
mattf wants to merge 3 commits intoapache:masterfrom
mattf:SPARK-1701
Closed

[SPARK-1701] Clarify slice vs partition in the programming guide#2305
mattf wants to merge 3 commits intoapache:masterfrom
mattf:SPARK-1701

Conversation

@mattf
Copy link
Copy Markdown

@mattf mattf commented Sep 6, 2014

This is a partial solution to SPARK-1701, only addressing the
documentation confusion.

Additional work can be to actually change the numSlices parameter name
across languages, with care required for scala & python to maintain
backward compatibility for named parameters.

This is a partial solution to SPARK-1701, only addressing the
documentation confusion.

Additional work can be to actually change the numSlices parameter name
across languages, with care required for scala & python to maintain
backward compatibility for named parameters.
@SparkQA
Copy link
Copy Markdown

SparkQA commented Sep 6, 2014

QA tests have started for PR 2305 at commit 7b045e0.

  • This patch merges cleanly.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Sep 6, 2014

QA tests have finished for PR 2305 at commit 7b045e0.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mattf
Copy link
Copy Markdown
Author

mattf commented Sep 11, 2014

@JoshRosen will you take a look at this?

@JoshRosen
Copy link
Copy Markdown
Contributor

Sorry for not reviewing this until now; it sort of fell off my radar.

Comment thread docs/programming-guide.md Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the "Note:" should mention that in some places we still say numSlices (for backwards compatibility with earlier versions of Spark) and that "slices" should be considered as a synonym for "partitions"; there are a lot of places that use numPartitions, etc, so we may want to emphasize that this discrepancy only occurs in a few places.

@mattf
Copy link
Copy Markdown
Author

mattf commented Sep 19, 2014

thanks for the feedback. i've changed the language to be more inline with your suggestion.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Sep 19, 2014

QA tests have started for PR 2305 at commit c0af05d.

  • This patch merges cleanly.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Sep 19, 2014

QA tests have finished for PR 2305 at commit c0af05d.

  • This patch fails unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mattf
Copy link
Copy Markdown
Author

mattf commented Sep 19, 2014

This patch fails unit tests.

i'm getting HTTP 503 from jenkins, but i'm gonna go out on a limb and say this doc change didn't break the unit tests.

@JoshRosen
Copy link
Copy Markdown
Contributor

I think that Jenkins might have crashed or restarted overnight, but it seems to be working now.

This looks good to me, so I'm going to merge it. Feel free to open similar PRs for other documentation improvements / clarifications, since these types of edits are really helpful.

@asfgit asfgit closed this in be0c756 Sep 19, 2014
@mattf mattf deleted the SPARK-1701 branch September 19, 2014 21:41
ghost pushed a commit to dbtsai/spark that referenced this pull request Apr 9, 2017
…nd code?)

## What changes were proposed in this pull request?

Came across the term "slice" when running some spark scala code. Consequently, a Google search indicated that "slices" and "partitions" refer to the same things; indeed see:

- [This issue](https://issues.apache.org/jira/browse/SPARK-1701)
- [This pull request](apache#2305)
- [This StackOverflow answer](http://stackoverflow.com/questions/23436640/what-is-the-difference-between-an-rdd-partition-and-a-slice) and [this one](http://stackoverflow.com/questions/24269495/what-are-the-differences-between-slices-and-partitions-of-rdds)

Thus this pull request fixes the occurrence of slice I came accross. Nonetheless, [it would appear](https://github.com/apache/spark/search?utf8=%E2%9C%93&q=slice&type=) there are still many references to "slice/slices" - thus I thought I'd raise this Pull Request to address the issue (sorry if this is the wrong place, I'm not too familar with raising apache issues).

## How was this patch tested?

(Not tested locally - only a minor exception message change.)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Author: asmith26 <asmith26@users.noreply.github.com>

Closes apache#17565 from asmith26/master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants