Skip to content

Conversation

@davies
Copy link
Contributor

@davies davies commented Jul 24, 2014

Add several default configs for PySpark, related to serialization in JVM.

spark.serializer = org.apache.spark.serializer.KryoSerializer
spark.serializer.objectStreamReset = 100
spark.rdd.compress = True

This will help to reduce the memory usage during RDD.partitionBy()

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA tests have started for PR 1568. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17107/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA results for PR 1568:
- This patch FAILED unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17107/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA tests have started for PR 1568. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17127/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA results for PR 1568:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17127/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now merged #1051, so update this to do _conf.setIfMissing().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you may want to move the "spark.rdd.compress" that that one set into your map above

@SparkQA
Copy link

SparkQA commented Jul 25, 2014

QA tests have started for PR 1568. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17189/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 25, 2014

QA results for PR 1568:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17189/consoleFull

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davies, you also need to remove the self._conf.setIfMissing("spark.rdd.compress", "true") line above. Otherwise it looks good.

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA tests have started for PR 1568. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17212/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 26, 2014

QA results for PR 1568:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17212/consoleFull

@mateiz
Copy link
Contributor

mateiz commented Jul 26, 2014

Merged this, thanks.

@asfgit asfgit closed this in 75663b5 Jul 26, 2014
@davies davies deleted the conf branch July 29, 2014 00:42
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Add several default configs for PySpark, related to serialization in JVM.

spark.serializer = org.apache.spark.serializer.KryoSerializer
spark.serializer.objectStreamReset = 100
spark.rdd.compress = True

This will help to reduce the memory usage during RDD.partitionBy()

Author: Davies Liu <[email protected]>

Closes apache#1568 from davies/conf and squashes the following commits:

cd316f1 [Davies Liu] remove duplicated line
f71a355 [Davies Liu] rebase to master, add spark.rdd.compress = True
8f63f45 [Davies Liu] Merge branch 'master' into conf
8bc9f08 [Davies Liu] fix unittest
c04a83d [Davies Liu] some default configs for PySpark
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants