Skip to content

Conversation

@minahlee
Copy link
Member

@minahlee minahlee commented Jan 7, 2016

What is this PR for?

Set spark.yarn.isPython to be true to distribute pyspark libraries to workers when master is yarn-client

What type of PR is it?

Bug Fix

Is there a relevant Jira issue?

ZEPPELIN-572

How should this be tested?

You need yarn cluster to test this PR.
Simple way to test this PR would be running below code in paragraph and see if it throws error.

%pyspark
print(sc.parallelize([1, 2]).count())

And you should be able to see that spark.yarn.isPython is set to true in Spark UI > Environment > Spark Properties only when you set spark.master as yarn-client.

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@cloverhearts
Copy link
Member

+1

@prabhjyotsingh
Copy link
Contributor

Tested, +1

@Leemoonsoo
Copy link
Member

LGTM

1 similar comment
@jongyoul
Copy link
Member

jongyoul commented Jan 8, 2016

LGTM

@Leemoonsoo
Copy link
Member

Although release branch 'branch-0.5.6' is created by https://issues.apache.org/jira/browse/ZEPPELIN-567, I think this change worth to apply 0.5.6. Shell we merge it into both master and 'branch-0.5.6'?

@Leemoonsoo
Copy link
Member

If there're no more discussions, i'm merging it into 'branch-0.5.6' and 'master'

@jongyoul
Copy link
Member

jongyoul commented Jan 9, 2016

Sure.

@Leemoonsoo
Copy link
Member

Since 0.5.6-incubating rc1 is in vote alreay, i'm merging it into master only. we can merge it into branch-0.5.6 anytime we want.

@asfgit asfgit closed this in 2ee234a Jan 10, 2016
@bzz
Copy link
Member

bzz commented Jan 11, 2016

Looks great, thank you. It definitely belongs to 0.5.6, will merge it there.

asfgit pushed a commit that referenced this pull request Jan 11, 2016
### What is this PR for?
Set `spark.yarn.isPython` to be `true` to distribute pyspark libraries to workers when master is `yarn-client`

### What type of PR is it?
Bug Fix

### Is there a relevant Jira issue?
[ZEPPELIN-572](https://issues.apache.org/jira/browse/ZEPPELIN-572)

### How should this be tested?
You need yarn cluster to test this PR.
Simple way to test this PR would be running below code in paragraph and see if it throws error.
```
%pyspark
print(sc.parallelize([1, 2]).count())
```
And you should be able to see that `spark.yarn.isPython` is set to `true` in **Spark UI > Environment > Spark Properties** only when you set spark.master as `yarn-client`.

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No

Author: Mina Lee <[email protected]>

Closes #605 from minahlee/ZEPPELIN-572 and squashes the following commits:

8c99de5 [Mina Lee] Set spark.yarn.isPython to be true to distribute needed pyspark libraries to workers when master is yarn-client
@darionyaphet
Copy link

This PR have apply in 0.5.6 ? spark.yarn.isPython=true must be set in zeppline-env.sh

@Leemoonsoo
Copy link
Member

@darionyaphet 0.5.6 includes this patch

@darionyaphet
Copy link

Hi @Leemoonsoo It seem not effective .

I build Zeppelin using command :
mvn package -Pspark-1.5 -Dspark.version=1.5.0 -Pyarn -Ppyspark -DskipTests

When I using yarl-client to startup pyspark job I found it's not work .
I must add ZEPPELIN_INTP_JAVA_OPTS="-Dspark.yarn.isPython=true" in conf/zeppelin-env.sh

@felixcheung
Copy link
Member

felixcheung commented Apr 8, 2016

Are you sure you are setting master to yarn-client?

@darionyaphet
Copy link

@felixcheung Yep , I'm using yarn-client

@meniluca
Copy link
Contributor

Hi all,
I can confirm, using release 0.5.6 and external Spark 1.5.1 cdh 5.5, I must add to zeppelin-env.sh ZEPPELIN_INTP_JAVA_OPTS="-Dspark.yarn.isPython=true" with master yarn-client

Maybe it's better to add it in the documentation?

Cheers,
Luca

@minahlee
Copy link
Member Author

@H4ml3t Sorry for the confusion, I also confirmed that this commit is not included in 0.5.6 binary package by checking source code in http://www.apache.org/dyn/closer.cgi/incubator/zeppelin/0.5.6-incubating/zeppelin-0.5.6-incubating.tgz. Thanks for reporting, I will update the doc

@minahlee minahlee deleted the ZEPPELIN-572 branch August 8, 2016 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants