Skip to content

Conversation

@prabhjyotsingh
Copy link
Contributor

@prabhjyotsingh prabhjyotsingh commented Apr 11, 2016

What is this PR for?

As Zeppelin evolves its notebook, for large scale data analysis, multiple zeppelin users are expected to use and connect to the same set of data repositories within an enterprise. Since Zeppelin notebooks could affect data, state and its lineage, it is important to have separation of users, provide them with appropriate sandboxes, in addition to capturing the right audit details. Further, the IT within the organization would prefer to support fewer Zeppelin instances (preferably one) to support its customers. Therefore, the objectives of creating a multi-tenant zeppelin are:
● Supporting workloads of multiple customers
● Supporting multiple LOBs (lines of business), on a single data systems
● Support fine grained audits
As a natural evolution of Zeppelin Authentication and Authorization design, partly user awareness in downstream data systems such as Spark/Hive and others, is essential to achieve the above stated objectives.

What type of PR is it?

Feature

Todos

  • - Test case
  • - Review Comments
  • - Documentation

What is the Jira issue?

ZEPPELIN-773

How should this be tested?

Screenshots (if appropriate)

screen shot 2016-04-11 at 12 41 35 pm

screen shot 2016-04-11 at 12 41 59 pm

screen shot 2016-04-11 at 12 48 13 pm

### Questions: - Does the licenses files need update? n/a - Is there breaking changes for older versions? n/a - Does this needs documentation? yes

prabhjyotsingh and others added 30 commits February 8, 2016 21:59
if fromMessage.principal.equals("anonymous") then set user as null
# Conflicts:
#	testing/startSparkCluster.sh
# Conflicts:
#	conf/zeppelin-site.xml.template
@prabhjyotsingh
Copy link
Contributor Author

Thank you @Leemoonsoo for the review, on debugging I found out that the Livy-Server that https://github.com/cloudera/livy and https://github.com/cloudera/hue/tree/master/apps/spark/java works differently.

https://github.com/cloudera/livy accepts request as curl -X POST --data '{"kind": "pyspark", "conf":{"spark.master": "local[*]"}}' -H "Content-Type: application/json" localhost:8998/sessions

and https://github.com/cloudera/hue/tree/master/apps/spark/java accepts the same as curl -X POST --data '{"kind": "pyspark", "master": "local[*]"}' -H "Content-Type: application/json" localhost:8998/sessions

Have put up an extra check 44c5e82#diff-6590a16671e62be8da95e793c364d739R72 to fix this.

@prabhjyotsingh prabhjyotsingh force-pushed the livyInterperter branch 2 times, most recently from add6e5e to 52ca4e0 Compare May 17, 2016 13:31
@Leemoonsoo
Copy link
Member

@prabhjyotsingh Thanks for explanation. I have tried with https://github.com/cloudera/hue/tree/master/apps/spark/java and LivyInterpreter works greatly.
However, still no luck with https://github.com/cloudera/livy. i'm getting the same error.

@prabhjyotsingh
Copy link
Contributor Author

prabhjyotsingh commented May 18, 2016

@Leemoonsoo can you try one of this

curl -X POST --data '{"kind": "spark", "master": "local[*]", "proxyUser": "null"}' -H "Content-Type: application/json" localhost:8998/sessions if trying on https://github.com/cloudera/hue/tree/master/apps/spark/java

or

curl -X POST --data '{"kind": "spark", "conf":{"spark.master": "local[*]"},"proxyUser": "null"}' -H "Content-Type: application/json" localhost:8998/sessions if on https://github.com/cloudera/livy.

Most likely this is known issue 3deca71#diff-427ba78b970ccdf0a95f5176f693a377R88 and your server is running from https://github.com/cloudera/livy.

or if it is something else can you paste the output, so I can try the same to debug and fix.

@Leemoonsoo
Copy link
Member

@prabhjyotsingh Thanks for the help. Commenting out spark.master from https://github.com/cloudera/livy/conf/spark-blacklist.conf made it work.

Looks good to me!

@prabhjyotsingh
Copy link
Contributor Author

@Leemoonsoo thanks a lot for reviewing it.

CI fails for unrelated issue at Job #4237.1

print "${group_by=product_id,product_id|product_name|customer_id|store_id}"
```

## Know Issue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Known Issue"?

@prabhjyotsingh
Copy link
Contributor Author

Thank you @AhyoungRyu for the review. Have addressed those.

@prabhjyotsingh
Copy link
Contributor Author

Shall I merge this, if no more discussion.

@Leemoonsoo
Copy link
Member

+1 for merge!

@felixcheung
Copy link
Member

+1

@asfgit asfgit closed this in 6acd0ae May 24, 2016
@ghost
Copy link

ghost commented May 24, 2016

@prabhjyotsingh @Leemoonsoo @felixcheung awesome!

shijinkui pushed a commit to shijinkui/incubator-zeppelin that referenced this pull request May 26, 2016
### What is this PR for?
As Zeppelin evolves its notebook, for large scale data analysis, multiple zeppelin users are expected to use and connect to the same set of data repositories within an enterprise. Since Zeppelin notebooks could affect data, state and its lineage, it is important to have separation of users, provide them with appropriate sandboxes, in addition to capturing the right audit details. Further, the IT within the organization would prefer to support fewer Zeppelin instances (preferably one) to support its customers. Therefore, the objectives of creating a multi-tenant zeppelin are:
●	Supporting workloads of multiple customers
●	Supporting multiple LOBs (lines of business), on a single data systems
●	Support fine grained audits
As a natural evolution of Zeppelin Authentication and Authorization design, partly user awareness in downstream data systems such as Spark/Hive and others, is essential to achieve the above stated objectives.

### What type of PR is it?
Feature

### Todos
* [x] - Test case
* [x] - Review Comments
* [x] - Documentation

### What is the Jira issue?
ZEPPELIN-773

### How should this be tested?
 - Install Livy by following steps on https://github.com/cloudera/livy
 - Start the Livy server
 - Now by using Zeppelin-Livy interpreter, run any of the spark, pyspark or R commands.

### Screenshots (if appropriate)
<img width="1436" alt="screen shot 2016-04-11 at 12 41 35 pm" src="https://cloud.githubusercontent.com/assets/674497/14419479/b514979c-ffe3-11e5-9dea-df9854d8409c.png">

<img width="1434" alt="screen shot 2016-04-11 at 12 41 59 pm" src="https://cloud.githubusercontent.com/assets/674497/14419478/b514922e-ffe3-11e5-9c98-93c5b99de106.png">

<img width="1440" alt="screen shot 2016-04-11 at 12 48 13 pm" src="https://cloud.githubusercontent.com/assets/674497/14419480/b515d8c8-ffe3-11e5-8c20-4c988c621f51.png">

### Questions:
* Does the licenses files need update? n/a
* Is there breaking changes for older versions? n/a
* Does this needs documentation? yes

Author: Prabhjyot Singh <[email protected]>
Author: Rohit Choudhary <[email protected]>

Closes apache#827 from prabhjyotsingh/livyInterperter and squashes the following commits:

9689da0 [Prabhjyot Singh] check for more session not found error
aeb5a73 [Prabhjyot Singh] update doc with review comment and add FAQ
5c2bf13 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
9833c59 [Prabhjyot Singh] ZEPPELIN-773: log error in all other status code
3deca71 [Prabhjyot Singh] ZEPPELIN-773 update doc for know issue, and more error logging
6f1503f [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
44c5e82 [Prabhjyot Singh] ZEPPELIN-773: fail check if API allows master or conf in parameter
f2ea724 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
53f2804 [Prabhjyot Singh] ZEPPELIN-773: add doc for livy impersonation
8095b3b [Prabhjyot Singh] ZZEPPELIN-773: add doc for spark version
fef1081 [Prabhjyot Singh] ZEPPELIN-773 add documentation for configuring Spark master uri.
23b7811 [Prabhjyot Singh] ZEPPELIN-773 livy to have conf for configuring yarn master
134923d [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
7a12336 [Prabhjyot Singh] missing exception handeling in LivySparkInterpreter
200e715 [Prabhjyot Singh] more exception handeling
1e18465 [Prabhjyot Singh] LOGGER.error
8116b72 [Prabhjyot Singh] remove redundant getResultCode
93708cd [Prabhjyot Singh] setting the right condition
c3e74f2 [Prabhjyot Singh] return error
ad26d0b [Prabhjyot Singh] replace info with error
7f6fa24 [Prabhjyot Singh] check/incorporate recent changes of this code in the Spark interpreter
6c19e35 [Prabhjyot Singh] retry a for a minute and fail, instead of looping forever
45e3d48 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
97d0663 [Prabhjyot Singh] add the doc to docs/_includes/themes/zeppelin/_navigation.html
7ff9744 [Prabhjyot Singh] update default value to 1000
a6e7d0b [Prabhjyot Singh] prefix zeppelin. to property zeppelin.livy.sql.maxResult
9be64e0 [Prabhjyot Singh] doc
5f9be73 [Prabhjyot Singh] adding more mock test
8c4b983 [Prabhjyot Singh] remove class name as it will be avail with the stack trace
a3b0a06 [Prabhjyot Singh] add test for pyspark
f0e3c20 [Prabhjyot Singh] change variable name
bbe2a7c [Prabhjyot Singh] rename lspark to livy
9bfbe47 [Prabhjyot Singh] close livy session on connection close/restart interpreter
84bd755 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
cb65c86 [Prabhjyot Singh] rename r to sparkr for consistent naming
eb8706f [Prabhjyot Singh] fix paragraph abort
1b79c07 [Prabhjyot Singh] more loging and exception handeling
9a84e11 [Prabhjyot Singh] revert merge conflict
ff3c4ed [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
40ce7cd [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
3863682 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
7cec9ba [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
a28d674 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
557d1e1 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
e43385e [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
dc0a3dc [Prabhjyot Singh] fix append "%html "
756558e [Prabhjyot Singh] check for html tags and append "%html "
c58eae7 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
6c6b164 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
01ec474 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
32fbc1a [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
18468a0 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
ca06e91 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
5bb5775 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
5eb4eff [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
78eca1e [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
ee1c9f4 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
ea05fe9 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
0fbb74b [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
dadc257 [Prabhjyot Singh] reverting zeppelinConfiguration
b8e1779 [Prabhjyot Singh] adding back LivySparkRInterpreter.java , pyspark
9d89b0d [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
948615a [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
4f513a5 [Prabhjyot Singh] site.xml
426bbe8 [Rohit Choudhary] Merge branch 'livyInterperter' of https://github.com/prabhjyotsingh/incubator-zeppelin into livyInterperter
68f438d [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into livyInterperter
ee2dceb [Prabhjyot Singh] working spark sql
1f9a111 [Rohit Choudhary] remove references to LivyInterpreters
b53fd8b [Rohit Choudhary] Don't need so many interpreters
de2fd3c [Prabhjyot Singh] have spark streaming
9cb0819 [Prabhjyot Singh] Fix for 1st request failing
8f4ec47 [Prabhjyot Singh] removing unrequired logs
07f0846 [Prabhjyot Singh] with lspark sql
10311d3 [Prabhjyot Singh] This works in all cases
d0519d5 [Prabhjyot Singh] Working livy interpreter with spark, pyspark, R
ace28a8 [Prabhjyot Singh] working livy
4053497 [Prabhjyot Singh] test livy
0709b9c [Prabhjyot Singh] moving AuthenticationInfo to org.apache.zeppelin.display.AuthenticationInfo
95e7c13 [Prabhjyot Singh] test for selenium
a5a991d [Prabhjyot Singh] check for selenium
34dcc32 [Prabhjyot Singh] instead of null pass "new AuthenticationInfo()"
ba91da4 [Prabhjyot Singh] Merge remote-tracking branch 'origin/master' into UserInInterpreterContext
57ca577 [Prabhjyot Singh] review change create such class AuthenticationInfo, and pass it into InterpreterContext
320790c [Prabhjyot Singh] fix for CI, missing change signature
d928203 [Prabhjyot Singh] revert shiri.ini if fromMessage.principal.equals("anonymous") then set user as null
fadc6d9 [Prabhjyot Singh] userName to be present in InterpreterContext/RemoteInterpreterContext
@prabhjyotsingh prabhjyotsingh deleted the livyInterperter branch August 15, 2017 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants