Skip to content

Conversation

@prasadwagle
Copy link

@prasadwagle prasadwagle commented Feb 1, 2016

What is this PR for?

The goal of the PR is to add authorization for notebooks according to the design document here.
The PR uses Shiro authentication.

What type of PR is it?

Feature

Todos

  • - Find way to get roles for a user in SecurityUtils (Not possible at the moment, see SHIRO-492)
  • - Investigate how to use Shiro authorization
  • - Use groups associated with user to determine if operation is permitted
  • - Check if user has permissions to modify note permissions
  • - Add checks in more NotebookServer operations
  • - Improve UI (explain permissions, error messages)
  • - Add unit tests
  • - Documentation

Is there a relevant Jira issue?

ZEPPELIN-549

How should this be tested?

  1. Enable Basic Auth Security by changing conf/shiro.ini.
  2. Create a note. By default all operations are allowed by any authenticated user.
  3. Update readers, writers and owners by clicking on the lock icon in the top right area.
  4. Check if users can or cannot perform operations according to the permissions.

Screenshots (if appropriate)

Screenshot

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@Leemoonsoo
Copy link
Member

Thanks for great contribution. Tested working nicely.

Can 'Note Permissions' have white background? like 'interpreter binding' ? If you need help, me or maybe Damien can help on styling.

Another one is - actually i should ask earlier - recently there were discussion about notebook file portability regarding interpreter selection.

Current design save permission information into notebook file. But notebook can be exported and imported into the Zeppelin instance that has different set of user/group exists. (either using import/export feature or copying notebook file into file system)

So, do you have good idea to make note permission feature gracefully aware notebook portability?

data-ng-model="permissions">
<p>Owners : <input ng-list ng-model="permissions.owners"> Owners can change permissions, read and write the note. </p>
<p>Readers : <input ng-list ng-model="permissions.readers"> Readers can only read the note.</p>
<p>Writers : <input ng-list ng-model="permissions.writers"> Writers can read and write the note.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When each 'Owners', 'Readers', 'Writers' is empty, default behavior is allowing all.
But some people may confuse default behavior is deny all. So how about make user quickly catch the default behavior?

For example adding whildcard(*) at placeholder or any other way.

<input ng-model ..... placeholder="*"> ....

@sourav-mazumder
Copy link

Hi Prasad,

I also have similar opinion like Moon on the storing the permission. However your design so far looks reasonably good though.

Here are my few comments/suggestions -

  1. I do agree with you that Interpreter Authorization is complex. But the examples you have provided, if I'm not missing anything, those are related to Authentication of executing the target technology not related to the interpreter itself. The authorization considerations for Interpreters should be who can change an interpreter setting, who can execute an interpreter, who can create a new interpreter from the existing interpreter set. So essentially you need to plugin here the Role of a user which you'll get from enterprise authentication/authorization system like LDAP. So as an example at the enterprise level someone will typically setup say 2 types roles for users using Zeppelin. Role type 1, say Zeppelin Administrator. Administrator will have access to the setting of access control for all interpreters, can control the users who can schedule running of his/her notebook (though I may be a notebook author I need not have the ability to schedule it to run it any time as I may be hogging enterprise cluster resource and impact some other higher priority job), who can import/export, etc. Then Role type 2, say Regular User. This Regular User will create the notebook, change the permission settings to his/her notebook. Based on this Role the user will even have access to access control setup for interpreters, notebook, scheduler.
  2. Coming to Moon's point where you are storing the access control information for every notebook and interpreters is very important. In my opinion it should not be in the Notebook itself as it is a plain text information and can be seen/exported by anyone whoever has access to the Notebook folder (or can export the same). There are typically three ways one can handle this. Firstly, you can store it as a separate encrypted file in a secured folder in the file system. Only the Zeppelin process ( essentially the super user who started the Zeppelin server) has access to that folder and can do any change. Secondly to a target database. Or Thirdly even in LDAP . The configurations for option 2 and 3 can come from a configuration again stored in an encrypted file in a folder which only the super user (who typically starts the zeppelin process) can access. You may start with any one implementation but having a generic interface/abstraction (so that later on other implementations can be plugged in) is recommended.
  3. I see you have assumed that there would be a separate authentication component. To model the same with something which will typically be used I may suggest you to refer to design/interfaces of Apache Knox. People would be using Zeppelin a lot in their Hadoop Cluster and there would be good chance that they will use Knox (Knox comes in most of the hadoop distribution) as a gateway (which will in turn authenticate the users with enterprise level LDAP) to authenticate users accessing Zeppelin. Basically Knox will do the authentication and will pass the necessary roles to your module and then your model can use them to differentiate between who is Zeppelin Administrator and who is Regular User.

Hope this helps.

Regards,
Sourav

@prasadwagle
Copy link
Author

@Leemoonsoo and @sourav-mazumder, I really appreciate your feedback.

@Leemoonsoo,
I fixed the note permission background and wildcard placeholder issue.

So, do you have good idea to make note permission feature gracefully aware notebook portability?

I am not sure how to gracefully deal with permissions when notebooks are ported to a different Zeppelin server with different user and groups. One option is to delete permission information when we port so notes are accessible to anyone (the default case). We could save permissions information in a separate file. I have created https://issues.apache.org/jira/browse/ZEPPELIN-666 to track this issue.

@sourav-mazumder,
Regarding comment 1 and 3, I will clarify the interpreter vs target data source authorization distinction in docs/security/interpreter_authorization.md and research Apache Knox.

Regarding comment 2, it is not clear to me why user and group names allowed to access notebooks should be encrypted. We are not storing any credentials and only the zeppelin process can modify note permissions when a request is made by one of the note's owners.

Btw, we have deployed Zeppelin with authentication and notebook authorization at Twitter and it is working well. We are using it mainly to access Vertica, Presto and Mysql databases. The top priority now is to authenticate users to these databases. We are implementing this by asking users for their database credentials and storing them in a secure way on the Zeppelin server.

</li>
<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">Security<b class="caret"></b></a>
<ul class="dropdown-menu">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @prasadwagle : )
There is a typo : overviwe.html -> overview.html

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AhyoungRyu! I fixed the typo. I am having trouble installing jekyll to test the docs. Please let me know if you see any other issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prasadwagle Thx for your quick response!
I left some comments again in this PR just a moment ago.

<li>
<a href="#" data-toggle="dropdown" class="dropdown-toggle">Security<b class="caret"></b></a>
<ul class="dropdown-menu">
<li><a href="{{BASE_PATH}}/security/overviwe.html">Overview</a></li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "#html" next to the {{BASE_PATH}}/security/authentication.html is not necessary.

@AhyoungRyu
Copy link
Contributor

Hi @prasadwagle : )
Finally Zeppelin notebook authorization feature is implemented! It's really awesome.

After applying this PR, I wanted to leave some comments about the documentation.

  • Authentication section in authentication.md and notebook_authorization.md is duplicated. So, in my opinion, notebook_authorization.md will be enough instead of having both. Let's remove authentication.md file since there is no other particular contents except this section. Then, notebook_authorization.md file name have to be changed, since this file will cover both notebook authentication and notebook authorization.
  • Security Overview section in overview.md and notebook_authorization.md is also duplicated. So, how about having only one file for this section? I think notebook_authorization.md will be enough.
  • Since you add a new drop-down menu "Security" to _navigation.html, the navbar is broken like below screenshot image.
    screen shot 2016-02-11 at 2 20 22 pm
    I think you have to modify style.css file for this.
    Or, so far, we don't need to have individual section for Zeppelin Security. So, the fastest way for solving this problem, you can just add your *.md files under the More section.

The above comments are just my personal opinion. So, please let me know if it's different with yours : )

@hayssams
Copy link
Contributor

Thanks for this great work.

May I suggest a different design making Zeppelin Authorization much more fine grained and independent of the notebook structure.

This design is described here https://gist.github.com/hayssams/0360288c715878aa43c2

@sourav-mazumder
Copy link

Hi Prasad,

This id to clarify little more on my original comment #2 to address your question.

If you are storing authorization data in a plain text anyone (who has access to the file storing the note permission) can access that permission file through the OS and change it. Now in your design approach you can assume that the file containing the permissions for all notebooks would be stored in a folder that can be accessed only by an administrator (essentially the same person who has the permission to start/stop the zeppelin process). That approach is fine too and you don't need encryption in that case. However, key point is you need to have note permission stored in a separate file not in the actual notebook data. In either approach this is the prerequisite.

Hope this helps.

Regards,
Sourav

@sourav-mazumder
Copy link

Hi Hayssams,

I liked your approach. That is something very similar to what my initial comment was to Prasad's design.

The points you have mentioned at the end of your document are the key considerations. Especially separating the permission of a notebook from actual content of the notebooks.

My only suggestion would be that along with the option of storing the permissions in a database (using JDBC) one should also have the options of storing the same in a PAM file or LDAP. User should be able to pick and choose. This PR does not need to implement all of those options right now. May be starting with PAM file based approach would be good enough as long as that file is either encrypyted or stored in a folder which can be only accessed by Zeppelin admin (the person who can start and stop zeppelin).

Regards,
Sourav

@hayssams
Copy link
Contributor

@sourav-mazumder
Hi Sourav,
Thanks for your comments.
My mistake when I wrote JDBC database, but actually Shiro already support transparently many different backend for authorization including LDAP / Cas / JDBC.

@Leemoonsoo
Copy link
Member

Regarding portability of notebook authorization permission,
Thanks for creating https://issues.apache.org/jira/browse/ZEPPELIN-666. I think it make sense to handle as a separate issue. but while we want to minimize incompatibility of notebook format and configuration file format every release, I think ZEPPELIN-666 must be resolved before the next release.

Regarding authorizing interpreter, it is a good idea.
But to do that, Zeppelin should provide per user interpreter-binding, as a precondition. Therefore it'll bring much more codebase change and incompatibility of conf/interpreter.json. So, I think it's nice subject to have discussion in a separate issue.

Regarding independent notebook structure per user,
I think it's conflict with this Authorization implementation. Because this design allows multiple owner, and it's not clear how multiple owner can be handled in 'independent notebook structure per user'.
And this subject is also related to API of NotebookRepo interface, as well as migration of old directory structure. I think this subject can be discussed in a separate issue.

@hayssams
Copy link
Contributor

@prasadwagle
Not sure if your implementation currently protect from discovering the notes through the search service.

@hayssams
Copy link
Contributor

I just pushed the code that implement support for shiro authorization on websocket messages at hayssams@df43fb7

Please let me know if it makes sense before I push more code.

@prasadwagle
Copy link
Author

@AhyoungRyu - I have fixed the documentation issues you reported. Thanks!

@hayssams - Your design allows for fine-grained controls and is elegant. I am trying to figure out how I would make it work in my company where we have a homegrown authentication scheme described below. The LDAP server is locked down tight and I am not sure if the security team would allow Zeppelin to write to it.

If an incoming request to the Zeppelin server does not have a cookie with user information encrypted with the authentication server public key, the user is redirected to the LDAP authentication server. Once the user is verified, the authentication server redirects the browser to a specific URL in the Zeppelin server which sets the authentication cookie in the browser. The end result is that all requests to the Zeppelin web server have the authentication cookie which contains user and groups information.

We have made the implementation in this pull request work in my company environment by setting userAndRoles in the NotebookSocket constructor using the information in the authentication cookie.

Not sure if your implementation currently protect from discovering the notes through the search service.

It does not. We are mostly concerned with preventing users from viewing results in notes for which they don't have read permissions. We can create a separate issue to prevent users from viewing queries in notes for which they don't have read permissions.

@Leemoonsoo's - I understand your concern regarding notebook portability and agree we should resolve ZEPPELIN-666 before the next release.

@sourav-mazumder
Copy link

Hi Prasad,

In your organization if writing to enterprise level LDAP is not possible (though that is the right approach from Enterprise Architecture stand point), you can always use other options supported by Shiro (as described by hayssams) like storing that info in JDBC sink (to store the permission in a dedicated RDBMS) or in protected PAM file based.

Regards,
Sourav

@hayssams
Copy link
Contributor

@prasadwagle
I understand your concern regarding LDAP. However, you can still implement your own Authorisation realm using shiro and this would be a file holding all of your permissions. This would be something like

myOwnRealm = com.company.ZeppelinRealm
securityManager.realms = $myOwnRealm

Shiro already support permissions stored in an Ini file format and Properties file format on top on which you can base you implementation.

@prasadwagle
Copy link
Author

Got it. Your suggestion to save notebook authorization information in a .ini file makes sense.

In our company, we have an LDAP server with users and groups and we get user and group information for a connection in an authentication cookie. Let's say, we have a note "RevenueNote" with the following authorization information (finance-dev, finance and executives are groups with many users):

  • Owners: finance-dev
  • Readers: finance, executives
  • Writers: finance-dev

What is the best way to translate these permissions to the ini file format and make Shiro permission checks work with the users and group information I get from the authentication cookie?

@Leemoonsoo
Copy link
Member

Okay, If i summarize my understanding,
Shiro can not only handle authentication for user but also can be extended to save/load permission information for each notebook. And the information can be saved in file, db, etc. Is it correct?

Then the next step would be change location of permission information from notebook to Shiro and it'll resolve https://issues.apache.org/jira/browse/ZEPPELIN-666. Right?

And ci test failure need to be investigated and fixed to merge this PR.
Let me know if you need any help on making ci green.

@hayssams
Copy link
Contributor

@Leemoonsoo

... and the information can be saved in file, db, etc. Is it correct?
Yes!
666 resolved by the use of shiro autz ?
Yes!

@prasadwagle prasadwagle reopened this Feb 26, 2016
@Leemoonsoo
Copy link
Member

ZEPPELIN-666 is going to handle portability of notebook authorization permissions stored in note file.
And about the searchservice @hayssams mentions, i'd like to create a new issue and handle it there.

While we agree to resolve ZEPPELIN-666 and new issue related to search service before the next release, Looks good to me and merge if there're no more discussions.

Thanks @prasadwagle for great work.

@asfgit asfgit closed this in 738c10e Feb 28, 2016
@qinzl1
Copy link

qinzl1 commented Aug 9, 2016

i just want config someone can create/delete notebook ,someone can not see the interpreter page,how config the shiro.ini file

@AhyoungRyu
Copy link
Contributor

AhyoungRyu commented Aug 9, 2016

someone can create/delete notebook

Regarding this, please refer to Notebook Authorization setting document. But AFAIK, currently you can't control create notebook.

someone can not see the interpreter page

Regarding this one, please see Shiro authentication#Secure your Zeppelin information section.

I attached related docs links what you wondered. But please note that the docs link is for 0.7.0-SNAPSHOT ver. If you're using 0.6.0 version, regarding someone can not see the interpreter page, you might need to add the below lines manually to conf/shiro.ini.

/api/interpreter/** = authc, roles[admin]
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]

Hope this helps :)

@qinzl1
Copy link

qinzl1 commented Aug 11, 2016

thanks @AhyoungRyu ,
But I note that ,if user didn't login will can not load the some div ,look like,<div class="col-md-4 ng-scope" ng-if="ticket">,.if I can controll the div did not show when some roles login
this my shiro.ini config

 [roles]
admin = *
dev = *
analyst = *

[urls]
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
# authc means Form based Auth Security
# To enfore security, comment the line below and uncomment the next one
/api/version = anon
#/** = anon
#/** = authc
/api/login = authc
/api/interpreter/** = authc,roles[admin]
/api/credential/** = authc,roles[admin]
/api/configuration/** = authc,roles[admin]
/api/notebook/interpreter/** = authc,roles[admin]

/** = authc

I don't want analyst role can create/delete notebook

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants