Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identity API #11

Open
saulshanabrook opened this issue Aug 23, 2019 · 11 comments
Open

Identity API #11

saulshanabrook opened this issue Aug 23, 2019 · 11 comments

Comments

@saulshanabrook
Copy link
Member

For the JupyterLab commenting work we need a way to identify who is commenting, to show their name and photo (jupyterlab/jupyterlab-commenting#22, jupyterlab/jupyterlab-commenting#35).

We were thinking about using JupyterHub to let us know who is active, but we don't want commenting to depend directly on that.

So I propose that we create a repo jupyterlab-identity to expose a global identity API in JupyterLab.

Design Notes

We have also been working on a metadata service for JupyterLab, so we thought we could have the identity API only take care of giving us a unique ID for who you are, then look up information about you, like your name and photo, with the metadata provider. Here is a sample class we could expose that does this, using properties . of the Schema.org Person type

import {LinkedDataRegistry} from '@jupyterlab/jupyterlab-metadata'


class Identity {
    constructor(private linkedDataRegistry: LinkedDataRegistry) {}

    /**
     * The current user ID.
     */
    public id: URL | null = null;

    /**
     * Get metadata about a person, retrieved from the metadata registry.
     */
    async getPerson(id: URL): Promise<{name?: string, image?: URL}> {
        const person = this.linkedDataRegistry.get(id);
        const name = person['http://schema.org/name'] 
        const image = person['http://schema.org/image']
        return {
            name: name|| undefined,
            image: image ? new URL(image): undefined,
        }
    }

}

To create a plugin with JupyterHub, we could have it set the id to something like juptyerhub:///saul when it starts up, and register metadata about the different IDs in the metadata store, if we can fetch information about them.

cc @ellisonbg @Zsailer @ktaletsk @hoo761

@ellisonbg
Copy link
Contributor

ellisonbg commented Aug 23, 2019 via email

@mckev-amazon
Copy link

Very cool, thanks for taking this on!

This will definitely be a useful addition to JupyterLab. One thought I have is that you might want to look at identity standards in addition to or instead of of schema.org for inspiration. schema.org is great for data modeling purposes, but perhaps not necessarily for identity and authentication purposes (for example, it's unlikely that it would ever make sense to include "netWorth", but might make sense to include an OpenID subscriber ID in this API).

Two prevailing identity schemas are OpenID Connect (OIDC) ID Token and SAML Response Assertion.

@saulshanabrook
Copy link
Member Author

Hey @mckev-amazon thank you for the suggestions!

We took a look at OpenID Connect, but I think this is actually a slightly different scope. I could see an OpenID Connect extension in JupyterLab that depends on this plugin and sets the identity of the user based on how they authenticate. So the hope here is to intentionally punt on how JupyterLab gets to know about who is using it. Instead, we just focus on being able to share this identity, once it has been established by some other plugin, with other parts of the application who want to know who is using JupyterLab.

@mckev-amazon
Copy link

Hey @saulshanabrook !

Yes, completely agreed, I would definitely try to keep this API as neutral as possible from the type of authentication or identity that is being used. Simply suggesting that we look at these standards to determine what the lowest common denominator should be for this API, given that it's a generic one. After all, at some point the internal JupyterLab API might need to be traced back to one of the external identities. The open standards might also give some hints on what to include which might not be immediately apparent (Full Name, email, and avatar all make sense to me from a UI standpoint, for example). No worries if that investigation doesn't turn up much :-)

@saulshanabrook
Copy link
Member Author

That's a good point! Yeah thanks for these references. 👍

After all, at some point the internal JupyterLab API might need to be traced back to one of the external identities.

If you have any use cases/ideas around this, I would love to hear them. I am not that familiar with these identity schemas.

@jaipreet-s
Copy link
Member

jaipreet-s commented Aug 23, 2019

Hi @saulshanabrook , this is great!
This would be useful for jupyterlab/pull-requests and jupyterlab-git to represent the identify of the person from the perspective of the VCS provider (GitHub, BitBucket, GitLab..)

A valid goal to further adoption would be for the Person interface to be extensible to allow adding of additional fields to the base types. This could also be accomplished with generics <T extends Person>

Something like:

class Person {
  private String id;
  private String name

}

class CompanyPerson extends Person {
  private String jobTitle;
  private String division;
}

From the UI perspective, the JupyterLab status bar could be utilized to display the currently determined identity of the user. Prior art here is VSCode's status bar which displays the GitHub user
https://github.com/microsoft/vscode-pull-request-github/blob/master/.readme/demo.gif

@ellisonbg
Copy link
Contributor

The reason we are thinking about adopting the schema.org person object is that it is flexible enough to handle all the complexities of people, so already has all of those things:

https://schema.org/Person

(note, this also inherits from Thing, which has a base set of attributes)

@ellisonbg
Copy link
Contributor

To address @mckev-amazon 's comments on OIDC and SAML. Our expectation is that people deploying Jupyter will use a range of different auth/identity providers, and certainly many of them will be using OIDC/SAML in a manner that would work for identity. At the same time, we know of other deployments that are using more novel systems for identity. For example, LSST is using GitHub orgs/teams as their directory service and identity provider (and even map Github teams to local POSIX ACLs). Also, there are a range of different deployment targets, from JupyterHub, standalone single servers. I think the approach we are thinking of here will enable all of those usage cases, providers to get identity information into lab in a flexible way. That being said, I think it is reasonable for JupyterHub to standardize on OIDC or SAML from a protocol perspective to get this information, but those would be implementations of the more abstract interface.

@echarles
Copy link
Member

echarles commented Oct 5, 2019

What about IAM API instead of Identity API?

With Identity and Access Management we cover more and the Access (a.k.a. Authorization based on e.g. Roles) will come in the picture very soon. Access is highly coupled to Identity, so I believe it makes sense to look at them at the same time and the same place.

@ellisonbg
Copy link
Contributor

The difficulty with this is that Jupyter deployments are extremely diverse on the auth side of things that I think standardizing will be impossible. Also, JupyterLab doesn't need to know details of how someone is authorized - it only needs to know who they are once they arrive. But I do agree that access and identity is coupled, but the goal here is to only surface the piece (identity) that JupyterLab needs to know about. In practice, it will be auth-systems that pass the identity information to JupyterLab.

@echarles
Copy link
Member

echarles commented Oct 7, 2019

@ellisonbg If needed, Authorization can be added after in the same repo or in a separate one. BTW a Pluggable user token creation/validation for jupyter_server could use this Identity API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants