-
Notifications
You must be signed in to change notification settings - Fork 204
OAuth Private Key support #733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Tine Kleivane <[email protected]>
db94c12
to
99ad3f5
Compare
jwt_header = {"alg": self.algorithm, "kid": self.key_id} | ||
jwt_claims = { | ||
"aud": self.issuer, | ||
"iss": self.client_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paalvibe how do you intend to use this client? Specifically, do you plan to use this change with Delta Sharing over OIDC M2M?
From the PR, it sounds like you are sending a self-signed token to your configured token endpoint, which then returns a JWT.
I am curious what the resultant token from your token endpoint looks like?
Could you share a sample of the JWT that your token endpoint returns (and that the client ultimately sends to the Delta Sharing server), along with the corresponding OIDC federation policy configuration the server uses to authenticate the request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Moe,
Yes, we are planning to use Delta Sharing over OIDC M2M.
Here is an example of the decoded JWT returned by the token endpoint:
{
"aud": "https://acme.org",
"sub": "0192: 315300649;acme:customerdata.gold",
"scope": "acme:customerdata.gold",
"iss": "https://sky.maskinporten.no",
"client_amr": "private_key_jwt",
"token_type": "Bearer",
"exp": 1694222211,
"iat": 1694333311,
"client_id": "abcd1234-1234-abcd-abcd-12341234abcd",
"jti": "lwlwlwlw4lwlwlwlwl4lwlw4-lw-lwl4lwl4lwl4lwl4",
"consumer": {
"authority": "iso6523-actorid-upis",
"ID": "0192:315300649"
}
}
Here are the policy details needed in Databricks OIDC server configuration:
Issuer URL: https://sky.maskinporten.no/
Subject claim: sub
Subject: 0192:315300649;acme:customerdata.gold
Audiences: https://acme.org/
sky.maskinporten.no
is the national company identity provider in Norway: https://docs-digdir-no.translate.goog/docs/Maskinporten/maskinporten_skyporten.html?_x_tr_sl=no&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=wapp
It is a machine to machine identity service, where access is granted in a maskinporten web interface or API according to organization number e.g. "0192:315300649" and scope e.g. "acme:customerdata.gold".
We have already implemented this for sharing cloud resources on AWS, Azure and GCP, as explained in the documentation above.
We have already tested it with our branch and it works. It would enable any data consumer identified with skyporten to read a Delta Share wherever they are, inside or outside Databricks, e.g. in a local notebook.
More info if of intersest:
At Samferdselsdata.no (Public Transport Sector Data Sharing initiative) we have working with Norwegian Digitalisation Agency (Digdir) to implement Delta Shares directly with the Maskinporten National Orgnumber OAuth2 service for authentication, like we have achieved for IAM based cloud access with Skyporten (https://docs.digdir.no/docs/Maskinporten/maskinporten_skyporten). This would mean that one could simply declare which org number should have access, and avoid any credentials exchange at all. Hopefully, a similar pattern will also be possible to use across Europe soon. In practice this enables country code+orgnumber based delta sharing instead of Entra or email-based access.
def _signed_jwt(self, jwt_header, jwt_claims): | ||
"""Generate a signed JWT token using the private key""" | ||
jwt_token = jwt.JWT(header=jwt_header, claims=jwt_claims) | ||
with open(self.private_key, "rb") as key_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes the key_file is available on disk and that the client can read it directly. How would this work in a Spark cluster environment, where the key isn’t necessarily stored on local disk? Where is the key expected to be stored in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not tested it inside a spark cluster environment, since we envisioned that data scientists would want to read the data in their development context. However, we are happy to get feedback from you on how to get it. In spark, one could get the key from a secrets service, and write the contents to a local path which is passed to DeltaSharingProfile(). However, an alternative or secondary option could be to pass the secret value into DeltaSharingProfile() as parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paalvibe
Thank you for filing this feature request and raising the PR, it’s a great effort to extend the Delta Sharing client.
Apologies for the delay in reviewing. I do have a few questions, adding them on the PR.
Awesome. Will get back to you asap. |
Hi! Any updates on the PR? |
Private Key is a commonly used pattern in OAuth, and is used by national identity infrastructure in Norway (Skyporten). It enables integration with more OAuth / OIDC IDPs.
Co-authored-by: Tine Kleivane [email protected]