Skip to content

Conversation

paalvibe
Copy link

@paalvibe paalvibe commented May 22, 2025

Private Key is a commonly used pattern in OAuth, and is used by national identity infrastructure in Norway (Skyporten). It enables integration with more OAuth / OIDC IDPs.

Co-authored-by: Tine Kleivane [email protected]

@paalvibe paalvibe force-pushed the oauth-private-key branch from db94c12 to 99ad3f5 Compare May 22, 2025 10:45
@roeap roeap self-requested a review May 30, 2025 10:13
jwt_header = {"alg": self.algorithm, "kid": self.key_id}
jwt_claims = {
"aud": self.issuer,
"iss": self.client_id,
Copy link
Collaborator

@moderakh moderakh Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paalvibe how do you intend to use this client? Specifically, do you plan to use this change with Delta Sharing over OIDC M2M?

From the PR, it sounds like you are sending a self-signed token to your configured token endpoint, which then returns a JWT.

I am curious what the resultant token from your token endpoint looks like?

Could you share a sample of the JWT that your token endpoint returns (and that the client ultimately sends to the Delta Sharing server), along with the corresponding OIDC federation policy configuration the server uses to authenticate the request?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Moe,

Yes, we are planning to use Delta Sharing over OIDC M2M.

Here is an example of the decoded JWT returned by the token endpoint:

{
  "aud": "https://acme.org",
  "sub": "0192: 315300649;acme:customerdata.gold",
  "scope": "acme:customerdata.gold",
  "iss": "https://sky.maskinporten.no",
  "client_amr": "private_key_jwt",
  "token_type": "Bearer",
  "exp": 1694222211,
  "iat": 1694333311,
  "client_id": "abcd1234-1234-abcd-abcd-12341234abcd",
  "jti": "lwlwlwlw4lwlwlwlwl4lwlw4-lw-lwl4lwl4lwl4lwl4",
  "consumer": {
    "authority": "iso6523-actorid-upis",
    "ID": "0192:315300649"
  }
}

Here are the policy details needed in Databricks OIDC server configuration:

Issuer URL: https://sky.maskinporten.no/
Subject claim: sub
Subject: 0192:315300649;acme:customerdata.gold
Audiences: https://acme.org/

sky.maskinporten.no is the national company identity provider in Norway: https://docs-digdir-no.translate.goog/docs/Maskinporten/maskinporten_skyporten.html?_x_tr_sl=no&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=wapp

It is a machine to machine identity service, where access is granted in a maskinporten web interface or API according to organization number e.g. "0192:315300649" and scope e.g. "acme:customerdata.gold".

We have already implemented this for sharing cloud resources on AWS, Azure and GCP, as explained in the documentation above.

We have already tested it with our branch and it works. It would enable any data consumer identified with skyporten to read a Delta Share wherever they are, inside or outside Databricks, e.g. in a local notebook.

More info if of intersest:

At Samferdselsdata.no (Public Transport Sector Data Sharing initiative) we have working with Norwegian Digitalisation Agency (Digdir) to implement Delta Shares directly with the Maskinporten National Orgnumber OAuth2 service for authentication, like we have achieved for IAM based cloud access with Skyporten (https://docs.digdir.no/docs/Maskinporten/maskinporten_skyporten). This would mean that one could simply declare which org number should have access, and avoid any credentials exchange at all. Hopefully, a similar pattern will also be possible to use across Europe soon. In practice this enables country code+orgnumber based delta sharing instead of Entra or email-based access.

def _signed_jwt(self, jwt_header, jwt_claims):
"""Generate a signed JWT token using the private key"""
jwt_token = jwt.JWT(header=jwt_header, claims=jwt_claims)
with open(self.private_key, "rb") as key_file:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes the key_file is available on disk and that the client can read it directly. How would this work in a Spark cluster environment, where the key isn’t necessarily stored on local disk? Where is the key expected to be stored in that case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have not tested it inside a spark cluster environment, since we envisioned that data scientists would want to read the data in their development context. However, we are happy to get feedback from you on how to get it. In spark, one could get the key from a secrets service, and write the contents to a local path which is passed to DeltaSharingProfile(). However, an alternative or secondary option could be to pass the secret value into DeltaSharingProfile() as parameter.

@moderakh moderakh self-requested a review August 29, 2025 21:01
Copy link
Collaborator

@moderakh moderakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paalvibe
Thank you for filing this feature request and raising the PR, it’s a great effort to extend the Delta Sharing client.

Apologies for the delay in reviewing. I do have a few questions, adding them on the PR.

@paalvibe
Copy link
Author

paalvibe commented Aug 29, 2025

Awesome. Will get back to you asap.

@paalvibe
Copy link
Author

paalvibe commented Sep 8, 2025

@paalvibe Thank you for filing this feature request and raising the PR, it’s a great effort to extend the Delta Sharing client.

Apologies for the delay in reviewing. I do have a few questions, adding them on the PR.

Hi! Any updates on the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants