Skip to content

Iceberg attach rework#142

Merged
Tishj merged 69 commits intoduckdb:mainfrom
Tishj:iceberg_attach_rework
Apr 12, 2025
Merged

Iceberg attach rework#142
Tishj merged 69 commits intoduckdb:mainfrom
Tishj:iceberg_attach_rework

Conversation

@Tishj
Copy link
Collaborator

@Tishj Tishj commented Apr 2, 2025

This PR changes the way secret and attach options function for iceberg.

Previously the secret in the attach options expected an S3 secret, this makes sense for s3tables/glue, but not so much for more traditional Iceberg REST Catalogs, that instead rely on OAuth2 to authorize access to the catalog.

In which case the key_id and secret from the s3 secret were used in request to the authorization server, in the client_credentials flow, which is wrong and confusing.

Attach Options

Options handled by the root:

  • endpoint
    The main endpoint of the Iceberg REST Catalog.
  • endpoint_type
    Streamlined way of attaching recognized catalog types.
  • authorization_type
    The method used to authorize access to the catalog.

AUTHORIZATION_TYPE defaults to oauth2.

Options handled by the endpoint type:

Based on the endpoint type, any additional options may be passed, see "Endpoint Types"

Options handled by the authorization type:

Based on the authorization type, any additional options may be passed, see "Authorization Types"

ICEBERG Secret Options

The ICEBERG secret uses the same options as the OAuth2 AUTHORIZATION_TYPE, with the addition of endpoint, to infer the OAUTH2_SERVER_URI if it's not given.

Endpoint Types

Glue (glue)

Attach Options:

  • secret
    The explicit name of a recognized storage secret to use.

NOTE:
The found secret (either through discovery or by explicitly naming it) needs to have region set.

endpoint will be set to glue.{region}.amazonaws.com/iceberg.
authorization_type will be set to sigv4.

S3Tables (s3_tables)

The region is discovered from the provided warehouse (path of the attached catalog)

endpoint will be set to s3tables.{region}.amazonaws.com/iceberg.
authorization_type will be set to sigv4.

Authorization Types

OAuth2

Attach Options:

  • secret
    The explicit name of an ICEBERG secret to use.
  • oauth2_scope
    The scope for the authorization request.
  • oauth2_server_uri
    The endpoint of the authorization server to send the request to.
  • client_id
    The client_id for the authorization request.
  • client_secret
    The client_secret for the authorization request.
  • oauth2_grant_type
    The grant_type for the authorization request.
  • token
    The bearer token received from the authorization server directly.

NOTE:
secret can not be combined with any of the other options.

SigV4

Attach Options:

  • secret
    The explicit name of a recognized storage secret to use.

TL;DR

The main changes are:

  • Addition of the AUTHORIZATION_TYPE.
  • Fixing the ICEBERG secret type so it can be created by the user.
  • Supporting token in the oauth2 authorization type options.

@Tishj Tishj marked this pull request as draft April 2, 2025 11:18
@corleyma
Copy link

corleyma commented Apr 3, 2025

I think this is headed in the right direction! One question:

if I read this correctly, can storage_secret be e.g. a GCS secret, if someone is running a rest catalog for iceberg data on GCS, and hasn't enabled credential vending?

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 3, 2025

I think this is headed in the right direction! One question:

if I read this correctly, can storage_secret be e.g. a GCS secret, if someone is running a rest catalog for iceberg data on GCS, and hasn't enabled credential vending?

Hmm I'm not sure, I think we'll need further changes to support other storage secret types than S3.
In this scenario, does the catalog expect authorization though the oauth endpoint or is it more like glue/s3tables?

@corleyma
Copy link

corleyma commented Apr 3, 2025

@Tishj If by oauth endpoint you mean the catalog-hosted oauth endpoint, that's going away in the REST spec. The catalog does still expect authentication and will need its own secret. And while in practice for REST catalog servers, the way to auth has been via oauth client credentials flow (because that's what the major iceberg client reference implementations in Java implemented), that is also changing now. with the introduction of the AuthManager API. Similar changes are planned for pyiceberg to support more flexible auth.

I think instead of thinking of it as an "Iceberg Secret", you could think of a few secret types:

  • sigv4 (this is the kind of secret that s3tables/glue/etc use, what you've called "s3" secret so far -- standard auth mechanism for AWS services)
  • oauth client credentials flow (currently, what you're calling Iceberg secret type)
  • oauth device code flow (human user flow, would require user to auth in browser)
  • token (just give me a token however you got it, I will submit this as Bearer token to catalog)

If you consider a catalog like Lakekeeper that implements OIDC with external identity providers, all that matters is that it gets a signed jwt with the correct claims (like audience, etc).

So I think it might be smart to have an interface that works like:

  • catalog secret (always required, at least if a catalog has auth enabled)
  • storage secret (optional, only required if catalog does not vend credentials or you don't want to use the catalog vended credential for some reason)

And catalog secret can be one of:

  • sigv4
  • Oauth2Credentials (initially, just support client credentials flow as you do now, could be updated in future to make client_secret optional and add more optional fields to support e.g. oauth2 device code flow, etc; I would also advocate for dropping the legacy {endpoint}/v1/oauth/tokens support here so it really is just generic oauth2 client credentials, since this endpoint has been deprecated in the spec for awhile now and I am not aware of production catalogs that require it.)
  • Token

By letting users configure a token secret type that you will pass directly as bearer token when making requests to catalog, you provide the opportunity for others to implement their own custom client-side auth flows, including if needed e.g. device code flow (though i do think that's something that would be good to be duckdb native in the future), and just give you a ready to use token. This will also help people who have weird requirements like accessing their catalogs behind authenticating proxies, etc (it's a thing).

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 3, 2025

Our way of thinking about these was to have the ICEBERG secret type, and when we introduce support for additional authorization methods, we'll add an AUTHORIZATION_TYPE option to the iceberg secret, which defaults to oauth2.

Based on the authorization type, we can handle any other parameter that is relevant for that authorization type, and also provide defaults for those parameters if they aren't given (right now we have defaults for scope and oauth2_server_uri for the oauth2 authorization type)

Interesting that you mention sigv4, we handle those through the CATALOG_SECRET currently, but the ENDPOINT_TYPE is required to indicate glue/s3tables (the sigv4-based authorization). I have considered adding that as an authorization type instead, adding a parameter to refer to the IAM (s3) secret for signing.

Something along the lines of

CREATE SECRET (
	TYPE ICEBERG,
	AUTHORIZATION_TYPE 'sigv4',
	SIGNER_SECRET 'my_s3_secret'
)

But that would introduce a bunch of extra steps to the glue/s3tables catalog configuration, which I'm not sure we want to do.

To paint a more complete picture, this is how a token would be provided directly (this PR adds support for this):

CREATE SECRET (
	TYPE ICEBERG,
	AUTHORIZATION_TYPE 'oauth2', --- optional, 'oauth2' is the default
	TOKEN '<bearer_token>'
)

I would also advocate for dropping the legacy {endpoint}/v1/oauth/tokens support here

There is already support for the OAUTH2_SERVER_URI parameter, the default for which is the deprecated endpoint - but a warning is logged (when logging is enabled). This is similar to Spark's behavior.

oauth device code flow

This is also something we're thinking of, I haven't looked into this yet however, so I can't tell you what parameters would be required here.

@Tishj Tishj marked this pull request as ready for review April 8, 2025 13:40
@corleyma
Copy link

corleyma commented Apr 8, 2025

This is shaping up nicely to me, but wanted to bump again the discussion of supporting storage secrets for object stores other than s3 -- GCS, Azure, etc? Is that planned in this round of work or future work?

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

No, it's not, this isn't Polaris.
I'll test it with Polaris/Lakekeeper, but the S3 secret is never referenced by the catalog

Hmm ok. I'm saying, with Polaris, specifying the S3 credentials is not necessary and people won't provide them, they won't even know what they are. S3 creds are provided at the time the catalog is created and Polaris vends it as needed under the hood thereafter.

So ICEBERG secret has to work without S3 creds being provided

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

@Tmonster please don't merge yet, I'm getting HTTP 400 error when I try to query my data lake even though I'm not getting any errors during CREATE SECRET and ATTACH. Let me look into it

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

This is what I'm getting:

D CREATE SECRET my_secret (
     TYPE ICEBERG,
     CLIENT_ID '...',
     CLIENT_SECRET '...',
     OAUTH2_SERVER_URI 'https://polaris.fivetran.com/api/catalog/v1/oauth/tokens'
  );
┌─────────┐
│ Success │
│ boolean │
├─────────┤
│ true    │
└─────────┘
D ATTACH 'land_inappropriate' AS my_catalog (
      TYPE ICEBERG, SECRET my_secret,
      ENDPOINT 'https://polaris.fivetran.com/api/catalog'
  );
D select * from my_catalog.fivetran_metadata_land_inappropriate.account limit 1;
HTTP Error:
HTTP GET error on 'https://george-fivetran-data-lake-test.s3.amazonaws.com/my_data_lake/fivetran_metadata_land_inappropriate/account/metadata/00002-3637a3f6-dc84-4df1-bfd8-978dcff9b477.metadata.json' (HTTP 400)

(This did use to work)

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

Btw, there was a AWS_REGION parameter in the ICEBERG secret, that needs to be put back somewhere.. Polaris needs it. It could go into the ATTACH command, doesn't need to be in the SECRET

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

It looks like REGION moved to GLUE and S3_TABLES but Polaris needs it as well.

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

Here is a patch to give you inspiration that fixes the region issue above:

fix-region-for-polaris.patch

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 11, 2025

Here is a patch to give you inspiration that fixes the region issue above:

fix-region-for-polaris.patch

There is a client.region (see https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/apache/iceberg/main/open-api/rest-catalog-open-api.yaml) listed in the recognized properties, that we don't support currently.

Would adding support for that fix your problem?
I really don't see why the "region" would need to be added by us in this manner, the catalog has many opportunities to give it in a sensible fashion, i.e through /v1/config or returned as part of the config or storage-credentials in the /v1/{prefix}/namespaces/{namespace}/tables/{table} response

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

I really don't see why the "region" would need to be added by us in this manner

Without it, it doesn't work for data lakes not in us-east-1. Might be a Polaris bug, I'm not sure. The following is the key fix. How you get the region there is upto you. My patch is just one way, if you don't like it, that's fine. But we do need it, please make it possible to specify region for the "else" case. The region is handled indirectly for GLUE and S3_TABLES, from the warehouse or the endpoint uri. We need to provide it explicitly for Polaris when it differs from the default us-east-1

diff --git a/src/storage/irc_table_entry.cpp b/src/storage/irc_table_entry.cpp
index f716ec3..82950b6 100644
--- a/src/storage/irc_table_entry.cpp
+++ b/src/storage/irc_table_entry.cpp
@@ -97,6 +97,8 @@ TableFunction ICTableEntry::GetScanFunction(ClientContext &context, unique_ptr<F
 			                                      : kv_secret.TryGetValue("session_token").ToString()},
 			                {"region", region},
 			                {"endpoint", endpoint}};
+		} else {
+			info.options["region"] = ic_catalog.region;
 		}
 
 		(void)secret_manager.CreateSecret(context, info);

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 11, 2025

I really don't see why the "region" would need to be added by us in this manner

Without it, it doesn't work for data lakes not in us-east-1. Might be a Polaris bug, I'm not sure. The following is the key fix. How you get the region there is upto you. My patch is just one way, if you don't like it, that's fine. But we do need it, please make it possible to specify region for the "else" case. The region is handled indirectly for GLUE and S3_TABLES. We need to provide it explicitly for Polaris when it differs from the default us-east-1

diff --git a/src/storage/irc_table_entry.cpp b/src/storage/irc_table_entry.cpp
index f716ec3..82950b6 100644
--- a/src/storage/irc_table_entry.cpp
+++ b/src/storage/irc_table_entry.cpp
@@ -97,6 +97,8 @@ TableFunction ICTableEntry::GetScanFunction(ClientContext &context, unique_ptr<F
 			                                      : kv_secret.TryGetValue("session_token").ToString()},
 			                {"region", region},
 			                {"endpoint", endpoint}};
+		} else {
+			info.options["region"] = ic_catalog.region;
 		}
 
 		(void)secret_manager.CreateSecret(context, info);

Can you please verify that the /v1/config result doesn't contain the region somehow?
And the LoadTableResult (the other endpoint I mentioned) can also have it in the config property, or in storage-credentials

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 11, 2025

@ediril can the region be a default, or does it have to be an override
i.e if the catalog returns credentials with a certain region, does that take precedence or should the user-supplied region override that?

Because what you have given me in your patch is an override for any non-glue and non-s3tables default credential (created when storage-credentials is empty but the config isn't)

And you have added a default value for the region.

I don't want to add a default value, especially not if it's an override.
And I'd prefer not to set an override if it can be a default.

I can add an override_region / default_region to the oauth2 authorization_type.
This would not have a default value (empty if not supplied)

Let me know if this should be override_region or default_region.
If this should be override_region, should this be applied to every created secret, or only the secret created from the config? (this is where the glue/s3tables override is currently applied).

Try the latest version of the branch
53268d7 is the relevant commit
This implements the default_region, adding the region as if it was a defaults from the /v1/config response

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

I can add an override_region / default_region to the oauth2 authorization_type.
This would not have a default value (empty if not supplied)

Yes I just tested this, it works, thanks!

It doesn't need to have a default value, and you can even call the parameter just REGION. You added a "not empty" check for it, that's perfect.

The only issue I see is, if OAUTH goes away, we'll be back to the same problem. REGION is really not related to OAUTH, it has to do with the bucket that holds the data lake files. The data lake can be anywhere, so it is not an AWS/S3 parameter either..

@ediril
Copy link
Contributor

ediril commented Apr 11, 2025

the catalog has many opportunities to give it in a sensible fashion, i.e through /v1/config or returned as part of the config or storage-credentials in the /v1/{prefix}/namespaces/{namespace}/tables/{table} response

No the region is not returned via either of these unfortunately. I agree /v1/config would be a good place to return it. Might be a bug or oversight, I'm not sure.

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 11, 2025

the catalog has many opportunities to give it in a sensible fashion, i.e through /v1/config or returned as part of the config or storage-credentials in the /v1/{prefix}/namespaces/{namespace}/tables/{table} response

No the region is not returned via either of these unfortunately. I agree v1/config would be a good place to return it. Might be a bug or oversight, I'm not sure.

I am confident this is a bug in Polaris, because this just doesn't respect vended credentials, there shouldn't be any intervention on the user side. There are official channels for this (config, storage-credentials...)

@Tishj Tishj merged commit afb64ca into duckdb:main Apr 12, 2025
20 checks passed
@PowerPlop
Copy link

Is authorization_type NONE supported?

Nessie catalog can run with authentication disabled: https://projectnessie.org/nessie-latest/authentication/

@Tmonster
Copy link
Collaborator

@PowerPlop, that should be fixed with #288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants