Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ request adding CHANGELOG notes for breaking (!) changes and possibly other secti
- Generic Table is no longer in beta and is generally-available.
- Added Windows support for Python client.
- (Before/After)UpdateTableEvent is emitted for all table updates within a transaction.
- Added KMS options to Polaris CLI

### Deprecations

Expand Down
2 changes: 2 additions & 0 deletions client/python/apache_polaris/cli/command/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ def options_get(key, f=lambda x: x):
sts_endpoint=options_get(Arguments.STS_ENDPOINT),
sts_unavailable=options_get(Arguments.STS_UNAVAILABLE),
path_style_access=options_get(Arguments.PATH_STYLE_ACCESS),
current_kms_key=options_get(Arguments.KMS_KEY_CURRENT),
allowed_kms_keys=options_get(Arguments.KMS_KEY_ALLOWED),
Comment on lines +75 to +76
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does just setting current_kms_key automatically adds the allowed_kms_keys ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently "current" is the same as "allowed"... However, I believe only "current" needs write access... but that's in Polaris java code... it does not affect CLI.

catalog_connection_type=options_get(Arguments.CATALOG_CONNECTION_TYPE),
catalog_authentication_type=options_get(Arguments.CATALOG_AUTHENTICATION_TYPE),
catalog_service_identity_type=options_get(Arguments.CATALOG_SERVICE_IDENTITY_TYPE),
Expand Down
18 changes: 17 additions & 1 deletion client/python/apache_polaris/cli/command/catalogs.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,8 @@ class CatalogsCommand(Command):
sts_endpoint: str
sts_unavailable: bool
path_style_access: bool
current_kms_key: str
allowed_kms_keys: List[str]
catalog_connection_type: str
catalog_authentication_type: str
catalog_service_identity_type: str
Expand Down Expand Up @@ -145,6 +147,8 @@ def validate(self):
f" {Argument.to_flag_name(Arguments.USER_ARN)},"
f" {Argument.to_flag_name(Arguments.ENDPOINT)},"
f" {Argument.to_flag_name(Arguments.ENDPOINT_INTERNAL)},"
f" {Argument.to_flag_name(Arguments.KMS_KEY_CURRENT)},"
f" {Argument.to_flag_name(Arguments.KMS_KEY_ALLOWED)},"
f" {Argument.to_flag_name(Arguments.STS_ENDPOINT)},"
f" {Argument.to_flag_name(Arguments.STS_UNAVAILABLE)}, and"
f" {Argument.to_flag_name(Arguments.PATH_STYLE_ACCESS)}"
Expand Down Expand Up @@ -179,7 +183,17 @@ def validate(self):
)

def _has_aws_storage_info(self):
return self.role_arn or self.external_id or self.user_arn or self.region or self.endpoint or self.endpoint_internal or self.sts_endpoint or self.path_style_access
return (self.role_arn
or self.external_id
or self.user_arn
or self.region
or self.endpoint
or self.endpoint_internal
or self.sts_endpoint
or self.current_kms_key
or self.allowed_kms_keys
or self.path_style_access
)

def _has_azure_storage_info(self):
return self.tenant_id or self.multi_tenant_app_name or self.consent_url
Expand All @@ -202,6 +216,8 @@ def _build_storage_config_info(self):
sts_endpoint=self.sts_endpoint,
sts_unavailable=self.sts_unavailable,
path_style_access=self.path_style_access,
current_kms_key=self.current_kms_key,
allowed_kms_keys=self.allowed_kms_keys,
)
elif self.storage_type == StorageType.AZURE.value:
config = AzureStorageConfigInfo(
Expand Down
8 changes: 8 additions & 0 deletions client/python/apache_polaris/cli/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,8 @@ class Arguments:
ENDPOINT_INTERNAL = "endpoint_internal"
STS_ENDPOINT = "sts_endpoint"
STS_UNAVAILABLE = "no_sts"
KMS_KEY_CURRENT = "current_kms_key"
KMS_KEY_ALLOWED = "allowed_kms_key"
PATH_STYLE_ACCESS = "path_style_access"
CATALOG_CONNECTION_TYPE = "catalog_connection_type"
CATALOG_AUTHENTICATION_TYPE = "catalog_authentication_type"
Expand Down Expand Up @@ -258,6 +260,12 @@ class Create:
"(Only for S3) Indicates that Polaris should not use STS (e.g. if STS is not available)"
)
PATH_STYLE_ACCESS = "(Only for S3) Whether to use path-style-access for S3"
KMS_KEY_CURRENT = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As these are optional and only for AWS, we may want to update client/python/apache_polaris/cli/command/catalogs.py as well for the function _has_aws_storage_info(). Here is a reference: https://github.com/apache/polaris/pull/3305/files#diff-a3e865c2a57514f7f505c706a3af70a5ac90b712f96656b513cdbfcee20c031eL181

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point - updated

"(Only for AWS S3) The AWS KMS key ARN to be used for encrypting new S3 data"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming this is required because we need to use this key to encrypt metadata.json ? as when we are vending creds we don't know which snapshot the client will be reading so we vend creds for all or we just give decrypt creds for allowed key and encrypt | decrypt creds for current keys ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Polaris does not use KMS keys directly. It only generates AWS policies that allow those keys to be used on the AWS side when S3 requests are made. But, yes, the current key is used for writing new data. Zero or more additional keys are also allowed to be used because they might be required for dealing with old files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Polaris does not use KMS keys directly.

wouldn't we be needing this for encrypting / decrypting metadata.json ?

additional keys are also allowed to be used because they might be required for dealing with old files

I agree with additional keys but my question was why would Polaris vends creds for old kms keys for encrypting, files are immutable, so old keys should be vended for decrypt, similarly new key should have encrypt / decrypt.
Do we vend creds for encryption and decryting for all key in our sts policy ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my question was why would Polaris vends creds for old kms keys for encrypting, [...]

Currently it does. However, this is beyond the scope of current PR (CLI). It's about the actual java code from #2802 :)

Normally, I'd think "additional" keys should get only decryption rights, but this may be tricky from the manual key rotation perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #3338 for follow-up

)
KMS_KEY_ALLOWED = (
"(Only for AWS S3) AWS KMS key ARN(s) that this catalog and its clients are allowed to use for reading S3 data (zero or more)"
)

TENANT_ID = "(Required for Azure) A tenant ID to use when connecting to Azure Storage"
MULTI_TENANT_APP_NAME = (
Expand Down
2 changes: 2 additions & 0 deletions client/python/apache_polaris/cli/options/option_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ def get_tree() -> List[Option]:
Argument(Arguments.STS_ENDPOINT, str, Hints.Catalogs.Create.STS_ENDPOINT),
Argument(Arguments.STS_UNAVAILABLE, bool, Hints.Catalogs.Create.STS_UNAVAILABLE),
Argument(Arguments.PATH_STYLE_ACCESS, bool, Hints.Catalogs.Create.PATH_STYLE_ACCESS),
Argument(Arguments.KMS_KEY_CURRENT, str, Hints.Catalogs.Create.KMS_KEY_CURRENT),
Argument(Arguments.KMS_KEY_ALLOWED, str, Hints.Catalogs.Create.KMS_KEY_ALLOWED, allow_repeats=True),
Argument(Arguments.ALLOWED_LOCATION, str, Hints.Catalogs.Create.ALLOWED_LOCATION,
allow_repeats=True),
Argument(Arguments.ROLE_ARN, str, Hints.Catalogs.Create.ROLE_ARN),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ there are a few s3-only options:
```text
--storage-type s3
--role-arn (Only for AWS S3) A role ARN to use when connecting to S3
--current-kms-key (Only for AWS S3) The AWS KMS key ARN to be used for encrypting new S3 data
--allowed-kms-key (Only for AWS S3) AWS KMS key ARN(s) that this catalog and its clients are allowed to use for reading S3 data (zero or more)
--no-sts (Only for S3) Indicates that Polaris should not use STS (e.g. if STS is not available)
--region (Only for S3) The region to use when connecting to S3
--external-id (Only for S3) The external ID to use when connecting to S3
Expand Down