Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Enable setting of additional configuration parameters through environment variables #6458

Merged
merged 4 commits into from
Feb 19, 2025

Conversation

this-user
Copy link
Contributor

@this-user this-user commented Feb 14, 2025

What changes were proposed in this pull request?

Add the environment variables GRAVITINO_JDBC_USER, GRAVITINO_JDBC_PASSWORD, and GRAVITINO_S3_ENDPOINT to the rewrite_config.py, so database user and password can be set by the user in a container environment as well as a custom S3 endpoint.

Why are the changes needed?

Currently, JDBC user and PW are hardcoded to iceberg and cannot be changed when using the container images. One workaround is including the values in the JDBC connection string, but this neither elegant nor in line with how containers commonly handle this issue.

Additionally, it wasn't possible to set the value of gravitino.iceberg-rest.s3-endpoint in a container environment, which can be useful in a variety of situations when using a provider other than AWS.

Does this PR introduce any user-facing change?

Added the three aforementioned two environment variables that can be used by the user.

How was this patch tested?

Was tested locally only, since this is a minimal change.

@this-user this-user changed the title [Improvement] Enable setting JDBC user and password through environment variables [Improvement] Enable setting of additional configuration parameters through environment variables Feb 14, 2025
@tengqm
Copy link
Contributor

tengqm commented Feb 17, 2025

I'm not sure if passing credentials via env variables is a good practice. It has some security flaws. A more secure way would be mount them via a volume.

@jerryshao
Copy link
Contributor

@FANNG1 can you please help to review?

@this-user
Copy link
Contributor Author

I'm not sure if passing credentials via env variables is a good practice. It has some security flaws. A more secure way would be mount them via a volume.

That's a fair point, but this is how it is commonly done, including by other Iceberg REST catalogue implementations and Spark as well if you running them inside containers. And most users would expect it to work this way. While it is certainly not ideal, I don't think it's substantially worse than other options, because if someone could gain access to the container's environment, they could surely gain access to the file system too.

If you wanted to be more secure, you would have to avoid storing the credentials in clear text by deploying a solution like HashiCorp's Vault. But then you again run into the same issue: If someone gains access to the container, they can gain access to the credentials necessary for querying the Vault, and you end up in the same situation.

"GRAVITINO_WAREHOUSE" : "warehouse",
"GRAVITINO_CREDENTIAL_PROVIDER_TYPE" : "credential-providers",
"GRAVITINO_CREDENTIAL_PROVIDERS" : "credential-providers",
"GRAVITINO_GCS_CREDENTIAL_FILE_PATH" : "gcs-service-account-file",
"GRAVITINO_GCS_SERVICE_ACCOUNT_FILE" : "gcs-service-account-file",
"GRAVITINO_S3_ACCESS_KEY" : "s3-access-key-id",
"GRAVITINO_S3_SECRET_KEY" : "s3-secret-access-key",
"GRAVITINO_S3_ENDPOINT" : "s3-endpoint",
"GRAVITINO_S3_REGION" : "s3-region",
"GRAVITINO_S3_ROLE_ARN" : "s3-role-arn",
"GRAVITINO_S3_EXTERNAL_ID" : "s3-external-id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the mapping key in the document? please refer to https://gravitino.apache.org/docs/0.8.0-incubating/iceberg-rest-service#docker-instructions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, though I have put "0.9.0-incubating" into the "since version" field, not sure if that is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's correct because the since version here means the environment variable added version.

@FANNG1 FANNG1 merged commit e2947e1 into apache:main Feb 19, 2025
28 checks passed
@FANNG1
Copy link
Contributor

FANNG1 commented Feb 19, 2025

merged to main, thanks @this-user for the contribution and @tengqm for the review!

Abyss-lord pushed a commit to Abyss-lord/gravitino that referenced this pull request Feb 19, 2025
…on parameters through environment variables (apache#6458)

### What changes were proposed in this pull request?

Add the environment variables `GRAVITINO_JDBC_USER`,
`GRAVITINO_JDBC_PASSWORD`, and `GRAVITINO_S3_ENDPOINT` to the
`rewrite_config.py`, so database user and password can be set by the
user in a container environment as well as a custom S3 endpoint.

### Why are the changes needed?

Currently, JDBC user and PW are hardcoded to `iceberg` and cannot be
changed when using the container images. One workaround is including the
values in the JDBC connection string, but this neither elegant nor in
line with how containers commonly handle this issue.

Additionally, it wasn't possible to set the value of
`gravitino.iceberg-rest.s3-endpoint` in a container environment, which
can be useful in a variety of situations when using a provider other
than AWS.

### Does this PR introduce _any_ user-facing change?

Added the three aforementioned two environment variables that can be
used by the user.

### How was this patch tested?

Was tested locally only, since this is a minimal change.
diqiu50 pushed a commit to diqiu50/gravitino that referenced this pull request Feb 19, 2025
…on parameters through environment variables (apache#6458)

### What changes were proposed in this pull request?

Add the environment variables `GRAVITINO_JDBC_USER`,
`GRAVITINO_JDBC_PASSWORD`, and `GRAVITINO_S3_ENDPOINT` to the
`rewrite_config.py`, so database user and password can be set by the
user in a container environment as well as a custom S3 endpoint.

### Why are the changes needed?

Currently, JDBC user and PW are hardcoded to `iceberg` and cannot be
changed when using the container images. One workaround is including the
values in the JDBC connection string, but this neither elegant nor in
line with how containers commonly handle this issue.

Additionally, it wasn't possible to set the value of
`gravitino.iceberg-rest.s3-endpoint` in a container environment, which
can be useful in a variety of situations when using a provider other
than AWS.

### Does this PR introduce _any_ user-facing change?

Added the three aforementioned two environment variables that can be
used by the user.

### How was this patch tested?

Was tested locally only, since this is a minimal change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants