Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring S3 config #51

Closed
wants to merge 9 commits into from
Closed

Refactoring S3 config #51

wants to merge 9 commits into from

Conversation

fhennig
Copy link
Contributor

@fhennig fhennig commented Apr 11, 2022

Description

fixes #49

Review Checklist

  • Code contains useful comments
  • (Integration-)Test cases added (or not applicable)
  • Documentation added (or not applicable)
  • Changelog updated (or not applicable)
  • Cargo.toml only contains references to git tags (not specific commits or branches)
  • Helm chart can be installed and deployed operator works (or not applicable)

Once the review is done, comment bors r+ (or bors merge) to merge. Further information

@fhennig fhennig marked this pull request as ready for review April 12, 2022 11:48
Copy link
Member

@razvan razvan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

submit_cmd.push(format!("--conf spark.hadoop.fs.s3a.endpoint={}", endpoint));
if self.spec.s3.is_some() {
submit_cmd.push(
"--conf spark.hadoop.fs.s3a.endpoint=http://${BUCKET_HOST}:${BUCKET_PORT}/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http:// as a hard-coded protocol doesn't look right. Have you tried s3a:// instead ?

Also, since we are adding this, why not add all other s3a related properties as documented here https://spark.apache.org/docs/latest/running-on-kubernetes.html#dependency-management

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works with http, but I'll try with s3a.

why not add the other properties?

That wasn't really in the scope of the ticket ;) But we could add more. This was mostly about making things more unified among operators.

@@ -365,6 +382,21 @@ fn env_var_from_secret(var_name: &str, secret: &str, secret_key: &str) -> EnvVar
}
}

fn env_var_from_config_map(var_name: &str, config_map: &str, config_map_key: &str) -> EnvVar {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another candidate for the operator-rs :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already added something like this recently: stackabletech/operator-rs#370
But it's a function on the ContainerBuilder. And in the end I decided against using the functions like this because it would make the code worse. And Sebastian and I didn't want two ways to do the same thing to keep the operators more "the same". In the end I decided to use the function here with the thought "this operator is different from all the others"

@sbernauer
Copy link
Member

sbernauer commented Apr 13, 2022

Hi ho, I'm not that into the whole S3 refactoring going on here. I'm just dropping some ideas with no requirement to fulfill them ;)
Why is the S3 connection a ConfigMap and not a CRD? The reason I'm asking is that i see that we need more advanced settings like Tls verification.
I would love to have something like this

apiVersion: v1
kind: S3Connection
metadata:
  name: my-minio
data:
  host: test-minio
  port: 9000 # optional
  pathStyleAccess: true # optional
  region: us-east-1 # optional
  multiPartUpload: true # optional
  # maybe sts in the future
  tls:  # optional common TLS struct => also allows to differentiate http/https. See https://docs.stackable.tech/commons-operator/tls.html
    verification:
      none: {}
  authenticationClass: minio-credentials # optional
# We need a AuthenticationClass as we want to support login via anonymous, basicAuth, via Session Token, via Iam Role etc... -> same concept as with LDAP bind credentials
---
apiVersion: v1
kind: S3Bucket
metadata:
  name: my-bucket
data:
  bucketName: my-bucket
  s3Connection: my-minio

The S3Connection should contain all the information needed to talk to a S3 service. The bucket is not included as you don't want to repeat the whole connection 100 times if you have 100 buckets.

@fhennig
Copy link
Contributor Author

fhennig commented Apr 13, 2022

I think we can close this without merging

@razvan razvan closed this Apr 19, 2022
@razvan razvan deleted the s3-rework branch November 8, 2022 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor S3 configuration to be compabitle with ADR and operator-rs implementation
3 participants