Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Query Strings in the Pipeline Root are ignored #10318

Closed
TobiasGoerke opened this issue Dec 14, 2023 · 7 comments
Closed

[backend] Query Strings in the Pipeline Root are ignored #10318

TobiasGoerke opened this issue Dec 14, 2023 · 7 comments

Comments

@TobiasGoerke
Copy link
Contributor

TobiasGoerke commented Dec 14, 2023

To my understanding, using custom artifact stores in pipelines v2 requires configuring the pipelineRoot.
According to the blob documentation, an endpoint needs to be configured by using a queryString that gets appended to the pipelineRoot.

However, the kfp driver ignores query strings and assembles wrong bucket names.

This issue effectively blocks using custom S3 artifact stores for pipelines v2.

Environment

KFP 2.0.5.

Steps to reproduce

Set a custom Pipeline Root, e.g. by setting the kfp-launcher's ConfigMap key defaultPipelineRoot to s3://my-bucket/my-dir?endpoint=my.endpoint&s3ForcePathStyle=true.

Expected result

We'll see errors like

failed to execute component: Failed to open bucket "my-bucket": open bucket s3//my-bucket/my-dir?endpoint=my.endpoint&s3ForcePathStyle=true%2Fmy-task-name%2Fmy-artifact-name: invalid value for query parameter "s3ForcePathStyle": strconv.ParseBool: parsing true/my-task-name/my-artifact-name: invalid syntax

Also, after the driver has been executed, the wrongly concatenated pipelineRoot is visible in the workflow yaml.


Impacted by this bug? Give it a 👍.

@TobiasGoerke
Copy link
Contributor Author

TobiasGoerke commented Dec 14, 2023

Using custom s3 endpoints now works for me (given #10319):

  • Set the defaultPipelineRoot in the kfp-launcher ConfigMap to e.g. s3://my-bucket/my-dir?endpoint=my.endpoint&s3ForcePathStyle=true
  • Set AWS_REGION env
  • Mount mlpipeline-minio-artifact (or another secret that contains your s3 credentials) to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

stijntratsaertit pushed a commit to stijntratsaertit/kfp that referenced this issue Feb 16, 2024
…10318) (kubeflow#10319)

* feat: preserve querystring in pipeline root

* refactor: create AppendToPipelineRoot
Also apply to client.go

* feat: remove query string from URIs (kubeflow#1)

* feat: remove query string from URIs

* refactor(GenerateOutputURI): move and preserve comments
@RonaldFletcher
Copy link

RonaldFletcher commented Apr 1, 2024

Using custom s3 endpoints now works for me (given #10319):

  • Set the defaultPipelineRoot in the kfp-launcher ConfigMap to e.g. s3://my-bucket/my-dir?endpoint=my.endpoint&s3ForcePathStyle=true
  • Set AWS_REGION env
  • Mount mlpipeline-minio-artifact (or another secret that contains your s3 credentials) to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

I used minio, I configured it this way, but it didn't work @TobiasGoerke
CleanShot 2024-04-01 at 22 21 39@2x

@TobiasGoerke
Copy link
Contributor Author

Using custom s3 endpoints now works for me (given #10319):

  • Set the defaultPipelineRoot in the kfp-launcher ConfigMap to e.g. s3://my-bucket/my-dir?endpoint=my.endpoint&s3ForcePathStyle=true
  • Set AWS_REGION env
  • Mount mlpipeline-minio-artifact (or another secret that contains your s3 credentials) to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

I used minio, I configured it this way, but it didn't work @TobiasGoerke CleanShot 2024-04-01 at 22 20 34@2x

Hard to say what's going wrong given your screenshot.
I'd assume your container is misconfigured and is missing permissions to create directories under `/tmp/outputs/', see last two lines

@RonaldFletcher
Copy link

Using custom s3 endpoints now works for me (given #10319):

  • Set the defaultPipelineRoot in the kfp-launcher ConfigMap to e.g. s3://my-bucket/my-dir?endpoint=my.endpoint&s3ForcePathStyle=true
  • Set AWS_REGION env
  • Mount mlpipeline-minio-artifact (or another secret that contains your s3 credentials) to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

I used minio, I configured it this way, but it didn't work @TobiasGoerke CleanShot 2024-04-01 at 22 20 34@2x

Hard to say what's going wrong given your screenshot. I'd assume your container is misconfigured and is missing permissions to create directories under `/tmp/outputs/', see last two lines

This is the example I use
CleanShot 2024-04-01 at 22 31 25@2x
I don't know why there is such an error,and How do AWS_SECRET_ACCESS_KEY inject these environment variables into the environment variables of that service? @TobiasGoerke

@TobiasGoerke
Copy link
Contributor Author

You'll either need to specify these vars manually or use tools like OPA / Kyverno.
However, it does not look like the message is related to auth.

@RonaldFletcher
Copy link

You'll either need to specify these vars manually or use tools like OPA / Kyverno. However, it does not look like the message is related to auth.

defaultPipelineRoot: s3://kubeflow-pipelines/v2?endpoint=minio.examplecom:443&s3ForcePathStyle=true
and ml-pipeline-apiserver deploymeny add env
CleanShot 2024-04-01 at 22 48 33@2x
Is this all the configuration? Are the changes I made correct? @TobiasGoerke

@TobiasGoerke
Copy link
Contributor Author

Seems correct. Although you will also need to mount the env variables to you pipeline pods, like this (better you OPA / Kyverno though):

task.set_env_variable("AWS_REGION", "eu-central-1")
task.set_env_variable("AWS_ACCESS_KEY_ID", "minio")
task.set_env_variable("AWS_SECRET_ACCESS_KEY", "minio123")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants