Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Unable to disable cache for pipeline steps #10966

Open
rmoesbergen opened this issue Jun 27, 2024 · 3 comments
Open

[backend] Unable to disable cache for pipeline steps #10966

rmoesbergen opened this issue Jun 27, 2024 · 3 comments

Comments

@rmoesbergen
Copy link

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    Using Kustomize, in GKE

  • KFP version:
    Pipelines version 2.2.0

  • KFP SDK version:
    SDK version 1.8.22 and 2.7.0 (both have this issue)

Steps to reproduce

I have a pipeline like this:

        train_op = (
            train_loader.create_op(
                job_name=job_name,
                account=account,
            )
            .set_caching_options(False)
        )
        train_op.execution_options.caching_strategy.max_cache_staleness = "P0D"

When compiling this pipeline, the Pod still gets these annotations

labels:
    app: kubeflow-job
    pipeline/runid: d2173b95-f465-47ac-a38a-470769c2064b
    pipelines.kubeflow.org/cache_enabled: 'true'
    pipelines.kubeflow.org/cache_id: ''
    pipelines.kubeflow.org/enable_caching: 'false'
  annotations:
    pipelines.kubeflow.org/execution_cache_key: 852f1ec5f95c01d9c0e62b85072fa8092f5f7933e73a08ec96e6ebb74229391e

and no matter what I try, kubeflow keeps caching the steps which makes no sense since our underlying data changes, but the parameters are the same. Also tried all of the suggestions here, including modifying the mutating admission webhook, but nothing works:

#4857
https://www.kubeflow.org/docs/components/pipelines/v1/overview/caching/
https://www.kubeflow.org/docs/components/pipelines/v2/caching/

The only thing that sort-of works is reverting kubeflow pipelines back to 2.0.5. The annotations are then still there, but somehow kubeflow ignores them and doesn't cache with that version.

Expected result

Kubeflow stops caching when I ask it to.


Impacted by this bug? Give it a 👍.

@tjhorner
Copy link

tjhorner commented Jul 9, 2024

There's a discrepancy between the SDK and the backend about what label to use to control the caching behavior.

The backend uses pipelines.kubeflow.org/cache_enabled:

KFPCacheEnabledLabelKey string = "pipelines.kubeflow.org/cache_enabled"

But the SDK uses pipelines.kubeflow.org/enable_caching:

# Caching option
op.add_pod_label('pipelines.kubeflow.org/enable_caching',
str(op.enable_caching).lower())

(here and in a few other locations.)

As a workaround, you can manually add the label pipelines.kubeflow.org/cache_enabled: 'false' to your pods, for example:

train_op.add_pod_label("pipelines.kubeflow.org/cache_enabled", "false")

The SDK should be updated to use the correct label.

@gregsheremeta
Copy link
Contributor

Hm, I'll look into this. I can verify that the annotations are not used in 2.0.5, which is coincidentally the version that I tend to run. The only thing that matters is in 2.0.5 the enableCache option in the proto. The check is here.

I'll take a look at 2.2.0 and report back.

@gregsheremeta
Copy link
Contributor

/assign @gregsheremeta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants