Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk] Error running pipelines with pod labels or annotation in pipeline steps added using kfp-kubernetes #10868

Closed
blee-gl opened this issue Jun 5, 2024 · 2 comments

Comments

@blee-gl
Copy link

blee-gl commented Jun 5, 2024

Environment

  • KFP version: 2.0.5
  • KFP SDK version: 2.7.0
  • All dependencies version:
    • kfp 2.7.0
    • kfp-kubernetes 1.2.0
    • kfp-pipeline-spec 0.3.0
    • kfp-server-api 2.0.5

Steps to reproduce

Take the hello world v2 pipelines example script and use the kubernetes.add_pod_label to add a label to the pod created for the hello_world step as defined in the kfp-kubernetes documentation; code shown below. Then compile the pipeline.

import os

from kfp import dsl, compiler, kubernetes


@dsl.component(base_image="python:3.9")
def hello_world(text: str) -> str:
    print(text)
    return text


@dsl.pipeline(name="hello-world", description="A simple intro pipeline")
def pipeline_hello_world(text: str = "hi there"):
    """Pipeline that passes small pipeline parameter string to consumer op."""

    consume_task = hello_world(
        text=text
    )  # Passing pipeline parameter as argument to consumer op

    kubernetes.add_pod_label(
        consume_task,
        label_key="test-label",
        label_value="test-value",
    )


if __name__ == "__main__":
    # execute only if run as a script
    compiler.Compiler().compile(
        pipeline_func=pipeline_hello_world, package_path="hello_world_pipeline.yaml"
    )

The pipeline spec YAML generated:

# PIPELINE DEFINITION
# Name: hello-world
# Description: A simple intro pipeline
# Inputs:
#    text: str [Default: 'hi there']
components:
  comp-hello-world:
    executorLabel: exec-hello-world
    inputDefinitions:
      parameters:
        text:
          parameterType: STRING
    outputDefinitions:
      parameters:
        Output:
          parameterType: STRING
deploymentSpec:
  executors:
    exec-hello-world:
      container:
        args:
        - --executor_input
        - '{{$}}'
        - --function_to_execute
        - hello_world
        command:
        - sh
        - -c
        - "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip ||\
          \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
          \ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0'\
          \ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' && \"\
          $0\" \"$@\"\n"
        - sh
        - -ec
        - 'program_path=$(mktemp -d)


          printf "%s" "$0" > "$program_path/ephemeral_component.py"

          _KFP_RUNTIME=true python3 -m kfp.dsl.executor_main                         --component_module_path                         "$program_path/ephemeral_component.py"                         "$@"

          '
        - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
          \ *\n\ndef hello_world(text: str) -> str:\n    print(text)\n    return text\n\
          \n"
        image: python:3.9
pipelineInfo:
  description: A simple intro pipeline
  name: hello-world
root:
  dag:
    tasks:
      hello-world:
        cachingOptions:
          enableCache: true
        componentRef:
          name: comp-hello-world
        inputs:
          parameters:
            text:
              componentInputParameter: text
        taskInfo:
          name: hello-world
  inputDefinitions:
    parameters:
      text:
        defaultValue: hi there
        isOptional: true
        parameterType: STRING
schemaVersion: 2.1.0
sdkVersion: kfp-2.7.0
---
platforms:
  kubernetes:
    deploymentSpec:
      executors:
        exec-hello-world:
          podMetadata:
            labels:
              test-label: test-value

Upload the pipeline and execute a pipeline run which results in a failure with an error stating "Resource failed to execute":
image

Expected result

The pipeline should be able to successfully execute and the hello-world task pod should have the label "test-label" with value "test-value" attached to the pod.

Materials and Reference

Looking into the failed pod logs, this error is given: F0605 12:20:44.976828 20 main.go:76] KFP driver: failed to unmarshal Kubernetes config, error: unknown field "podMetadata" in kfp_kubernetes.KubernetesExecutorConfig KubernetesConfig: 0xc0002fb3f0

Full log stack below:

time="2024-06-05T12:20:44.953Z" level=info msg="capturing logs" argo=true
I0605 12:20:44.975730      20 main.go:105] input ComponentSpec:{
  "executorLabel": "exec-hello-world",
  "inputDefinitions": {
    "parameters": {
      "text": {
        "parameterType": "STRING"
      }
    }
  },
  "outputDefinitions": {
    "parameters": {
      "Output": {
        "parameterType": "STRING"
      }
    }
  }
}
I0605 12:20:44.976385      20 main.go:112] input TaskSpec:{
  "cachingOptions": {
    "enableCache": true
  },
  "componentRef": {
    "name": "comp-hello-world"
  },
  "inputs": {
    "parameters": {
      "text": {
        "componentInputParameter": "text"
      }
    }
  },
  "taskInfo": {
    "name": "hello-world"
  }
}
I0605 12:20:44.976597      20 main.go:118] input ContainerSpec:{
  "args": [
    "--executor_input",
    "{{$}}",
    "--function_to_execute",
    "hello_world"
  ],
  "command": [
    "sh",
    "-c",
    "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0' '--no-deps' 'typing-extensions\u003e=3.7.4,\u003c5; python_version\u003c\"3.9\"' \u0026\u0026 \"$0\" \"$@\"\n",
    "sh",
    "-ec",
    "program_path=$(mktemp -d)\n\nprintf \"%s\" \"$0\" \u003e \"$program_path/ephemeral_component.py\"\n_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main                         --component_module_path                         \"$program_path/ephemeral_component.py\"                         \"$@\"\n",
    "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import *\n\ndef hello_world(text: str) -\u003e str:\n    print(text)\n    return text\n\n"
  ],
  "image": "python:3.9"
}
I0605 12:20:44.976721      20 main.go:133] input kubernetesConfig:{
  "podMetadata": {
    "labels": {
      "test-label": "test-value"
    }
  }
}
F0605 12:20:44.976828      20 main.go:76] KFP driver: failed to unmarshal Kubernetes config, error: unknown field "podMetadata" in kfp_kubernetes.KubernetesExecutorConfig
KubernetesConfig: 0xc0002fb3f0
time="2024-06-05T12:20:45.957Z" level=info msg="sub-process exited" argo=true error="<nil>"
time="2024-06-05T12:20:45.957Z" level=error msg="cannot save parameter /tmp/outputs/pod-spec-patch" argo=true error="open /tmp/outputs/pod-spec-patch: no such file or directory"
time="2024-06-05T12:20:45.958Z" level=error msg="cannot save parameter /tmp/outputs/cached-decision" argo=true error="open /tmp/outputs/cached-decision: no such file or directory"
time="2024-06-05T12:20:45.958Z" level=error msg="cannot save parameter /tmp/outputs/condition" argo=true error="open /tmp/outputs/condition: no such file or directory"
Error: exit status 1

Note that both kubernetes.add_labels and kubernetes.add_pod_annotation use the podMetadata field which means using either will result in the error above.


Impacted by this bug? Give it a 👍.

@diankasileymane
Copy link

diankasileymane commented Jun 6, 2024

I am facing the same issue with the kubernetes.set_image_pull_policy and kfp.kubernetes.set_image_pull_secrets
Environment
KFP version: 2.0.5
KFP SDK version: 2.7.0
All dependencies version:
kfp 2.7.0
kfp-kubernetes 1.2.0
kfp-pipeline-spec 0.3.0
kfp-server-api 2.0.3

The pipeline spec YAML generated:

components:
  comp-simple-task:
    executorLabel: exec-simple-task
deploymentSpec:
  executors:
    exec-simple-task:
      container:
        args:
        - --executor_input
        - '{{$}}'
        - --function_to_execute
        - simple_task
        command:
        - sh
        - -c
        - "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip ||\
          \ python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1\
          \ python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0'\
          \ '--no-deps' 'typing-extensions>=3.7.4,<5; python_version<\"3.9\"' && \"\
          $0\" \"$@\"\n"
        - sh
        - -ec
        - 'program_path=$(mktemp -d)


          printf "%s" "$0" > "$program_path/ephemeral_component.py"

          _KFP_RUNTIME=true python3 -m kfp.dsl.executor_main                         --component_module_path                         "$program_path/ephemeral_component.py"                         "$@"

          '
        - "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import\
          \ *\n\ndef simple_task():\n    print(\"hello-world\")\n\n"
        image: python:3.7
pipelineInfo:
  name: pipeline
root:
  dag:
    tasks:
      simple-task:
        cachingOptions:
          enableCache: true
        componentRef:
          name: comp-simple-task
        taskInfo:
          name: simple-task
schemaVersion: 2.1.0
sdkVersion: kfp-2.7.0
---
platforms:
  kubernetes:
    deploymentSpec:
      executors:
        exec-simple-task:
          imagePullPolicy: Always

### Pod error
time="2024-06-06T07:30:23.403Z" level=info msg="capturing logs" argo=true I0606 07:30:23.474031 19 main.go:105] input ComponentSpec:{ "executorLabel": "exec-simple-task" } I0606 07:30:23.474488 19 main.go:112] input TaskSpec:{ "cachingOptions": { "enableCache": true }, "componentRef": { "name": "comp-simple-task" }, "taskInfo": { "name": "simple-task" } } I0606 07:30:23.474637 19 main.go:118] input ContainerSpec:{ "args": [ "--executor_input", "{{$}}", "--function_to_execute", "simple_task" ], "command": [ "sh", "-c", "\nif ! [ -x \"$(command -v pip)\" ]; then\n python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location 'kfp==2.7.0' '--no-deps' 'typing-extensions\u003e=3.7.4,\u003c5; python_version\u003c\"3.9\"' \u0026\u0026 \"$0\" \"$@\"\n", "sh", "-ec", "program_path=$(mktemp -d)\n\nprintf \"%s\" \"$0\" \u003e \"$program_path/ephemeral_component.py\"\n_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main --component_module_path \"$program_path/ephemeral_component.py\" \"$@\"\n", "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import *\n\ndef simple_task():\n print(\"hello-world\")\n\n" ], "image": "python:3.7" } I0606 07:30:23.474760 19 main.go:133] input kubernetesConfig:{ "imagePullPolicy": "Always" } F0606 07:30:23.474840 19 main.go:76] KFP driver: failed to unmarshal Kubernetes config, error: unknown field "imagePullPolicy" in kfp_kubernetes.KubernetesExecutorConfig KubernetesConfig: 0xc000457210 time="2024-06-06T07:30:24.406Z" level=info msg="sub-process exited" argo=true error="<nil>" time="2024-06-06T07:30:24.407Z" level=error msg="cannot save parameter /tmp/outputs/pod-spec-patch" argo=true error="open /tmp/outputs/pod-spec-patch: no such file or directory" time="2024-06-06T07:30:24.407Z" level=error msg="cannot save parameter /tmp/outputs/cached-decision" argo=true error="open /tmp/outputs/cached-decision: no such file or directory" time="2024-06-06T07:30:24.407Z" level=error msg="cannot save parameter /tmp/outputs/condition" argo=true error="open /tmp/outputs/condition: no such file or directory" Error: exit status 1

@blee-gl
Copy link
Author

blee-gl commented Jun 6, 2024

I've spoke with @rimolive on the #kubeflow-pipelines channel on CNCF Slack. The conclusion was a compatibility issue with the kfp-kubernetes package version and Kubeflow Pipelines version.

kfp-kubernetes v1.2.0 introduces the kfp.kubernetes.add_labels(), kfp.kubernetes.add_annotations(), kubernetes.set_image_pull_policy(), kfp.kubernetes.set_image_pull_secrets(), alongside other functionality, and is released as part of the KFP v2.2.0 release.

I'm running Kubeflow v1.8 with KFP v2.0.5 meaning that my version of KFP is not compatible with pipeline specs generated using kfp-kubernetes.

The solution is to upgrade KFP to v2.2.0. Kubeflow 1.9 is planned for release in July 2024 and is planned to come with KFP v2.2.0.

Marking this issue as closed.

CC @diankasileymane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants