Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't configure Executor args. Is it intended? #12315

Closed
3 tasks done
zamonia500 opened this issue Dec 3, 2023 · 10 comments · Fixed by #12609
Closed
3 tasks done

Can't configure Executor args. Is it intended? #12315

zamonia500 opened this issue Dec 3, 2023 · 10 comments · Fixed by #12609
Assignees
Labels
area/executor P3 Low priority solution/suggested A solution to the bug has been suggested. Someone needs to implement it. solution/workaround There's a workaround, might not be great, but exists type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@zamonia500
Copy link
Contributor

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

I'd like to customize the args of executor containers to append extra argument for turn on some flags on argoexec image.

When i have referred to documents workflow-controller-configmap.yaml , it seems that customization of executor container was supported up until some point.

executor: |
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 0.1
memory: 64Mi
limits:
cpu: 0.5
memory: 512Mi
# args & env allows command line arguments and environment variables to be appended to the
# executor container and is mainly used for development/debugging purposes.
args:
- --loglevel
- debug
- --gloglevel
- "6"
env:
# ARGO_TRACE enables some tracing information for debugging purposes. Currently it enables
# logging of S3 request/response payloads (including auth headers)
- name: ARGO_TRACE
value: "1"

I tried to customize args of executor containers via update workflow-controller-configmap following to latest schema. But it didn't work. My customize configuration is ignored by workflow controller.

So, i have inspected the reason of the ignorance and figured out there is no handling code to apply the customize configuration into executors(init, wait containers).

func (woc *wfOperationCtx) newInitContainer(tmpl *wfv1.Template) apiv1.Container {
ctr := woc.newExecContainer(common.InitContainerName, tmpl)
ctr.Command = []string{"argoexec", "init", "--loglevel", getExecutorLogLevel(), "--log-format", woc.controller.executorLogFormat()}
return *ctr
}
func (woc *wfOperationCtx) newWaitContainer(tmpl *wfv1.Template) *apiv1.Container {
ctr := woc.newExecContainer(common.WaitContainerName, tmpl)
ctr.Command = []string{"argoexec", "wait", "--loglevel", getExecutorLogLevel(), "--log-format", woc.controller.executorLogFormat()}
return ctr
}

Version

v3.5.1

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

### Updated `workflow-controller-configmap`

apiVersion: v1
data:
  config: |
    artifactRepository:
      s3:
        accessKeySecret:
          key: accesskey
          name: mlpipeline-minio-artifact
        secretKeySecret:
          key: secretkey
          name: mlpipeline-minio-artifact
        bucket: mlpipeline
        endpoint: minio.minio:443
        insecure:
    executor:
      args:
      - --insecure-skip-tls-verify
      env:
      - name: Hello
        value: Customize
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: argo
    meta.helm.sh/release-namespace: argo
  creationTimestamp: "2023-11-24T10:38:09Z"
  labels:
    app.kubernetes.io/component: workflow-controller
    app.kubernetes.io/instance: argo
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: argo-workflows-cm
    app.kubernetes.io/part-of: argo-workflows
    helm.sh/chart: argo-workflows-0.39.0
  name: argo-argo-workflows-workflow-controller-configmap
  namespace: argo


### Testing workflow
[Hello world workflow](https://raw.githubusercontent.com/argoproj/argo-workflows/main/examples/hello-world.yaml)
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-
  labels:
    workflows.argoproj.io/archive-strategy: "false"
  annotations:
    workflows.argoproj.io/description: |
      This is a simple hello world example.
spec:
  entrypoint: whalesay
  templates:
  - name: whalesay
    container:
      image: docker/whalesay:latest
      command: [cowsay]
      args: ["hello world"]

After i create the hello-world workflow the workflow pod is customized as follows.

You can find the environment variable is injected well.

Name:         hello-world-r2rwl
Namespace:    runway-project-22b4
Priority:     0
Node:         orange-cow-1/192.168.134.55
Start Time:   Sun, 03 Dec 2023 15:31:53 +0900
Labels:       workflows.argoproj.io/completed=true
              workflows.argoproj.io/workflow=hello-world-r2rwl
Annotations:  kubectl.kubernetes.io/default-container: main
              workflows.argoproj.io/node-id: hello-world-r2rwl
              workflows.argoproj.io/node-name: hello-world-r2rwl
Status:       Succeeded
IP:           10.244.1.105
IPs:
  IP:           10.244.1.105
Controlled By:  Workflow/hello-world-r2rwl
Init Containers:
  init:
    <...>
    Command:
      argoexec
      init
      --loglevel
      info
      --log-format
      text
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 03 Dec 2023 15:31:54 +0900
      Finished:     Sun, 03 Dec 2023 15:31:54 +0900
    Ready:          True
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-world-r2rwl (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-world-r2rwl
      Hello:                              Customize
      ARGO_CONTAINER_NAME:                init
      ARGO_TEMPLATE:                      {"name":"whalesay","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["cowsay"],"args":["hello world"],"resources":{}}}
      ARGO_NODE_ID:                       hello-world-r2rwl
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c6msv (ro)
Containers:
  wait:
    <...>
    Command:
      argoexec
      wait
      --loglevel
      info
      --log-format
      text
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 03 Dec 2023 15:31:55 +0900
      Finished:     Sun, 03 Dec 2023 15:32:04 +0900
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-world-r2rwl (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-world-r2rwl
      Hello:                              Customize
      ARGO_CONTAINER_NAME:                wait
      ARGO_TEMPLATE:                      {"name":"whalesay","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["cowsay"],"args":["hello world"],"resources":{}}}
      ARGO_NODE_ID:                       hello-world-r2rwl
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /tmp from tmp-dir-argo (rw,path="0")
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c6msv (ro)
  main:
    Container ID:  docker://8364799a17556a20be5f2e2ac70a57fba839c9afbc01b5f9bd9172af35ed6e5f
    Image:         docker/whalesay:latest
    Image ID:      docker-pullable://docker/whalesay@sha256:178598e51a26abbc958b8a2e48825c90bc22e641de3d31e18aaf55f3258ba93b
    Port:          <none>
    Host Port:     <none>
    Command:
      /var/run/argo/argoexec
      emissary
      --loglevel
      info
      --log-format
      text
      --
      cowsay
    Args:
      hello world
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 03 Dec 2023 15:32:02 +0900
      Finished:     Sun, 03 Dec 2023 15:32:03 +0900
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_CONTAINER_NAME:                main
      ARGO_TEMPLATE:                      {"name":"whalesay","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"docker/whalesay:latest","command":["cowsay"],"args":["hello world"],"resources":{}}}
      ARGO_NODE_ID:                       hello-world-r2rwl
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-c6msv (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  var-run-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  tmp-dir-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  kube-api-access-c6msv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  106s  default-scheduler  Successfully assigned runway-project-22b4/hello-world-r2rwl to orange-cow-1
  Normal  Pulled     105s  kubelet            Container image "cr.makina.rocks/library/argoproj/argoexec:v3.5.1-customized1" already present on machine
  Normal  Created    105s  kubelet            Created container init
  Normal  Started    105s  kubelet            Started container init
  Normal  Pulled     104s  kubelet            Container image "cr.makina.rocks/library/argoproj/argoexec:v3.5.1-customized1" already present on machine
  Normal  Created    104s  kubelet            Created container wait
  Normal  Started    104s  kubelet            Started container wait
  Normal  Pulling    104s  kubelet            Pulling image "docker/whalesay:latest"
  Normal  Pulled     98s   kubelet            Successfully pulled image "docker/whalesay:latest" in 6.708859082s (6.708862441s including waiting)
  Normal  Created    97s   kubelet            Created container main
  Normal  Started    97s   kubelet            Started container main


### Logs from the workflow controller

```text
]time="2023-12-03T06:31:53.700Z" level=info msg="Processing workflow" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.704Z" level=info msg="Updated phase  -> Running" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.704Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.704Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.704Z" level=info msg="Pod node hello-world-r2rwl initialized Pending" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.716Z" level=info msg="Created pod: hello-world-r2rwl (hello-world-r2rwl)" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.716Z" level=info msg="TaskSet Reconciliation" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.716Z" level=info msg=reconcileAgentPod namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:31:53.722Z" level=info msg="Workflow update successful" namespace=runway-project-22b4 phase=Running resourceVersion=5692708 workflow=hello-world-r2rwl
time="2023-12-03T06:32:03.719Z" level=info msg="Processing workflow" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:03.719Z" level=info msg="Task-result reconciliation" namespace=runway-project-22b4 numObjs=0 workflow=hello-world-r2rwl
time="2023-12-03T06:32:03.719Z" level=info msg="node changed" namespace=runway-project-22b4 new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=hello-world-r2rwl old.message= old.phase=Pending old.progress=0/1 workflow=hello-world-r2rwl
time="2023-12-03T06:32:03.720Z" level=info msg="TaskSet Reconciliation" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:03.720Z" level=info msg=reconcileAgentPod namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:03.729Z" level=info msg="Workflow update successful" namespace=runway-project-22b4 phase=Running resourceVersion=5692893 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg="Processing workflow" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg="Task-result reconciliation" namespace=runway-project-22b4 numObjs=0 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg="node changed" namespace=runway-project-22b4 new.message= new.phase=Succeeded new.progress=0/1 nodeID=hello-world-r2rwl old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg="TaskSet Reconciliation" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg=reconcileAgentPod namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg="Updated phase Running -> Succeeded" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.730Z" level=info msg="Marking workflow completed" namespace=runway-project-22b4 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.735Z" level=info msg="cleaning up pod" action=deletePod key=runway-project-22b4/hello-world-r2rwl-1340600742-agent/deletePod
time="2023-12-03T06:32:13.738Z" level=info msg="Workflow update successful" namespace=runway-project-22b4 phase=Succeeded resourceVersion=5692977 workflow=hello-world-r2rwl
time="2023-12-03T06:32:13.746Z" level=info msg="cleaning up pod" action=labelPodCompleted key=runway-project-22b4/hello-world-r2rwl/labelPodCompleted

Logs from in your workflow's wait container

time="2023-12-03T06:31:55.190Z" level=info msg="Starting Workflow Executor" version=v3.5.1+877c552.dirty
time="2023-12-03T06:31:55.192Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2023-12-03T06:31:55.193Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=runway-project-22b4 podName=hello-world-r2rwl template="{\"name\":\"whalesay\",\"inputs\":{},\"outputs\":{},\"metadata\":{},\"container\":{\"name\":\"\",\"image\":\"docker/whalesay:latest\",\"command\":[\"cowsay\"],\"args\":[\"hello world\"],\"resources\":{}}}" version="&Version{Version:v3.5.1+877c552.dirty,BuildDate:2023-11-30T05:32:42Z,GitCommit:877c5523066e17687856fe3484c9b2d398e986f5,GitTag:v3.5.1,GitTreeState:dirty,GoVersion:go1.21.4,Compiler:gc,Platform:linux/amd64,}"
time="2023-12-03T06:31:55.193Z" level=info msg="Starting deadline monitor"
time="2023-12-03T06:32:04.195Z" level=info msg="Main container completed" error="<nil>"
time="2023-12-03T06:32:04.196Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2023-12-03T06:32:04.196Z" level=info msg="No output parameters"
time="2023-12-03T06:32:04.196Z" level=info msg="No output artifacts"
time="2023-12-03T06:32:04.196Z" level=info msg="Alloc=9134 TotalAlloc=12915 Sys=24421 NumGC=3 Goroutines=7"
time="2023-12-03T06:32:04.196Z" level=info msg="Deadline monitor stopped"
time="2023-12-03T06:32:04.196Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
@sarabala1979
Copy link
Member

@zamonia500 Executor Container customization is only supporting a few elements. We need to improve the Docs.

exec := apiv1.Container{
Name: name,
Image: woc.controller.executorImage(),
ImagePullPolicy: woc.controller.executorImagePullPolicy(),
Env: woc.createEnvVars(),
Resources: woc.controller.Config.GetExecutor().Resources,
SecurityContext: woc.controller.Config.GetExecutor().SecurityContext,

Can you provide your usecase to use args and env on executor?

@zamonia500
Copy link
Contributor Author

@sarabala1979,

I have deployed minio tenant into kubernetes deployed argo-workflow-controller too. I have to communicate with minio with https protocol. But https certificate created by minio operator is self-signed!
So i'd like to pass verify_tls=false option into argo executor

@sarabala1979
Copy link
Member

you can configure this on your workflow.


Insecure *bool `json:"insecure,omitempty" protobuf:"varint,4,opt,name=insecure"`

@philBrown
Copy link
Contributor

Found the same issue trying to set the loglevel. Didn't matter what the executor config map setting was, the init and wait containers did not change

Config map

data:
  executor: |
    args:
    - --loglevel 
    - debug 
    - --gloglevel 
    - "6" 

kubectl describe pod ...

Init Containers:
  init:
    # ...
    Command:
      argoexec
      init
      --loglevel
      info
      --log-format
      text

I can see from the workflow controller logs that it picked up the supplied config map settings

level=info msg="Configuration:\nartifactRepository: {}\nexecutor:\n  args:\n  - --loglevel\n  - debug\n  - --gloglevel\n  - \"6\"\n

The log levels are specifically called out in the documentation so it's very disappointing the executor settings don't actually work

@zamonia500
Copy link
Contributor Author

you can configure this on your workflow.

Insecure *bool `json:"insecure,omitempty" protobuf:"varint,4,opt,name=insecure"`

@sarabala1979 ,
Is there any way to apply insecure option globally? Not each workflows one by one

@zamonia500
Copy link
Contributor Author

zamonia500 commented Dec 3, 2023

Found the same issue trying to set the loglevel. Didn't matter what the executor config map setting was, the init and wait containers did not change

[...]

The log levels are specifically called out in the documentation so it's very disappointing the executor settings don't actually work

@philBrown , AFAIK loglevel configuration is following the workflow-controller's config

@agilgur5 agilgur5 added area/docs Incorrect, missing, or mistakes in docs P3 Low priority labels Jan 27, 2024
@agilgur5
Copy link
Contributor

@sarabala1979 ,
Is there any way to apply insecure option globally? Not each workflows one by one

You should be able to set this in your artifactRepository config in the workflow-controller-configmap.
I see you had insecure: in your config in the opening comment, did insecure: true not work?

@agilgur5 agilgur5 added solution/suggested A solution to the bug has been suggested. Someone needs to implement it. area/executor labels Jan 27, 2024
@agilgur5
Copy link
Contributor

agilgur5 commented Jan 27, 2024

When i have referred to documents workflow-controller-configmap.yaml , it seems that customization of executor container was supported up until some point.

I double-checked, and it looks like args were added with the PNS executor in #1214 (this line).
It seems like they were unintentionally removed with some deprecation removals in #8009 (released in v3.3) -- this line removed args but it was not actually part of the intent of the PR, the issue, or documented in the upgrade notes of the PR -- so that looks to have been an accident / mistake and actually not intentional. And then ofc the docs still have args available.

@sarabala1979 do you have any thoughts on adding this back? I'm inclined to do so since it seems to have been accidentally removed and therefore was a regression, rather than an explicit feature removal.

As a workaround, I believe you could use podSpecPatch to override args directly

@agilgur5 agilgur5 added type/regression Regression from previous behavior (a specific type of bug) and removed area/docs Incorrect, missing, or mistakes in docs labels Jan 27, 2024
@agilgur5 agilgur5 self-assigned this Jan 27, 2024
@agilgur5 agilgur5 added the solution/workaround There's a workaround, might not be great, but exists label Jan 27, 2024
@zamonia500
Copy link
Contributor Author

@sarabala1979 ,
Is there any way to apply insecure option globally? Not each workflows one by one

You should be able to set this in your artifactRepository config in the workflow-controller-configmap. I see you had insecure: in your config in the opening comment, did insecure: true not work?

@agilgur5 , Oh! i missed that configmap field :(
Thank you for your guidance!

I already have resolved this issue by customize exec container. So it doesn't matter to me this issue gonna clone right now or not, but seems the root cause of args issue is still exists. I agree that revert of unintentionally removed line!

@agilgur5
Copy link
Contributor

agilgur5 commented Feb 2, 2024

@philBrown , AFAIK loglevel configuration is following the workflow-controller's config

this is correct

but seems the root cause of args issue is still exists. I agree that revert of unintentionally removed line!

Wrote up a PR and added a regression test for this there: #12609

@agilgur5 agilgur5 changed the title Customization of executor containers is limited now. Is it intended? Can't configure Executor args. Is it intended? Feb 5, 2024
@agilgur5 agilgur5 added this to the v3.5.x patches milestone Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/executor P3 Low priority solution/suggested A solution to the bug has been suggested. Someone needs to implement it. solution/workaround There's a workaround, might not be great, but exists type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants