Volume-based artifact passing system #1349

Ark-kun · 2019-04-30T03:00:44Z

Is this a BUG REPORT or FEATURE REQUEST?:
FEATURE REQUEST

I'd like to implement a feature that automatically mounts a single volume to the workflow pods to passively orchestrate the data passing.

I'm working on implementing this feature and will submit PR once it's ready.

It's possible to implement now on top of Argo, but it might be nice to have it built-in.
I've previously illustrated the proposal in #1227 (comment)

The main idea is to substitute the Argo's "active" way of passing artifacts (copying, packing, uploading/downloading/unpacking) with a passive system that has many advantages:

Much faster artifact storage I/O. No packing/unpacking. No copying files.
Artifact size is not limited by main/wait container disk sizes.

Syntax (unresolved):

# New syntax
artifactStorage: 
  volume: # Will automatically mount this volume to all Pods in a particular way
    persistentVolumeClaim:
      claimName: vol01

#The rest of the code is the usual artifact-passing syntax
templates:
  - name: producer
    outputs:
      artifacts:
      - name: out-art1
        path: /argo/outputs/out-art1/data
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["cowsay hello world > /argo/outputs/out-art1/data"]

  - name: consumer
    inputs:
      artifacts:
      - name: in-art1
        path: /argo/inputs/in-art1/data
    container:
      image: docker/whalesay:latest
      command: [cat, '/argo/inputs/in-art1/data']

  - name: main
    dag:
      tasks:
      - name: producer-task
        template: producer
      - name: consumer-task
        template: consumer
        arguments:
          artifacts:
          - name: in-art1
            from: "{{tasks.producer-task.outputs.artifacts.out-art1}}"

Transformed spec:

volumes:
  - name: argo-storage
    persistentVolumeClaim:
      claimName: vol01
templates:
  - name: producer
    outputs:
      parameters:
      - name: out-art1-subpath
        value: "{{workflow.uid}}/{{pod.name}}/out-art1/"
    container:
      image: docker/whalesay:latest
      command: [sh, -c]
      args: ["cowsay hello world > /argo/outputs/out-art1/data"]
      volumeMounts:
        - name: argo-storage
          mountPath: /argo/outputs/out-art1/
          subPath: "{{workflow.uid}}/{{pod.name}}/out-art1/"

  - name: consumer
    inputs:
      parameters:
      - name: in-art1-subpath
    container:
      image: docker/whalesay:latest
      command: [cat, '/argo/inputs/in-art1/data']
      volumeMounts:
        - name: argo-storage
          mountPath: /argo/inputs/in-art1/
          subPath: "{{input.parameters.in-art1-subpath}}"
          readOnly: true

  - name: main
    dag:
      tasks:
      - name: producer-task
        template: producer
      - name: consumer-task
        template: consumer
        arguments:
          artifacts:
          - name: in-art1-subpath
            from: "{{tasks.producer-task.outputs.parameters.out-art1-subpath}}"

This system becomes becomes better with the #1329, #1348 and #1300 features.

Added support for artifact path references #1300 adds the ability to reference the artifact paths which is required for using generated paths
Auto-generating local artifact paths #1348 adds the ability to generate artifact paths. This ensures that the paths are compatible with volume mounts.
Allow downstream tasks to know the names of upstream pods. #1329 allows passing the resolved paths to the downstream tasks without having to hack the command-line.

JoshRagem · 2019-07-09T17:52:44Z

there is a serious issue with this approach on AWS ebs volumes--the volumes will fail to attach and/or mount once you have two or more pods on different nodes. If your proposal here could be extended with some option to prefer scheduling pods on nodes that already have the volume attached (if allowed by resource requests), that might reduce the errors

Ark-kun · 2019-09-03T23:29:55Z

Is it true that AWS does not support any multi-write volume types that work for any set of pods?

Ark-kun · 2019-10-16T21:13:18Z

Here is a draft rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py
It can be run as a command-line program to rewrite Argo Workflow from artifacts to volume-based data passing.

What does everyone think?

danxmoran · 2019-10-23T17:20:51Z

Hi @Ark-kun, I'm exploring Argo for a use-case where I want to:

Query metadata about a file in external storage (i.e. FTP), outputting its size
Dynamically generate a volume big enough to store the downloaded file
Download the file to the volume
Mount the volume in a separate task, for processing

Would this proposal support the step-level (or template-level) dynamic volume sizing that I'd need to implement this flow?

Ark-kun · 2019-11-02T03:57:51Z

The per-step or per-artifact volumes could technically be implemented as another rewriting layer on top of the one in this issue. (My rewriter scrip will make it easier. You'll just need to change subPaths to volume names.)

This issue is more geared towards centralized data storage though.

alexec · 2020-05-12T23:00:59Z

Could you use PVCs for this?

Ark-kun · 2020-05-13T21:55:27Z

Could you use PVCs for this?

If this is a question for me, then yes - the proposed feature and the implementation script are volume-agnostic. Any volume can be used and probably most users will specify some PVC even if only for a layer of indirection.

BlackRider97 · 2020-05-18T12:58:53Z

Is there any issue if I use AzureFiles as persistent volume as Argo since it provides concurrent access on volume which is the limitation with EBS ?

hadim · 2020-06-08T18:13:49Z

@Ark-kun are you still planning to implement this feature? The lifecycle of the artifacts in Argo could be an issue for us as it involves a lot of copying/downloading/uploading.

hadim · 2020-06-08T18:37:11Z

Also, how would you automatically remove the PVC at the end of the workflow? A typical workflow for us would be:

setup a PVC
get some data on S3
step1: use data from S3 and generate new data on PVC
step2: use data from step1 and generate new data on PVC
etc...
upload data generate by the last step on S3
delete PVC

rmgogogo · 2020-06-15T14:37:42Z

Any reason why not directly read/write S3?
Is it because the libary doesn't support S3 interface?

hadim · 2020-06-15T14:53:47Z

We don't want to upload/download our data at each step for performance purposes. Using PVC solves this. We only use artifacts for the first and last steps.

Ark-kun · 2020-06-16T01:09:42Z

@Ark-kun are you still planning to implement this feature?

I've implemented this feature back in October 2019 as a separate script which can transform a subset of DAG-based workflows:

Here is a draft rewriter script: https://github.com/Ark-kun/pipelines/blob/SDK---Compiler---Added-support-for-volume-based-data-passing/sdk/python/kfp/compiler/_data_passing_using_volume.py
It can be run as a command-line program to rewrite Argo Workflow from artifacts to volume-based data passing.

I wonder whether we need to add it to the Argo controller itself (as it can just be used as a preprocessor). WDYT?

Also, how would you automatically remove the PVC at the end of the workflow?

It could be possible to do using exit handler and resource templates.

My main scenario requires the volume to persist between restarts. This allows implementing the intermediate data caching so that when you run a modified pipeline it can skip already computed parts instead of running all steps every time. (There probably needs to be some garbage collection system that deletes the expired data.)

alexec · 2021-01-18T19:56:54Z

Relates to #4130 and #2551

* chore: remove unused fields Signed-off-by: Derek Wang <[email protected]>

Ark-kun mentioned this issue Apr 30, 2019

Volume artifacts #1227

Closed

alexec added artifacts type/feature Feature request labels May 12, 2020

alexec added the use-case/machine-learning label Nov 23, 2020

jalberti mentioned this issue May 7, 2021

Custom artifact repository plugins #5862

Open

alexec removed the use-case/machine-learning label Jun 28, 2021

alexec removed the epic/artifacts label Sep 21, 2021

icecoffee531 pushed a commit to icecoffee531/argo-workflows that referenced this issue Jan 5, 2022

chore: remove unused fields (argoproj#1349)

c85905b

* chore: remove unused fields Signed-off-by: Derek Wang <[email protected]>

alexec added the area/artifacts S3/GCP/OSS/Git/HDFS etc label Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Volume-based artifact passing system #1349

Volume-based artifact passing system #1349

Ark-kun commented Apr 30, 2019 •

edited by agilgur5

Loading

JoshRagem commented Jul 9, 2019

Ark-kun commented Sep 3, 2019 •

edited by agilgur5

Loading

Ark-kun commented Oct 16, 2019

danxmoran commented Oct 23, 2019

Ark-kun commented Nov 2, 2019

alexec commented May 12, 2020

Ark-kun commented May 13, 2020 •

edited

Loading

BlackRider97 commented May 18, 2020

hadim commented Jun 8, 2020

hadim commented Jun 8, 2020

rmgogogo commented Jun 15, 2020 •

edited by agilgur5

Loading

hadim commented Jun 15, 2020

Ark-kun commented Jun 16, 2020

alexec commented Jan 18, 2021

Volume-based artifact passing system #1349

Volume-based artifact passing system #1349

Comments

Ark-kun commented Apr 30, 2019 • edited by agilgur5 Loading

JoshRagem commented Jul 9, 2019

Ark-kun commented Sep 3, 2019 • edited by agilgur5 Loading

Ark-kun commented Oct 16, 2019

danxmoran commented Oct 23, 2019

Ark-kun commented Nov 2, 2019

alexec commented May 12, 2020

Ark-kun commented May 13, 2020 • edited Loading

BlackRider97 commented May 18, 2020

hadim commented Jun 8, 2020

hadim commented Jun 8, 2020

rmgogogo commented Jun 15, 2020 • edited by agilgur5 Loading

hadim commented Jun 15, 2020

Ark-kun commented Jun 16, 2020

alexec commented Jan 18, 2021

Ark-kun commented Apr 30, 2019 •

edited by agilgur5

Loading

Ark-kun commented Sep 3, 2019 •

edited by agilgur5

Loading

Ark-kun commented May 13, 2020 •

edited

Loading

rmgogogo commented Jun 15, 2020 •

edited by agilgur5

Loading