Skip to content

[Discussion] Use Couler or not #2996

@typhoonzero

Description

@typhoonzero

In the refactored code, we use Couler to generate an Argo YAML file to submit to the Kubernetes cluster to run. A generated Couler program should look like:

def step_entry_0():
    import runtime
    runtime.local.tensorflow.train(....)

couler.run_container(step_entry_0, ...)

def step_entry_1():
   import runtime
   runtime.db.exec(....)

couler.run_container(step_entry_1, ...)

Execute the above Couler program, it should generate a YAML file with the above step python code in it.

  1. SQLFlow is a compiler to compile a SQL program to a workflow YAML file, we use a code generator to generate the above Couler program according to the SQLFlow IR, then execute the Couler program to get the YAML, yet we can rewrite the code generator to generate the YAML directly to make the procedure simple.

image

  1. Use Couler to generate YAML is hard to maintain. We use Couler to generate and submit the workflow yet, we still use Go to Fetch the workflow status periodically; The SQLFlow compiler needs to maintain a Go side workflow struct in order to do dependency analysis and other optimizations ( e.g. katib?), there's no need to translate it to Python side, and use Python Couler to implement YAML generation again.

  2. We need a local mode to simplify the development and debugging. As a compiler, SQLFlow local mode can directly generate a Python program with several step functions and call them one by one, and with the "workflow mode", SQLFlow can generate the YAML with the step functions directly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions