Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add remote client that directly utilizes the flyteidl-rust Python bindings #2536

Draft
wants to merge 9 commits into
base: flyrs-v2
Choose a base branch
from

Conversation

austin362667
Copy link
Contributor

@austin362667 austin362667 commented Jun 27, 2024

Tracking issue

As a result of the initiative, issues, discussions and serial trials and errors in flyrs branch 1 2 this PR aims to create another FlyteRemote inremote/remote_rs.py which wrapping flyteidl-rust1 as its core. Python package, flyteidl-rust, was built directly from Rust. With the helps of works in flyte/flyteidl/src/lib.rs, this PR eliminates the protobuf objects serialization overhead2 while calling gRPC services. Unlike the original approach in flyrs, this method optimizes performance and reduces the maintenance cost.

  • Use case ( can work ):
    from flytekit.remote.remote_rs import FlyteRemote
    
    def test_fetch_execute_sync_string_task(register):
        remote = FlyteRemote(Config.auto(config_file=CONFIG), PROJECT, DOMAIN)
        flyte_task = remote.fetch_task(name="basics.echo_string_task.test_echo_string_task", version=VERSION)
        # `wait` to sync execution
        execution = remote.execute(flyte_task, inputs={"s": "hey"}, wait=True)
        assert execution.is_done
        assert execution.outputs["o0"] == "hey"

Why are the changes needed?

With flyteidl-rust Python bindings, Flytekit can get rid of not only grpc but also protobuf Python dependencies, ultimately removing the model layer between protobuf and remote entities. This approach leverages Rust structures directly.

For Rust side, developers still can wrap flyte/flytidl as their remote gRPC clients or use its API directly.

For Python, developers can build another remote client by wrapping flyteidl-rust as its core.

What changes were proposed in this pull request?

To this end, we have to replace all types import from model files to use flyteidl-rust package instead. Including,

  1. Modify remote entities to inherit from flyteidl-rust classes instead of model files.

  2. Refactor type transoformers in core/type_engine.py and get_serializable() in translator.py to adapt with flyteidl-rust classes.

    • get_serializable() in tools/translator.py
      • ReferenceEntity
      • PythonTask
      • WorkflowBase
      • Node
      • LaunchPlan
      • BranchNode
      • FlyteTask
    • SimpleTransformer in core/type_engine.py
      • int
      • float
      • bool
      • str
      • datetime
      • timedelta
      • date
      • none
    • Other types' Transformer in core/type_engine.py

How was this patch tested?

Warning

For now, we havn't migrate the test cases to use flyteidl-rust and other type transformation still works in progress, so the CI must fail, don't run.

FlyteRemote only works in Task register / fetch / execute strictly with primitive python types, Workflow and LaunchPlan still works in progress.

Setup process

  1. Install flyteidl-rust
    pip install -i https://test.pypi.org/simple/ flyteidl-rust
    Screenshot 2024-06-27 at 7 11 54 PM

  2. Run the tasks which only using string, integer, float as input or output. For example,

    • pyflyte register ./echo_string_task.py and run on web console, or
    • pyflyte run --remote ./echo_string_task.py test_echo_string_task --s "hey"

Screenshots

  • pyflyte register echo_int and echo_float task.
    Screenshot 2024-06-27 at 7 16 38 PM
  • re-run echo_string task from console.
    Screenshot 2024-06-27 at 7 15 25 PM
  • pyflyte run --remote
      from flytekit import task
      
      @task()
      def test_square_task(n: float) -> float:
          return n**2
    Screenshot 2024-06-27 at 7 24 05 PM Screenshot 2024-06-27 at 7 24 40 PM

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Cross References

  1. https://github.com/pydantic/pydantic-core
  2. https://github.com/pola-rs/pyo3-polars/tree/main/pyo3-polars

Footnotes

  1. Temporarily registered in PyPI test index as test package that will not affect the real PyPI registry in production.

  2. In previous PRs, it serialize the python types into bytes string and pass into PyO3-exposed Rust functions.

Signed-off-by: Austin Liu <[email protected]>

refactor remote entities into `flyteidl-rust`

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>

refactor `pyflyte` interaction into `flyteidl-rust`

Signed-off-by: Austin Liu <[email protected]>
Signed-off-by: Austin Liu <[email protected]>
@austin362667
Copy link
Contributor Author

austin362667 commented Jun 27, 2024

Thanks, Ketan, for reminding me not to compromise on the earlier serialization approach, and thanks to Kevin and Troy for sharing helpful experiences as they are developing the rust-fs project. I also appreciate others’ insightful discussions.

@austin362667
Copy link
Contributor Author

austin362667 commented Jun 27, 2024

There are several implementation details need to be discussed (in regards to the developer experience):

  1. Enum comparison as integer or as string or as enum itself via different magic function.

    Since we've add __repr__ magic function for rust enum type to show as int string, we can use int(str(literal_type.type)) to compare literal types. Or we can just leverage Rust Prost's as_str_name() function by exposing it to python binding through PyO3. Or we can add compare operator mentioned in https://pyo3.rs/v0.22.0/class/protocols .

  2. The positional arguments of tuple variants in complex enum.

    Some Rust code generated enum variants comes in tuple form, like

    pub struct Literal {
          pub value: ::core::option::Option<literal::Value>, 
    }
    
    /// Nested message and enum types in `Literal`.
    pub mod literal {
        #[::pyo3_macro::with_pyclass]
        #[derive(::pyo3_macro::WithNew)]
        #[allow(clippy::derive_partial_eq_without_eq)]
        #[derive(Clone, PartialEq, ::prost::Oneof)]
        pub enum Value {
            /// A simple value.
            #[prost(message, tag = "1")]
            Scalar(::prost::alloc::boxed::Box<super::Scalar>),
            /// A collection of literals to allow nesting.
            #[prost(message, tag = "2")]
            Collection(super::LiteralCollection),
            /// A map of strings to literals.
            #[prost(message, tag = "3")]
            Map(super::LiteralMap),
        }
    }

    So the usage in Python will be like,

    lit = flyteidl.core.Literal(value=flyteidl.literal.Value.Scalar(flyteidl.core.Scalar(flyteidl.scalar.Value.Primitive(flyteidl.core.Primitive(flyteidl.primitive.Value.StringValue("x"))))), hash="", metadata={})

    , which has no named arguments. It's not that trivia to use:

     lit.value[0].value[0].value[0]
    

    image

@austin362667 austin362667 changed the title Add another Rust remote client that directly utilizes the flyteidl-rust Python bindings Add remote client that directly utilizes the flyteidl-rust Python bindings Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant