Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arc-Lang: Version 1 #471

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Arc-Lang: Version 1 #471

wants to merge 1 commit into from

Conversation

segeljakt
Copy link
Member

This is a WIP PR for version 1 of Arc-Lang. It includes a rewrite of Arc-Lang's compiler and runtime system. The system architecture now looks like this:

Orchestration  Client <--> Coordinator <--> Worker
    Layer      (Rust)        (Rust)         (Rust)
                 ^             ^              ^
                 |             |              |
                 v             v              v
  Execution   Arc-Lang      Arc-MLIR       Dataflow
    Layer     Compiler      Compiler       Executor
              (Ocaml)        (C++)          (Rust)

The workflow is as follows:

Users write Arc-Lang programs on their local machine and execute them using a client program. The client communicates with a local Arc-Lang process using pipes to compile and interpret the program. During interpretation, the interpreter may construct dataflow graphs that should be executed on a distributed system. These logical dataflow graphs and their UDFs are holistically represented as Arc-MLIR code and sent to the client process using standard output. The client process forwards the code to a coordinator process using TCP. The coordinator process uses an Arc-MLIR compiler to compile the MLIR source into a Rust module, representing the UDFs, and a JSON object, representing an optimised logical dataflow graph. The coordinator then compiles the logical graph into a physical graph mapped to specific hardware threads and sockets in a cluster of workers.

@segeljakt
Copy link
Member Author

segeljakt commented Aug 16, 2023

Changes needed in MLIR:

  • Changed the following in the Rust dialect:

    • Renamed i32 to I32, bool is Bool, etc. This is so we don't need to worry about being compatible with Rust's function signatures.
    • Renamed 1 to I32::new(1), bool to Bool::new(bool), etc. These functions are still const, so they can be called in const contexts.
    • Removed declare. This was previously needed for passing functions as data. I removed it since it resulted in a lot of complexity inside the runtime system without too much value.
    • Removed task. Instead of allowing tasks to be generated by the runtime, I think it is best to solve this problem in a different way by for example introducing a new dataflow operator that can take a task as input.
    • Deprecated async, spawn, receive, send. I think it will become easier if we only generate synchronous code. I think this code was needed for "nonpersistent tasks".
    • Renamed #[rewrite] to #[data]. #[data] is only needed on struct and enum declarations to derive boilerplate. The macro no longer inserts things like Rc<T>, which caused some complexity.
    • Removed new!(..), access!(..), enwrap!(..), val!(..) and made them use Rust's syntax directly. What these operations do is now less complicated since we now represent data as it appears without rewriting it in strange ways.
    • Renamed is!(x::y, z) to Rust's builtin matches!(z, x::y(_)).
    • Introduced more explicit syntax for operations, e.g. a + b can be written I32::add(a, b). I think it could be good to use this syntax but it is not crucial since I overloaded the operators. It is for example possible to do I32::new(1) + I32::new(2).
  • Other:

    • Cleaned up some code in the arc-mlir directory. Primarily:
      • Renamed *_SRC_DIR to *_DIR
      • Renamed ARC_MLIR_SOURCE_DIR to ARC_MLIR_SRC_DIR
      • Inlined the arc-lang/etc/Cargo.toml.template into the *.in files since it was very small.
      • Added some small comments in the CMake files.
  • Unsolved problems:

    • The arc.in file is outdated. I am unsure if we should keep it or not. Personally I would prefer if you could run arc-lang by just running the arc-lang binary.
    • It was not possible to overload && and || since these have special short-circuiting behaviour, so I used & and | instead, e.g., Bool::new(true) & Bool::new(false). They inline directly to && and ||so it should not affect performance but we might need to update the codegen output.
    • CI is not working correctly. I can fix it once tests pass locally.
  • Thoughts:

    • Arc-MLIR generates a Rust-crate module while Arc-Lang produces a Rust-crate with a main function. I think this is fine if we assume Arc-MLIR will pass a JSON of a dataflow graph to Arc-Lang, which can be turned into a main function.

@segeljakt segeljakt force-pushed the klas/v1 branch 2 times, most recently from 67d8bf8 to 855c1de Compare August 18, 2023 09:23
os << ", ";
os << structFields[i].first.getValue();
os << " : ";
os << structFields[i].first.getValue() << " : ";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, accidental bug here. Forgot ","

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant