-
Notifications
You must be signed in to change notification settings - Fork 144
ci: add a github action ci to test Rust std::autodiff #2430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| run: | | ||
| curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y | ||
| source ~/.cargo/env | ||
| rustup toolchain install nightly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
until Rust adds CI for Enzyme, I think this should be fixed to a specific commit
.github/workflows/enzyme-rust.yml
Outdated
| jobs: | ||
| rust-autodiff: | ||
| name: Rust Autodiff Tests LLVM ${{ matrix.llvm }} | ||
| runs-on: [self-hosted, Linux, X64, 32-core] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not the correct setup for the runner, see https://github.com/EnzymeAD/Enzyme/blob/main/.github/workflows/enzyme-mlir.yml for an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it needs to run on linux-x86-n2-32
| cmake ../enzyme -DLLVM_EXTERNAL_LIT=`which lit` -DCMAKE_BUILD_TYPE=Release -DLLVM_DIR=/usr/lib/llvm-${{ matrix.llvm }}/lib/cmake/llvm | ||
| make -j `nproc` LLVMEnzyme-${{ matrix.llvm }} | ||
| - name: Clone and configure Rust compiler | ||
| run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is never told to use the enzyme we built above?
.github/workflows/enzyme-rust.yml
Outdated
| source ~/.cargo/env | ||
| git clone --depth 1 --branch master https://github.com/rust-lang/rust.git | ||
| cd rust | ||
| ./configure --enable-llvm-link-shared --enable-llvm-plugins --enable-llvm-enzyme --release-channel=nightly --enable-llvm-assertions --enable-clang --enable-lld --enable-option-checking --enable-ninja --disable-docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also use ccache here, and cache intermediate artifacts
|
@wsmoses @ZuseZ4 #2430 (comment) Could you give me some hints, examples, suggestions, or something I should study? |
|
I have no experience with ccache and to me it sounds like something that should be added in a follow-up pr to avoid feature creep. But it looks like you've already added it in your last commit? If it works I guess it should stay. For the pinning of the version, I would assume we want to fix the rustc we build, and not necessarily the rust version we use to build rustc. It's fine to do both, but right now you only pin the version used to build. You should also check out a specific commit of github.com/rust-lang/rust instead of just the master branch. This way, we can verify that new Enzyme versions don't break support. We would update the rustc commit every once in a while. This way, an update to rustc can't break Enzyme CI. I don't know whats the relevant difference between your. yml file and the mlir example given, so I'd leave that for Billy to answer. |
.github/workflows/enzyme-rust.yml
Outdated
| echo "CC=ccache gcc" >> $GITHUB_ENV | ||
| echo "CXX=ccache g++" >> $GITHUB_ENV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who uses this? What exactly is built using GCC? I think Rust still uses LLVM. The rust invocation will prefix its invocations with ccache by itself as far as I can see, thanks to the --enable-ccache a few lines above.
.github/workflows/enzyme-rust.yml
Outdated
| run: | | ||
| . ~/.cargo/env | ||
| rm -rf build && mkdir build && cd build | ||
| RUST_LLVM_DIR=${GITHUB_WORKSPACE}/rust/build/x86_64-unknown-linux-gnu/llvm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would shell-quote this. GitHub probably has no space or special chars in the checkout dir, but better be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shell-quoted it in this commit!
use quotes
.github/workflows/enzyme-rust.yml
Outdated
| run: | | ||
| curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y | ||
| . ~/.cargo/env | ||
| # Pin to specific nightly commit for reproducibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for? Rustc bootstrap? Might be worth it to add a comment saying what is now reproducible. Enzyme itself is built against the latest rust compiler (you check out the main branch of rust-lang/rust), so I imagine it won't really help there?
959f1ee to
f6c53f9
Compare
.github/workflows/enzyme-rust.yml
Outdated
|
|
||
| container: | ||
| image: ${{ (contains(matrix.os, 'linux') && 'ghcr.io/enzymead/reactant-docker-images@sha256:91e1edb7a7c869d5a70db06e417f22907be0e67ca86641d48adcea221fedc674' ) || '' }} | ||
| image: ${{ (contains(matrix.os, 'linux') && 'ghcr.io/enzymead/reactant-docker-images:main' ) || '' }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should use the specific hash, not main
.github/workflows/enzyme-rust.yml
Outdated
| - name: Install dependencies | ||
| run: | | ||
| apt-get update | ||
| apt-get install -y binutils ninja-build cmake gcc g++ python3 python3-dev ccache nodejs npm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need all these deps?
.github/workflows/enzyme-rust.yml
Outdated
| . ~/.cargo/env | ||
| git clone https://github.com/rust-lang/rust.git | ||
| cd rust | ||
| git checkout 51ff895062ba60a7cba53f57af928c3fb7b0f2f4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move the commit to the matrix? that'll make it easier to update, and or if in the future we want to checkout multiple and/or stable tags
ec70a9d to
2071224
Compare
|
This currently builds Enzyme twice, once as part of rustc and once standalone. |
.github/workflows/enzyme-rust.yml
Outdated
| - name: Clone and configure Rust compiler | ||
| run: | | ||
| . ~/.cargo/env | ||
| git clone https://github.com/rust-lang/rust.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should use another actions/checkout for this [and can give it the relevant tag/commit]
|
@wsmoses this looks ready? Can you run CI, I don't have permission |
eadd4f3 to
9e3b7aa
Compare
| --enable-clang \ | ||
| --enable-lld \ | ||
| --disable-optimize-llvm \ | ||
| --enable-option-checking \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use the github actions cache here, but lets see if it builds first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise ccache doesn't do anything as the run is ephemeral
|
fails |
|
@sgasho Not sure if it's related, or needed but you can also try to do a submodule update for enzyme and commit it. |
|
726c421 to
79282f3
Compare
|
@sgasho Looks like you managed it! The last test is failing, but that one frequently changes the IR generated, not sure if it's because of rustc, LLVM, or Enzyme changing. I'll look into making it more robust, it's not like that feature (batching) is well tested in Rust atm anyway. From my side that sounds like it's good to merge? Also side note, we'll probably switch from linking against Enzyme to just dlopen'ing Enzyme, which should make this CI even simpler in the future: rust-lang/rust#146623 |
…youxu bless autodiff batching test This pr blesses a broken test and unblocks running rust in the Enzyme CI: EnzymeAD/Enzyme#2430 Enzyme is the plugin used by our std::autodiff and (future) std::batching modules, both of which are not build by default. In the near future we also hope to enable std::autodiff in the Rust CI. This test is the only one to combine two features, automatic differentiation and batching/vectorization. This combination is even more experimental than either feature on its own. I have a wip branch in which I enable more vectorization/batching and as part of that I'll think more about how to write those tests in a robust way (and likely change the interface). Until that lands, I don't care too much about what specific IR we generate here; it's just nice to track changes. r? compiler
…youxu bless autodiff batching test This pr blesses a broken test and unblocks running rust in the Enzyme CI: EnzymeAD/Enzyme#2430 Enzyme is the plugin used by our std::autodiff and (future) std::batching modules, both of which are not build by default. In the near future we also hope to enable std::autodiff in the Rust CI. This test is the only one to combine two features, automatic differentiation and batching/vectorization. This combination is even more experimental than either feature on its own. I have a wip branch in which I enable more vectorization/batching and as part of that I'll think more about how to write those tests in a robust way (and likely change the interface). Until that lands, I don't care too much about what specific IR we generate here; it's just nice to track changes. r? compiler
|
Not sure about that part. |
|
I'm not sure why it's still not caching, but at this point it's probably easier to wait for my fix on the rustc side to land. |
|
@sgasho The PR landed. If you update again and use the new build command from that PR it shouldn't build LLVM anymore. I'd also remove ccache now, since it seems unreliable. We should be fast enough without it. |
|
I would still keep ccache, can you restore that? |
|
hmm....I couldn't reproduce this on my local machine using act. |
|
-- The C compiler identification is GNU 13.3.0
-- The CXX compiler identification is GNU 13.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
LLVM_SHLIBEXT=.so
LLVM ABS DIR /__w/Enzyme/Enzyme/rust/build/x86_64-unknown-linux-gnu/ci-llvm
CMAKE_PREFIX_PATH
CMake Error at CMakeLists.txt:74 (find_package):
Could not find a package configuration file provided by "LLVM" with any of
the following names:
LLVMConfig.cmake
llvm-config.cmake
Add the installation prefix of "LLVM" to CMAKE_PREFIX_PATH or set
"LLVM_DIR" to a directory containing one of the above files. If "LLVM"
provides a separate development package or SDK, be sure it has been
installed.
-- Configuring incomplete, errors occurred!
thread 'main' (2652) panicked at /root/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/cmake-0.1.54/src/lib.rs:1119:5:
command did not execute successfully, got: exit status: 1
build script failed, must exit now
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
finished in 0.289 seconds
Build completed unsuccessfully in 0:01:42
Error: Error: ScriptExecutorError when trying to execute: "Error: Job failed with exit code 1."
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator. |
.github/workflows/enzyme-rust.yml
Outdated
| - uses: actions/checkout@v4 | ||
| name: Checkout Enzyme | ||
| - uses: actions/checkout@v4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - uses: actions/checkout@v4 | |
| name: Checkout Enzyme | |
| - uses: actions/checkout@v4 | |
| - uses: actions/checkout@v5 | |
| name: Checkout Enzyme | |
| - uses: actions/checkout@v5 |
|
The second error looks familiar, https://rust-lang.zulipchat.com/#narrow/channel/182449-t-compiler.2Fhelp/topic/distribute.20LLVM.20CMake.20modules.3F/near/420114940 |
|
This PR could be a potential solution @sgasho @I-Al-Istannen. |
|
It looks like the cmake files alone are not enough and people are not keen on shipping additional things just for one user, so Enzyme should just build it's own LLVM. Even with a cache that barely has any cache hits, the last run was only 23 minutes once we remove cland and lld. That's already at the lower end of what the Julia runners need, so from a Rust point I'd recommend to land that and open an issue for someone to investigate why ccache has a low hitrate, to make it even faster. |
|
What was the error message when cmake was shipped? |
1 similar comment
|
What was the error message when cmake was shipped? |
|
rust-lang/rust#148027 (comment), and looking at the corresponding CMake file it would have looked for all other archives next (if I read cmake correctly). |
|
We just added Rust CI to our Enzyme fork at rust-lang/enzyme. @sgasho I'm not sure what's the difference and why the rust-lang/enzyme had a much higher cache hit rate, but maybe the PR helps you to figure out the difference? Can you try to update your pr (while still keeping the faster custom runners)? |
|
My guess is that you're missing the I have a bugfix in the rust fork for it, which is likely why CI there doesn't need it. Upstreaming unfortunately seems blocked on some julia failure, hopefully someone will pick it up. |
8ba359e to
bf4d02d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, @giordano can you also give this a once over
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
|
@wsmoses @giordano @ZuseZ4 @I-Al-Istannen |
|
Congrats on the merge and grit to keep at it :) |
|
Agree, thanks a lot for having the patience to work two months on it! Let's hope that we'll soon get ci (and thus nightly support) within rust as well. |



closes: #2429
related: rust-lang/rust#145899
result of act dry-run