[ci] Organizing TheRock build_linux_packages #782
Conversation
| repository: ${{ github.repository }} | ||
| ref: refs/pull/${{ github.event.pull_request.number }}/merge | ||
| sparse-checkout: | | ||
| .github | ||
| ${{ inputs.subtree }} | ||
| fetch-depth: 0 # Needed for subtree splitting | ||
| token: ${{ steps.generate-token.outputs.token }} | ||
| path: "rocm-libraries" |
There was a problem hiding this comment.
Too much magic in here, unrelated to "building Linux packages". Please elaborate on what is broken and what other approaches you have considered to fix it.
There was a problem hiding this comment.
I agree. While having a sparse checkout is definitely an improvement it is unclear to me why a token needs to be created. This at needs least some more documentation.
With regards to
This update in build_linux_packages.yml allows forked PRs to be able to checkout correctly
is it worth fixing the checkout in forks (why should this be problematic?) as a build still requires to run on azure-linux-scale-rocm, thus the CI wont work in forks anyway and might only work if sending a PR.
There was a problem hiding this comment.
See also my feedback at #594 (comment) .
I'd like to keep the workflows as simple and reusable as possible. All this "if triggered by this other github repository then do this, else do that" logic is tightly coupling two repositories and making the workflow harder to understand and edit. We would also need to make the same changes to any other workflow that we want to trigger from the monorepo, like building and testing Windows packages. We'll also have more repositories that we want to test changes for (systems monorepo, profiler repo, etc.), so this approach of modifying a checkout step (in yml, not even in a reusable python script) won't scale.
If we instead created branches in TheRock (or a fork of TheRock) with all submodule modifications necessary, we'd be able to reuse any workflow for free.
There was a problem hiding this comment.
Another option could be putting source fetching and repository/fork checking out in a reusable workflow that we include in each of our workflows.
There was a problem hiding this comment.
Anything local to a workflow (and not persisted in a branch or at least locally reproducible via scripts) will be harder to debug and work with outside of CI workflows. Such a reusable workflow or script could itself create a branch (possibly in a fork to keep the main repository tidier, or we could just accept the large number of branches here).
There was a problem hiding this comment.
Ah good call out. Makes sense, especially about azure-linux-scale-rocm. I think I'll go with either the forked approach or a reusable workflow, and leaning more towards the reusable workflow! I'll make some adjustments!!!
| name: "Artifact Upload" | ||
|
|
||
| inputs: | ||
| build_directory: | ||
| type: string | ||
| amdgpu_families: | ||
| type: string |
There was a problem hiding this comment.
I have some conflicting feelings about adding this and build_linux_action as composite actions. My original suggestion was to extract the repository setup into an action and not the building.
Pros:
- It keeps the build workflows more focused on the high level operations needed.
- It allows us to share these steps with multiple workflow files.
Cons:
- Reuse is possible by moving logic from yml into scripts. Ideally the scripts are integrated into the build scripts or infrastructure servers themselves and run seamlessly. I'm not really seeing the value in having a different repository reuse github actions. The code should be simple enough that actions can be easily reimplemented.
- Splitting workflows across multiple files makes it harder to see what they are doing both for CI/CD maintenance and for developers wanting to reproduce what the CI does.
- The logs on https://github.com/ROCm/TheRock/actions/runs/15614427714/job/43986995984?pr=782 look pretty clean, but the individual steps under each workflow are now folded together. I like being able to see "fetch sources", "CMake configure", and "CMake build" separately with their own timing, for example.
I'm happy to unblock the monorepo testing work, so I can put aside some of the concerns. We can remove code from these files as they move to scripts or get more naturally integrated into the build system, then eventually maybe move back into inlined workflow steps to get back the benefits of having fine-grained logging and timing.
There was a problem hiding this comment.
I was going to follow your initial suggestion, but it felt weird to include checkout steps for rocm-libraries when TheRock never even uses it. For me, it felt better to keep build_linux_packages as a reusable workflow.
I agree that it does bunch it up :( so this makes debugging a bit more difficult
In that case, I'll actually move these to scripts (s3 artifact upload practically does this), so it can be reusable, better visibility in steps and easily implemented
There was a problem hiding this comment.
Honestly... the more I am working on this, I think it may be best to just re-create this yml file inside of rocm-libraries, and the only new custom step would be checking out rocm-libraries.
But, this is still good organization and cleanup, so I will keep this PR
There was a problem hiding this comment.
Yeah, it's good to explore the convince ourselves that the direction makes sense. This is also highlighting some of the areas that should be streamlined regardless of which repository the code is running from (or running locally).
| - name: Create Logs index Files | ||
| shell: bash | ||
| if: always() | ||
| run: | | ||
| python3 build_tools/github_actions/create_log_index.py \ | ||
| --build-dir=${{ inputs.build_directory }} \ | ||
| --amdgpu-family=${{ inputs.amdgpu_families }} |
There was a problem hiding this comment.
(Not in this PR, but relating to the positioning of this as a composite workflow)
This should be moved to server-side
| - name: Upload Logs | ||
| shell: bash | ||
| if: always() | ||
| run: | | ||
| python3 build_tools/github_actions/upload_build_logs_to_s3.py \ | ||
| --build-dir=${{ inputs.build_directory }} \ | ||
| --run-id ${{ github.run_id }} \ | ||
| --amdgpu-family ${{ inputs.amdgpu_families }} |
There was a problem hiding this comment.
(Not in this PR, but relating to the positioning of this as a composite workflow)
This should be moved to teatime.py: #648
| - name: Report | ||
| shell: bash | ||
| if: ${{ !cancelled() }} | ||
| run: | | ||
| echo "Full SDK du:" | ||
| echo "------------" | ||
| du -h -d 1 build/dist/rocm | ||
| echo "Artifact Archives:" | ||
| echo "------------------" | ||
| ls -lh build/artifacts/*.tar.xz | ||
| echo "Artifacts:" | ||
| echo "----------" | ||
| du -h -d 1 build/artifacts | ||
| echo "CCache Stats:" | ||
| echo "-------------" | ||
| ccache -s |
There was a problem hiding this comment.
(Not in this PR, but relating to the positioning of this as a composite workflow)
This should be moved to a script - we can run a "check environment before building" script for #762 and a "check environment after building" script here
| def build_linux_configure(): | ||
| logging.info(f"Building package {package_version}") | ||
| cmd = f""" | ||
| cmake -B build -GNinja . | ||
| -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache | ||
| -DTHEROCK_AMDGPU_FAMILIES={amdgpu_families} | ||
| -DTHEROCK_PACKAGE_VERSION="{package_version}" | ||
| -DTHEROCK_VERBOSE=ON | ||
| -DBUILD_TESTING=ON | ||
| {extra_cmake_options} | ||
| """ | ||
| subprocess.run(cmd, cwd=THEROCK_DIR, check=True) |
There was a problem hiding this comment.
This does make some sense to put in a script, given the number of options.... but we could also use a CMake preset instead. Can even use environment variables in them: https://cmake.org/cmake/help/latest/manual/cmake-presets.7.html.
https://github.com/ROCm/TheRock/blob/main/CMakePresets.json
ScottTodd
left a comment
There was a problem hiding this comment.
Good enough for now. I've carried similar build.sh scripts that apply CMake options before (e.g. https://github.com/iree-org/iree/blob/main/build_tools/cmake/build_all.sh) and they're okay... CMake presets is one modern way to share such settings, but it is awkward in a few ways.
|
Can you update the PR title and description before merging? |
build_linux_packages
marbre
left a comment
There was a problem hiding this comment.
With the new scope, this PR is fine to land. Thanks.
Currently in
rocm-libraries, we usebuild_linux_packages.ymlto build sub-projects of the monorepo.Instead, we are removing the
rocm-librariesdependency, and lettingrocm-librariespull TheRock, and use specific scripts. Doing some organization as well, moving "cmake configure" into a script.