Skip to content

[ci] Organizing TheRock build_linux_packages #782

Merged
geomin12 merged 18 commits into
mainfrom
users/geomin12/fix-monorepo-fork
Jun 16, 2025
Merged

[ci] Organizing TheRock build_linux_packages #782
geomin12 merged 18 commits into
mainfrom
users/geomin12/fix-monorepo-fork

Conversation

@geomin12
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 commented Jun 5, 2025

Currently in rocm-libraries, we use build_linux_packages.yml to build sub-projects of the monorepo.

Instead, we are removing the rocm-libraries dependency, and letting rocm-libraries pull TheRock, and use specific scripts. Doing some organization as well, moving "cmake configure" into a script.

Comment on lines 80 to 87
repository: ${{ github.repository }}
ref: refs/pull/${{ github.event.pull_request.number }}/merge
sparse-checkout: |
.github
${{ inputs.subtree }}
fetch-depth: 0 # Needed for subtree splitting
token: ${{ steps.generate-token.outputs.token }}
path: "rocm-libraries"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much magic in here, unrelated to "building Linux packages". Please elaborate on what is broken and what other approaches you have considered to fix it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. While having a sparse checkout is definitely an improvement it is unclear to me why a token needs to be created. This at needs least some more documentation.

With regards to

This update in build_linux_packages.yml allows forked PRs to be able to checkout correctly

is it worth fixing the checkout in forks (why should this be problematic?) as a build still requires to run on azure-linux-scale-rocm, thus the CI wont work in forks anyway and might only work if sending a PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also my feedback at #594 (comment) .

I'd like to keep the workflows as simple and reusable as possible. All this "if triggered by this other github repository then do this, else do that" logic is tightly coupling two repositories and making the workflow harder to understand and edit. We would also need to make the same changes to any other workflow that we want to trigger from the monorepo, like building and testing Windows packages. We'll also have more repositories that we want to test changes for (systems monorepo, profiler repo, etc.), so this approach of modifying a checkout step (in yml, not even in a reusable python script) won't scale.

If we instead created branches in TheRock (or a fork of TheRock) with all submodule modifications necessary, we'd be able to reuse any workflow for free.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option could be putting source fetching and repository/fork checking out in a reusable workflow that we include in each of our workflows.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything local to a workflow (and not persisted in a branch or at least locally reproducible via scripts) will be harder to debug and work with outside of CI workflows. Such a reusable workflow or script could itself create a branch (possibly in a fork to keep the main repository tidier, or we could just accept the large number of branches here).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good call out. Makes sense, especially about azure-linux-scale-rocm. I think I'll go with either the forked approach or a reusable workflow, and leaning more towards the reusable workflow! I'll make some adjustments!!!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ScottTodd pls review!

Comment on lines +1 to +7
name: "Artifact Upload"

inputs:
build_directory:
type: string
amdgpu_families:
type: string
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some conflicting feelings about adding this and build_linux_action as composite actions. My original suggestion was to extract the repository setup into an action and not the building.

Pros:

  • It keeps the build workflows more focused on the high level operations needed.
  • It allows us to share these steps with multiple workflow files.

Cons:

  • Reuse is possible by moving logic from yml into scripts. Ideally the scripts are integrated into the build scripts or infrastructure servers themselves and run seamlessly. I'm not really seeing the value in having a different repository reuse github actions. The code should be simple enough that actions can be easily reimplemented.
  • Splitting workflows across multiple files makes it harder to see what they are doing both for CI/CD maintenance and for developers wanting to reproduce what the CI does.
  • The logs on https://github.com/ROCm/TheRock/actions/runs/15614427714/job/43986995984?pr=782 look pretty clean, but the individual steps under each workflow are now folded together. I like being able to see "fetch sources", "CMake configure", and "CMake build" separately with their own timing, for example.

I'm happy to unblock the monorepo testing work, so I can put aside some of the concerns. We can remove code from these files as they move to scripts or get more naturally integrated into the build system, then eventually maybe move back into inlined workflow steps to get back the benefits of having fine-grained logging and timing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to follow your initial suggestion, but it felt weird to include checkout steps for rocm-libraries when TheRock never even uses it. For me, it felt better to keep build_linux_packages as a reusable workflow.

I agree that it does bunch it up :( so this makes debugging a bit more difficult

In that case, I'll actually move these to scripts (s3 artifact upload practically does this), so it can be reusable, better visibility in steps and easily implemented

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly... the more I am working on this, I think it may be best to just re-create this yml file inside of rocm-libraries, and the only new custom step would be checking out rocm-libraries.

But, this is still good organization and cleanup, so I will keep this PR

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's good to explore the convince ourselves that the direction makes sense. This is also highlighting some of the areas that should be streamlined regardless of which repository the code is running from (or running locally).

Comment on lines +19 to +25
- name: Create Logs index Files
shell: bash
if: always()
run: |
python3 build_tools/github_actions/create_log_index.py \
--build-dir=${{ inputs.build_directory }} \
--amdgpu-family=${{ inputs.amdgpu_families }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not in this PR, but relating to the positioning of this as a composite workflow)

This should be moved to server-side

Comment on lines +36 to +43
- name: Upload Logs
shell: bash
if: always()
run: |
python3 build_tools/github_actions/upload_build_logs_to_s3.py \
--build-dir=${{ inputs.build_directory }} \
--run-id ${{ github.run_id }} \
--amdgpu-family ${{ inputs.amdgpu_families }}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not in this PR, but relating to the positioning of this as a composite workflow)

This should be moved to teatime.py: #648

Comment on lines +55 to +70
- name: Report
shell: bash
if: ${{ !cancelled() }}
run: |
echo "Full SDK du:"
echo "------------"
du -h -d 1 build/dist/rocm
echo "Artifact Archives:"
echo "------------------"
ls -lh build/artifacts/*.tar.xz
echo "Artifacts:"
echo "----------"
du -h -d 1 build/artifacts
echo "CCache Stats:"
echo "-------------"
ccache -s
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not in this PR, but relating to the positioning of this as a composite workflow)

This should be moved to a script - we can run a "check environment before building" script for #762 and a "check environment after building" script here

Comment thread .github/workflows/build_linux_packages.yml Outdated
Comment thread build_tools/github_actions/build_linux_packages.py Outdated
Comment on lines +24 to +35
def build_linux_configure():
logging.info(f"Building package {package_version}")
cmd = f"""
cmake -B build -GNinja .
-DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache
-DTHEROCK_AMDGPU_FAMILIES={amdgpu_families}
-DTHEROCK_PACKAGE_VERSION="{package_version}"
-DTHEROCK_VERBOSE=ON
-DBUILD_TESTING=ON
{extra_cmake_options}
"""
subprocess.run(cmd, cwd=THEROCK_DIR, check=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does make some sense to put in a script, given the number of options.... but we could also use a CMake preset instead. Can even use environment variables in them: https://cmake.org/cmake/help/latest/manual/cmake-presets.7.html.

https://github.com/ROCm/TheRock/blob/main/CMakePresets.json

Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough for now. I've carried similar build.sh scripts that apply CMake options before (e.g. https://github.com/iree-org/iree/blob/main/build_tools/cmake/build_all.sh) and they're okay... CMake presets is one modern way to share such settings, but it is awkward in a few ways.

@ScottTodd
Copy link
Copy Markdown
Member

Can you update the PR title and description before merging?

@geomin12 geomin12 changed the title [monorepo] Fixing forks for monorepo [ci] Organizing TheRock build_linux_packages Jun 13, 2025
Copy link
Copy Markdown
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new scope, this PR is fine to land. Thanks.

@geomin12 geomin12 merged commit eaa78bc into main Jun 16, 2025
22 of 23 checks passed
@geomin12 geomin12 deleted the users/geomin12/fix-monorepo-fork branch June 16, 2025 15:57
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Jun 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants