Skip to content

[ci] Reorganizing artifact upload#730

Merged
geomin12 merged 29 commits into
mainfrom
users/geomin12/s3-organization
Jun 4, 2025
Merged

[ci] Reorganizing artifact upload#730
geomin12 merged 29 commits into
mainfrom
users/geomin12/s3-organization

Conversation

@geomin12
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 commented May 28, 2025

Changes:

  • Moving log upload, job summary upload and artifact upload to a script
  • If the owner and repo are not ROCm/TheRock, artifacts are uploaded to the external s3 bucket
  • Using the third-party indexer.py file now

Comment thread .github/actions/setup_test_environment/action.yml Outdated
Comment on lines +28 to +35
def create_index_file(args):
logging.info("Creating index file")
index_file_path = THEROCK_DIR / "third-party" / "indexer" / "indexer.py"
build_dir = args.build_dir

subprocess.run(
[sys.executable, index_file_path, "-f", "*.tar.xz*", build_dir / "artifacts"]
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overlaps with some work that @erman-gurses has been doing over at #648, moving the log file uploading and maybe log index page generation into the teatime.py script invoked by the build system.

A few things to consider about the architecture:

  • We should have flexible scripts that can be reused from multiple different contexts, like CI jobs and on developer machines. The log indexer and artifact fetcher are both good examples right of that right now. As we add more composite scripts like this one, we should be careful to preserve that property.
  • Do we want "artifact upload" to be scoped to github actions, or can it be more generic? I think here we could move this script up a level and keep the github-specific steps behind environment variables, as is already done in teatime.py:
    CI systems can set `TEATIME_LABEL_GH_GROUP=1` in the environment, which will
    cause labeled console output to be printed using GitHub Actions group markers
    instead of line prefixes. This causes the output to show in the log viewer
    with interactive group handles.

Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/fetch_artifacts.py Outdated
Comment on lines +153 to +157
python build_tools/github_actions/artifact_upload.py \
--repo ${{ github.repository }} \
--run-id ${{ github.run_id }} \
--amdgpu-family ${{ env.AMDGPU_FAMILIES }} \
--build-dir build/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has come up in the review of @erman-gurses 's #648 a bit, but I do like how this collapses the .yml down to a single python script invocation instead of inlining the individual subcommands and bash conditions.

Where it interfaces with that PR is that the build logs will be uploaded to S3 during the build, either at the end of the build or continuously (streaming). The build log index can either also be generated by the same process or it can be generated here at the end in the action. Longer term I think we should consider having a dynamic index page or some server-side processing so we don't need to think as carefully during the build about generating a static index page and uploading it along with the otherwise loose log files.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point, I need integrate the log part into teatime.py script. I need to change the architecture of #648 anyway. If @ScottTodd thinks this will make teatime.py script integration easier, I am ok to land this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that too, I think i'll split this up into more scripts, that way it'll be easier to pinpoint where to add the teatime script

Copy link
Copy Markdown
Contributor

@erman-gurses erman-gurses Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like that. The separation will be easier for me.

@erman-gurses
Copy link
Copy Markdown
Contributor

The PR does not only handle the artifacts here but also the logs. I think artifact_upload.py does not reflect the full functionality here

Copy link
Copy Markdown
Contributor

@erman-gurses erman-gurses left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Couple of minor things.

with:
repository: "ROCm/TheRock"
# for testing purposes
ref: users/geomin12/s3-organization
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove this after the testing is done.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run working in rocm-libraries

Were these ref: changes even needed on this PR itself? I'm a bit confused about the sequencing here. Prefer to keep test changes out of PRs so we can accurately review the changes as they will be merged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no it is not! i just added it for testing purposes to have it in rocm-libraries. i'll remove it now and have this run through

Comment thread .github/workflows/build_linux_packages.yml
Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good

Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment thread build_tools/github_actions/artifact_upload.py Outdated
Comment on lines +148 to +150
parser.add_argument(
"--amdgpu-family", type=str, required=True, help="AMD GPU family to upload"
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like to remove this amdgpu-family plumbing, to keep as much of our infrastructure as generic as possible.

Once we move the index generation server side there will only be one usage in this file:

s3_base_path = f"{bucket_uri}/logs/{amdgpu_family}"

For that we can treat it as a generic "logs subdirectory". In fact, we could even make that change right now, just across all the other places that also look at the string (e.g. the index page url).

@geomin12
Copy link
Copy Markdown
Contributor Author

geomin12 commented Jun 3, 2025

Run working in rocm-libraries

Copy link
Copy Markdown
Member

@ScottTodd ScottTodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bit of clean up then this looks good to me

with:
repository: "ROCm/TheRock"
# for testing purposes
ref: users/geomin12/s3-organization
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run working in rocm-libraries

Were these ref: changes even needed on this PR itself? I'm a bit confused about the sequencing here. Prefer to keep test changes out of PRs so we can accurately review the changes as they will be merged.

Comment thread .github/workflows/build_linux_packages.yml Outdated
Comment thread .github/workflows/build_linux_packages.yml Outdated
Comment thread .github/workflows/build_windows_packages.yml Outdated
Comment on lines -227 to -231
shell: powershell
run: |
$Env:PATH += ";C:\Program Files\Amazon\AWSCLIV2"
python3 build_tools/upload_logs_to_s3.py `
--build-dir=${{ env.BASE_BUILD_DIR_POWERSHELL }} `
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like BASE_BUILD_DIR_POWERSHELL is no longer used thanks to these scripts replacing shell: powershell usage. We can remove that variable in a follow-up.

Comment thread .github/workflows/build_windows_packages.yml Outdated
Comment thread .github/workflows/test_rocprim.yml Outdated
@geomin12 geomin12 requested a review from ScottTodd June 3, 2025 21:43
@geomin12 geomin12 merged commit 1b9fe8a into main Jun 4, 2025
23 checks passed
@geomin12 geomin12 deleted the users/geomin12/s3-organization branch June 4, 2025 00:41
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants