Skip to content

[ci] Enabling aws-linux-scale-rocm to 10% #4899

Merged
geomin12 merged 7 commits into
mainfrom
users/geomin12/aws-runner
Apr 28, 2026
Merged

[ci] Enabling aws-linux-scale-rocm to 10% #4899
geomin12 merged 7 commits into
mainfrom
users/geomin12/aws-runner

Conversation

@geomin12
Copy link
Copy Markdown
Contributor

As we are migrating from Azure -> AWS for build machines, we are increasing our CPU load from 10% for AWS, while 90% continues on Azure as we slowly migrate and test

geomin12 and others added 6 commits April 28, 2026 08:52
Add logic to configure_ci.py and configure_multi_arch_ci.py to select
build runners using a weighted distribution:
- Default builds: 90% azure-linux-scale-rocm, 10% aws-linux-scale-rocm
- Sanitizer builds (asan/tsan): 100% azure-linux-scale-rocm-heavy-ramdisk

Changes:
- Add BUILD_RUNNER_LABELS config and select_build_runner() function
  to amdgpu_family_matrix.py
- Add build_runs_on field to BuildConfig in configure_multi_arch_ci.py
- Add build-runs-on field to matrix output in configure_ci.py
- Pass build_runs_on through workflow chain to build stage jobs
- Update multi_arch_build_portable_linux_artifacts.yml to use
  build_runs_on when provided, with fallback to existing logic

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TestBuildRunnerSelection with 2 deterministic tests:
- Test weighted selection (90% Azure, 10% AWS) for default builds
- Test sanitizer builds always use Azure ramdisk runner

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@geomin12
Copy link
Copy Markdown
Contributor Author

Copy link
Copy Markdown
Contributor

@HereThereBeDragons HereThereBeDragons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments

sanity_check_only_for_family: ${{ matrix.family_info.sanity_check_only_for_family }}
release_type: ${{ inputs.release_type }}

build_python_packages:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about this job here? also should get the dynamic cpu runners?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are doing an initial rollout to see any issues, but good, we'll do a 50% conversion and include other workflows

sanity_check_only_for_family: ${{ matrix.family_info.sanity_check_only_for_family }}
release_type: ${{ inputs.release_type }}

build_python_packages:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here for the future

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we haven't implemented windows yet for AWS runners, but will roll this out once we get that allocated

@geomin12 geomin12 merged commit c70a244 into main Apr 28, 2026
28 checks passed
@geomin12 geomin12 deleted the users/geomin12/aws-runner branch April 28, 2026 20:52
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants