-
Notifications
You must be signed in to change notification settings - Fork 245
Use 128GB runners for gfx1151 pytorch CI on Windows #4613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -57,6 +57,8 @@ | |
| - test-runs-on-alternate-weight: (optional) Probability (0.0-1.0) of selecting the alternate runner. | ||
| - test-runs-on-multi-gpu: (optional) GitHub runner label for multi-GPU tests for this architecture | ||
| - benchmark-runs-on: (optional) GitHub runner label for benchmarks for this architecture | ||
| - pytorch-ci-test-runs-on: (optional) GitHub runner label for PyTorch wheel tests only; when set, | ||
| the workflow should pass `--test-project-name=pytorch` to configure_target_run.py to use this label instead of test-runs-on | ||
| - test-runs-on-kernel: (optional) dict of kernel-specific runner labels, keyed by kernel type (e.g. "oem") | ||
| - family: (required) AMD GPU family name, used for test selection and artifact fetching | ||
| - fetch-gfx-targets: (required) list of gfx targets to fetch split test artifacts for (e.g. ["gfx942", "gfx942:xnack+"]) | ||
|
|
@@ -120,6 +122,7 @@ | |
| }, | ||
| "windows": { | ||
| "test-runs-on": "windows-gfx1151-gpu-rocm", | ||
| "pytorch-ci-test-runs-on": "windows-strix-halo-gpu-rocm-128gb", | ||
|
Comment on lines
124
to
+125
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @geomin12 / @amd-shiraz / @amd-justchen What's our spread of test runners for this We should either:
RIght now
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think they are still including all of the machines from 16, 32, 64, 128gb of total RAM. There was a point where I started adding runner labels for minimum amount of RAM for tests to select. Plumbing needs to be in place for that though, @geomin12 thoughts?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From my experience there are 128gb models and 64gb models, all configured to the maximum carveout sizes (96gb and 48gb iirc).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any update here? does the runner label exist?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The label
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @geomin12 / @amd-shiraz / @amd-justchen |
||
| # TODO(#2754): Add new benchmark-runs-on runner for benchmarks | ||
| "benchmark-runs-on": "windows-gfx1151-gpu-rocm", | ||
| "family": "gfx1151", | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.