Skip to content

[ci] Adding ability to run kernel-specified test runners#2902

Merged
geomin12 merged 8 commits into
mainfrom
users/geomin12/oem-addition
Jan 30, 2026
Merged

[ci] Adding ability to run kernel-specified test runners#2902
geomin12 merged 8 commits into
mainfrom
users/geomin12/oem-addition

Conversation

@geomin12
Copy link
Copy Markdown
Contributor

@geomin12 geomin12 commented Jan 13, 2026

Motivation

For expansion of TheRock's test machines, we are adding test machines specifically for oem kernels and allowing users to use these machines via labels

Users will be able to trigger options via: workflow_dispatch and pull request labels

Technical Details

I ran apt install linux-oem-24.04a on a couple of Linux gfx1151 machines, then created an oem label and added logic to replace test runners with the oem test runner if label was specified

Test Plan

Testing all via CI

Test Result

First normal CI run (cancelled):
Output:

OUTPUT linux_variants=[{"test-runs-on": "linux-gfx120X-gpu-rocm", "family": "gfx120X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx120X-all"}, {"test-runs-on": "linux-mi325-1gpu-ossci-rocm-frac", "test-runs-on-multi-gpu": "linux-mi325-4gpu-ossci-rocm", "benchmark-runs-on": "linux-mi325-1gpu-ossci-rocm-frac", "family": "gfx94X-dcgpu", "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx94X-dcgpu"}, {"test-runs-on": "linux-gfx1151-gpu-rocm", "test-runs-on-kernel": {"oem": "linux-strix-halo-gpu-rocm-oem"}, "family": "gfx1151", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx1151"}, {"test-runs-on": "", "family": "gfx110X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx110X-all"}]
OUTPUT linux_test_labels=[]
OUTPUT windows_variants=[{"test-runs-on": "", "family": "gfx120X-all", "bypass_tests_for_releases": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "windows-release", "artifact_group": "gfx120X-all"}, {"test-runs-on": "windows-gfx1151-gpu-rocm", "benchmark-runs-on": "windows-gfx1151-gpu-rocm", "family": "gfx1151", "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "windows-release", "artifact_group": "gfx1151"}, {"test-runs-on": "windows-gfx110X-gpu-rocm", "family": "gfx110X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "windows-release", "artifact_group": "gfx110X-all"}]
OUTPUT windows_test_labels=[]
OUTPUT enable_build_jobs=true
OUTPUT test_type=smoke

OEM kernel label added run:
output:

OUTPUT linux_variants=[{"test-runs-on": "", "family": "gfx110X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx110X-all"}, {"test-runs-on": "linux-strix-halo-gpu-rocm-oem", "test-runs-on-kernel": {"oem": "linux-strix-halo-gpu-rocm-oem"}, "family": "gfx1151", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx1151"}, {"test-runs-on": "", "test-runs-on-multi-gpu": "", "benchmark-runs-on": "linux-mi325-1gpu-ossci-rocm-frac", "family": "gfx94X-dcgpu", "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx94X-dcgpu"}, {"test-runs-on": "", "family": "gfx120X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx120X-all"}]
OUTPUT linux_test_labels=[]
OUTPUT windows_variants=[{"test-runs-on": "", "family": "gfx110X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "windows-release", "artifact_group": "gfx110X-all"}, {"test-runs-on": "", "benchmark-runs-on": "windows-gfx1151-gpu-rocm", "family": "gfx1151", "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "windows-release", "artifact_group": "gfx1151"}, {"test-runs-on": "", "family": "gfx120X-all", "bypass_tests_for_releases": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "windows-release", "artifact_group": "gfx120X-all"}]
OUTPUT windows_test_labels=[]
OUTPUT enable_build_jobs=true
OUTPUT test_type=smoke

Workflow dispatch OEM kernel run
output:

OUTPUT linux_variants=[{"test-runs-on": "", "family": "gfx120X-all", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx120X-all"}, {"test-runs-on": "linux-strix-halo-gpu-rocm-oem", "test-runs-on-kernel": {"oem": "linux-strix-halo-gpu-rocm-oem"}, "family": "gfx1151", "bypass_tests_for_releases": true, "sanity_check_only_for_family": true, "build_variant_label": "release", "build_variant_suffix": "", "build_variant_cmake_preset": "", "artifact_group": "gfx1151"}]
OUTPUT linux_test_labels=[]
OUTPUT windows_variants=[]
OUTPUT windows_test_labels=[]
OUTPUT enable_build_jobs=true
OUTPUT test_type=smoke

with final test results in this PR run

Submission Checklist

@geomin12 geomin12 added the test_runner:oem If added, the tests will run on a machine configured with `oem` kernel label Jan 13, 2026
@geomin12 geomin12 changed the title [ci] Adding kernel oem runner [ci] Adding ability to run kernel-specified test runners Jan 13, 2026
@geomin12 geomin12 added test_runner:oem If added, the tests will run on a machine configured with `oem` kernel and removed test_runner:oem If added, the tests will run on a machine configured with `oem` kernel labels Jan 14, 2026
@geomin12 geomin12 marked this pull request as ready for review January 14, 2026 17:24
Copy link
Copy Markdown
Contributor

@HereThereBeDragons HereThereBeDragons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we adjust the PR label from kernel:oem to test_runner:oem?

Comment thread .github/workflows/ci.yml
windows_use_prebuilt_artifacts:
type: boolean
description: "If enabled, the CI will pull Windows artifacts using artifact_run_id and only run tests"
additional_label_options:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this option is only for PRs, you dont need those here as you can get the pr label content from the github event

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan on leaving this option in case developers want to test without waiting for build!

Comment thread build_tools/github_actions/tests/configure_ci_test.py
@geomin12 geomin12 added test_runner:oem If added, the tests will run on a machine configured with `oem` kernel and removed test_runner:oem If added, the tests will run on a machine configured with `oem` kernel labels Jan 28, 2026
@HereThereBeDragons
Copy link
Copy Markdown
Contributor

HereThereBeDragons commented Jan 30, 2026

before merging: can we have one ci run with it?

@geomin12
Copy link
Copy Markdown
Contributor Author

before merging: can we have one ci run with it?

this run (https://github.com/ROCm/TheRock/actions/runs/21492661988/job/61927151095?pr=2902) tests using the oem label which is noted here:
Screenshot 2026-01-30 105543

@geomin12 geomin12 merged commit 9bf6056 into main Jan 30, 2026
45 checks passed
@github-project-automation github-project-automation Bot moved this from TODO to Done in TheRock Triage Jan 30, 2026
@geomin12 geomin12 deleted the users/geomin12/oem-addition branch January 30, 2026 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test_runner:oem If added, the tests will run on a machine configured with `oem` kernel

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants