infra: align CI workflows with self-hosted runner strategy#2021
Conversation
…ion integration test
… workflows in nmtcpp
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - iOSOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
❌ E2E Mobile Test Results - AndroidOverall Status: FAILED Test Summary
Links
Automated E2E mobile testing powered by AWS Device Farm |
What problem does this PR solve?
The workflow fleet was inconsistent in runner labels and host selection across inference addon CI, which made self-hosted adoption uneven and introduced reliability issues (including artifact collisions and platform-specific instability).
A follow-up PR will be made to unpin the
tmp-self-hosted-runnersfrom theuses:lines in workflows yamls.Then once that follow-up PR is merged, the
tmp-self-hosted-runnersbranch can be deleted.Another follow-up PR will be made to update the benchmark workflows to use self-hosted runners as well.
How does it solve it?
runs-onentries across addon CI to consistently use the intended self-hosted runner set.mainupdates while preserving branch-specific runner migrations introduced for self-hosted execution.Runner labels available (with redundancy & concurrency)
qvac-ubuntu2204-x64Intel(R) Xeon(R) Gold 5412U, 128GB RAM, 3.5TB Datacenter NVME x 2 in RAID-1 (mirroring)qvac-ubuntu2204-x64-gpu13th Gen Intel(R) Core(TM) i5-13500, 64GB RAM, 1.7TB Datacenter NVME x 2 in RAID-1 (mirroring) + NVIDIA RTX 4000 SFF Ada Generation 20GB VRAMqvac-ubuntu2404-x64Intel(R) Xeon(R) Gold 5412U, 128GB RAM, 3.5TB Datacenter NVME x 2 in RAID-1 (mirroring)qvac-ubuntu2404-x64-gpu13th Gen Intel(R) Core(TM) i5-13500, 64GB RAM, 1.7TB Datacenter NVME x 2 in RAID-1 (mirroring) + NVIDIA RTX 4000 SFF Ada Generation 20GB VRAMqvac-win25-x64Intel(R) Xeon(R) Gold 5412U, 128GB RAM, 3.5TB Datacenter NVME x 2 in RAID-1 (mirroring) , Running Windows Server 2025 Standard Edition (14-core license)qvac-win25-x64-gpu13th Gen Intel(R) Core(TM) i5-13500, 64GB RAM, 1.7TB Datacenter NVME x 2 in RAID-1 (mirroring) + NVIDIA RTX 4000 SFF Ada Generation 20GB VRAM, Running Windows Server 2025 Standard Edition (14-core license)Local Actions Cache Server runs on first replica of
qvac-ubuntu2204-x64. Cache retention set to 7 days. We have 3TB+.. nothing to worry, but DevOps will keep an eye on usage.Breaking changes