Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP switch GPU workers to image that uses multi engine generic worker #700

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

bhearsum
Copy link
Collaborator

This will be necessary for #466 due to the "simple" engine not supporting chain of trust.

This initial work is just for some sanity checking that nothing obvious breaks. Aside from tasks in automation, we also need to see how interactive tasks behave, and if they meet the needs (most notably, see if we can have root in interactive tasks if needed).

@bhearsum
Copy link
Collaborator Author

It looks like these new workers largely work fine. We are hitting taskcluster/taskcluster#7128, which will need to be fixed before this is landable, but all of the GPU tasks seem to work well within Docker on this new image.

@bhearsum
Copy link
Collaborator Author

It looks like these new workers largely work fine. We are hitting taskcluster/taskcluster#7128, which will need to be fixed before this is landable, but all of the GPU tasks seem to work well within Docker on this new image.

taskcluster/taskcluster#7428 has this patch. We'll need a new worker image when that's done to pick it up. I also want to wait on mozilla-releng/fxci-config#244 to land to avoid some unnecessary confusion in this repo.

@bhearsum bhearsum force-pushed the gpu-multi branch 6 times, most recently from 26b3d9b to 996ab9e Compare January 10, 2025 15:04
The new image we're upgrading GPU workers to uses Ubuntu 24.04, which makes it incompatible with various parts of the pipeline (mostly due to Python package pinning). As it turns out, the easiest way to fix this is to dockerize the GPU tasks.

We need slight updates to GPU task payloads to accommodate this.

This will fix mozilla#391.
Without this we end up with these files being inaccessible in subsequent tasks.
This has always been needed, but it was found on the host system on the previous image.
@bhearsum bhearsum force-pushed the gpu-multi branch 11 times, most recently from 7792ffc to a111917 Compare January 13, 2025 01:38
bhearsum added a commit to bhearsum/fxci-config that referenced this pull request Jan 17, 2025
…GPU workers

And also, use the newly minted dated image for them, which appears to be working well in mozilla/translations#700.
bhearsum added a commit to bhearsum/fxci-config that referenced this pull request Jan 17, 2025
…GPU workers

And also, use the newly minted dated image for them, which appears to be working well in mozilla/translations#700.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant