-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ephemeral runner #831
Ephemeral runner #831
Conversation
4e2d4a9
to
f35ce70
Compare
@rofafor Hey! Thanks for this PR. But you don't need to worry about the validation error. It's just a validation error. |
@TingluoHuang JFYI, this seems like a revival of the discussion we had in the slack channel before. |
My original plan was to drop support for Could you also change L82 from:
to
and revert the change on L160, and finally revert removal of this part:
and change it to:
So that the new |
I tested and seems to be working with Enterprise servers as well when setting |
@rofafor we've been talking how to roll this out without breaking people's setups. Can you confirm if the your GHES https://ghe.company.com/api/v3/meta endpoint is callable from a runner without a token and if the endpoint is rate limited? Additionally, can you call it with a token and get an authenticated rate limit budget? What does the rate limit header look like Finally could you send along what the JSON looks like. |
Target system GHES 3.1.8. Tested on both on my laptop and actual GHA workflow with similar results (as expected):
Bad credentials:
With a proper token:
Without authentication:
...and the same with newer GHES 3.2.0:
|
ebc49e1
to
1b146e6
Compare
We've talked about this and we can't merge this as is. The problem we've got is once this is merged we will break anyone's GitHub Enterprise Server deployment as the ephemeral flag isn't available. The new flag will be supported in 3.3 I believe however that still isn't enough as not anyone will upgrade quickly, they could conceivably not upgrade for years when talking about banks). In order to support setting the new flag we need to expose the GitHub environment to the runner's entrypoint i.e. environment variables |
Acknowledged. Closing this one. |
Thanks, everyone! Please let me try adopting this, perhaps by adding the server-side of this. |
This will resolve issues like #694 if done correctly! |
#941 worth highlighting this is what will happen if we set the new runner ephemeral flag by default without knowing if it's safe to do so. |
@toast-gear Given GHES 3.3 has been released recently, would it be safe to merge this as-is, after some graceful period(a month or two, maybe)? |
I think once 3.3 has been out for a while, something like 2/3 months (its been a month so far https://docs.github.com/en/[email protected]/admin/release-notes, lots of people will probably upgrade their GHES instances over Dec and Jan as code freezes are fairly common this time of the year in regulated environments), then I think we should probably consider just dropping the logic for setting the old flag entirely and only support the new flag. Baselining onto GHES 3.3 as a minimum seems reasonable to me and will help keep the codebase maintainable and as simple as possible and would be my preferred approach. If we do want to keep logic for the old flag ( |
So I was trying to set the runners to use "--ephemeral" instead of "--once" as we kept seeing the race condition. We're still running 3.2.6 I was running And saw this: Can the docs be update to clarify this as a caveat? |
We don't document GitHub's documentation. The README already says that you need 3.3 https://github.com/actions-runner-controller/actions-runner-controller#github-enterprise-support to use ARC, we have baselined onto 3.3 and so you need to upgrade to 3.3. In addition, we are going to be considering moving to the new flag by default soon so upgrading to 3.3 is going to be critical. This PR has been held up due to the need for GHES to be on 3.3 so we needed to give people time, 3 months which included the Dec, Jan period where code freezes are common is enough time. Please do organise upgrading as soon as possible. We will retain the In addition, of note is the next release of the controller has removed the registration only runners as they aren't needed on 3.3 so you will need to upgrade to 3.3 to be able to scale from 0 if that is a feature you use on next release. |
My apologies, I had only scanned the readme when I was upgrading the chart. I'll be more attentive in future. Our GHE instance is maintained by a central team, so I will pass your feedback onto them. Currently we're only using organization and repo level runners (hosted by various different SRE teams), but if we do end up using enterprise runners in the future I'll provide feedback if it helps |
It's no worries, the solution technically does support < 3.3, just not all the features, however the maintainers don't get loads of time to work on ARC and so we need to be practical about versioning. We also don't have an enterprise account GHES or GHEC so ARC is far easier to maintain in general if we are strict on versions support. In general when new features come out which require a newer GHES version we will baseline onto that version after what we consider to be a reasonable time frame and so it's important your organisation keeps on top of upgrading your GHES version. In general we won't outright remove stuff until quite a while has passed e.g. the |
We've been encouraged to contribute to OSS a bit more, and I do know golang & k8s, so I'll happily contribute a bit once I've a few personal projects off my plate. |
@crackedupcorson that's great to hear pal, PRs are welcome, a good first issue is #728 for someone familiar with golang & k8s |
ac017f0
to
25570a0
Compare
@rofafor, we appreciate the PR and the work you put into it, we've decided however to roll out the ephemeral runners via the next controller release however 0510897 #1189. We are subsequently going to remove the --once flag entirely #1196 as we can't see a reason to keep it to be honest. What this means is your PR won't get merged and can be closed. Again appreciate the work, if we didn't have GHES to worry about we would have merged this a long time ago 😅. |
@toast-gear, I haven't been able to upgrade to GHES 3.3 yet and therefore this PR have been kind of abandoned from my side, so no harm done. |
@rofafor This has triggered several important discussions about when and how we should roll it out and without this we would have ended up breaking many things on the way. So, thank you! I appreciate your work Also, I hope you'll soon be able to upgrade to GHES 3.3. FWIW, static(persistent) runners are also improved in the next version of ARC, 0.22.0. If you see issues with |
The
--ephemeral
switch is now released and--once
was (accidentally?) removed in https://github.com/actions/runner/pull/660/files:Actually, the
--ephemeral
switch is already in the images published via CI, but missing one small fix. Here's my log from the currentlatest
tagged image:I had to leave
RUNNER_FEATURE_FLAG_EPHEMERAL
logic due to the fact that GitHub Enterprise Server 3.0 or 3.1 doesn't support this new ephemeral feature yet: