-
Notifications
You must be signed in to change notification settings - Fork 187
[rhcos-4.11] support being initiated from a cosa remote-session
#3108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rhcos-4.11] support being initiated from a cosa remote-session
#3108
Conversation
The command previously wouldn't allow us to get/set fedora-coreos.parent-commit
and fedora-coreos.parent-version because it would try to traverse the
`fedora-coreos.` path. Add an exception here similar to `coreos-assembler.`.
Now we can do something like this in the pipeline:
```
cosa meta \
--set fedora-coreos.parent-commit=${parent_commit} \
--set fedora-coreos.parent-version=${parent_version}
```
(cherry picked from commit 433a68d)
This will allow us to specify where the AWS_CONFIG_FILE is from the CLI without having to set it in the environment. (cherry picked from commit db00529)
This will allow us to specify the AWS_CONFIG_FILE from the command line without having to set it in the environment. (cherry picked from commit 9214b9b)
This will allow us to specify the AWS_CONFIG_FILE from the command
line without having to set it in the environment.
This required reworking the cosalib/s3 library into an S3() class
so that we could call boto3.client('s3') after setting the
AWS_CONFIG_FILE environment variable in cmd-buildfetch. Otherwise
it would get called as soon as the library was imported and setting
the environment variable would have no effect (too late).
(cherry picked from commit 9d55a20)
This will allow us to specify the AWS_CONFIG_FILE from the command line without having to set it in the environment. (cherry picked from commit 2b0b36d)
Right now cmd-build will just pick up whatever the latest symlink points to and try to get previous build information from that, but what if the absolute latest build doesn't have a build for the architecture we are targetting? Currently cmd-build will fail with: ``` /usr/lib/coreos-assembler/cmd-build: line 163: /srv/builds/36.20220720.20.5/aarch64/meta.json: No such file or directory error: failed to execute cmd-build: exit status 1 ``` What we really need is to get the latest build for the given architecture and use that in cmd-build instead. (cherry picked from commit 2ca610c)
If we are starting out a brand new stream and the x86_64 build
has completed, but there was no build yet for $arch then the detected
previous buildid:
```
previous_buildid = parent_build or self.get_latest()
```
Will just be the build ID of the x86_64 build from `get_latest()`,
but this code tries to open a meta.json for it from $arch, which
won't exist. Here's the error we see:
```
Traceback (most recent call last):
File "<string>", line 5, in <module>
File "/usr/lib/coreos-assembler/cosalib/builds.py", line 134, in init_build_meta_json
with open(metapath) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/srv/builds/36.20220714.dev.0/aarch64/meta.json'
error: failed to execute cmd-build: exit status 1
```
This should fix that error by verifying the detected buildid actually
has a build for that architecture before continuing.
(cherry picked from commit 786e87b)
CI hit a race where parallel `buildextend-*` would race to generate `tmp/image.json`. We need to lock this for efficiency (avoid duplicate work) and correctness. (cherry picked from commit 6222e3f)
Linux `tmpfs` still doesn't support `user.` xattrs, and so for toolbox-like containers that use a (proper!) `tmpfs` for `/tmp` instead of having it be `overlayfs`, we need to ensure that we create our tempdir in `$workdir/tmp` - the workdir must have `user.` xattr support for the `cache/repo-build` anyways. Closes: coreos#2904 (cherry picked from commit e024805)
This was originally needed for `cosa sign` to force the re-import from the tarball to ensure that the detached metadata was present in the tmp repo after signing (3b192cc). But now with the switch to `ostree container`, we operate directly on the tmp repo and re-export the signed commit, so this is no longer needed. This also fixes `cosa generate-hashlist` also setting `force`, which I don't think it actually needed. (cherry picked from commit ccb4b73)
Prep for next patch. (cherry picked from commit 68f31d5)
`flufl` requires specifying lock lifetimes, i.e. how long the lock should be considered valid before others can also lock it. The default is 15s. This surprisingly means that a critical section could in fact be run concurrently if one of the locks takes more than 15s. I don't think we want to use lock lifetimes here, so let's make sure we always pass in a large number. The intended use case is that your program doesn't deadlock, but I'd prefer we deadlock and have pipelines fail instead after timing out at the Jenkins level. This should fix the `NotLockedError: Already unlocked` error messages we've had in the FCOS pipeline after parallelizing `cosa generate-hashlist` and `cosa buildextend-qemu` (related: coreos/fedora-coreos-pipeline#554). What would be interesting would be lock timeouts instead: e.g. have `with Lock(...)` automatically throw if it still hasn't acquired the lock after X duration. That'd allow us to error out earlier than the Jenkins timeout in case of something taking much longer than it really should. We could do this now but it'd require creating the lock separately from using it in a context. Latest upstream `flufl` supports specifying this as part of the constructor so let's wait for that instead: https://gitlab.com/warsaw/flufl.lock/-/merge_requests/47 (cherry picked from commit 1151356)
|
@dustymabe: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@dustymabe: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
travier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from a quick review
|
CI is known broken for |
We recently did some golang migration [1] and happened to add the
cosa remote-sessionpieces [2] after that migration. This makesit not easy to backport to other branches. For a period of time
let's use the latest COSA for our COSA pod for multi-arch builds
instead of the versioned COSA. We will still use the versioned COSA
on the remote so the actual building will happened with the correctly
versioned COSA (substituted below and passed as an argument to
cosa remote-session create --image=). However we still need to supportsome of the new command line flags that were added to aid in the transition
to
cosa remote-session. This PR backports various pieces needed.[1] #2919
[2] #2979