Skip to content

Conversation

@cgwalters
Copy link
Collaborator

No description provided.

@bootc-bot bootc-bot bot requested a review from jeckersb October 2, 2025 18:09
@github-actions github-actions bot added area/install Issues related to `bootc install` area/documentation Updates to the documentation labels Oct 2, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant amount of work related to composefs, including switching the fs-verity hashing algorithm from SHA256 to SHA512, adding a new workflow for building 'sealed' images with Unified Kernel Images (UKIs), and making the composefs backend a default feature. The changes are mostly consistent and well-structured, with good use of type aliases to improve code clarity.

However, I've found a few issues that need attention:

  • There's some dead and potentially buggy code in the new xtask for building sealed images.
  • A new Dockerfile contains a leftover command that should be removed.
  • The documentation for newly added CLI options and subcommands is incomplete.

Please see my detailed comments for suggestions on how to address these points.

Comment on lines +117 to +131
**--composefs-native**



**--insecure**



Default: false

**--bootloader**=*BOOTLOADER*



Default: grub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation for the new options --composefs-native, --insecure, and --bootloader is incomplete. Please add descriptions for these options to explain their purpose and usage.

| **bootc usr-overlay** | Add a transient writable overlayfs on `/usr` |
| **bootc install** | Install the running container to a target |
| **bootc container** | Operations which can be executed as part of a container build |
| **bootc composefs-finalize-staged** | |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The description for the new subcommand bootc composefs-finalize-staged is missing. Please add a description to explain what this command does.

@cgwalters
Copy link
Collaborator Author

Split prep work for this to #1665

@cgwalters cgwalters force-pushed the composefs-by-default branch from 837dc44 to cccc2f8 Compare October 6, 2025 14:10
@cgwalters
Copy link
Collaborator Author

@jeckersb I ended up reworking this one to use the bootc-in-container for the sealing side because it would be too annoying in our CI to have it be on the host (which is ubuntu here).

@jeckersb
Copy link
Collaborator

jeckersb commented Oct 6, 2025

I'm guessing this is something to do with me running just build-sealed inside of my toolbox, but I'm getting...

$ podman build -t localhost/bootc --build-arg=COMPOSEFS_FSVERITY=23e148ffffc8f37e2609bfe450bc945fa97e984a788d93fc8f93f19f7fc9ef9a9359c252969d0e12ac10caa7dd9a9427f9ad1c74696dae9be2b2575817386ab9 --build-arg=base=localhost/bootc-unsealed --secret=id=key,src=target/test-secureboot/db.key --secret=id=cert,src=target/test-secureboot/db.crt -f Dockerfile.cfsuki .
Error: failed to parse query parameter 'secrets': "[\"id=key,src=podman-build-secret3148315696\",\"id=cert,src=podman-build-secret3059655711\"]": rename /var/tmp/libpod_builder3803266185/build/podman-build-secret3148315696 /var/tmp/libpod_builder3803266185/podman-build-secret3148315696: no such file or directory
error: command exited with non-zero code `podman build -t localhost/bootc --build-arg=COMPOSEFS_FSVERITY=23e148ffffc8f37e2609bfe450bc945fa97e984a788d93fc8f93f19f7fc9ef9a9359c252969d0e12ac10caa7dd9a9427f9ad1c74696dae9be2b2575817386ab9 --build-arg=base=localhost/bootc-unsealed --secret=id=key,src=target/test-secureboot/db.key --secret=id=cert,src=target/test-secureboot/db.crt -f Dockerfile.cfsuki .`: 125
error: Recipe `build-sealed` failed on line 18 with exit code 1

@cgwalters
Copy link
Collaborator Author

Hmm I don't know what's up with that error. Offhand it's perhaps something to do with the source bind path when it's relative? Does it work if you make it an absolute path? What's your wrapper for podman in the toolbox look like?

@jeckersb
Copy link
Collaborator

jeckersb commented Oct 7, 2025

Hmm I don't know what's up with that error. Offhand it's perhaps something to do with the source bind path when it's relative? Does it work if you make it an absolute path? What's your wrapper for podman in the toolbox look like?

Nope still doesn't work with absolute paths

⬢ [jeckersb@toolbx bootc]$ ls -l /var/home/jeckersb/git/bootc/target/test-secureboot/db.{crt,key}
-rw-r--r--. 1 jeckersb jeckersb 1854 Oct  6 15:54 /var/home/jeckersb/git/bootc/target/test-secureboot/db.crt
-rw-------. 1 jeckersb jeckersb 3272 Oct  6 15:54 /var/home/jeckersb/git/bootc/target/test-secureboot/db.key
⬢ [jeckersb@toolbx bootc]$ podman build -t localhost/bootc --build-arg=COMPOSEFS_FSVERITY=23e148ffffc8f37e2609bfe450bc945fa97e984a788d93fc8f93f19f7fc9ef9a9359c252969d0e12ac10caa7dd9a9427f9ad1c74696dae9be2b2575817386ab9 --build-arg=base=localhost/bootc-unsealed --secret=id=key,src=/var/home/jeckersb/git/bootc/target/test-secureboot/db.key --secret=id=cert,src=/var/home/jeckersb/git/bootc/target/test-secureboot/db.crt -f Dockerfile.cfsuki .
Error: failed to parse query parameter 'secrets': "[\"id=key,src=podman-build-secret2171653322\",\"id=cert,src=podman-build-secret2161445125\"]": rename /var/tmp/libpod_builder2211788443/build/podman-build-secret2171653322 /var/tmp/libpod_builder2211788443/podman-build-secret2171653322: no such file or directory

I don't have any wrapper for podman in my toolbox, but I conditionally set CONTAINER_CONNECTION to go over the podman socket. From .bashrc:

if [ -n "${TOOLBOX_PATH}" ]
then
    CONTAINER_CONNECTION=host
fi

and then...

$ podman system connection list
Name        URI                                       Identity    Default     ReadWrite
host        unix:///run/user/1000/podman/podman.sock              true        true
$ systemctl --user status podman.socket
● podman.socket - Podman API Socket
     Loaded: loaded (/usr/lib/systemd/user/podman.socket; enabled; preset: disabled)
     Active: active (listening) since Mon 2025-10-06 21:50:56 EDT; 13min ago
 Invocation: 0257bbab67374412a75a2c0b32b77018
   Triggers: ● podman.service
       Docs: man:podman-system-service(1)
     Listen: /run/user/1000/podman/podman.sock (Stream)
     CGroup: /user.slice/user-1000.slice/[email protected]/app.slice/podman.socket

I suspect there's just something weird going on with passing secrets across via podman-remote from inside of the toolbox. I'll poke at it more tomorrow.

Given the eyeball test the changes look good to me. From CI it looks like it needs a cargo fmt and an update to test image pull check since now we're pulling the image which it expects to be absent?

@jeckersb
Copy link
Collaborator

jeckersb commented Oct 7, 2025

I suspect there's just something weird going on with passing secrets across via podman-remote from inside of the toolbox. I'll poke at it more tomorrow.

Testing a stripped-down case this morning, it works fine for me to pass secrets inside of the toolbox over podman-remote to the host. I've also verified that if I run the original podman command directly on the host, it works properly.

So I'm thinking now that it has something to do with passing secrets over podman-remote in combination with multistage builds.

@jeckersb
Copy link
Collaborator

jeckersb commented Oct 7, 2025

I suspect there's just something weird going on with passing secrets across via podman-remote from inside of the toolbox. I'll poke at it more tomorrow.

Testing a stripped-down case this morning, it works fine for me to pass secrets inside of the toolbox over podman-remote to the host. I've also verified that if I run the original podman command directly on the host, it works properly.

So I'm thinking now that it has something to do with passing secrets over podman-remote in combination with multistage builds.

Ahhhhhh it's this - containers/podman#25314

I'll put in a separate PR to add that silly workaround to our .dockerignore

@cgwalters cgwalters force-pushed the composefs-by-default branch 5 times, most recently from 3dbc6dd to ee05675 Compare October 8, 2025 17:42
@cgwalters
Copy link
Collaborator Author

cgwalters commented Oct 8, 2025

OK it took me a bit too long to figure out the reason that we were getting cargo: command not found is that the default GHA Ubuntu runners seem to install Rust via rustup in the default runner user, so it isn't available as root.

This will get naturally fixed when we move to using bcvk here.

However...for now I moved the outer portion of the sealing back to shell script - we've already moved some of the inner portion into the Rust code.

@cgwalters cgwalters force-pushed the composefs-by-default branch from ee05675 to a58d0a3 Compare October 8, 2025 18:22
@cgwalters
Copy link
Collaborator Author

OK though, unfortunately right now tmt's virtual provisioner doesn't support uefi. That looks pretty easy to fix, but I'm a bit tempted to try out having bcvk libvirt replace the testcloud stuff.

@cgwalters cgwalters enabled auto-merge (rebase) October 8, 2025 19:04
@cgwalters cgwalters requested a review from jeckersb October 8, 2025 19:04
@cgwalters cgwalters disabled auto-merge October 8, 2025 19:38
@cgwalters cgwalters force-pushed the composefs-by-default branch 4 times, most recently from bb56008 to b70bee9 Compare October 9, 2025 13:36
This ensures we're SHA-512 across the board.

Signed-off-by: Colin Walters <[email protected]>
- Use bash strict mode more consistently
- Drop the error redirections which can mask problems as
  recommended by AI

Signed-off-by: Colin Walters <[email protected]>
@cgwalters cgwalters force-pushed the composefs-by-default branch 2 times, most recently from f500118 to 39fffbd Compare October 16, 2025 13:32
@cgwalters
Copy link
Collaborator Author

Is that the primary thing blocking this at this point?

Yeah I think so, filed teemtee/tmt#4203

That said, we can work around this by using bcvk to provision a system external to tmt. That's not hard, but the downside is that it's logic that'd need to be replicated into anything else that wants to use tmt.

@cgwalters cgwalters force-pushed the composefs-by-default branch from 39fffbd to 1c4b1d9 Compare October 16, 2025 13:47
@jeckersb
Copy link
Collaborator

This looks good testing it out on my end, looks like CI is failing because bootupd is missing on the ostree case. I think maybe the jobs are stepping on each other trying to both use the localhost/bootc tag? Maybe just needs the Justfile updated to use localhost/bootc-sealed in the sealed case...

@cgwalters cgwalters force-pushed the composefs-by-default branch from 1c4b1d9 to c5c0137 Compare October 16, 2025 14:37
@cgwalters cgwalters enabled auto-merge (rebase) October 16, 2025 14:47
@cgwalters
Copy link
Collaborator Author

This looks good testing it out on my end, looks like CI is failing because bootupd is missing on the ostree case.

Yeah I'd messed up the bootupd detection, fixed now

Copy link
Collaborator

@jeckersb jeckersb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, assuming the in-flight jobs pass just needs trivial validation cleanup

- Change the install logic to detect UKIs and automatically
  enable composefs
- Change the install logic to detect absence of bootupd
  and default to installing systemd-boot
- Move sealing bits to the toplevel
- Add Justfile entrypoints
- Add basic end-to-end CI coverage (install + run) using
  our integration tests
- Change lints to ignore `/boot/EFI`

Signed-off-by: Colin Walters <[email protected]>
@cgwalters cgwalters force-pushed the composefs-by-default branch from c5c0137 to 77685f0 Compare October 16, 2025 15:20
@cgwalters
Copy link
Collaborator Author

Holy 🐮 so I removed the background & on the rm -rf in the setup as it felt obscure to me but...the "bootc ubuntu setup" in some of these jobs is now over 1h30m! In other runs it's just 7-9m.

Of course, us having to clean up tons of garbage from the stock runners is ridiculous and hopefully GH will implement #1669

But while let's not kill CI for this one, I'm going to try to go back to optimizing the provisioner...

@cgwalters
Copy link
Collaborator Author

OK no, it's just 2/4 jobs here that had some kind of pathological slowness going on, but we don't have enough timing information to have a good idea of what specifically that was. Working on a followup commit.

@cgwalters cgwalters merged commit f4c678e into bootc-dev:main Oct 16, 2025
36 checks passed
@cgwalters
Copy link
Collaborator Author

There was some chat (internal) that this might have broken Anaconda bootc installs. If so I have a theory why:

We might not be correctly detecting bootupd presence in the case where we're doing a to-filesystem install outside of a container. We have CI that covers that this installs, but not that it boots. Well, glancing at the CI job for this https://github.com/bootc-dev/bootc/actions/runs/18566247913/job/52928051266 I do see bootupd being run.

Are the target systems here being booted via UEFI or bios?

Can you link to a failing CI job? Is this only happening in rawhide with the bootc COPR? Does it also happen on c10s for example?

And this doesn't reproduce with the ostreecontainer flow?

@elkoniu
Copy link

elkoniu commented Oct 21, 2025

Hi @cgwalters , thanks for checking this out. So I was testing images published by bootc and the last one working fine was: https://download.copr.fedorainfracloud.org/results/rhcontainerbot/bootc/fedora-rawhide-x86_64/09691020-bootc/

I saw your changes in this commit and I forced Anaconda to have all binaries bootc is looking for installed: bootupctl and bootctl

The result is that after bootc install to-filesystem command is completed (without returning an error) and reboot is performed we are landing in the grub shell. My guess is that it is related to grub not being able to find a kernel to run as in the past we had a similar issue when we mess up a /boot partition mount.

@elkoniu
Copy link

elkoniu commented Oct 21, 2025

This is the log from the manual install process triggered in the Anaconda runtime (fails to boot):

[anaconda root@fedora ~]# bootc install to-filesystem --stateroot=default --source-imgref=registry:quay.io/fedora-testing/fedora-bootc:rawhide-standard --target-imgref=registry:quay.io/fedora-testing/fedora-bootc:rawhide-standard /mnt/sysroot
Installing image: docker://registry:quay.io/fedora-testing/fedora-bootc:rawhide-standard
Initializing ostree layout
layers already present: 0; layers needed: 1 (925.8 MB)
Fetched layers: 882.90 MiB in 12 minutes (1.23 MiB/s)                                                                                                             Deploying container image: done (14 seconds)                                                                                                                  Installing bootloader via bootupd
Installed: grub.cfg
Installed: bootuuid.cfg
Trimming root
.: 16 GiB (17193353216 bytes) trimmed
Finalizing filesystem root
Trimming boot
boot: 1.8 GiB (1906499584 bytes) trimmed
Finalizing filesystem boot
Installation complete!
[anaconda root@fedora ~]# bootc --version
bootc 1.8.0

@elkoniu
Copy link

elkoniu commented Oct 21, 2025

I found some of my old install logs and the only difference I can see is:

Added 01_users.cfg
Added 10_blscfg.cfg
Added 14_menu_show_once.cfg
Added 30_uefi-firmware.cfg
Added 41_custom.cfg

But this is something bootupd specific right?

@cgwalters
Copy link
Collaborator Author

Oh yes that's the regression, looking

@cgwalters
Copy link
Collaborator Author

It's odd, I do see that output in our CI job here. What's the reproducer for this again? Does it need rhinstaller/anaconda#6298 ?

@elkoniu
Copy link

elkoniu commented Oct 21, 2025

It's odd, I do see that output in our CI job here. What's the reproducer for this again? Does it need rhinstaller/anaconda#6298 ?

Yes, I am running this on top of the ISO produced out of this PR.

I just did a small test and manually amended /boot/grub2/grub.cfg with:

### BEGIN 01_users.cfg ###
# Keep the comment for grub2-set-password
### BEGIN /etc/grub.d/01_users ###
if [ -f ${prefix}/user.cfg ]; then
  source ${prefix}/user.cfg
  if [ -n "${GRUB2_PASSWORD}" ]; then
    set superusers="root"
    export superusers
    password_pbkdf2 root ${GRUB2_PASSWORD}
  fi
fi
### END 01_users.cfg ###

### BEGIN 10_blscfg.cfg ###
blscfg
### END 10_blscfg.cfg ###

### BEGIN 14_menu_show_once.cfg ###
# Force the menu to be shown once, with a timeout of ${menu_show_once_timeout}
# if requested by ${menu_show_once_timeout} being set in the env.
if [ "${menu_show_once_timeout}" ]; then
  set timeout_style=menu
  set timeout="${menu_show_once_timeout}"
  unset menu_show_once_timeout
  save_env menu_show_once_timeout
fi
### END 14_menu_show_once.cfg ###

### BEGIN 30_uefi-firmware.cfg ###
if [ "$grub_platform" = "efi" ]; then
        menuentry 'UEFI Firmware Settings' $menuentry_id_option 'uefi-firmware' {
                fwsetup
        }
fi
### END 30_uefi-firmware.cfg ###

### BEGIN 41_custom.cfg ###
if [ -f $prefix/custom.cfg ]; then
  source $prefix/custom.cfg
fi
### END 41_custom.cfg ###

And now deployed OS was able to boot.

@elkoniu
Copy link

elkoniu commented Oct 21, 2025

To rebuild an ISO I am using this script:
https://github.com/rhinstaller/anaconda/blob/72d232b94032b6724fd8ad4443f3946370c05391/scripts/testing/rebuild_iso

Then on top of that I am updating the ISO with:
https://github.com/rhinstaller/anaconda/blob/72d232b94032b6724fd8ad4443f3946370c05391/scripts/testing/update_iso

I am not sure which version of the bootc Lorax is using (it is part of the toolchain to rebuild the ISO).

@elkoniu
Copy link

elkoniu commented Oct 21, 2025

To run the ISO I am using qemu:

#!/bin/bash

set -e

DISK_NAME=mydisk.img

# Prepare
qemu-img create $DISK_NAME 20G

# Run
qemu-system-x86_64 -machine q35 \
                   -m 10G \
                   -accel kvm \
                   -cpu host \
                   -smp cores=2,threads=4 \
                   -boot menu=on,splash-time=3000 \
                   -vga virtio \
                   -net user,hostfwd=tcp::10022-:22 \
                   -net nic \
                   -drive format=raw,file=mydisk.img \
                   -cdrom result/iso/boot-updated.iso

# Clean up
rm -rfv $DISK_NAME

Now I started to wondering if this is not some Bios vs UEFI related story with bootupd behaving different on your setup and mine.

@abadger
Copy link

abadger commented Oct 22, 2025

@cgwalters Simon de Vlieger tracked down the important missing piece as blscfg. What do you think? Is this something that anaconda is doing wrong or is this a regression in bootc code that we just need to wait for a fix for? (This is the only thing left before we can merge the PR implementing the bootc kicakstart command).

@cgwalters
Copy link
Collaborator Author

It must be a bootc regression. I will look at this tomorrow.

@cgwalters
Copy link
Collaborator Author

To explicitly close the loop here, I couldn't reproduce this failure except when using a really old image. Please ping if that's incorrect.

@elkoniu
Copy link

elkoniu commented Oct 29, 2025

Thanks for checking this @cgwalters . I can confirm that with the latest images it works fine (https://quay.io/repository/fedora/fedora-bootc?tab=tags&tag=rawhide) and the grub config is consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/documentation Updates to the documentation area/install Issues related to `bootc install`

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants