Skip to content

manifest: add firstboot support#1913

Open
lzap wants to merge 3 commits intoosbuild:mainfrom
lzap:firstboot1
Open

manifest: add firstboot support#1913
lzap wants to merge 3 commits intoosbuild:mainfrom
lzap:firstboot1

Conversation

@lzap
Copy link
Contributor

@lzap lzap commented Sep 30, 2025

customizations: add FirstbootOptions type

In order to create firstboot customizations, we need to define a new type
in the manifest package. This commit introduces the FirstbootOptions type
along with its associated methods and tests.

The type uses a common Go pattern for handling unions, allowing for different
customization options such as CustomFirstbootOptions, SatelliteFirstbootOptions,
and AAPFirstbootOptions.

Function FirstbootOptionsFromBP converts a Blueprint firstboot customization
to a manifest FirstbootOptions which is a slice of scripts.
manifest: add firstboot support

This commit introduces support for firstboot customizations in the
manifest package. It includes the ability to define and manage
firstboot scripts, create necessary systemd units, and ensure
proper execution of these scripts during the first boot of the
system.
test: add firstboot to all-customizations.json

Replaces: #1705

This is now ready for final reviews.

@lzap lzap requested a review from a team as a code owner September 30, 2025 15:40
@lzap lzap changed the title Firstboot1 manifest: add firstboot support Sep 30, 2025
Copy link
Member

@supakeen supakeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick read through, there might be other or more things but this is what jumped out with my morning coffee.

Two things:

  1. You forgot to add Firstboot to OSCustomizations.
  2. The panic thing commit can be a separate PR since it's the first thing I saw when opening this PR and I thought 'wait this isnt firstboot' ;)

@lzap
Copy link
Contributor Author

lzap commented Oct 2, 2025

Resolved all problems, thanks.

You forgot to add Firstboot to OSCustomizations.

Not sure what you mean, there is no new type everything uses existing stages:

  • fsnode
  • cacerts
  • systemd unit

I see correct data in generated manifests but I need to test booting an image. Here is a snippet from Fedora:

https://gist.github.com/lzap/ad28760b96ddf56e64ef4852f33650c5

@lzap
Copy link
Contributor Author

lzap commented Oct 2, 2025

So I boot-tested an image and all files are in place, but it does not work since I made a wrong assumption that multiple Exec=-/path/to/script would be executed one after another even if a script returned non-zero. That is not the case, unit will exit immediately and the dash character will mean it will be only reported as success, but the chain of execution of commands is not completed.

Several solutions:

  • Each first boot has its own unit. But since the BP was designed to keep the execution order of the array, Want/After would be necessary.
  • Create a helper script that executes everything in order.

What do you prefer?

Copy link
Contributor

@avitova avitova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried very hard to find something, but this actually looks very nice. No objections.

@achilleas-k
Copy link
Member

  • Each first boot has its own unit. But since the BP was designed to keep the execution order of the array, Want/After would be necessary.
  • Create a helper script that executes everything in order.

Want/After feels cleaner, but there might be details there that might not make it work as expected (like the second script starting after the first script has started, but before the first script has finished).

@achilleas-k
Copy link
Member

So I boot-tested an image and all files are in place, but it does not work since I made a wrong assumption that multiple Exec=-/path/to/script would be executed one after another even if a script returned non-zero.

I did a quick test and it works the way we want it with Type=oneshot and ExecStart=....

[achilleas@osbuild ~]$ cat /home/achilleas/.local/share/systemd/user/test.service
[Unit]
Description=test

[Service]
Type=oneshot
ExecStart=echo "ONE"
ExecStart=-ls /doesnotexist
ExecStart=echo "FIN"

[Install]
WantedBy=graphical-session.target
[achilleas@osbuild ~]$ systemctl --user start test.service
[achilleas@osbuild ~]$ systemctl --user status test.service
○ test.service - test
     Loaded: loaded (/home/achilleas/.local/share/systemd/user/test.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/user/service.d
             └─10-timeout-abort.conf
     Active: inactive (dead)

Oct 13 21:30:10 osbuild.devel systemd[1041]: Starting test.service - test...
Oct 13 21:30:10 osbuild.devel echo[1347]: ONE
Oct 13 21:30:10 osbuild.devel ls[1348]: ls: cannot access '/doesnotexist': No such file or directory
Oct 13 21:30:10 osbuild.devel echo[1350]: FIN
Oct 13 21:30:10 osbuild.devel systemd[1041]: Finished test.service - test.
Oct 13 21:31:30 osbuild.devel systemd[1041]: Starting test.service - test...
Oct 13 21:31:30 osbuild.devel echo[1360]: ONE
Oct 13 21:31:30 osbuild.devel ls[1362]: ls: cannot access '/doesnotexist': No such file or directory
Oct 13 21:31:30 osbuild.devel echo[1364]: FIN
Oct 13 21:31:30 osbuild.devel systemd[1041]: Finished test.service - test.

@achilleas-k
Copy link
Member

So I boot-tested an image and all files are in place, but it does not work since I made a wrong assumption that multiple Exec=-/path/to/script would be executed one after another even if a script returned non-zero.

I did a quick test and it works the way we want it with Type=oneshot and ExecStart=....

[achilleas@osbuild ~]$ cat /home/achilleas/.local/share/systemd/user/test.service
[Unit]
Description=test

[Service]
Type=oneshot
ExecStart=echo "ONE"
ExecStart=-ls /doesnotexist
ExecStart=echo "FIN"

[Install]
WantedBy=graphical-session.target
[achilleas@osbuild ~]$ systemctl --user start test.service
[achilleas@osbuild ~]$ systemctl --user status test.service
○ test.service - test
     Loaded: loaded (/home/achilleas/.local/share/systemd/user/test.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/user/service.d
             └─10-timeout-abort.conf
     Active: inactive (dead)

Oct 13 21:30:10 osbuild.devel systemd[1041]: Starting test.service - test...
Oct 13 21:30:10 osbuild.devel echo[1347]: ONE
Oct 13 21:30:10 osbuild.devel ls[1348]: ls: cannot access '/doesnotexist': No such file or directory
Oct 13 21:30:10 osbuild.devel echo[1350]: FIN
Oct 13 21:30:10 osbuild.devel systemd[1041]: Finished test.service - test.
Oct 13 21:31:30 osbuild.devel systemd[1041]: Starting test.service - test...
Oct 13 21:31:30 osbuild.devel echo[1360]: ONE
Oct 13 21:31:30 osbuild.devel ls[1362]: ls: cannot access '/doesnotexist': No such file or directory
Oct 13 21:31:30 osbuild.devel echo[1364]: FIN
Oct 13 21:31:30 osbuild.devel systemd[1041]: Finished test.service - test.

EDIT: Oh and that's actually how you did it. Curious why it didn't work. 🤔

@thozza
Copy link
Member

thozza commented Oct 14, 2025

  • Each first boot has its own unit. But since the BP was designed to keep the execution order of the array, Want/After would be necessary.
  • Create a helper script that executes everything in order.

Want/After feels cleaner, but there might be details there that might not make it work as expected (like the second script starting after the first script has started, but before the first script has finished).

In addition, specifying the ordering of the custom first-boot script on an arbitrary other service is something that should be considered from the beginning. Installing a 3rd party service, enabling it and then having a custom first-boot script that does something after it is started does not sound like a far-fetched use case to me.

@lzap
Copy link
Contributor Author

lzap commented Oct 22, 2025

Curious why it didn't work. 🤔

Ha, I know why. The AAP command I was looking for had curl -s (silent) so I saw nothing in logs. I retested without silent option and with echo TOUCH too and it works as expected:

image

From the man page:

If more than one command is configured, the commands are invoked sequentially in the order they appear in the unit file. If one of the commands fails (and is not prefixed with "-"), other lines are not executed, and the unit is considered failed.

So I am going to just remove the -s option and rebase for re-reviews.

@lzap
Copy link
Contributor Author

lzap commented Oct 22, 2025

Rebased, reversed the marker file logic.

Copy link
Member

@achilleas-k achilleas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the duplication of the Blueprint structures in pkg/customizations/firstboot/ is a bit unnecessary. The pkg/customizations/ types are meant to be convenient internal representations of options that serve as intermediate types between blueprint customizations and osbuild stages. Sometimes, it's convenient to have something that very closely resembles the blueprint itself (like with users and groups), because that makes the most sense. In other cases, it might resemble the stage options closer, or be completely independent.

I think in this case, the Blueprint structures aren't convenient internally. They're designed to be convenient for user input, but the union types make it awkward to handle. So copying them from one set of union types to another isn't really giving us any benefit.

Looking at the implementations of the each functions, it looks like for every firstboot customization we basically generate an executable file and an exec line for the systemd unit (with the optional - prefix). Also certs for AAP and Satellite. It seems to me that these things would be perfect as an intermediate representation of the firstboot customizations. So what I'm imagining is the following:

  1. pkg/customizations/firstboot/ defines a type, Script that's basically
type Script struct {
    Filename      string
    Contents      string
    IgnoreFailure bool
    Certs         []string
}
  1. The FirstBootOptionsFromBP(), instead of copying the BP structs to identical ones inside images, implements analogues to the each functions (the ones that are currently in pkg/manifest/) that take blueprint.FirstBootCustomization and return []firstboot.Script.

  2. In pkg/manifest/, a function takes []firstboot.Script and produces a set of []fsnode.File, CA certs, and systemd unit create stage options (equivalent to what parse() does now, but I imagine with a simpler implementation).

This way, it would be a bit easier to (for example) define a firstboot script internally for something we want to define statically in an image type. Consider:

regFirstboot := firstboot.SatelliteFirstbootOptions{
    FirstbootCommonOptions: firstboot.FirstbootCommonOptions{
        Name:          "satellite",
        IgnoreFailure: true,
    },
    CACerts: []string{"cert1", "cert2"},
    Command: "#!/usr/bin/bash\ncurl https://sat.example.com/register",
}

certs, files, unit, err := parse(&firstboot.FirstbootOptions{
  Scripts: []firstboot.FirstbootOption{
      regFirstBoot,
  }
})

vs

regFirstboot := firstboot.Script{
    Filename: "satellite",
    Command: "#!/usr/bin/bash\ncurl https://sat.example.com/register",
    IgnoreFailure: true
    CACerts: []string{"cert1", "cert2"},
}

files, certs, unit, err := genFirstbootComponents([]firstboot.Script{regFirstBoot})

(minor difference, but more readable IMO).

@lzap
Copy link
Contributor Author

lzap commented Oct 29, 2025

I was struggling to understand the difference between the two structures and this really helped to sort out my understanding. Yes, this makes sense and after I refactored the code it looks so much nicer. Rebased, and fixed tests as well.

Copy link
Member

@achilleas-k achilleas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. This looks a lot nicer now. Some minor changes requested below.

@lzap
Copy link
Contributor Author

lzap commented Nov 11, 2025

Rebased, resolved all your comments thanks.

diff --git a/pkg/customizations/firstboot/firstboot.go b/pkg/customizations/firstboot/firstboot.go
index 6a7b04bf5..0f05f9b5a 100644
--- a/pkg/customizations/firstboot/firstboot.go
+++ b/pkg/customizations/firstboot/firstboot.go
@@ -4,8 +4,6 @@ import (
 	"errors"
 	"fmt"
 	"path/filepath"
-	"strings"
-	"text/template"
 
 	"github.com/osbuild/blueprint/pkg/blueprint"
 	"github.com/osbuild/images/pkg/shutil"
@@ -90,25 +88,6 @@ func (AAPFirstbootOptions) isFirstbootOption()       {}
 
 var ErrFirstbootAlreadySet = errors.New("firstboot customization already set")
 
-var tmplFirstbootAAP = `#!/usr/bin/bash
-curl -i --data {{ .HostConfigKey }} {{ .URL }}
-`
-
-func renderFirstboot(tmplStr string, data any) (string, error) {
-	tmpl, err := template.New("firstboot-unit").Parse(tmplStr)
-	if err != nil {
-		return "", fmt.Errorf("error parsing firstboot unit template: %w", err)
-	}
-
-	var result strings.Builder
-	err = tmpl.Execute(&result, data)
-	if err != nil {
-		return "", fmt.Errorf("error rendering firstboot unit: %w", err)
-	}
-
-	return result.String(), nil
-}
-
 // FirstbootOptionsFromBP converts a blueprint FirstbootCustomization to
 // FirstbootOptions. Validation is done in the blueprint package, so this function
 // assumes the input is valid, however, JSON unmarshalling errors are possible.
@@ -162,17 +141,10 @@ func FirstbootOptionsFromBP(bpFirstboot blueprint.FirstbootCustomization) (*Firs
 			}
 			aapDone = true
 
-			data := struct {
-				URL           string
-				HostConfigKey string
-			}{
-				URL:           shutil.Quote(aap.JobTemplateURL),
-				HostConfigKey: shutil.Quote("host_config_key=" + aap.HostConfigKey),
-			}
-			contents, err := renderFirstboot(tmplFirstbootAAP, data)
-			if err != nil {
-				return nil, err
-			}
+			contents := fmt.Sprintf("#!/usr/bin/bash\ncurl -i --data %s %s\n",
+				shutil.Quote("host_config_key="+aap.HostConfigKey),
+				shutil.Quote(aap.JobTemplateURL),
+			)
 
 			fo.Scripts = append(fo.Scripts, Script{
 				Filename:      nameFunc(aap.Name, "aap"),
diff --git a/pkg/manifest/firstboot.go b/pkg/manifest/firstboot.go
index bd9a0dfdd..35f8cacdb 100644
--- a/pkg/manifest/firstboot.go
+++ b/pkg/manifest/firstboot.go
@@ -10,11 +10,10 @@ import (
 	"github.com/osbuild/images/pkg/osbuild"
 )
 
-// parse processes the firstboot options and returns a list of CA certificates to
+// handleFirstbootOptions processes the firstboot options and returns a list of CA certificates to
 // include in the image, a list of file nodes to create the firstboot scripts, and
 // a systemd unit to run the scripts on first boot.
-// TODO RENAME THIS
-func parse(fbo *firstboot.FirstbootOptions) ([]string, []*fsnode.File, *osbuild.SystemdUnitCreateStageOptions, error) {
+func handleFirstbootOptions(fbo *firstboot.FirstbootOptions) ([]string, []*fsnode.File, *osbuild.SystemdUnitCreateStageOptions, error) {
 	if fbo == nil {
 		return nil, nil, nil, nil
 	}
diff --git a/pkg/manifest/firstboot_test.go b/pkg/manifest/firstboot_test.go
index e9017d4d9..32e5291a8 100644
--- a/pkg/manifest/firstboot_test.go
+++ b/pkg/manifest/firstboot_test.go
@@ -134,7 +134,7 @@ echo 'unnamed'`
   }
 }`
 
-	certs, files, unit, err := parse(fbo)
+	certs, files, unit, err := handleFirstbootOptions(fbo)
 	assert.NoError(t, err)
 
 	assert.Equal(t, []string{"cert1", "cert2", "cert3", "cert4"}, certs)
diff --git a/pkg/manifest/os.go b/pkg/manifest/os.go
index 9ebc9781a..b9090de19 100644
--- a/pkg/manifest/os.go
+++ b/pkg/manifest/os.go
@@ -661,9 +661,9 @@ func (p *OS) serialize() (osbuild.Pipeline, error) {
 		pipeline = prependStage(pipeline, osbuild.NewDracutConfStage(dracutConfConfig))
 	}
 
-	fbCerts, fbFiles, fbUnit, err := parse(p.OSCustomizations.Firstboot)
+	fbCerts, fbFiles, fbUnit, err := handleFirstbootOptions(p.OSCustomizations.Firstboot)
 	if err != nil {
-		panic(err)
+		return osbuild.Pipeline{}, err
 	}
 
 	if len(fbFiles) > 0 {

@lzap
Copy link
Contributor Author

lzap commented Nov 11, 2025

Ah missed one comment about the commit ordering, reordered and generating checksums now.

@github-actions
Copy link

This PR is stale because it had no activity for the past 30 days. Remove the "Stale" label or add a comment, otherwise this PR will be closed in 7 days.

@lzap
Copy link
Contributor Author

lzap commented Dec 16, 2025

I rebased, this is ready.

@github-actions github-actions bot removed the Stale label Dec 17, 2025
Copy link
Member

@supakeen supakeen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased, this is ready.

It's failing in CI with the following, which leads to the check host saying things were unsuccessful:

Dec 16 13:51:44 my-host (-first-.)[1092]: osbuild-custom-first-boot.service: Executing: /usr/local/bin/osbuild-first-.
Dec 16 13:51:45 my-host osbuild-first-.[1094]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dec 16 13:51:45 my-host osbuild-first-.[1094]:                                  Dload  Upload   Total   Spent    Left  Speed
Dec 16 13:51:45 my-host osbuild-first-.[1094]: [158B blob data]
Dec 16 13:51:45 my-host osbuild-first-.[1094]: curl: (7) Failed to connect to localhost port 443 after 7 ms: Could not connect to server
Dec 16 13:51:45 my-host systemd[1]: osbuild-custom-first-boot.service: Child 1092 belongs to osbuild-custom-first-boot.service.
Dec 16 13:51:45 my-host systemd[1]: osbuild-custom-first-boot.service: Main process exited, code=exited, status=7/NOTRUNNING
Dec 16 13:51:45 my-host systemd[1]: osbuild-custom-first-boot.service: Failed with result 'exit-code'.
❌ Status for all failed units
× osbuild-custom-first-boot.service
     Loaded: loaded (/usr/lib/systemd/system/osbuild-custom-first-boot.service; enabled; preset: disabled)
     Active: failed (Result: exit-code) since Tue 2025-12-16 14:52:28 GMT; 30s ago
 Invocation: 4ec8bc4876f64bd7b9ae839f90c64818
    Process: 1435 ExecStartPre=/usr/bin/rm /var/local/.osbuild-custom-first-boot (code=exited, status=0/SUCCESS)
    Process: 1440 ExecStart=/usr/local/bin/osbuild-first-. (code=exited, status=7)
    Process: 1458 ExecStart=/usr/local/bin/osbuild-first-. (code=exited, status=7)
   Main PID: 1458 (code=exited, status=7)
   Mem peak: 2.3M
        CPU: 33ms

@lzap
Copy link
Contributor Author

lzap commented Dec 17, 2025

It's failing in CI with the following

Yeah already on it thanks. Feel free to drop code comments tho.

@lzap
Copy link
Contributor Author

lzap commented Dec 17, 2025

Ha a small overlook in how I handle those script names when sanitizing the filenames revealed that if user creates same names for two or more scripts it can be a problem as they overwrite each other. Fixing. Boot tested it locally, it should work.

Now, I am planning to write an extensive host check for this feature, but I would really prefer to get this PR merged first: #2043 so I can write it in Go.

@lzap
Copy link
Contributor Author

lzap commented Jan 12, 2026

Thanks for reviews.

@lzap lzap requested a review from supakeen January 12, 2026 10:36
Copy link
Member

@thozza thozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

In general, the code looks good.

I've added some comments inline. While I have reservations regarding the ordering of the firstboot unit files, which I raised on multiple occasions, I can live with the code as is (Alhough I don't want to be the one who fixes this aspect in the future, because it will be bad to change any hard-coded ordering out of fear that some users may depend on it and we don't want to break them).

The blocking issues from my PoV:

  • handleFirstbootOptions() should be moved to the osbuild package
  • The first commit assigns to osc.Firstboot, which is added by the second commit.

}

if c != nil && c.Firstboot != nil {
osc.Firstboot, err = firstboot.FirstbootOptionsFromBP(*c.Firstboot)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

osc.Firstboot is added by the next commit 12ce45cd6531f4b2b08e078862674b7d7e11df84, so at this point, it does not exist yet.

Copy link
Contributor Author

@lzap lzap Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I messed up big during one of the rebases, sorry. With the refactoring you suggested, I am gonna squash the two now.

Wants: []string{"network-online.target"},
After: []string{"network-online.target", "osbuild-first-boot.service"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said it before, but I'll say it again, since this had been ignored every time.

I strongly believe that the option to order the firstboot script against other systemd services should be exposed to the user and not hard-coded (part of it may make sense).

The use case: If I want to add a custom firstboot script, the network may not be enought if I want to further configure a custom workload. It may mean waiting for some other custom service to be running, or even waiting for other firstboot script to finish. Since we are adding this as new customization, it may be good to consider it.

Also, I'd argue that not all firstboot scripts need to depend on the network being available. Our firstboot script that does RHN registration historically needed network for obvious reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly believe that the option to order the firstboot script against other systemd services should be exposed to the user and not hard-coded (part of it may make sense).

Firstboot customization was specifically designed independent of the underlying implementation. From the day one, this was meant as a simple oneshot script not to compete with other options that are still possible. If anyone wants to mess around with systemd units and ordering, they can use file customization.

Apologies if I ignored any of your comments, this was not the intention. I can go back and add more tightened integration with systemd, however, I remember that someone specifically asked that firstboot (or any customization for that matter) should be generic. @achilleas-k ?

lzap added 3 commits January 23, 2026 15:39
In order to create firstboot customizations, we need to define a new type
in the manifest package. This commit introduces the FirstbootOptions type
along with its associated methods and tests.

The type uses a common Go pattern for handling unions, allowing for different
customization options such as CustomFirstbootOptions, SatelliteFirstbootOptions,
and AAPFirstbootOptions.

Function FirstbootOptionsFromBP converts a Blueprint firstboot customization
to a manifest FirstbootOptions which is a slice of scripts.
Adds some firstboot into all customization.
Also, in case of ostree cacert was not present and it is a dependency
of firstboot, add it too.
@lzap
Copy link
Contributor Author

lzap commented Jan 23, 2026

Rebased, refactored as you suggested, merged the two commits into one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants