Skip to content

AGENT-1302,OCPBUGS-61668: Merge interactive ignition into unconfigured-ignition#9941

Merged
openshift-merge-bot[bot] merged 13 commits intoopenshift:mainfrom
zaneb:interactive-unconfigured
Nov 8, 2025
Merged

AGENT-1302,OCPBUGS-61668: Merge interactive ignition into unconfigured-ignition#9941
openshift-merge-bot[bot] merged 13 commits intoopenshift:mainfrom
zaneb:interactive-unconfigured

Conversation

@zaneb
Copy link
Member

@zaneb zaneb commented Sep 12, 2025

Ensure that even without passing the --interactive flag, the ignition created with agent create unconfigured-ignition contains everything needed to enable the interactive flow except for the sentinel file /etc/assisted/interactive-ui.

Also ensure that when agent-tui fills in the template to supply the rendezvousIP, IPv6 addresses are handled correctly.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Sep 12, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 12, 2025

@zaneb: This pull request references AGENT-1302 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

This pull request references Jira Issue OCPBUGS-61668, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Ensure that even without passing the --interactive flag, the ignition created with agent create unconfigured-ignition contains everything needed to enable the interactive flow except for the sentinel file /etc/assisted/interactive-ui.

Also ensure that when agent-tui fills in the template to supply the rendezvousIP, IPv6 addresses are handled correctly.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from andfasano and rwsu September 12, 2025 11:01
@zaneb
Copy link
Member Author

zaneb commented Sep 12, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Sep 12, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 12, 2025

@zaneb: This pull request references AGENT-1302 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

This pull request references Jira Issue OCPBUGS-61668, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @mhanss

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Sep 12, 2025
@openshift-ci openshift-ci bot requested a review from mhanss September 12, 2025 11:05
@zaneb zaneb force-pushed the interactive-unconfigured branch from 3626c67 to ef51e88 Compare September 12, 2025 11:54
zaneb added a commit to zaneb/agent-installer-utils that referenced this pull request Sep 12, 2025
Expect to read the rendezvous-host.env template from a separate
.template file. The template is placed there by
openshift/installer#9941.

Only fall back to expecting the .env file to be a template if the former
is not present. This change is backward-compatible.
@zaneb zaneb force-pushed the interactive-unconfigured branch 2 times, most recently from e8f1f7f to e0a8a83 Compare September 12, 2025 12:43
@zaneb zaneb force-pushed the interactive-unconfigured branch 6 times, most recently from c20ba5d to b1b6fcb Compare September 23, 2025 23:20
@zaneb
Copy link
Member Author

zaneb commented Sep 25, 2025

/retest

2 similar comments
@zaneb
Copy link
Member Author

zaneb commented Sep 25, 2025

/retest

@zaneb
Copy link
Member Author

zaneb commented Sep 26, 2025

/retest

@rwsu
Copy link
Contributor

rwsu commented Sep 26, 2025

/retest-required

@zaneb
Copy link
Member Author

zaneb commented Sep 28, 2025

/test e2e-agent-sno-ipv4-pxe

@zaneb zaneb force-pushed the interactive-unconfigured branch from b1b6fcb to cfd6866 Compare October 3, 2025 04:03
@zaneb
Copy link
Member Author

zaneb commented Oct 5, 2025

/test e2e-agent-compact-ipv4-appliance-diskimage

@zaneb
Copy link
Member Author

zaneb commented Oct 6, 2025

The appliance tests is failing in the equinix setup, so hard to get a good signal.

@zaneb
Copy link
Member Author

zaneb commented Oct 6, 2025

/test e2e-agent-compact-ipv4-appliance-diskimage

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Nov 6, 2025
[Unit]
Description=Service that runs the Agent Installer UI
Wants=network-online.target assisted-service.service
Conflicts=agent-register-cluster.service agent-import-cluster.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason to remove also agent-import-cluster.service from the conflicts? That's a day2 add-node service, so the UI shouldn't be used in such case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflicts are evaluated when systemd builds the transaction, but Conditions are only evaluated at runtime. So Conflicts are dangerous to use in combination with Condtions, because they could cause neither service to run.

The add-node ignition will not have /etc/assisted/interactive-ui, so that's enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 do you maybe mean that we should also review also our other services where this combo is present, relying only on the ConditionPathExists (for mutually exclusive situations, like the agent-import-cluster.service vs agent-start-ui.service?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hurt to check, but I believe none of those conflicting services are ever enabled+Wanted at the same time.

ConditionPathExists=/etc/assisted/rendezvous-host.env
ConditionPathExists=/etc/assisted/rendezvous-host.env.template
ConditionPathExists=/etc/assisted/interactive-ui
ConditionPathExists=!/usr/local/bin/agent-tui
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this condition is required to allow enabling by the default the agent-extract-tui.service, so that it will not conflict during the regular workflow (which takes care of extracting the agent-tui artifacts from the initrd) - with the final intention of simplifying the unconfigured-ignition asset? If so a small comment could be useful here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH it seemed self-explanatory to me that if we already have the agent-tui binary then we don't need to run a service to get it?

It already wouldn't happen in the regular workflow because we have always had ConditionPathExists=/etc/assisted/interactive-ui. So this condition isn't required at all. But one day when we enable the UI in the regular agent install, it will be one less bug that somebody has to fix.

This is described in the commit message for anybody who needs to know the context in future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for me it was not immediately evident, since the service was expected to run only once (and currently only when the UI is present). And btw, I think in future we should change the triggering condition for this service, detaching completely from the UI and binding it to a more explicit one (ie something like /etc/assisted/no-registry) indicating that the payload is immediately available for consumption (and the same could be used by the TUI to skip the release check).
Yeah the commit message was useful (previous link seems broken now, I think it's this one), even though also a comment wouldn't have hurt IMHO

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the appliance has the registry, but I believe does not have the agent-tui, and we don't want it to use the agent-tui because we don't want it creating the rendezvous-host.env file - only the config-iso should do that. One day we might want the GUI to be available in the appliance, but by conditioning on interactive-ui we are punting until then.

getRendezvousHostEnvTemplate(data, workflowType),
nodeZeroIP)
if err != nil {
panic(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting to return an error here rather than a panic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template that we are resolving is hard-coded here in the source. If that can't be resolved then it isn't due to bad input, just straight up bad code. We should fail as early and loudly as possible.

The function signature hasn't changed; it never returned an error.

if err != nil {
return err
}
rendezvousHostFile := ignition.FileFromString(rendezvousHostEnvPath, "root", 0644, rendezvousHostData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there could be the case where both the /etc/assisted/rendezvous-host.env.template and /etc/assisted/rendezvous-host.env will be simultaneously present on the disk (with the template file always rendered)?
Given the new logic introduced in openshift/agent-installer-utils#156, in which cases it will be required to have also the non-template file, if it's ignored when the template is present? Maybe also here some comments could be useful to understand better the flow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the template in the /etc/assisted/rendezvous-host.env file is for backwards compatibility so that we don't change the way --interactive works. Once we've switched to not passing --interactive we can delete all of the code in the case workflow.AgentWorkflowTypeInstallInteractiveDisconnected: section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't be immediately removed? AFAIK the rendezvousIP is always empty in the disconnected workflow (until now - the only expected integration will arrive from the SaaS UI that will generate the agent-config.yaml, something now captured)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since openshift/agent-installer-utils#156 landed, it probably could be removed. At the time I didn't want to create an ordering dependency with that PR, because that's a sure way to not get things landed, and right now I don't want to delay landing this in order to test any changes to the existing --interactive behaviour.

zaneb added 13 commits November 7, 2025 09:55
It doesn't hurt to have them.
Instead of just starting the UI container and leaving conmon to manage
it (e.g. by restarting on failure), continue to manage it with systemd
as we do for all other podman containers.
Rely on the /etc/assisted/interactive-ui file to enable the UI, and
never disable the service altogether.

Because Conflicts and Conditions are evaluated at different times (when
building the transaction vs. when starting the service), it is not safe
to combine them for the purposes of selecting which services to run.
Run this service whenever the UI is enabled (and agent-tui is not
already present). In practice this is the same as when it runs now, but
enables us to not make the list of enabled services specific to the
interactive installer.
Always add the rendezvous-host.env template to the
unconfigured-ignition, but use the .template suffix to avoid interfering
with any tests on the existence of the env file. We can modify agent-tui
to look for the template at this path prior to switching to the regular
unconfigured-ignition.
Previously when agent-tui filled this in as a template, an IPv6
rendezvous IP was not handled correctly.

Also, the Rendezvous IP needs to be canonicalized when the installer
fills in the template.
This way arguments can't get mixed up.
If the rendezvous IP is known, always create the rendezvous-host.env
environment file in the unconfigured-ignition.
Even if the RendezvousIP is not specified directly, we can still
calculate one from NMState data if it is provided. We already do this in
the regular agent ignition, so do the same in the unconfigured ignition.

There is no need to pass the host list because in the unconfigured
ignition hosts will not have assigned roles, so there is no additional
data there that can be used to improve the selection.
The unconfigured-ignition does not contain any auth keys. In the
appliance, all of the auth settings are part of the config-image.
Whatever settings are applied in the ignition get overwritten before any
of the services that use them start.

However, for the interactive install we must explicitly set the AuthType
to "none" to prevent services looking for keys that are not actually
generated.
We can enable this service unconditionally, because we already have a
sentinel for disabling services related to the config-image.
@zaneb zaneb force-pushed the interactive-unconfigured branch from bd25104 to bfcc946 Compare November 7, 2025 01:16
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Nov 7, 2025
@zaneb
Copy link
Member Author

zaneb commented Nov 7, 2025

/verified by e2e-agent-compact-ipv4-appliance-diskimage, e2e-agent-compact-ipv4-iso-no-registry

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 7, 2025
@openshift-ci-robot
Copy link
Contributor

@zaneb: This PR has been marked as verified by e2e-agent-compact-ipv4-appliance-diskimage,e2e-agent-compact-ipv4-iso-no-registry.

Details

In response to this:

/verified by e2e-agent-compact-ipv4-appliance-diskimage, e2e-agent-compact-ipv4-iso-no-registry

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 7, 2025

@zaneb: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-resourcegroup cfd6866 link false /test e2e-azure-ovn-resourcegroup
ci/prow/e2e-agent-two-node-fencing-ipv4 bfcc946 link false /test e2e-agent-two-node-fencing-ipv4
ci/prow/okd-scos-e2e-aws-ovn bfcc946 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Contributor

@andfasano andfasano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

ConditionPathExists=/etc/assisted/rendezvous-host.env
ConditionPathExists=/etc/assisted/rendezvous-host.env.template
ConditionPathExists=/etc/assisted/interactive-ui
ConditionPathExists=!/usr/local/bin/agent-tui
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for me it was not immediately evident, since the service was expected to run only once (and currently only when the UI is present). And btw, I think in future we should change the triggering condition for this service, detaching completely from the UI and binding it to a more explicit one (ie something like /etc/assisted/no-registry) indicating that the payload is immediately available for consumption (and the same could be used by the TUI to skip the release check).
Yeah the commit message was useful (previous link seems broken now, I think it's this one), even though also a comment wouldn't have hurt IMHO

if err != nil {
return err
}
rendezvousHostFile := ignition.FileFromString(rendezvousHostEnvPath, "root", 0644, rendezvousHostData)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't be immediately removed? AFAIK the rendezvousIP is always empty in the disconnected workflow (until now - the only expected integration will arrive from the SaaS UI that will generate the agent-config.yaml, something now captured)

[Unit]
Description=Service that runs the Agent Installer UI
Wants=network-online.target assisted-service.service
Conflicts=agent-register-cluster.service agent-import-cluster.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 do you maybe mean that we should also review also our other services where this combo is present, relying only on the ConditionPathExists (for mutually exclusive situations, like the agent-import-cluster.service vs agent-start-ui.service?

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andfasano

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 7, 2025
@bfournie
Copy link
Contributor

bfournie commented Nov 7, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 7, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit fd30059 into openshift:main Nov 8, 2025
26 of 28 checks passed
@openshift-ci-robot
Copy link
Contributor

@zaneb: Jira Issue OCPBUGS-61668: Some pull requests linked via external trackers have merged:

The following pull request, linked via external tracker, has not merged:

All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh.

Jira Issue OCPBUGS-61668 has not been moved to the MODIFIED state.

This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload.

Details

In response to this:

Ensure that even without passing the --interactive flag, the ignition created with agent create unconfigured-ignition contains everything needed to enable the interactive flow except for the sentinel file /etc/assisted/interactive-ui.

Also ensure that when agent-tui fills in the template to supply the rendezvousIP, IPv6 addresses are handled correctly.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.21.0-0.nightly-2025-11-13-042845

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants