AGENT-1302,OCPBUGS-61668: Merge interactive ignition into unconfigured-ignition#9941
Conversation
|
@zaneb: This pull request references AGENT-1302 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. This pull request references Jira Issue OCPBUGS-61668, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@zaneb: This pull request references AGENT-1302 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. This pull request references Jira Issue OCPBUGS-61668, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
3626c67 to
ef51e88
Compare
Expect to read the rendezvous-host.env template from a separate .template file. The template is placed there by openshift/installer#9941. Only fall back to expecting the .env file to be a template if the former is not present. This change is backward-compatible.
e8f1f7f to
e0a8a83
Compare
c20ba5d to
b1b6fcb
Compare
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
/retest-required |
|
/test e2e-agent-sno-ipv4-pxe |
b1b6fcb to
cfd6866
Compare
|
/test e2e-agent-compact-ipv4-appliance-diskimage |
|
The appliance tests is failing in the equinix setup, so hard to get a good signal. |
|
/test e2e-agent-compact-ipv4-appliance-diskimage |
| [Unit] | ||
| Description=Service that runs the Agent Installer UI | ||
| Wants=network-online.target assisted-service.service | ||
| Conflicts=agent-register-cluster.service agent-import-cluster.service |
There was a problem hiding this comment.
Any specific reason to remove also agent-import-cluster.service from the conflicts? That's a day2 add-node service, so the UI shouldn't be used in such case
There was a problem hiding this comment.
Conflicts are evaluated when systemd builds the transaction, but Conditions are only evaluated at runtime. So Conflicts are dangerous to use in combination with Condtions, because they could cause neither service to run.
The add-node ignition will not have /etc/assisted/interactive-ui, so that's enough.
There was a problem hiding this comment.
🤔 do you maybe mean that we should also review also our other services where this combo is present, relying only on the ConditionPathExists (for mutually exclusive situations, like the agent-import-cluster.service vs agent-start-ui.service?
There was a problem hiding this comment.
Doesn't hurt to check, but I believe none of those conflicting services are ever enabled+Wanted at the same time.
| ConditionPathExists=/etc/assisted/rendezvous-host.env | ||
| ConditionPathExists=/etc/assisted/rendezvous-host.env.template | ||
| ConditionPathExists=/etc/assisted/interactive-ui | ||
| ConditionPathExists=!/usr/local/bin/agent-tui |
There was a problem hiding this comment.
IIUC, this condition is required to allow enabling by the default the agent-extract-tui.service, so that it will not conflict during the regular workflow (which takes care of extracting the agent-tui artifacts from the initrd) - with the final intention of simplifying the unconfigured-ignition asset? If so a small comment could be useful here
There was a problem hiding this comment.
TBH it seemed self-explanatory to me that if we already have the agent-tui binary then we don't need to run a service to get it?
It already wouldn't happen in the regular workflow because we have always had ConditionPathExists=/etc/assisted/interactive-ui. So this condition isn't required at all. But one day when we enable the UI in the regular agent install, it will be one less bug that somebody has to fix.
This is described in the commit message for anybody who needs to know the context in future.
There was a problem hiding this comment.
At least for me it was not immediately evident, since the service was expected to run only once (and currently only when the UI is present). And btw, I think in future we should change the triggering condition for this service, detaching completely from the UI and binding it to a more explicit one (ie something like /etc/assisted/no-registry) indicating that the payload is immediately available for consumption (and the same could be used by the TUI to skip the release check).
Yeah the commit message was useful (previous link seems broken now, I think it's this one), even though also a comment wouldn't have hurt IMHO
There was a problem hiding this comment.
Right now the appliance has the registry, but I believe does not have the agent-tui, and we don't want it to use the agent-tui because we don't want it creating the rendezvous-host.env file - only the config-iso should do that. One day we might want the GUI to be available in the appliance, but by conditioning on interactive-ui we are punting until then.
| getRendezvousHostEnvTemplate(data, workflowType), | ||
| nodeZeroIP) | ||
| if err != nil { | ||
| panic(err) |
There was a problem hiding this comment.
I was expecting to return an error here rather than a panic
There was a problem hiding this comment.
The template that we are resolving is hard-coded here in the source. If that can't be resolved then it isn't due to bad input, just straight up bad code. We should fail as early and loudly as possible.
The function signature hasn't changed; it never returned an error.
| if err != nil { | ||
| return err | ||
| } | ||
| rendezvousHostFile := ignition.FileFromString(rendezvousHostEnvPath, "root", 0644, rendezvousHostData) |
There was a problem hiding this comment.
So there could be the case where both the /etc/assisted/rendezvous-host.env.template and /etc/assisted/rendezvous-host.env will be simultaneously present on the disk (with the template file always rendered)?
Given the new logic introduced in openshift/agent-installer-utils#156, in which cases it will be required to have also the non-template file, if it's ignored when the template is present? Maybe also here some comments could be useful to understand better the flow
There was a problem hiding this comment.
Keeping the template in the /etc/assisted/rendezvous-host.env file is for backwards compatibility so that we don't change the way --interactive works. Once we've switched to not passing --interactive we can delete all of the code in the case workflow.AgentWorkflowTypeInstallInteractiveDisconnected: section.
There was a problem hiding this comment.
Couldn't be immediately removed? AFAIK the rendezvousIP is always empty in the disconnected workflow (until now - the only expected integration will arrive from the SaaS UI that will generate the agent-config.yaml, something now captured)
There was a problem hiding this comment.
Since openshift/agent-installer-utils#156 landed, it probably could be removed. At the time I didn't want to create an ordering dependency with that PR, because that's a sure way to not get things landed, and right now I don't want to delay landing this in order to test any changes to the existing --interactive behaviour.
It doesn't hurt to have them.
Instead of just starting the UI container and leaving conmon to manage it (e.g. by restarting on failure), continue to manage it with systemd as we do for all other podman containers.
Rely on the /etc/assisted/interactive-ui file to enable the UI, and never disable the service altogether. Because Conflicts and Conditions are evaluated at different times (when building the transaction vs. when starting the service), it is not safe to combine them for the purposes of selecting which services to run.
Run this service whenever the UI is enabled (and agent-tui is not already present). In practice this is the same as when it runs now, but enables us to not make the list of enabled services specific to the interactive installer.
Always add the rendezvous-host.env template to the unconfigured-ignition, but use the .template suffix to avoid interfering with any tests on the existence of the env file. We can modify agent-tui to look for the template at this path prior to switching to the regular unconfigured-ignition.
Escape a literal $ with $$.
Previously when agent-tui filled this in as a template, an IPv6 rendezvous IP was not handled correctly. Also, the Rendezvous IP needs to be canonicalized when the installer fills in the template.
This way arguments can't get mixed up.
If the rendezvous IP is known, always create the rendezvous-host.env environment file in the unconfigured-ignition.
Even if the RendezvousIP is not specified directly, we can still calculate one from NMState data if it is provided. We already do this in the regular agent ignition, so do the same in the unconfigured ignition. There is no need to pass the host list because in the unconfigured ignition hosts will not have assigned roles, so there is no additional data there that can be used to improve the selection.
The unconfigured-ignition does not contain any auth keys. In the appliance, all of the auth settings are part of the config-image. Whatever settings are applied in the ignition get overwritten before any of the services that use them start. However, for the interactive install we must explicitly set the AuthType to "none" to prevent services looking for keys that are not actually generated.
We can enable this service unconditionally, because we already have a sentinel for disabling services related to the config-image.
bd25104 to
bfcc946
Compare
|
/verified by e2e-agent-compact-ipv4-appliance-diskimage, e2e-agent-compact-ipv4-iso-no-registry |
|
@zaneb: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@zaneb: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| ConditionPathExists=/etc/assisted/rendezvous-host.env | ||
| ConditionPathExists=/etc/assisted/rendezvous-host.env.template | ||
| ConditionPathExists=/etc/assisted/interactive-ui | ||
| ConditionPathExists=!/usr/local/bin/agent-tui |
There was a problem hiding this comment.
At least for me it was not immediately evident, since the service was expected to run only once (and currently only when the UI is present). And btw, I think in future we should change the triggering condition for this service, detaching completely from the UI and binding it to a more explicit one (ie something like /etc/assisted/no-registry) indicating that the payload is immediately available for consumption (and the same could be used by the TUI to skip the release check).
Yeah the commit message was useful (previous link seems broken now, I think it's this one), even though also a comment wouldn't have hurt IMHO
| if err != nil { | ||
| return err | ||
| } | ||
| rendezvousHostFile := ignition.FileFromString(rendezvousHostEnvPath, "root", 0644, rendezvousHostData) |
There was a problem hiding this comment.
Couldn't be immediately removed? AFAIK the rendezvousIP is always empty in the disconnected workflow (until now - the only expected integration will arrive from the SaaS UI that will generate the agent-config.yaml, something now captured)
| [Unit] | ||
| Description=Service that runs the Agent Installer UI | ||
| Wants=network-online.target assisted-service.service | ||
| Conflicts=agent-register-cluster.service agent-import-cluster.service |
There was a problem hiding this comment.
🤔 do you maybe mean that we should also review also our other services where this combo is present, relying only on the ConditionPathExists (for mutually exclusive situations, like the agent-import-cluster.service vs agent-start-ui.service?
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andfasano The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/lgtm |
fd30059
into
openshift:main
|
@zaneb: Jira Issue OCPBUGS-61668: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged:
All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-61668 has not been moved to the MODIFIED state. This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix included in accepted release 4.21.0-0.nightly-2025-11-13-042845 |
Ensure that even without passing the
--interactiveflag, the ignition created withagent create unconfigured-ignitioncontains everything needed to enable the interactive flow except for the sentinel file/etc/assisted/interactive-ui.Also ensure that when agent-tui fills in the template to supply the rendezvousIP, IPv6 addresses are handled correctly.