Skip to content

AGENT-1303: Allow InfraEnv registration to use late binding#8059

Merged
openshift-merge-bot[bot] merged 3 commits intoopenshift:masterfrom
zaneb:infra-env-late-binding
Oct 18, 2025
Merged

AGENT-1303: Allow InfraEnv registration to use late binding#8059
openshift-merge-bot[bot] merged 3 commits intoopenshift:masterfrom
zaneb:infra-env-late-binding

Conversation

@zaneb
Copy link
Member

@zaneb zaneb commented Sep 24, 2025

In ABI, if no Cluster is present when registering the InfraEnv, register it without a cluster reference and allow it to be used for late-binding agents.

In practice the Cluster is always registered before the InfraEnv in the regular ABI flow. This change allows the InfraEnv registration to be used without the Cluster registration when the latter is expected to be performed by the UI.

The 'register' command has long since been split into separate
'registerCluster' and 'registerInfraEnv' phases, so the duplicate code
can be removed.
A failure to do a ListClusters call is different from successfully doing
the call and finding that there are no clusters registered. Handle the
two cases separately.
If there is no Cluster object present when registering an InfraEnv,
register it without a Cluster reference to allow for late binding.

There is no need to pass the openshiftVersion because an InfraEnv
without a version specified will always use the latest available OS
image (of which we only register one anyway).
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 24, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 24, 2025

@zaneb: This pull request references AGENT-1303 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

In ABI, if no Cluster is present when registering the InfraEnv, register it without a cluster reference and allow it to be used for late-binding agents.

In practice the Cluster is always registered before the InfraEnv in the regular ABI flow. This change allows the InfraEnv registration to be used without the Cluster registration when the latter is expected to be performed by the UI.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Sep 24, 2025
@openshift-ci
Copy link

openshift-ci bot commented Sep 24, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: zaneb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 24, 2025
@codecov
Copy link

codecov bot commented Sep 24, 2025

Codecov Report

❌ Patch coverage is 0% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.90%. Comparing base (cab6cc9) to head (252744e).
⚠️ Report is 43 commits behind head on master.

Files with missing lines Patch % Lines
cmd/agentbasedinstaller/client/main.go 0.00% 7 Missing ⚠️
cmd/agentbasedinstaller/register.go 0.00% 6 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8059      +/-   ##
==========================================
+ Coverage   73.68%   74.90%   +1.22%     
==========================================
  Files         400      403       +3     
  Lines       68657    73029    +4372     
==========================================
+ Hits        50587    54706    +4119     
- Misses      15356    15479     +123     
- Partials     2714     2844     +130     
Files with missing lines Coverage Δ
cmd/agentbasedinstaller/register.go 10.09% <0.00%> (-0.15%) ⬇️
cmd/agentbasedinstaller/client/main.go 0.00% <0.00%> (ø)

... and 34 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zaneb
Copy link
Member Author

zaneb commented Sep 25, 2025

/retest-required

1 similar comment
@zaneb
Copy link
Member Author

zaneb commented Sep 26, 2025

/retest-required

if modelsCluster != nil {
log.Infof("Reference to cluster id: %s", modelsCluster.ID.String())
} else {
log.Info("No Cluster found, registering InfraEnv for late binding")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a POC including this PR and openshift/installer#9960 This works fine such that now first the infra-env is created and cluster object is only attempted to be created when in case of OVE, user enters details such as name, basedomain, pull-secret etc from the first page and hits next button. At that time, assisted service does throw an error here because the infra-env was already created. -

Sep 30 19:04:47 localhost.localdomain service[3731]: time="2025-09-30T19:04:47Z" 
level=error msg="failed to create infraenv" 
func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).RegisterInfraEnvInternal.func1" 

file="/src/internal/bminventory/inventory.go:4983" cluster_id=f7861ee0-b07d-428a-b1c3-020c84ff3651 

error="ERROR: duplicate key value violates unique constraint \"infra_envs_pkey\" (SQLSTATE 23505)" 

go-id=324 pkg=Inventory request_id=16e21b82-9d86-424e-8a41-86c7ce3bd108

Sep 30 19:04:47 localhost.localdomain service[3731]: time="2025-09-30T19:04:47Z" 
level=error msg="Failed to register InfraEnv mycluster_infra-env with id f7861ee0-b07d-428a-b1c3-020c84ff3651. 
Error: ERROR: duplicate key value violates unique constraint \"infra_envs_pkey\" (SQLSTATE 23505)" 
func="github.com/openshift/assisted-service/internal/bminventory.(*bareMetalInventory).RegisterInfraEnvInternal"
file="/src/internal/bminventory/inventory.go:4993" 
cluster_id=f7861ee0-b07d-428a-b1c3-020c84ff3651 go-id=324 pkg=Inventory request_id=16e21b82-9d86-424e-8a41-86c7ce3bd108

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it should first check if infra-env is already created

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent-register-infraenv.service

Sep 30 18:51:47 localhost.localdomain agent-register-infraenv[4018]: time="2025-09-30T18:51:47Z" level=info msg="No Cluster found, registering InfraEnv for late binding"
Sep 30 18:51:47 localhost.localdomain agent-register-infraenv[4018]: time="2025-09-30T18:51:47Z" level=info msg="Registering infraenv"
Sep 30 18:51:47 localhost.localdomain podman[4001]: time="2025-09-30T18:51:47Z" level=info msg="No Cluster found, registering InfraEnv for late binding"
Sep 30 18:51:47 localhost.localdomain podman[4001]: time="2025-09-30T18:51:47Z" level=info msg="Registering infraenv"
Sep 30 18:51:47 localhost.localdomain podman[4001]: time="2025-09-30T18:51:47Z" level=info msg="Registered infraenv with id: f7861ee0-b07d-428a-b1c3-020c84ff3651"
Sep 30 18:51:47 localhost.localdomain agent-register-infraenv[4018]: time="2025-09-30T18:51:47Z" level=info msg="Registered infraenv with id: f7861ee0-b07d-428a-b1c3-020c84ff3651"```

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected with the current UI implementation, because in ABI we define the InfraEnv UUID at ISO generation time (it's passed as an env var to assisted-service). That shouldn't block merging of this PR because this one has no effect in practice by itself.

However, before merging openshift/installer#9960 we will need to update the UI to cope with this flow. To start with, having the UI check for the existence of an InfraEnv before creating a new one would allow you to get a bit further with the PoC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pawanpinjarkar the main goal of the PoC is to support designing a solution for https://issues.redhat.com/browse/AGENT-1294, and thus identify a gap list - to prepare a breakdown for the epic. I also didn't expect that the UI worked out-of-the-box, and the good thing is that the PoC confirmed that - so another point for the gap list.
A quick hack to proceed with the exploration in the PoC is to disable the UI and manually simulate its behavior by directly interacting with AS after the agent-register-infraenv service (with the basic idea of supporting just one workflow, where instead of running the start-cluster-installation service then the agent-start-ui is executed (when requested)

@pawanpinjarkar
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2025
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 528a1a8 and 2 for PR HEAD 252744e in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 75a5f7c and 1 for PR HEAD 252744e in total

@openshift-ci
Copy link

openshift-ci bot commented Oct 17, 2025

@zaneb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 252744e link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 23fc26b and 0 for PR HEAD 252744e in total

@openshift-merge-bot openshift-merge-bot bot merged commit 181d75e into openshift:master Oct 18, 2025
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants