Skip to content

Conversation

@dulek
Copy link
Contributor

@dulek dulek commented Mar 24, 2022

This addresses latest refactoring in CAPO based on kubernetes-sigs/cluster-api-provider-openstack#1153.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2022
@openshift-ci openshift-ci bot requested review from mandre and mdbooth March 24, 2022 11:27
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 24, 2022
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 6, 2022
@dulek
Copy link
Contributor Author

dulek commented Apr 6, 2022

Still WiP, it uses my local rebase as we'd need openshift/cluster-api-provider-openstack#230 first.

@dulek
Copy link
Contributor Author

dulek commented Apr 7, 2022

/retest

Okay, the problem isn't new, we got to investigate it, but it's there with or without this patch.

@dulek dulek changed the title WiP: Convert Machines directly to InstanceSpec Bug 2054701: Convert Machines directly to InstanceSpec Apr 7, 2022
@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 7, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 7, 2022

@dulek: This pull request references Bugzilla bug 2054701, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @eurijon

Details

In response to this:

Bug 2054701: Convert Machines directly to InstanceSpec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested a review from eurijon April 7, 2022 11:42
@dulek
Copy link
Contributor Author

dulek commented Apr 7, 2022

One fact, I haven't yet proved that this actually solves events issue.

@mandre
Copy link
Member

mandre commented Apr 8, 2022

/test e2e-openstack

@dulek
Copy link
Contributor Author

dulek commented Apr 8, 2022

/hold

Okay, this doesn't record CAPO events and I think I know why.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2022
@dulek
Copy link
Contributor Author

dulek commented Apr 8, 2022

/hold cancel

@mdbooth, this is ready for reviews and I believe the e2e-openstack failures are unrelated. The events should work now.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2022
@dulek
Copy link
Contributor Author

dulek commented Apr 8, 2022

I reorganized imports back to the initial state after my IDE tried to be smarter than us.

capov1 "sigs.k8s.io/cluster-api-provider-openstack/api/v1beta1"
"sigs.k8s.io/cluster-api-provider-openstack/pkg/cloud/services/compute"
clusterv1 "sigs.k8s.io/cluster-api/api/v1beta1"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 😀

Comment on lines 58 to 61
networkService, err := networking.NewService(&scope.Scope{
ProviderClient: osc.provider,
ProviderClientOpts: clientOptsForCloud(osc.cloud),
Logger: ctrl.Log.WithName("capo-network")})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make the scope generation common between getCompute and getNetwork?

func (osc *openStackContext) scope() *scope.Scope {
  &scope.Scope{...}
}

I also think the log name should be the same from both, as it's really describing the controller which is the source of the log message. Might call it cluster-api-provider-openstack rather than machine-api-provider-openstack to differentiate, though. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I felt current names are weird, but kept them.

dulek added 2 commits April 8, 2022 17:01
CreateInstance, DeleteInstance and some other methods in CAPO got
converted to use InstanceSpec instead of Machine objects as arguments.
This commit adapts MAPO for that, effectively replacing
mapov1.Machine->capov1.Machine conversion with
mapov1.Machine->capov1.InstanceSpec.

A bit of other changes related to the CAPO bump were needed too.
By default CAPO will use FakeRecorder for its events. We got to
initialize it with the MAPO recorder and this commit does so.
@mdbooth
Copy link
Contributor

mdbooth commented Apr 11, 2022

/retest-required

@mdbooth
Copy link
Contributor

mdbooth commented Apr 11, 2022

/test e2e-openstack

1 similar comment
@EmilienM
Copy link
Member

/test e2e-openstack

@mdbooth
Copy link
Contributor

mdbooth commented Apr 12, 2022

/test e2e-openstack

1 similar comment
@mdbooth
Copy link
Contributor

mdbooth commented Apr 12, 2022

/test e2e-openstack

@EmilienM
Copy link
Member

/retest
deprovision failed, just looking if it's consistent.

@EmilienM
Copy link
Member

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 12, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dulek, EmilienM

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 12, 2022
@EmilienM
Copy link
Member

/retest

@mdbooth
Copy link
Contributor

mdbooth commented Apr 12, 2022

/lgtm
/hold

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Apr 12, 2022
@EmilienM
Copy link
Member

I'm debugging the deprovision issue here: #37

@EmilienM
Copy link
Member

Deprovision fails because some resources fail to be destroyed by the installer: https://paste.opendev.org/show/bgXseRNEjdrdaoGdq09K/
The logs come from a cluster using that PR.

@EmilienM
Copy link
Member

The staled ports with this PR: https://paste.opendev.org/show/bSKOgUtbphAicEXZ1uln/

@EmilienM
Copy link
Member

Yeah I found the issue, somehow the worker ports aren't tagged with openshiftClusterID anymore and this is required by the installer to remove these ports before the workers servers are removed.

@EmilienM
Copy link
Member

@EmilienM
Copy link
Member

From Neutron server logs, the port was created like this:

2022-04-12 20:06:48.576 18 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6, resource={'id': '0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6', 'name': '887wp37w-39327-7kfgw-worker-0-2hzs9-0', 'network_id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'mac_address': 'fa:16:3e:cb:28:60', 'admin_state_up': True, 'status': 'DOWN', 'device_id': 'fde08fdc-388b-470b-8e0f-628c6bf7ff8e', 'device_owner': 'compute:central', 'fixed_ips': [{'subnet_id': 'a93e4b64-8a0f-4ea3-9109-17e6c5818067', 'ip_address': '10.0.0.59'}], 'allowed_address_pairs': [{'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.5'}, {'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.7'}], 'extra_dhcp_opts': [], 'security_groups': ['b05af4fe-f33e-4e79-b9eb-af35e371107e'], 'description': 'Created by cluster-api-provider-openstack cluster openshift-machine-api-887wp37w-39327-7kfgw', 'binding:vnic_type': 'normal', 'binding:profile': {}, 'binding:host_id': 'mecha-az1.ci.vexxhost.ca', 'binding:vif_type': 'ovs', 'binding:vif_details': {'port_filter': True}, 'qos_policy_id': None, 'port_security_enabled': True, 'dns_name': '887wp37w-39327-7kfgw-worker-0-2hzs9', 'dns_assignment': [{'ip_address': '10.0.0.59', 'hostname': '887wp37w-39327-7kfgw-worker-0-2hzs9', 'fqdn': '887wp37w-39327-7kfgw-worker-0-2hzs9.ci.vexxhost.ca.'}], 'resource_request': None, 'trunk_details': {'trunk_id': '656712f8-10dd-497f-95b9-6b1be6b6b38c', 'sub_ports': []}, 'ip_allocation': 'immediate', 'tags': ['cluster-api-provider-openstack', 'openshift-machine-api-887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:11:35Z', 'updated_at': '2022-04-12T20:06:48Z', 'revision_number': 6, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'network': {'id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'name': '887wp37w-39327-7kfgw-openshift', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'admin_state_up': True, 'mtu': 1242, 'status': 'ACTIVE', 'subnets': ['a93e4b64-8a0f-4ea3-9109-17e6c5818067'], 'shared': False, 'availability_zone_hints': [], 'availability_zones': [], 'ipv4_address_scope': None, 'ipv6_address_scope': None, 'router:external': False, 'vlan_transparent': None, 'description': 'Created By OpenShift Installer', 'qos_policy_id': None, 'port_security_enabled': True, 'dns_domain': '', 'l2_adjacency': True, 'tags': ['887wp37w-39327-7kfgw-primaryClusterNetwork', 'openshiftClusterID=887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:01:22Z', 'updated_at': '2022-04-12T19:01:29Z', 'revision_number': 3, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'provider:network_type': 'geneve', 'provider:physical_network': None, 'provider:segmentation_id': 10}}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84

Something removed the tag later and it was re-updated:

2022-04-12 20:06:53.343 22 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6, resource={'id': '0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6', 'name': '887wp37w-39327-7kfgw-worker-0-2hzs9-0', 'network_id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'mac_address': 'fa:16:3e:cb:28:60', 'admin_state_up': True, 'status': 'DOWN', 'device_id': '', 'device_owner': '', 'fixed_ips': [{'subnet_id': 'a93e4b64-8a0f-4ea3-9109-17e6c5818067', 'ip_address': '10.0.0.59'}], 'allowed_address_pairs': [{'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.5'}, {'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.7'}], 'extra_dhcp_opts': [], 'security_groups': ['b05af4fe-f33e-4e79-b9eb-af35e371107e'], 'description': 'Created by cluster-api-provider-openstack cluster openshift-machine-api-887wp37w-39327-7kfgw', 'binding:vnic_type': 'normal', 'binding:profile': {}, 'binding:host_id': '', 'binding:vif_type': 'unbound', 'binding:vif_details': {}, 'qos_policy_id': None, 'port_security_enabled': True, 'dns_name': '', 'dns_assignment': [{'ip_address': '10.0.0.59', 'hostname': 'host-10-0-0-59', 'fqdn': 'host-10-0-0-59.ci.vexxhost.ca.'}], 'resource_request': None, 'trunk_details': {'trunk_id': '656712f8-10dd-497f-95b9-6b1be6b6b38c', 'sub_ports': []}, 'ip_allocation': 'immediate', 'tags': ['cluster-api-provider-openstack', 'openshift-machine-api-887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:11:35Z', 'updated_at': '2022-04-12T20:06:52Z', 'revision_number': 7, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'network': {'id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'name': '887wp37w-39327-7kfgw-openshift', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'admin_state_up': True, 'mtu': 1242, 'status': 'ACTIVE', 'subnets': ['a93e4b64-8a0f-4ea3-9109-17e6c5818067'], 'shared': False, 'availability_zone_hints': [], 'availability_zones': [], 'ipv4_address_scope': None, 'ipv6_address_scope': None, 'router:external': False, 'vlan_transparent': None, 'description': 'Created By OpenShift Installer', 'qos_policy_id': None, 'port_security_enabled': True, 'dns_domain': '', 'l2_adjacency': True, 'tags': ['887wp37w-39327-7kfgw-primaryClusterNetwork', 'openshiftClusterID=887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:01:22Z', 'updated_at': '2022-04-12T19:01:29Z', 'revision_number': 3, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'provider:network_type': 'geneve', 'provider:physical_network': None, 'provider:segmentation_id': 10}}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84

@EmilienM
Copy link
Member

I created a cluster with that PR and I could reproduce the issue, see the worker port:

openstack port show 3r9n99h7-39327-g8hwd-worker-0-7bxx4-0
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                                               |
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                                                  |
| allowed_address_pairs   | ip_address='10.0.0.5', mac_address='fa:16:3e:25:36:24'                                                                              |
|                         | ip_address='10.0.0.7', mac_address='fa:16:3e:25:36:24'                                                                              |
| binding_host_id         | None                                                                                                                                |
| binding_profile         | None                                                                                                                                |
| binding_vif_details     | None                                                                                                                                |
| binding_vif_type        | None                                                                                                                                |
| binding_vnic_type       | normal                                                                                                                              |
| created_at              | 2022-04-13T02:20:00Z                                                                                                                |
| data_plane_status       | None                                                                                                                                |
| description             | Created by cluster-api-provider-openstack cluster openshift-machine-api-3r9n99h7-39327-g8hwd                                        |
| device_id               | 77a087bb-73c0-4490-9a89-b449df97ccb6                                                                                                |
| device_owner            | compute:central                                                                                                                     |
| device_profile          | None                                                                                                                                |
| dns_assignment          | fqdn='3r9n99h7-39327-g8hwd-worker-0-7bxx4.ci.vexxhost.ca.', hostname='3r9n99h7-39327-g8hwd-worker-0-7bxx4', ip_address='10.0.0.170' |
| dns_domain              | None                                                                                                                                |
| dns_name                | 3r9n99h7-39327-g8hwd-worker-0-7bxx4                                                                                                 |
| extra_dhcp_opts         |                                                                                                                                     |
| fixed_ips               | ip_address='10.0.0.170', subnet_id='a2033a90-3582-412e-8b89-3538ec57e250'                                                           |
| id                      | 32f34f0c-57ef-469b-b7dc-8bc13a13036a                                                                                                |
| ip_allocation           | immediate                                                                                                                           |
| mac_address             | fa:16:3e:25:36:24                                                                                                                   |
| name                    | 3r9n99h7-39327-g8hwd-worker-0-7bxx4-0                                                                                               |
| network_id              | 98a5cefc-15f0-4093-872b-f4dd503943ee                                                                                                |
| numa_affinity_policy    | None                                                                                                                                |
| port_security_enabled   | True                                                                                                                                |
| project_id              | fc706a2473b64d93bb1f13493d80e330                                                                                                    |
| propagate_uplink_status | None                                                                                                                                |
| qos_network_policy_id   | None                                                                                                                                |
| qos_policy_id           | None                                                                                                                                |
| resource_request        | None                                                                                                                                |
| revision_number         | 5                                                                                                                                   |
| security_group_ids      | f395bad8-bad5-49e6-b533-c2b4d0ae5f90                                                                                                |
| status                  | ACTIVE                                                                                                                              |
| tags                    | cluster-api-provider-openstack, openshift-machine-api-3r9n99h7-39327-g8hwd                                                          |
| trunk_details           | {'trunk_id': '51dccae4-8f2f-454b-a8b0-db6f43aede8c', 'sub_ports': []}                                                               |
| updated_at              | 2022-04-13T02:20:05Z                                                                                                                |
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------+

We're missing openshiftClusterID tag here.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 13, 2022

@dulek: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mdbooth
Copy link
Contributor

mdbooth commented Apr 13, 2022

/lgtm
/hold cancel

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Apr 13, 2022
@openshift-merge-robot openshift-merge-robot merged commit 044ab8d into openshift:master Apr 13, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 13, 2022

@dulek: All pull requests linked via external trackers have merged:

Bugzilla bug 2054701 has been moved to the MODIFIED state.

Details

In response to this:

Bug 2054701: Convert Machines directly to InstanceSpec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mdbooth mdbooth deleted the conversion branch April 13, 2022 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants