Bug 2054701: Convert Machines directly to InstanceSpec #33

dulek · 2022-03-24T11:27:09Z

This addresses latest refactoring in CAPO based on kubernetes-sigs/cluster-api-provider-openstack#1153.

dulek · 2022-04-06T16:18:46Z

Still WiP, it uses my local rebase as we'd need openshift/cluster-api-provider-openstack#230 first.

dulek · 2022-04-07T09:24:45Z

/retest

Okay, the problem isn't new, we got to investigate it, but it's there with or without this patch.

openshift-ci · 2022-04-07T11:42:23Z

@dulek: This pull request references Bugzilla bug 2054701, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target release (4.11.0) matches configured target release for branch (4.11.0)
bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @eurijon

Details

In response to this:

Bug 2054701: Convert Machines directly to InstanceSpec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dulek · 2022-04-07T12:07:26Z

One fact, I haven't yet proved that this actually solves events issue.

mandre · 2022-04-08T06:57:03Z

/test e2e-openstack

dulek · 2022-04-08T11:38:43Z

/hold

Okay, this doesn't record CAPO events and I think I know why.

dulek · 2022-04-08T11:49:33Z

/hold cancel

@mdbooth, this is ready for reviews and I believe the e2e-openstack failures are unrelated. The events should work now.

dulek · 2022-04-08T12:02:07Z

I reorganized imports back to the initial state after my IDE tried to be smarter than us.

mdbooth · 2022-04-08T12:04:49Z

pkg/machine/actuator.go

 	capov1 "sigs.k8s.io/cluster-api-provider-openstack/api/v1beta1"
 	"sigs.k8s.io/cluster-api-provider-openstack/pkg/cloud/services/compute"
-	clusterv1 "sigs.k8s.io/cluster-api/api/v1beta1"



mdbooth · 2022-04-08T13:59:20Z

pkg/machine/utils.go

+		networkService, err := networking.NewService(&scope.Scope{
+			ProviderClient:     osc.provider,
+			ProviderClientOpts: clientOptsForCloud(osc.cloud),
+			Logger:             ctrl.Log.WithName("capo-network")})


Can we make the scope generation common between getCompute and getNetwork?

func (osc *openStackContext) scope() *scope.Scope { &scope.Scope{...} }

I also think the log name should be the same from both, as it's really describing the controller which is the source of the log message. Might call it cluster-api-provider-openstack rather than machine-api-provider-openstack to differentiate, though. What do you think?

Makes sense, I felt current names are weird, but kept them.

CreateInstance, DeleteInstance and some other methods in CAPO got converted to use InstanceSpec instead of Machine objects as arguments. This commit adapts MAPO for that, effectively replacing mapov1.Machine->capov1.Machine conversion with mapov1.Machine->capov1.InstanceSpec. A bit of other changes related to the CAPO bump were needed too.

By default CAPO will use FakeRecorder for its events. We got to initialize it with the MAPO recorder and this commit does so.

mdbooth · 2022-04-11T10:18:12Z

/retest-required

mdbooth · 2022-04-11T10:20:12Z

/test e2e-openstack

EmilienM · 2022-04-11T14:18:49Z

/test e2e-openstack

mdbooth · 2022-04-12T08:53:47Z

/test e2e-openstack

mdbooth · 2022-04-12T09:09:50Z

/test e2e-openstack

EmilienM · 2022-04-12T12:45:02Z

/retest
deprovision failed, just looking if it's consistent.

EmilienM · 2022-04-12T12:56:45Z

/approve

openshift-ci · 2022-04-12T12:58:11Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dulek, EmilienM

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [EmilienM]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

EmilienM · 2022-04-12T15:51:27Z

/retest

mdbooth · 2022-04-12T15:51:59Z

/lgtm
/hold

EmilienM · 2022-04-12T18:46:24Z

I'm debugging the deprovision issue here: #37

EmilienM · 2022-04-12T20:08:58Z

Deprovision fails because some resources fail to be destroyed by the installer: https://paste.opendev.org/show/bgXseRNEjdrdaoGdq09K/
The logs come from a cluster using that PR.

EmilienM · 2022-04-12T20:10:51Z

The staled ports with this PR: https://paste.opendev.org/show/bSKOgUtbphAicEXZ1uln/

EmilienM · 2022-04-12T20:12:06Z

Yeah I found the issue, somehow the worker ports aren't tagged with openshiftClusterID anymore and this is required by the installer to remove these ports before the workers servers are removed.

EmilienM · 2022-04-12T20:15:28Z

destroy logs: https://paste.opendev.org/show/b3tWyyOqL9YpgNPQkPn3/

EmilienM · 2022-04-12T20:28:14Z

From Neutron server logs, the port was created like this:

2022-04-12 20:06:48.576 18 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6, resource={'id': '0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6', 'name': '887wp37w-39327-7kfgw-worker-0-2hzs9-0', 'network_id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'mac_address': 'fa:16:3e:cb:28:60', 'admin_state_up': True, 'status': 'DOWN', 'device_id': 'fde08fdc-388b-470b-8e0f-628c6bf7ff8e', 'device_owner': 'compute:central', 'fixed_ips': [{'subnet_id': 'a93e4b64-8a0f-4ea3-9109-17e6c5818067', 'ip_address': '10.0.0.59'}], 'allowed_address_pairs': [{'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.5'}, {'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.7'}], 'extra_dhcp_opts': [], 'security_groups': ['b05af4fe-f33e-4e79-b9eb-af35e371107e'], 'description': 'Created by cluster-api-provider-openstack cluster openshift-machine-api-887wp37w-39327-7kfgw', 'binding:vnic_type': 'normal', 'binding:profile': {}, 'binding:host_id': 'mecha-az1.ci.vexxhost.ca', 'binding:vif_type': 'ovs', 'binding:vif_details': {'port_filter': True}, 'qos_policy_id': None, 'port_security_enabled': True, 'dns_name': '887wp37w-39327-7kfgw-worker-0-2hzs9', 'dns_assignment': [{'ip_address': '10.0.0.59', 'hostname': '887wp37w-39327-7kfgw-worker-0-2hzs9', 'fqdn': '887wp37w-39327-7kfgw-worker-0-2hzs9.ci.vexxhost.ca.'}], 'resource_request': None, 'trunk_details': {'trunk_id': '656712f8-10dd-497f-95b9-6b1be6b6b38c', 'sub_ports': []}, 'ip_allocation': 'immediate', 'tags': ['cluster-api-provider-openstack', 'openshift-machine-api-887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:11:35Z', 'updated_at': '2022-04-12T20:06:48Z', 'revision_number': 6, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'network': {'id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'name': '887wp37w-39327-7kfgw-openshift', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'admin_state_up': True, 'mtu': 1242, 'status': 'ACTIVE', 'subnets': ['a93e4b64-8a0f-4ea3-9109-17e6c5818067'], 'shared': False, 'availability_zone_hints': [], 'availability_zones': [], 'ipv4_address_scope': None, 'ipv6_address_scope': None, 'router:external': False, 'vlan_transparent': None, 'description': 'Created By OpenShift Installer', 'qos_policy_id': None, 'port_security_enabled': True, 'dns_domain': '', 'l2_adjacency': True, 'tags': ['887wp37w-39327-7kfgw-primaryClusterNetwork', 'openshiftClusterID=887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:01:22Z', 'updated_at': '2022-04-12T19:01:29Z', 'revision_number': 3, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'provider:network_type': 'geneve', 'provider:physical_network': None, 'provider:segmentation_id': 10}}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84

Something removed the tag later and it was re-updated:

2022-04-12 20:06:53.343 22 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckRevisionNumberCommand(name=0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6, resource={'id': '0dcc3f34-ff3b-4dab-bfe8-9714ca7400e6', 'name': '887wp37w-39327-7kfgw-worker-0-2hzs9-0', 'network_id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'mac_address': 'fa:16:3e:cb:28:60', 'admin_state_up': True, 'status': 'DOWN', 'device_id': '', 'device_owner': '', 'fixed_ips': [{'subnet_id': 'a93e4b64-8a0f-4ea3-9109-17e6c5818067', 'ip_address': '10.0.0.59'}], 'allowed_address_pairs': [{'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.5'}, {'mac_address': 'fa:16:3e:cb:28:60', 'ip_address': '10.0.0.7'}], 'extra_dhcp_opts': [], 'security_groups': ['b05af4fe-f33e-4e79-b9eb-af35e371107e'], 'description': 'Created by cluster-api-provider-openstack cluster openshift-machine-api-887wp37w-39327-7kfgw', 'binding:vnic_type': 'normal', 'binding:profile': {}, 'binding:host_id': '', 'binding:vif_type': 'unbound', 'binding:vif_details': {}, 'qos_policy_id': None, 'port_security_enabled': True, 'dns_name': '', 'dns_assignment': [{'ip_address': '10.0.0.59', 'hostname': 'host-10-0-0-59', 'fqdn': 'host-10-0-0-59.ci.vexxhost.ca.'}], 'resource_request': None, 'trunk_details': {'trunk_id': '656712f8-10dd-497f-95b9-6b1be6b6b38c', 'sub_ports': []}, 'ip_allocation': 'immediate', 'tags': ['cluster-api-provider-openstack', 'openshift-machine-api-887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:11:35Z', 'updated_at': '2022-04-12T20:06:52Z', 'revision_number': 7, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'network': {'id': 'a2fba078-1e58-4321-ae96-ad74e92bad9b', 'name': '887wp37w-39327-7kfgw-openshift', 'tenant_id': 'fc706a2473b64d93bb1f13493d80e330', 'admin_state_up': True, 'mtu': 1242, 'status': 'ACTIVE', 'subnets': ['a93e4b64-8a0f-4ea3-9109-17e6c5818067'], 'shared': False, 'availability_zone_hints': [], 'availability_zones': [], 'ipv4_address_scope': None, 'ipv6_address_scope': None, 'router:external': False, 'vlan_transparent': None, 'description': 'Created By OpenShift Installer', 'qos_policy_id': None, 'port_security_enabled': True, 'dns_domain': '', 'l2_adjacency': True, 'tags': ['887wp37w-39327-7kfgw-primaryClusterNetwork', 'openshiftClusterID=887wp37w-39327-7kfgw'], 'created_at': '2022-04-12T19:01:22Z', 'updated_at': '2022-04-12T19:01:29Z', 'revision_number': 3, 'project_id': 'fc706a2473b64d93bb1f13493d80e330', 'provider:network_type': 'geneve', 'provider:physical_network': None, 'provider:segmentation_id': 10}}, resource_type=ports, if_exists=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84

EmilienM · 2022-04-13T02:24:47Z

I created a cluster with that PR and I could reproduce the issue, see the worker port:

openstack port show 3r9n99h7-39327-g8hwd-worker-0-7bxx4-0
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| Field                   | Value                                                                                                                               |
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                                                                                  |
| allowed_address_pairs   | ip_address='10.0.0.5', mac_address='fa:16:3e:25:36:24'                                                                              |
|                         | ip_address='10.0.0.7', mac_address='fa:16:3e:25:36:24'                                                                              |
| binding_host_id         | None                                                                                                                                |
| binding_profile         | None                                                                                                                                |
| binding_vif_details     | None                                                                                                                                |
| binding_vif_type        | None                                                                                                                                |
| binding_vnic_type       | normal                                                                                                                              |
| created_at              | 2022-04-13T02:20:00Z                                                                                                                |
| data_plane_status       | None                                                                                                                                |
| description             | Created by cluster-api-provider-openstack cluster openshift-machine-api-3r9n99h7-39327-g8hwd                                        |
| device_id               | 77a087bb-73c0-4490-9a89-b449df97ccb6                                                                                                |
| device_owner            | compute:central                                                                                                                     |
| device_profile          | None                                                                                                                                |
| dns_assignment          | fqdn='3r9n99h7-39327-g8hwd-worker-0-7bxx4.ci.vexxhost.ca.', hostname='3r9n99h7-39327-g8hwd-worker-0-7bxx4', ip_address='10.0.0.170' |
| dns_domain              | None                                                                                                                                |
| dns_name                | 3r9n99h7-39327-g8hwd-worker-0-7bxx4                                                                                                 |
| extra_dhcp_opts         |                                                                                                                                     |
| fixed_ips               | ip_address='10.0.0.170', subnet_id='a2033a90-3582-412e-8b89-3538ec57e250'                                                           |
| id                      | 32f34f0c-57ef-469b-b7dc-8bc13a13036a                                                                                                |
| ip_allocation           | immediate                                                                                                                           |
| mac_address             | fa:16:3e:25:36:24                                                                                                                   |
| name                    | 3r9n99h7-39327-g8hwd-worker-0-7bxx4-0                                                                                               |
| network_id              | 98a5cefc-15f0-4093-872b-f4dd503943ee                                                                                                |
| numa_affinity_policy    | None                                                                                                                                |
| port_security_enabled   | True                                                                                                                                |
| project_id              | fc706a2473b64d93bb1f13493d80e330                                                                                                    |
| propagate_uplink_status | None                                                                                                                                |
| qos_network_policy_id   | None                                                                                                                                |
| qos_policy_id           | None                                                                                                                                |
| resource_request        | None                                                                                                                                |
| revision_number         | 5                                                                                                                                   |
| security_group_ids      | f395bad8-bad5-49e6-b533-c2b4d0ae5f90                                                                                                |
| status                  | ACTIVE                                                                                                                              |
| tags                    | cluster-api-provider-openstack, openshift-machine-api-3r9n99h7-39327-g8hwd                                                          |
| trunk_details           | {'trunk_id': '51dccae4-8f2f-454b-a8b0-db6f43aede8c', 'sub_ports': []}                                                               |
| updated_at              | 2022-04-13T02:20:05Z                                                                                                                |
+-------------------------+-------------------------------------------------------------------------------------------------------------------------------------+

We're missing openshiftClusterID tag here.

openshift-ci · 2022-04-13T09:51:55Z

@dulek: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

mdbooth · 2022-04-13T09:58:30Z

/lgtm
/hold cancel

openshift-ci · 2022-04-13T10:00:56Z

@dulek: All pull requests linked via external trackers have merged:

openshift/machine-api-provider-openstack#33

Bugzilla bug 2054701 has been moved to the MODIFIED state.

Details

In response to this:

Bug 2054701: Convert Machines directly to InstanceSpec

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2022

openshift-ci bot requested review from mandre and mdbooth March 24, 2022 11:27

openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 24, 2022

dulek force-pushed the conversion branch from 88d467d to 21b7f3c Compare April 6, 2022 16:00

openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 6, 2022

dulek force-pushed the conversion branch from 21b7f3c to 23c3f2d Compare April 6, 2022 16:18

Rebase CAPO locally

e129627

dulek force-pushed the conversion branch from 23c3f2d to fc7adb2 Compare April 7, 2022 11:39

dulek changed the title ~~WiP: Convert Machines directly to InstanceSpec~~ Bug 2054701: Convert Machines directly to InstanceSpec Apr 7, 2022

openshift-ci bot requested a review from eurijon April 7, 2022 11:42

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2022

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2022

dulek force-pushed the conversion branch from 30e4334 to 34674dd Compare April 8, 2022 12:01

dulek force-pushed the conversion branch from 34674dd to 7e0135e Compare April 8, 2022 13:33

mdbooth suggested changes Apr 8, 2022

View reviewed changes

openshift-ci bot assigned mdbooth Apr 8, 2022

dulek added 2 commits April 8, 2022 17:01

Initialize CAPO event recorder

38c21d3

By default CAPO will use FakeRecorder for its events. We got to initialize it with the MAPO recorder and this commit does so.

EmilienM mentioned this pull request Apr 11, 2022

machine/convert: reverse machineSpec.Ports order #36

Closed

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 12, 2022

openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Apr 12, 2022

Add missing Tags in conversion

46404d1

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 13, 2022

openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Apr 13, 2022

openshift-merge-robot merged commit 044ab8d into openshift:master Apr 13, 2022

mdbooth deleted the conversion branch April 13, 2022 10:01

Bug 2054701: Convert Machines directly to InstanceSpec #33

Bug 2054701: Convert Machines directly to InstanceSpec #33

Uh oh!

Conversation

dulek commented Mar 24, 2022

Uh oh!

dulek commented Apr 6, 2022

Uh oh!

dulek commented Apr 7, 2022

Uh oh!

openshift-ci bot commented Apr 7, 2022

Uh oh!

dulek commented Apr 7, 2022

Uh oh!

mandre commented Apr 8, 2022

Uh oh!

dulek commented Apr 8, 2022

Uh oh!

dulek commented Apr 8, 2022

Uh oh!

dulek commented Apr 8, 2022

Uh oh!

mdbooth Apr 8, 2022

Choose a reason for hiding this comment

Uh oh!

mdbooth Apr 8, 2022

Choose a reason for hiding this comment

Uh oh!

dulek Apr 8, 2022

Choose a reason for hiding this comment

Uh oh!

mdbooth commented Apr 11, 2022

Uh oh!

mdbooth commented Apr 11, 2022

Uh oh!

EmilienM commented Apr 11, 2022

Uh oh!

mdbooth commented Apr 12, 2022

Uh oh!

mdbooth commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

openshift-ci bot commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

mdbooth commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 12, 2022

Uh oh!

EmilienM commented Apr 13, 2022

Uh oh!

openshift-ci bot commented Apr 13, 2022

Uh oh!

mdbooth commented Apr 13, 2022

Uh oh!

openshift-ci bot commented Apr 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants