Skip to content

Conversation

@iamemilio
Copy link

@iamemilio iamemilio commented Mar 9, 2021

Allows you to specify how many ports to attach to each machine created for a given network.
This patch also includes some fixes to how ports are attached to nodes:

  • ensures no duplicate ports
  • ensures ports are not taken
  • ensures ports meet networking needs specified by user

Fixes: OSASINFRA-2364

@iamemilio iamemilio requested a review from adduarte March 9, 2021 17:24
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 9, 2021
@pierreprinetti
Copy link
Member

I know it feels like overhead, but I would be grateful if, when possible, you could add a link to a Jira card (or a BZ) to PRs, so that it's easier for the bypasser to get some context

@iamemilio
Copy link
Author

/retest

1 similar comment
@pierreprinetti
Copy link
Member

/retest

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from iamemilio after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@iamemilio iamemilio changed the title [WIP] Port Count Port Count Mar 22, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2021
@iamemilio
Copy link
Author

don't retest until Nova is back up on vexxhost

@iamemilio iamemilio force-pushed the portCount branch 2 times, most recently from 06883a5 to 3d8472f Compare March 23, 2021 20:06
@iamemilio
Copy link
Author

/retest

portTags: net.PortTags,
vnicType: net.VNICType,
portSecurity: net.PortSecurity,
portName: fmt.Sprintf("%s-%d", name, i),
Copy link

@MaysaMacedo MaysaMacedo Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure where the instance name is enforced, but could it have the following format <infra-id>-master-<n>? and consequently port name <infra-id>-master-<n>-<n>?

Copy link
Author

@iamemilio iamemilio Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I am not sure actually. Can CAPO manage control plane resources? If so, then this is a possibility. For workers the name scheme for machines is <cluster-id>-<machineset name>-<unique hash>

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it does, so this could be a problem. The way I see it, there are 2 ways to handle this:

  1. Change the naming scheme of all instances to follow the <cluster-id>-<machineset name>-<unique hash> model.
  2. Only append a number at the end of a port's name if more than one port is requested, that way the only valid usage is for worker nodes (for now)

@MaysaMacedo I am not sure if either of these would have consequences for kuryr, but if either is acceptable, I would lean towards 1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I thought about this, and while its ugly for now, I would rather just leave it and make it a consideration in a bz we are going to file to eventually have upstream parity with this feature based on this work happening upstream: kubernetes-sigs#778

We will fix this post FF by allowing the port names to be explicitly setable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're working on moving away from relying on ports names on 4.8, so it's fine. I was just not sure is the name of that Port would be something like ostest-qxzjd-master-port-0-3 and if it that would be expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that would be a possibility for the master nodes. Its ugly, but we are going to fix it by letting you specify port names when we adopt kubernetes-sigs#778

@mdbooth
Copy link

mdbooth commented Mar 24, 2021

@iamemilio @mandre I just had a quick look in upstream CAPO and I didn't immediately see any SR-IOV support there.

I wonder if we should discuss this patch upstream first to ensure that:

  • I didn't miss anything
  • This approach will be acceptable there
  • There's nobody else working on an incompatible solution which hasn't landed yet

It would be good to at least validate design changes like this, even if we don't do a full upstream port ourselves.

@iamemilio
Copy link
Author

This approach will likely not be acceptable upstream unless they have support for OpenShift Operators. I would need to follow up with NFV to see if there is any upstream equivilent of the OpenShift SR-IOV Operator for upstream kubernetes.

Emilio Garcia added 2 commits March 25, 2021 13:41
SR-IOV may require a single port interface be attached to a node for
each workload (pod) that utilizes SR-IOV networking.
SR-IOV users need to be able to disable the port security. This feature
allows users to disable it in the machine spec for a given network or
subnet.
@iamemilio
Copy link
Author

/retest

1 similar comment
@iamemilio
Copy link
Author

/retest

Emilio Garcia added 2 commits March 26, 2021 12:35
In order to aviod potential races where other threads take the same
ports, we have to remove the re-use logic.

Fixes Bug 1943599
The directory where origin images are stored in registry.ci.openshift.org has
changed.
@iamemilio
Copy link
Author

/retest

1 similar comment
@iamemilio
Copy link
Author

/retest

@iamemilio
Copy link
Author

Tested in my local env, and this works fine. There seems to be an issue with the CI

@iamemilio
Copy link
Author

/retest

1 similar comment
@iamemilio
Copy link
Author

/retest

@openshift-ci
Copy link

openshift-ci bot commented Mar 29, 2021

@iamemilio: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-openstack 1ad7796 link /test e2e-openstack

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@iamemilio
Copy link
Author

Closing in favor of #173

@iamemilio iamemilio closed this Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants