Skip to content

Conversation

@jcpowermac
Copy link
Contributor

Alternative to #2558

@jcpowermac jcpowermac changed the title vsphere: platform none, move to base Bug 1952358: vsphere: platform none, vmxnet3v4 fix move to base Apr 30, 2021
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. label Apr 30, 2021
@openshift-ci-robot
Copy link
Contributor

@jcpowermac: This pull request references Bugzilla bug 1952358, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @mike-nguyen

Details

In response to this:

Bug 1952358: vsphere: platform none, vmxnet3v4 fix move to base

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Apr 30, 2021
@yuqi-zhang
Copy link
Contributor

It'd be also good if we can at least manually verify this for none, but I am more in favour of this approach that 2558

@jcpowermac
Copy link
Contributor Author

It'd be also good if we can at least manually verify this for none, but I am more in favour of this approach that 2558

Sounds like a good idea, let me do that.

@jcpowermac
Copy link
Contributor Author

/test e2e-vsphere
/test e2e-vsphere-upi

@jcpowermac
Copy link
Contributor Author

As requested tested a platform: none install (vSphere)
After install upgraded a single guest - control-plane-0 to hardware version 17.

install-config.yaml snipit

apiVersion: v1
baseDomain: vmc.devcluster.openshift.com
metadata:
  name: jcupi 
platform:
  none: {}
pullSecret: |
_  vsphere-upi oc debug node/control-plane-0
Starting pod/control-plane-0-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.19.4
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host

Disabled udp offload as expected

sh-4.4# ethtool -k ens192 | grep -v fixed
Features for ens192:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ip-generic: on
scatter-gather: on
        tx-scatter-gather: on
tcp-segmentation-offload: on
        tx-tcp-segmentation: on
        tx-tcp-mangleid-segmentation: off
        tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
receive-hashing: on
highdma: on
tx-udp_tnl-segmentation: off
tx-udp_tnl-csum-segmentation: off
rx-gro-list: off
tx-nocache-copy: off

Dispatch file, as expected

sh-4.4# cat /etc/NetworkManager/dispatcher.d/99-vsphere-disable-tx-udp-tnl 
#!/bin/bash
# Workaround:
# https://bugzilla.redhat.com/show_bug.cgi?id=1941714
# https://bugzilla.redhat.com/show_bug.cgi?id=1935539

driver=$(nmcli -t -m tabular -f general.driver dev show "${DEVICE_IFACE}")

if [[ "$2" == "up" && "${driver}" == "vmxnet3" ]]; then
  logger -s "99-vsphere-disable-tx-udp-tnl triggered by ${2} on device ${DEVICE_IFACE}."
  ethtool -K ${DEVICE_IFACE} tx-udp_tnl-segmentation off
  ethtool -K ${DEVICE_IFACE} tx-udp_tnl-csum-segmentation off
fi

No cloud-config, as expected

sh-4.4# cat /etc/kubernetes/cloud.conf 
sh-4.4# 

@jcpowermac
Copy link
Contributor Author

vSphere job failure caused by single tests, I would think unrelated to this change.

@sinnykumari
Copy link
Contributor

Find this implementation better than #2558 . Can we also add description in commit message to include reasoning of this template move?

@cuppett
Copy link
Member

cuppett commented May 3, 2021

/retest

The template is being moved from `templates/common/vsphere`
which is only applied when `platform: vsphere` is set.

There are customers running `platform: none` on vSphere or ESXi directly
that can be effected by the vmxnet3v4 issue.
@jcpowermac jcpowermac force-pushed the vsphere-platform-none-vmxnet3v4-base branch from bf130b1 to 0daceef Compare May 3, 2021 12:51
@jcpowermac
Copy link
Contributor Author

Find this implementation better than #2558 . Can we also add description in commit message to include reasoning of this template move?

@sinnykumari updated.

@ashcrow
Copy link
Member

ashcrow commented May 3, 2021

level=error msg=Error: could not contact Ironic API: timeout reached

Failure could be infra issue.

@jcpowermac
Copy link
Contributor Author

vsphere test failures look unrelated.

@ashcrow
Copy link
Member

ashcrow commented May 3, 2021

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented May 3, 2021

@jcpowermac: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-vsphere-upi bf130b1ecea736511a0000704d463330f299454a link /test e2e-vsphere-upi
ci/prow/e2e-vsphere bf130b1ecea736511a0000704d463330f299454a link /test e2e-vsphere
ci/prow/e2e-vsphere-upgrade 0daceef link /test e2e-vsphere-upgrade
ci/prow/e2e-metal-ipi 0daceef link /test e2e-metal-ipi
ci/prow/okd-e2e-aws 0daceef link /test okd-e2e-aws
ci/prow/e2e-aws-disruptive 0daceef link /test e2e-aws-disruptive

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jcpowermac
Copy link
Contributor Author

Are we waiting for those last jobs to pass? From dealing with other PRs in MCO those rarely are green.

Still going to take quite a bit of time for this to merge and then backported to 4.8 and 4.7.

@ashcrow
Copy link
Member

ashcrow commented May 3, 2021

My $0.02: If we can explain why these are failing and have confidence this works then I don't believe we need to wait.

Copy link
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the failing CI are unrelated and not required, so let's go ahead and move forwards with this given

  1. Joseph was able to confirm this works on none
  2. This does not seem to affect any other platform
  3. We will be looking to revert this as soon as the underlying issues are resolved
    /lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 3, 2021
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcpowermac, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 3, 2021
@openshift-merge-robot openshift-merge-robot merged commit 35e25c3 into openshift:master May 3, 2021
@openshift-ci-robot
Copy link
Contributor

@jcpowermac: All pull requests linked via external trackers have merged:

Bugzilla bug 1952358 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1952358: vsphere: platform none, vmxnet3v4 fix move to base

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@cuppett: new pull request created: #2564

Details

In response to this:

/cherry-pick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@cuppett: new pull request could not be created: failed to create pull request against openshift/machine-config-operator#release-4.7 from head openshift-cherrypick-robot:cherry-pick-2559-to-release-4.7: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"A pull request already exists for openshift-cherrypick-robot:cherry-pick-2559-to-release-4.7."}],"documentation_url":"https://docs.github.com/rest/reference/pulls#create-a-pull-request"}

Details

In response to this:

/cherry-pick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants