Skip to content

Conversation

@ricky-rav
Copy link
Contributor

@ricky-rav ricky-rav commented Dec 14, 2021

A new Network Manager dispatcher script, placed in mtu-migration-dispatcher.yaml, responds to UP events on selected host interfaces when an MTU migration is ongoing. It sets appropriate (transient) MTU values for host interfaces and their routes in the first phase of the MTU migration in order to avoid disruption while the cluster MTU is also being modified.

Permanently applying an MTU value to host interfaces is instead under the cluster administrators's responsibility; an option for the cluster admin, for instance, would be to apply a MachineConfig object that adds a custom Network Manager profile to the host.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 14, 2021
Copy link
Contributor

@jcaamano jcaamano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think that there is no need to have a phase2 which simplifies things quite a bit.

Comment on lines 177 to 180
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need the default gateway interface in this case?

@jcaamano
Copy link
Contributor

In general I think that there is no need to have a phase2 which simplifies things quite a bit.

Now that I think of it, you might have added a phase2 thinking that MTU changes persist. But they do not. MTU set through iproute2 both on interfaces and routes do not persist accross reboots, so there no need for a phase2.

@jcaamano
Copy link
Contributor

In general I think that there is no need to have a phase2 which simplifies things quite a bit.

Now that I think of it, you might have added a phase2 thinking that MTU changes persist. But they do not. MTU set through iproute2 both on interfaces and routes do not persist accross reboots, so there no need for a phase2.

The exception to this would be the MTU set in OVSDB but that would be reset anyway by however the admin chooses to deploy the actual target MTU config.

@ricky-rav ricky-rav changed the title [WIP] mtu-migration service and mtu-migration-dispatcher script mtu-migration-dispatcher script Dec 14, 2021
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 14, 2021
Copy link
Contributor

@jcaamano jcaamano Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for greater than?

@jcaamano
Copy link
Contributor

/retest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be foundMTUMigrationWorker = ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, since we're exiting, there's indeed no need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_mtu_on_dev returns 1 if it isnt found and this logic seems to continue. Could that cause potential issues with assumign MTU is 1? Should you return -1 or something instead and exit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_mtu_on_dev gives back the result of cat ..... The return is just the status of the function that should fail the script if different than 0 because of set -x. But let's change it because it is confusing. Just doing an an exit 1 there should be equivalent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to set the values of the arguments up here rather than having to search to the bottom of the script to find it. Would also be nice to have some comment indicating what these arguments are

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does CONNECTION_UUID come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's among the environment variables set by network manager when calling the dispatcher script. I will add a comment that clarifies that and the input arguments as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check should go under the if statement on line 90 and look like this:

        if [ $ORIGINAL_MTU -gt $TARGET_MTU ]; then
            if [ $is_vlan_parent -eq 1 ]; then
                 echo "No need to set MTU as current is greater than target and is a vlan parent"
                 return 0
            fi
            routable_mtu=$TARGET_MTU
            mtu=$ORIGINAL_MTU
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you print this later so no need to print here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit that doesnt really matter but [0-9]+ (since if the route says mtu there should be at least 1 char here and should be a digit)

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 16, 2021

@ricky-rav: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-disruptive 2844a3d2ffb3aad02f10a741917a2948344fde41 link false /test e2e-aws-disruptive
ci/prow/e2e-aws-workers-rhel8 2844a3d2ffb3aad02f10a741917a2948344fde41 link false /test e2e-aws-workers-rhel8
ci/prow/e2e-aws-workers-rhel7 2844a3d2ffb3aad02f10a741917a2948344fde41 link false /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

A new Network Manager dispatcher script, placed in mtu-migration-dispatcher.yaml, responds to pre-up events on selected host interfaces when an MTU migration is ongoing. It sets appropriate (transient) MTU values for host interfaces and their routes in the first phase of the MTU migration in order to avoid disruption while the cluster MTU is also being modified.

Permanently applying an MTU value to host interfaces is instead under the cluster administrators's responsibility; an option for the cluster admin, for instance, would be to apply a MachineConfig object that adds a custom Network Manager profile to the host.

Signed-off-by: Riccardo Ravaioli <[email protected]>
Co-authored-by: Jaime Caamaño Ruiz <[email protected]>
@jcaamano
Copy link
Contributor

/retest

Copy link
Contributor

@trozet trozet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@trozet
Copy link
Contributor

trozet commented Dec 16, 2021

/assign sinnykumari

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 16, 2021
Copy link
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from MCO side

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 16, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ricky-rav, sinnykumari, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 16, 2021
@openshift-merge-robot openshift-merge-robot merged commit 4f93ba3 into openshift:master Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants