Skip to content

Conversation

@celebdor
Copy link
Contributor

node-ip is a subcommand that allows the user to see which IP should the
node use in cases of multiple interface and multiple address nodes. This
is useful to prevent cases where Container Runtime related services bind
to an interface that is not reachable in the control plane.

It has two commands:

  • show: Takes one or more Virtual IPs of the control plane and it gives
    you one eligible IP on stdout.

  • set: Takes one or more Virtual IPs of the control plane and sets
    systemd service configuration for services like CRI-O or Kubelet that
    need to bind to the control plane.

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 13, 2020
@celebdor
Copy link
Contributor Author

Remaining:

  • Finish set
  • Once this makes it into RHCOS, make a PR to have the VIP using platforms change to this over the Python implementation.

@celebdor
Copy link
Contributor Author

/assign @vrutkovs

@vrutkovs
Copy link
Contributor

/cc @LorbusChris

we'll want this in fcos branch too

@celebdor
Copy link
Contributor Author

The Set command is working :-)

@celebdor
Copy link
Contributor Author

/retitle MCD: Add node-ip subcommand

@openshift-ci-robot openshift-ci-robot changed the title WIP MCD: Add node-ip subcommand MCD: Add node-ip subcommand Mar 13, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 13, 2020
@celebdor
Copy link
Contributor Author

/assign @bcrochet @cybertron

@vrutkovs
Copy link
Contributor

/retest

@LorbusChris
Copy link
Contributor

LorbusChris commented Mar 13, 2020

@celebdor please run make go-deps to fix the code-generator scripts file permission change
(changing 100755 → 100644 is not wanted here)

/cherry-pick fcos

@openshift-cherrypick-robot

@LorbusChris: once the present PR merges, I will cherry-pick it on top of fcos in a new PR and assign it to you.

Details

In response to this:

@celebdor please run make go-deps to fix the code-generator scripts file permission change

/cherry-pick fcos

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@LorbusChris LorbusChris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@celebdor
Copy link
Contributor Author

@celebdor please run make go-deps to fix the code-generator scripts file permission change
(changing 100755 → 100644 is not wanted here)

/cherry-pick fcos

Done with the make go-deps

@celebdor celebdor requested a review from LorbusChris March 13, 2020 14:58
@ashcrow
Copy link
Member

ashcrow commented Mar 13, 2020

/retest

1 similar comment
@LorbusChris
Copy link
Contributor

/retest

@yuqi-zhang
Copy link
Contributor

This looks sane although I am not the most familiar with regarding node ip. Since this is a new cmd it'd be nice to add some basic unit tests like Sinny said.

@LorbusChris
Copy link
Contributor

Not to say it would replace unit testing, but maybe add a commit that enables the use of this new cmd for e2e testing on openstack and baremetal?
E.g. like in a477762

@kikisdeliveryservice
Copy link
Contributor

I'd also like tests added..

@rphillips @umohnani8 @haircommander

Any thoughts? Is MCD appropriate place for this? Anything to consider from a crio/kubelet perspective?

@kikisdeliveryservice
Copy link
Contributor

Basic question: Is this making changes via commandline?

If so, if the point of the MCO is to make changes via MC/Kubeletcfgs/etc... does it make sense?

I dont know, you tell me =)

@cgwalters
Copy link
Member

I think I see the rationale for this, but...it's really weird from an architecture perspective to have the MCD go poking at both kubelet and crio configuration.

crio already has code to try to detect this, no? Since we only support crio, any reason why this isnt' a crio command or so?

@cgwalters
Copy link
Member

This is a generic Kubernetes problem too right? Feels like we could try to upstream this somewhere in Kube, though I am not plugged in enough there to say.

@kikisdeliveryservice
Copy link
Contributor

until this gets sorted out

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 25, 2020
@kikisdeliveryservice
Copy link
Contributor

kikisdeliveryservice commented Mar 25, 2020

Thinking about this more:

The flow for interacting with kubelet/crio via MCO is kubelet/ctrcfg -> machine config -> pool. This seems to bypass this via a new interface (the MCO's UI is machine configs). I feel uncomfortable using the MCO to make on the fly changes that aren't captured in one of the above cfgs. I'm not sure I understand why this is bypassing that existing UI? Or maybe this means, as Colin suggested, that these changes are better made elsewhere?

@kikisdeliveryservice
Copy link
Contributor

Also ping: @openshift/openshift-team-sdn

Was told you might have some thoughts on this.

@celebdor
Copy link
Contributor Author

It was suggested by members of the MCO team that, in order to move from Python to golang, we put this functionality as subcommands of MCD. This way it gets propagated to the RHCOS images.

The purpose of this commands is to be available for scripts that ARE created with the MachineConfig interface, namely storage->files that get placed in /etc/NetworkManager dispatcher scripts location.

The reason why this is not all just a MachineConfig placing a couple of /etc/systemd/service.d/ files is that finding out which IP is on the control plane network is something to be done at runtime. This could then be tackled upstream in CRI-O and Kubelet. The latter we know has some code to choose the reporting address and it surely can be enhanced there not to use deprecated IPv6 addresses which is something that is generically wrong. However, that does not mean that it would pick an address in the control plane rather than in the provisioning or the storage networks and it seems rather unlikely that we could convince them to add code for that (although if the consensus here is that we do just that, we can give a shot at making a Kubelet PR that allows you to specify from which subnet the address should be chosen). The CRI-O situation is quite similar, it would mean pushing there the concept of preferred subnets and waiting for CRI-O and Kubelet versions to include this preferred subnets knobs (which they may well say that are unnecessary logic since you can just set a config option with the IP of your choice) before we can drop Python from RHCOS.

@ashcrow ashcrow requested a review from ericavonb March 26, 2020 13:09
@danwinship
Copy link
Contributor

The latter we know has some code to choose the reporting address and it surely can be enhanced there not to use deprecated IPv6 addresses which is something that is generically wrong. However, that does not mean that it would pick an address in the control plane rather than in the provisioning or the storage networks and it seems rather unlikely that we could convince them to add code for that

Indeed, the entire point of having the --node-ip argument is that kubelet couldn't possibly guess the right value in every possible weird setup.

The CRI-O situation is quite similar

I don't think it is. The default value of --stream-address is not the autodetected node IP, it's 127.0.0.1. I think the problem there is just that you need to set it to ::1 on IPv6 nodes, not that you need to set it to the correct node IP. (Or, really, we need golang to support being able to listen dual-stack on localhost...)

Use: "node-ip",
DisableFlagsInUseLine: true,
Short: "Node IP tools",
Long: "Node IP has tools that aid in the configuration of nodes in platforms that use Virtual IPs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really clear in this context what "virtual" means. It might be better to just say "multiple IPs"?

var nodeIPSetCmd = &cobra.Command{
Use: "set",
DisableFlagsInUseLine: true,
Short: "Sets container runtime services to bind to a configured IP address that directly routes to the given virtual IPs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I assume the usage here is that you pass it the IPs of the other masters, in order to find an IP on the interface that routes to those other IPs, rather than an IP on a provisioning/storage network? In that case, the IPs you pass would be non-virtual IPs, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You give it the virtual IPs so that then it can find which non virtual IPs are on the virtual IPs network, which will rule out non virtual IPs in other networks (like a storage network).

var nodeIPShowCmd = &cobra.Command{
Use: "show",
DisableFlagsInUseLine: true,
Short: "Show a configured IP address that directly routes to the given Virtual IPs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just a configured IP address, it's the right one (for some definition of "right")

@haircommander
Copy link
Member

The latter we know has some code to choose the reporting address and it surely can be enhanced there not to use deprecated IPv6 addresses which is something that is generically wrong. However, that does not mean that it would pick an address in the control plane rather than in the provisioning or the storage networks and it seems rather unlikely that we could convince them to add code for that

Indeed, the entire point of having the --node-ip argument is that kubelet couldn't possibly guess the right value in every possible weird setup.

The CRI-O situation is quite similar

I don't think it is. The default value of --stream-address is not the autodetected node IP, it's 127.0.0.1. I think the problem there is just that you need to set it to ::1 on IPv6 nodes, not that you need to set it to the correct node IP. (Or, really, we need golang to support being able to listen dual-stack on localhost...)

yeah, this is the only case I can see that CRI-O would need to be configured like this. For all other IP address choices it makes (choosing hostIP for a pod), it defers to the kubelet.

If that case, or the case of configuring the kubelet alone, is deemed worthy of a new command in the MCD, I don't see anything wrong with this change 👍

@LorbusChris
Copy link
Contributor

I'm wondering, if MCO is not comfortable taking this functionality in, couldn't it also just live in https://github.com/openshift/baremetal-runtimecfg?

@celebdor
Copy link
Contributor Author

I'm wondering, if MCO is not comfortable taking this functionality in, couldn't it also just live in https://github.com/openshift/baremetal-runtimecfg?

Then it would not be in the RHCOS image (MCD is installed there from an RPM package). Do you suggest to use podman to extract it in ExecStartPre of the units that use it?

@LorbusChris
Copy link
Contributor

LorbusChris commented Mar 27, 2020

Couldn't we run it as a containerized command (without extracting)? Like we would've done with the python scripts.

@cgwalters
Copy link
Member

If we move it into the baremetal operator, we're kind of saying that multiple NICs and more sophisticated network is only a baremetal problem, but I don't think that's true.

@celebdor
Copy link
Contributor Author

@cgwalters I think @LorbusChris suggested baremetal-runtimecfg, which now that it is used by several platforms could use a rename (naming is hard :'( ), not the baremetal operator.

@kikisdeliveryservice
Copy link
Contributor

@cgwalters I think @LorbusChris suggested baremetal-runtimecfg, which now that it is used by several platforms could use a rename (naming is hard :'( ), not the baremetal operator.

Hi!
Q:
who is using baremetal-runtimecfg vs who is covered by the use-case for this PR?

@vrutkovs
Copy link
Contributor

oVirt, Openstack and Baremetal IPI are using these scripts

@LorbusChris
Copy link
Contributor

However, this PR wouldn't actually make any of those platforms use the new commands. To activate, one would need a commit like this: 82fa8a7 (just with the correct MCD path for OCP)

@kikisdeliveryservice
Copy link
Contributor

oVirt, Openstack and Baremetal IPI are using these scripts

do they also use baremetal-runtimecfg?

@kikisdeliveryservice
Copy link
Contributor

Couldn't we run it as a containerized command (without extracting)? Like we would've done with the python scripts.

@celebdor WDYT? if all of those platforms do indeed use baremetal-runtimecfg (rename pending 😉 )

@LorbusChris
Copy link
Contributor

This is superseded by #1659

/close

@openshift-ci-robot
Copy link
Contributor

@LorbusChris: Closed this PR.

Details

In response to this:

This is superseded by #1659

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.