Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP: ComponentConfig fields should be optional #1173

Conversation

justinsb
Copy link
Member

Proposing that we make all fields in ComponentConfig optional/nilable,
as initially discussed in
https://groups.google.com/d/topic/kubernetes-wg-component-standard/hO7Lmi0cwQU/discussion

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: justinsb
To complete the pull request process, please assign lavalamp
You can assign the PR to them by writing /assign @lavalamp in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 26, 2019
@k8s-ci-robot k8s-ci-robot requested review from deads2k and lavalamp July 26, 2019 15:10
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jul 26, 2019
Proposing that we make all fields in ComponentConfig optional/nilable,
as initially discussed in
https://groups.google.com/d/topic/kubernetes-wg-component-standard/hO7Lmi0cwQU/discussion
@justinsb justinsb force-pushed the component-config-fields-should-be-optional branch from 1c12898 to a05f968 Compare July 26, 2019 15:25
@k8s-ci-robot
Copy link
Contributor

@justinsb: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-enhancements-verify a05f968 link /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@mtaufen mtaufen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. Let's do it.


After demonstrating this in v1.16 (ideally in v1.17) we will change the types of
the fields in kubelet so that they are nilable. This may require introduction
of a v1beta2 (TBD).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we have to think about this carefully. It would probably "just work" as far as config files, though something else like kubeadm may be relying on the v1beta1 struct types to generate files.
@liggitt can you think of any other edge cases here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two types of clients to consider:

  • clients using the go types... changing fields to pointers would introduce NPE risks when they updated to new versions
  • clients using serialized config files. as long as the fields changed to pointers set omitempty, we would not serialize null in unexpected places

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing fields to pointers would introduce NPE risks

the same is true for the kubelet itself, actually, so a very diligent sweep would be required to guard against that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we break it up field-by-field and write really good instructions on how to modify and test consumer code, we can get a broad swath of people attacking this. But the isntructions have to be ACTIONABLE.

We will begin with kube-scheduler and kube-proxy, as those are the smallest;
there is also more desire to promote those to beta.

Implementation details TBD!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off the top of my head:

  1. Make everything a pointer in the external type(s).
  2. Update defaulting code to check nil everywhere instead of zero values (yay for consistency!).
  3. Backfill zero-value defaults for previously-non-pointer fields.
  4. Run codegen.
  5. Run tests.

Hopefully most logic is based off the internal type and changes are minimal. Some tests that round-trip may need updates.

Might get interesting with deeply-nested structures. Best way to find out is to try it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making everything a pointer is a nice thing from consistency standpoint, however it also looks annoying in Go code. It won't have much impact on users, that treat component configs as YAML templates only, but Go code vendoring users are going to be in for a hard time (or at least, if I understand correctly, for a huge use of k8s.io/utils/pointer).

The real problem here is tied to omitempty and the JSON library, which is very cautious in respecting it. In fact, in non-trivial cases, it won't respect it at all. The Go-YAML library has a IsZeroer interface, but that is not supported in the builtin Go JSON library.

allows us to prove the value before we change types with stricter versioning
requirements.

We risk changing the behaviour of an unspecified field in the componentconfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there still a risk if we backfill zero-value defaults?

### Goals

- Allow generation of compact componentconfig using the same types
- Start to replace some of the "magic sentinel" values that are used to mean "not set"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think conversion v1 <-> v2 will lose set/unset information because it goes through the internal type, even if v1 and v2 consist of all pointers. There's still significant utility to this proposal, but conversions that preserve intent would also be very useful. It's complicated, because the convertor needs to understand defaults for both versions so it can shim correctly, and not sure how easily the current machinery would extend to that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you would have to make the fields in the internal type pointers as well if you want to be able to preserve the set/unset aspect after conversion to the internal hub version

pointer-types in our componentconfig.

This also allows for generation of a minimal componentconfig from code, and
should faciliate patching and merging of componentconfig.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

facilitate

participating-sigs:
- sig-cluster-lifecycle
reviewers:
- @mtaufen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These might need to be quoted, verify-kep-metadata failed.

Copy link
Contributor

@rosti rosti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @justinsb !
This is indeed a pressing issue (as we found with kubeadm and v1beta1).
However, I do think, that we need to start at the top. We need to figure out component config storage, conversion and validation responsibilities with respect of cluster lifecycle tools. I talked to @mtaufen about a possible design meeting with regards to these questions on last week's WG Component Standard meeting.
Surely, once these questions are cleared, we would have a lot more understanding to what to do with questions like this one. Like, to what extent to make versioned Go types hard to use.


## Proposal

In v1.16 we will change the types of the fields in the kube-proxy,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.16 is a very optimistic time frame, given the fact, that there is only a month before code freeze and a bit more than a day for the enhancement freeze.

We will begin with kube-scheduler and kube-proxy, as those are the smallest;
there is also more desire to promote those to beta.

Implementation details TBD!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making everything a pointer is a nice thing from consistency standpoint, however it also looks annoying in Go code. It won't have much impact on users, that treat component configs as YAML templates only, but Go code vendoring users are going to be in for a hard time (or at least, if I understand correctly, for a huge use of k8s.io/utils/pointer).

The real problem here is tied to omitempty and the JSON library, which is very cautious in respecting it. In fact, in non-trivial cases, it won't respect it at all. The Go-YAML library has a IsZeroer interface, but that is not supported in the builtin Go JSON library.

However, while flags make the distinction between not-specified and
set-to-empty-value, our api types generally do not. By using pointer-types it
is possible to preserve this difference, this KEP proposes that we should use
pointer-types in our componentconfig.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1! Lack of this ability has already caused problem for the scheduler. We cannot turn on pprof by default for the scheduler. In order to set the default value of "EnableProfiling" to true, we first need to find that the flag is not provided, but we cannot, because the flag is false both when it is explicitly set to false and when it is not provided. This applies to other boolean and integer options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bsalamat you can tell whether a flag was provided or not using pflag flagset's Changed method:
kubernetes-sigs/cluster-addons@d5f8d8d#diff-fe99745b837e4e6db3d1c39b19f39ed3R41
That can be used for the scheduler's flag logic 👍.

I think what is being expressed here is that is that when using a config file, once you have the struct, you won't be able to get this information unless we make all fields pointers or encode the information somewhere else.

@timothysc
Copy link
Member

/assign @timothysc @neolit123

@alejandrox1
Copy link
Contributor

/assign

@fabriziopandini
Copy link
Member

/assign @fabriziopandini

should faciliate patching and merging of componentconfig.

ComponentConfig types are used with v1alpha1 versions for kube-scheduler,
kube-proxy, kube-controller-manager, cloud-controller-manager; I intend to make
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubeadm already consumes kube-proxy v1alpha1, and fear this change - without cutting a new version - might create some problem...

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinsb thanks for this write up!
Only one additional point from my side related to the migration from the current version to "all pointer" version.

Currently, conversions are implemented in the "private" part of the component config, while consumers like kubeadm are using the public API. This potentially can lead to forcing each component config consumer to re-implement the conversion logic, which is far from ideal.
IMO, before proceeding with this KEP, it should be discussed how to ensure "smooth" migration from public versions of the same API

kube-proxy, kube-controller-manager, cloud-controller-manager; I intend to make
these nilable. kubelet uses a ComponentConfig type that is v1beta1, I propose
not changing those at the current time, but rather proving out the approach with
the v1alpha1 types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to put details here wrt if successful we will... This should also serve as a blocker towards graduating other component configs.


I initially raised this in a [mailing list
thread](https://groups.google.com/d/topic/kubernetes-wg-component-standard/hO7Lmi0cwQU/discussion)
with wg-component-standard. When trying to create componentconfig from code,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were any fields omitempty?


Supporting merging of configurations that better reflects intent has come up a
few times: there have been discussions of merging fields with flags and the
precedence when doing so, merging fields set at different levels of a hierarchy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying, but I don't think most others would. An example on hierarchical ordering I think would help make your case.

@stealthybox
Copy link
Member

/wg component-standard

@k8s-ci-robot k8s-ci-robot added the wg/component-standard Categorizes an issue or PR as relevant to WG Component Standard. label Sep 21, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 23, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 24, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. wg/component-standard Categorizes an issue or PR as relevant to WG Component Standard.
Projects
None yet
Development

Successfully merging this pull request may close these issues.