Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use image_family instead of static image reference #17722

Merged
merged 1 commit into from
Aug 18, 2020

Conversation

MHBauer
Copy link
Contributor

@MHBauer MHBauer commented May 27, 2020

The intention here seems to be to test on the stable version of cos.

The history of this file shows many other environments, such as coroes, containervm, different versions of ubuntu with different versions of docker.

Given that it hasn't been running on cos for 6 weeks or so, I'm not sure there's much signal in running on multiple cos versions. cos-stable image_family should float as we like. Alternatively, we can go on an lts image family, such as cos-77-lts.

This would effect these jobs as they reference this file:

config/jobs/cadvisor/cadvisor.yaml
42:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml --test-suite=cadvisor
98:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml --test-suite=cadvisor

config/jobs/kubernetes/sig-node/node-docker.yaml
18:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
47:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
75:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

config/jobs/kubernetes/sig-node/sig-node-presubmit.yaml
36:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
145:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

config/jobs/kubernetes/sig-node/node-kubelet.yaml
27:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
54:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
153:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
180:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
210:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

config/jobs/kubernetes/node-problem-detector/node-problem-detector-ci.yaml
93:        "--node-args=--image-config-file=/home/prow/go/src/k8s.io/test-infra/jobs/e2e_node/image-config.yaml --extra-envs=${EXTRA_ENVS}"

config/jobs/kubernetes/node-problem-detector/node-problem-detector-presubmits.yaml
90:          "--node-args=--image-config-file=/home/prow/go/src/k8s.io/test-infra/jobs/e2e_node/image-config.yaml --extra-envs=${EXTRA_ENVS}"

config/jobs/kubernetes/sig-release/release-branch-jobs/1.18.yaml
123:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
155:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
1052:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

config/jobs/kubernetes/sig-release/release-branch-jobs/1.15.yaml
86:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
118:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
757:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

config/jobs/kubernetes/sig-release/release-branch-jobs/1.16.yaml
86:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
118:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
804:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

config/jobs/kubernetes/sig-release/release-branch-jobs/1.17.yaml
123:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
155:      - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
860:        - --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels May 27, 2020
@MHBauer
Copy link
Contributor Author

MHBauer commented May 27, 2020

PTAL
/cc @vpickard @bart0sh @bsdnet

@k8s-ci-robot
Copy link
Contributor

@MHBauer: GitHub didn't allow me to request PR reviews from the following users: bsdnet.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

PTAL
/cc @vpickard @bart0sh @bsdnet

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested a review from bart0sh May 27, 2020 01:04
Copy link
Contributor

@bsdnet bsdnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cos-stable is a moving target, and it maybe change , fo example: it maybe cos-83 or cos-84.
If the purpose of the image is for pre-commit , I would suggest using LTS image/family instead.
To do step by step, maybe we rename cos-stable-77-12371-227-0 to cos-77-12371-227-0
first for now. It is a safe bet per my today's knowledge.

cos-stable2:
image_regex: cos-stable-77-12371-227-0 # docker v19.03.1, deprecated after 2020-12-17
cos-stable:
image_family: cos-stable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to use family, cos-81-lts should be more proper.

@bart0sh
Copy link
Contributor

bart0sh commented May 27, 2020

@bsdnet

cos-stable is a moving target, and it maybe change , fo example: it maybe cos-83 or cos-84

Using cos-81-lts family can potentially break jobs. When gcp switches to cos-83-lts they may not provide cos-81-lts anymore. This is what will happen with cos-73-lts soon as it approaches EOL.

@bart0sh
Copy link
Contributor

bart0sh commented May 27, 2020

To do step by step, maybe we rename cos-stable-77-12371-227-0 to cos-77-12371-227-0
first for now. It is a safe bet per my today's knowledge.

We don't know that as we haven't tested it. Recently we did similar switch from cos-stable-73 to cos-73 and it broke one of the PR jobs.

Copy link
Contributor Author

@MHBauer MHBauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should 1 be cos-stable and 2 be cos-XX-lts? ought to require rolling them less frequently?

@bsdnet
Copy link
Contributor

bsdnet commented May 27, 2020

@bsdnet

cos-stable is a moving target, and it maybe change , fo example: it maybe cos-83 or cos-84

Using cos-81-lts family can potentially break jobs. When gcp switches to cos-83-lts they may not provide cos-81-lts anymore. This is what will happen with cos-73-lts soon as it approaches EOL.
No, this won't. LTS is a long term supported. cos-83 will not become LTS.
You could see that from the past: cos-69, cos-73, cos-77, next LTS is cos-81.

@bsdnet
Copy link
Contributor

bsdnet commented May 27, 2020

should 1 be cos-stable and 2 be cos-XX-lts? ought to require rolling them less frequently?

Yes, if it is cos-xx-lts, and it is stable and less frequently. For cos-stable, it supposes every 6 weeks.
See https://cloud.google.com/container-optimized-os/docs/resources/support-policy

@vpickard
Copy link
Contributor

vpickard commented Jun 5, 2020

should 1 be cos-stable and 2 be cos-XX-lts? ought to require rolling them less frequently?

Yes, if it is cos-xx-lts, and it is stable and less frequently. For cos-stable, it supposes every 6 weeks.
See https://cloud.google.com/container-optimized-os/docs/resources/support-policy

This seems to be a very reasonable approach for the cos images. I'd suggest updating this PR to do that. @bsdnet and Ning Liao (nick?) are working on the cos image update policy, we can adjust this again later if needed.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 5, 2020
Copy link
Contributor

@vpickard vpickard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the conversation, I agree it would be good to have 1 be cos-stable and 2 be cos-XX-lts. Would be good to update the PR to reflect that.

Also, @bsdnet and Ning are creating/reviewing the cos image policy. We could update later if needed based on policy.

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jun 5, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 5, 2020
@bart0sh
Copy link
Contributor

bart0sh commented Jun 8, 2020

Have I understood correctly that we're discussing these 2 possible image maintenance policies?

  • use image_family. This gives us less maintenance effort for the price of possible failures when COS changes image versions.
  • continue using image. This gives us much less chances for tests to fail for the price of more maintenance effort. In this scenario test job maintainers should test new images and change image names in a job yaml manually when COS changes image versions.

@vpickard
Copy link
Contributor

vpickard commented Jun 8, 2020

Have I understood correctly that we're discussing these 2 possible image maintenance policies?

  • use image_family. This gives us less maintenance effort for the price of possible failures when COS changes image versions.
  • continue using image. This gives us much less chances for tests to fail for the price of more maintenance effort. In this scenario test job maintainers should test new images and change image names in a job yaml manually when COS changes image versions.

Yes, that is the current thinking. I believe we should proceed with this approach. @bsdnet and Ning can share their thoughts on COS image policy, and we can adjust if needed going forward.

@bart0sh Does that sound reasonable to you?

@bart0sh
Copy link
Contributor

bart0sh commented Jun 9, 2020

@vpickard @bsdnet @MHBauer So, what would be our decision? Should we continue using 'image:', start using 'image-family:' or anything else? As far as I understood this proposal we should keep using 'image:' and implement automation around it.

@SergeyKanzhelev
Copy link
Member

@bsdnet any updates on it?

With the current structure of tests - jobs/e2e_node/image-config.yaml is used in all type of tests, presubmits, release blocking and feature tests. Perhaps we should have separate image configuration files than? What do you think?

@karan
Copy link
Contributor

karan commented Aug 17, 2020

+1 to running tests with image_family (except let's using cos-xx-lts instead of cos-stable). We should be certifying k8s for major releases of images (COS LTS in this case) and not specific images. I would also prefer to use that in presubmits and everywhere else to get as much test coverage as possible, and detect issues sooner (in presubmits).

@vpickard
Copy link
Contributor

Based on the sig-node ci meeting today, the general consensus is to have one image for all of the job types (PR, CI, etc), and the preferred image would be to use the image_family (cos-81-lts, for example).

+1 to @karan comment above.

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 17, 2020
@MHBauer
Copy link
Contributor Author

MHBauer commented Aug 18, 2020

/assign @dims @dashpole

@dims
Copy link
Member

dims commented Aug 18, 2020

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, MHBauer, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 18, 2020
@k8s-ci-robot k8s-ci-robot merged commit b5fdd37 into kubernetes:master Aug 18, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

9 participants