Skip to content

Conversation

@eparis
Copy link
Member

@eparis eparis commented Aug 26, 2018

This would break CL....

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Aug 26, 2018
@wking
Copy link
Member

wking commented Aug 26, 2018

/retest

#181 fixed the gofmt stuff. Also kick Jenkins for smoke tests:

retest this please

@wking
Copy link
Member

wking commented Aug 26, 2018

Can we drop this from modules/tectonic/resources/tectonic-wrapper.sh too? I don't understand where that script is being run yet, but it would be nice to drop our dependency on the hyperkube image completely.

@eparis
Copy link
Member Author

eparis commented Aug 27, 2018

/retest
I can try to see what uses that script (and in turn image) as my next side project...

@eparis
Copy link
Member Author

eparis commented Aug 28, 2018

Can we merge this? If I promise to come back and look at the (unrelated) tectonic-wrapper.sh? and start ripping out CL entirely...

@wking
Copy link
Member

wking commented Aug 28, 2018

The smoke test errors included:

 smoke_test.go:113: retrying in 10s
 smoke_test.go:112: failed with error: failed to connect to API server: the server could not find the requested resource
 smoke_test.go:113: retrying in 10s
 common_test.go:18: Failed to connect to API server in 15m0s.

I don't know if that's related or not. Maybe you need to bump the RHCOS AMI before using the system kubelet? [Edit: actually, the smoke tests don't use that config, so I have no idea what's broken :/]

@wking
Copy link
Member

wking commented Aug 28, 2018

I don't know if that's related or not. Maybe you need to bump the RHCOS AMI before using the system kubelet? [Edit: actually, the smoke tests don't use that config, so I have no idea what's broken :/]

Now that I've finished my coffee, I've remembered ;). The smoke tests are currently running via Jenkins in a separate AWS account (Prow is in 460538899914, Jenkins/smoke is in 846518947292). The RHCOS AMIs aren't public, and only the Prow account has the access credentials for them. So the smoke-test failures are probably because the CoreOS AMI being used is missing a host kubelet, and everything after that hangs. In order to get this PR to work, we need to do one of:

a. Add a native kubelet to the CoreOS stable series.
b. Move the smoke tests over to Prow and/or the Prow AWS account, from which they'll have access to the RHCOS AMIs.
c. Drop the smoke tests in favor of more e2e-aws coverage. Work in progress on this front in openshift/release#1271.

In general, moving the default to use a private AMI stream is going to make it harder for devs and others to use the installer. Is there a public stream that is similiar enough to the RHCOS stream that we can use for development?

@abhinavdahiya
Copy link
Contributor

abhinavdahiya commented Aug 28, 2018

Can we drop this from modules/tectonic/resources/tectonic-wrapper.sh too? I don't understand where that script is being run yet, but it would be nice to drop our dependency on the hyperkube image completely.

The tectonic-wrapper.sh allows tectonic.sh to run inside a container that has oc binary

KUBECTL="oc --config=$KUBECONFIG"

@wking
Copy link
Member

wking commented Aug 28, 2018

The tectonic-wrapper.sh allows tectonic.sh to run inside a container that has oc binary.

Hyperkube seems like a heavy dependency if that's all we'll need it for. Do the origin folks publish an image that has only oc, kubectl, and enough-of-POSIX (e.g. via BusyBox)? Effectively that would be an origin-client image. Or maybe I'm overestimating the size of hyperkube?

@abhinavdahiya
Copy link
Contributor

Hyperkube seems like a heavy dependency if that's all we'll need it for.

We needed hyperkube on host for kubelet anyways; So it wasn't a problem. But moving forward as we would not need hyperkube image, i think we can move to minimal image.

But oc is required; kubectl does not work well with securitycontextconstraints objects.

@eparis eparis force-pushed the use-kubelet-on-node branch from f759b97 to 7a43f50 Compare August 29, 2018 13:41
@eparis eparis changed the title Use the kubelet on RHCOS instead of from origin-node image WIP: Use the kubelet on RHCOS instead of from origin-node image Aug 29, 2018
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 29, 2018
@eparis eparis force-pushed the use-kubelet-on-node branch from 7a43f50 to 4b088fb Compare August 29, 2018 17:17
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 29, 2018
@eparis
Copy link
Member Author

eparis commented Aug 29, 2018

https://github.com/coreos-inc/tectonic-operators/pull/447 would need to merge, and then this get updated.

@eparis
Copy link
Member Author

eparis commented Aug 29, 2018

/retest
now that my forked tectonic-operators are public

@sjenning
Copy link
Contributor

fyi me @rphillips @derekwaynecarr

@abhinavdahiya
Copy link
Contributor

@eparis https://jenkins-kube-lifecycle.prod.coreos.systems/job/tectonic-operators/job/master/418/display/redirect created new operators with changes from https://github.com/coreos-inc/tectonic-operators/pull/447 and https://github.com/coreos-inc/tectonic-operators/pull/451.

You can use b0e6febc1dfe4998f911c4ce2c911228a3fdcb38 tag instead of personal images

@eparis eparis force-pushed the use-kubelet-on-node branch from 4b088fb to bc9daa4 Compare September 1, 2018 00:05
@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: eparis
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: yifan-gu

If they are not already assigned, you can assign the PR to them by writing /assign @yifan-gu in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

The change in this repo changes the usage for the bootstrap node.
The change to the operators changes the usage in the rest of the nodes.
@eparis eparis force-pushed the use-kubelet-on-node branch from bc9daa4 to 86669fe Compare September 1, 2018 00:10
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2018
@eparis
Copy link
Member Author

eparis commented Sep 4, 2018

/retest

@openshift-ci-robot
Copy link
Contributor

@eparis: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws 86669fe link /test e2e-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@wking wking mentioned this pull request Sep 5, 2018
@crawford
Copy link
Contributor

crawford commented Sep 6, 2018

@eparis I'm working on stripping Container Linux from the installer. Do you mind if I close this out in favor of a future PR?

@eparis eparis closed this Sep 6, 2018
@eparis eparis deleted the use-kubelet-on-node branch February 18, 2019 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants