Skip to content

Allow openshift-ansible image to deploy to GCP#6541

Merged
sdodson merged 4 commits intoopenshift:masterfrom
smarterclayton:move_gcp_in
Jan 24, 2018
Merged

Allow openshift-ansible image to deploy to GCP#6541
sdodson merged 4 commits intoopenshift:masterfrom
smarterclayton:move_gcp_in

Conversation

@smarterclayton
Copy link
Contributor

@smarterclayton smarterclayton commented Dec 21, 2017

This moves all core GCP functionality into the openshift-ansible repo, adds
the necessary equivalent entrypoint to the openshift-ansible installer
image, and ensures the dynamic inventory mechanisms in openshift-ansible
continue to work. After this change https://github.com/openshift/origin-gce can be removed

TODO:

  • Create a test PR job that builds the ansible image, then uses it to deploy to GCP (similar to conformance_gce).

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 21, 2017
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2017
@openshift-ci-robot openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Dec 21, 2017
@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 21, 2017
@openshift-merge-robot openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Dec 21, 2017
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 23, 2017
@smarterclayton smarterclayton force-pushed the move_gcp_in branch 2 times, most recently from 432cf00 to 231a877 Compare December 31, 2017 23:47
@smarterclayton smarterclayton changed the title WIP - moving origin-gce functionality into openshift-ansible Allow openshift-ansible image to deploy to GCP Jan 1, 2018
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 1, 2018
@smarterclayton smarterclayton force-pushed the move_gcp_in branch 4 times, most recently from 0b26570 to 716a206 Compare January 2, 2018 16:57
@sdodson
Copy link
Member

sdodson commented Jan 2, 2018

/assign kwoodson mgugino

@openshift-ci-robot
Copy link

@sdodson: GitHub didn't allow me to assign the following users: mgugino.

Note that only openshift members can be assigned.

Details

In response to this:

/assign kwoodson mgugino

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sdodson
Copy link
Member

sdodson commented Jan 2, 2018

/assign michaelgugino

@smarterclayton
Copy link
Contributor Author

/retest

ansible.cfg Outdated
gathering = smart
fact_caching = jsonfile
fact_caching_connection = $HOME/ansible/facts
fact_caching_connection = /tmp/ansible/cached_facts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably stay variablized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, why would a folder in $HOME be the right place for a cache? Especially a non-dotted name?

@@ -0,0 +1,402 @@
#!/usr/bin/env python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason to rename this file? This looks like gce.py. I'd prefer if we kept the same name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to make it clear what it's for, and the naming is redundant. It has additional patches on top of gce.py. I can rename it back, but I could also add a comment at the top that talks about its provenance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This should be fine. Adding the comment and what the expectation is around keeping this file updated when a newer gce.py comes out.

@@ -0,0 +1 @@
{} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the default inventory for dynamic behavior.

@@ -0,0 +1,109 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename aws/gcp files to match? I think the aws equivalent is build_ami.yml.

Also, should we move most of the heavy lifting under the roles/openshift_gce/build_image.yml so that the logic is under the role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMI is Amazon specific, GCE always calls it an image. I would argue for build_image for both before build_ami for both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe yes, but this was going to be a future refactor point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind if I refactor this as a follow up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine.

@@ -0,0 +1,38 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving the logic under the role and have a shim that calls:

import_role: with tasks_from: publish_image.yml.

@@ -0,0 +1,18 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that we should be sharing across clouds? Should this move into a logrotate role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably yes, ok to do as a followup? This was originally to ensure we were rotating everything, and on cloud images we can definitely be more opinionated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine as a follow up. Just want to share as much of this useful code as possible.

@@ -0,0 +1,12 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file or an equivalent lives here:

openshift-ansible/roles/openshift_cloud_provider

These settings should move to the tasks/gce.yml section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving. Note that multizone does not work on GCP if only one zone is deployed, so I defaulted this to false.

@@ -0,0 +1,11 @@
{% set storagedriver = provision_gce_docker_storage_driver | default('overlay2') %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lives in container-runtime/templates/docker-storage-setup.j2.


oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper > /root/bootstrap.kubeconfig
gcloud compute project-info --project '{{ openshift_gcp_project }}' add-metadata --metadata-from-file '{{ openshift_gcp_prefix + openshift_gcp_clusterid | default("default") }}-bootstrap-config=/root/bootstrap.kubeconfig'
rm -f /root/bootstrap.kubeconfig No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add newline \n


set -euo pipefail

oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper > /root/bootstrap.kubeconfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this run? Should this /root/bootstrap.kubeconfig be in /etc/origin/master?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the job that keeps the bootstrap metadata in project instances up to date. So it's creating a new bootstrap config, uploading it to gcp metadata, then deleting the temporary local file.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2018
@smarterclayton
Copy link
Contributor Author

/test gcp

2 similar comments
@smarterclayton
Copy link
Contributor Author

/test gcp

@smarterclayton
Copy link
Contributor Author

/test gcp

@smarterclayton
Copy link
Contributor Author

smarterclayton commented Jan 22, 2018

The "install to GCP from openshift-ansible PR using image" test job is now passing. Can you guys help get the final round of reviews done so we can merge this and iterate? I want to see some movement on AWS cluster setup.

Copy link
Contributor

@michaelgugino michaelgugino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a few minor changes.

- import_playbook: init/main.yml

- import_playbook: openshift-checks/private/install.yml
- import_playbook: control_plane.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are users going to encouraged to call control_plane.yml, or is this just for dev/test purposes?

As-is, init/main.yml will be called 3 times; we don't want that, we need to come up with a way that init/main.yml is only called once per run of ansible-playbook.

I suggest we don't do this at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think eventually control_plane is the only part that will be in openshift-ansible, and the other parts will be called by different tools (or at a higher level).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the init part, and this being available to end users, are both issues. Do we have a practical way to run-once an included playbook? If we don't, then I think we should get rid of deploy_cluster and have control_plane and components (or whatever we call it) and do them in sequence. Installing a control plane has nothing to do with the components on top of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have the split of playbooks entirely as you suggest. However, I don't think we want to do that this late in the cycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a way forward, I think we can simply make the two parts "private" in a separate folder and include them here, and in AWS and GCP, with comments about pre and post conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with the less contentious path, including a thorough description of what the two playbooks are for.


- import_playbook: openshift-master/private/additional_config.yml

- import_playbook: openshift-node/private/config.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just add install_gcp step here, it only runs against masters, just make it controlled by inventory; no need to split these steps up. In the gcp playbook above, that's really the only difference is that there's an additional play called in the middle of deploy_cluster.yml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that long term we are going to have a very hard split here. If we're not ready to do that it's fine, but I don't want to see coupling between "have APIs running" and "install opinionated components" for much longer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason the control plane needs to know about GCP at this point - that's a GCP specific detail, or an AWS specific detail, not something the generic playbook should be aware of. We should be using composition, not inheritance, for many of these areas (where we are leveraging openshift-ansible as lower level playbooks)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that long term we are going to have a very hard split here. If we're not ready to do that it's fine, but I don't want to see coupling between "have APIs running" and "install opinionated components" for much longer.

We have done a lot of decoupling in this release. Many people want a 'run this one command line' operation to install everything.

I suggest putting the install_gcp play here for simplicity's sake. It's not a lot of tasks to skip if it's not normally used, and it saves you from having to include a bunch of plays in your playbook.

deploy_cluster.yml is a special, top-level play. It utilizes re-usable plays. I don't see value in splitting it at this time, as this gcp play is the only one that needs that to happen and we can seamlessly add the install_gcp play into deploy_cluster.

Copy link
Contributor Author

@smarterclayton smarterclayton Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS needs it split as well. Basically, deploy_cluster.yml is really "BYO cluster". BYO, AWS, GCP, OpenStack, and Azure would all be peers in importance. I'm fine with "BYO cluster" being its own playbook, but we should not be duplicating the steps in it across 5 different providers. So it's not really the right factoring even for 2/3 the targets we deploy to at this very moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the ideal workflow is:

  1. Deploy hosts, either manually or via a play we provide
  2. Prequisites and user-provided setup tasks (cloud-init script, what have you)
  3. deploy_cluster.yml

Each individual cloud provider is going to have it's own set up unique steps that need to be performed; using image-based deployments is a whole other topic altogether. I think deploy_cluster should be provider agnostic; the more we can have all providers converge on the same path, the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For cloud providers (including OpenStack), the flow is going to end up looking more like:

  1. Use our golden image, or customize your own, where nodes are already preconfigured
  2. Deploy one or more core scale groups, using those images (that already have nodes configured)
  3. Install containerized components onto the masters (not RPM based, but image based) such that the control plane comes up
  4. Accept / bootstrap the master nodes and any other initial capacity provided
  5. Install core components like the autoscaler and autosigner
  6. Install additional components like the router, registry, etc

However, we are not going to be driving these from unified playbooks for much longer - instead, the cluster operator is going to be invoking provisioning in what is probably going to be three stages

  1. provision cloud topology (load balancers, etc) and scale groups
  2. provision control plane onto a set of nodes
  3. install individual components

Each of those is going to be an entrypoint playbook that needs to share code with deploy_cluster - deploy_cluster is not really going to be called by anyone except people on bare metal. I agree on bare metal the workflow you described is probably ok, except that's only 25% of the install footprint and not a priority. We should not be optimizing for "deploy_cluster.yml" is the single entrypoint any longer. We need to be treating the three steps as discrete, and make sure that we are reusing those pieces. Deploy_cluster can be cloud provider agnostic, but only because it won't get called :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely prefer not having a single playbook try to do all things. I think it makes sense to have them be completely separate concerns.

Each of those is going to be an entrypoint playbook that needs to share code with deploy_cluster - deploy_cluster is not really going to be called by anyone except people on bare metal. I agree on bare metal the workflow you described is probably ok, except that's only 25% of the install footprint and not a priority.

While I think many users will not be using actual bare metal hosts, many of them will be provisioning hosts manually on a variety of clouds and virt infrastructure. I imagine some users will be quite happy with the instance/infra provisioning plays, and others are not.

I think there is a lot of value in having a simple deploy process of

  1. Provision some elbs
  2. Provision some hosts
  3. Provision openshift

We had discussed previously making all the nodes 'bootstrap-able'. I think we should try to move that direction so we have as common as possible install experience across all deployment types.

name: openshift_gcp
tasks_from: setup_scale_group_facts.yml

- name: configure the control plane
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just make this ../deploy_cluster.yml and add install_gcp.yml to deploy_cluster.yml in the appropriate spot. You can use a boolean set via inventory or via this playbook to ensure that install_gcp.yml is called only when we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're probably going to want to add prerequisites.yml to this playbook. I have been working on some patches that integrate prerequisites.yml into these types of special plays.

See this node-scaleup PR for an example: #6784

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is - why does prerequisites have to be special, vs being part of lower level playbooks? Is it only the dependency on openshift_facts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's special because it incorporates some things (like installing docker and configuring docker storage) there were previously not included in openshift-ansible and had to be completed by the user before running openshift-ansible.

container_runtime (previously the docker role) needs to be configured very early in the deployment process. The reasons for this are:

  1. configuring registry authentication credentials
  2. openshift_version depends on correctly setup container runtime to query latest/available tags for containerized installs
  3. health checks want to ensure containers are available and registries are configured before running the installer and failing halfway through.

We could put this role in the init code path, but certain users don't want us to touch docker after it's been installed, because as you may know, we have given users license to modify things outside of openshift-ansible, and re-running the container_runtime role may have undesired effects for them. Additionally, it runs against the 'nodes' group, so having to run those tasks each time a play is run adds a lot of execution time to large clusters.

We could probably add it to deploy_cluster.yml now that we have refactored a lot of other components. But I think we're a little late in the cycle for that for 3.9 and we could potentially do it in 3.10.

If you look at 3.7 (or even worse, 3.6), you will see that it was a very tangled web of dependencies surrounding the docker role. I agree that prerequisites.yml is a compromise, might not be the best solution, but it works and is an improvement over past releases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openshift_version depends on correctly setup container runtime to query latest/available tags for containerized installs

This is bad. This shouldn't be a dependency then.

health checks want to ensure containers are available and registries are configured before running the installer and failing halfway through.

This seems buggy, like the above. That's not something we should be doing on the nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I'm not arguing that these aren't valuable and important checks. My issue is that they are fundamentally things that provisioning has to do, so for the cloud providers we will always need to call them. Doing that from inside a system is going to be problematic because we will always have external dependencies. We either need to avoid the dependency on the container runtime to do the check, or figure out how to make those checks special and move them somewhere else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bad. This shouldn't be a dependency then.

If you compare what we have now versus 3.7 and 3.6 releases you'll see a vast reduction in the web of dependencies in this area. I agree, we can definitely do more in this area, but I think we're in a pretty good place at the moment compared to previous releases in this area. Personally, I would like to force users to specify exactly what they want and we take the training wheels off in regards to openshift_version, but that's not something we implemented this cycle. I think we have the ground work now to move that direction if we choose to.

We either need to avoid the dependency on the container runtime to do the check, or figure out how to make those checks special and move them somewhere else.

container_runtime needs to be setup at some point regardless. We might as well do it right up front and check to make sure things are working before we start deploying to 100's of nodes. Whether we're using atomic, skopeo, or docker to interact with registries, we still need to ensure those items are properly configured on the hosts. I don't see a reason to not configure container_runtime up front when it's a prerequisite to pretty much the whole deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might as well do it right up front and check to make sure things are working before we start deploying to 100's of node

This is the old way :) In the cloud provider model, we will need to check these things before we have a single node provisioned, because we are going to be using golden images. Anything that we need in order to check this needs to be something that can be installed into the docker image that the playbook runs out of, and it's better for it not to be tied in any way to a particular container runtime choice (whether skopeo or docker) because the component won't know that.

I think the checks are valuable, but we can't rely on installing a container runtime on a given provisioned node in order to the checks. We probably shouldn't be installing skopeo either, given the amount of dependencies it pulls in, or its relatively inflexibility. The checks won't make any sense in cloud providers really, because we won't even invoke ansible on the nodes during the install (i.e. both AWS and GCP really operate only on the nodes selected as masters).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At best, this seems like a check that has to be done after container runtime is installed in the node job, not up front in the playbooks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the old way :) In the cloud provider model, we will need to check these things before we have a single node provisioned, because we are going to be using golden images.

Building the golden images is currently no different on aws vs deploying on bare metal, as far as container_runtime is concerned. If container runtime is not setup, then the node isn't provisioned; if there's no container runtime, it can't run containers.

We can't really check 'before' a single node is provisioned; the value is in checking that each node is configured correctly (ie, networking, proxy), not so much that you put in valid credentials and registry url.

We have a lot of checks built in for user friendliness. Probably you or I or anyone that is commit PRs here can identify most problems somewhat quickly; The average user just sees the nodes didn't deploy and don't know why. Having the checks in place that we do gives them much better feedback if they are not advanced ansible users (I would argue, many of our users are not 'advanced' ansible users, they just want the installer to work).

@@ -0,0 +1 @@
../../../filter_plugins No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are no longer doing filter_plugins this way. All filter plugins should be moved to lib_utils and that role should be brought into scope via a meta-depends in a role or at the play level in the 'roles:' section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not using any special filter plugin, I'll run this without.

@smarterclayton
Copy link
Contributor Author

/test launch-gcp

@smarterclayton
Copy link
Contributor Author

/retest

Also add libcloud (required for dynamic GCE lookup) and which (relied on
by gcloud).
This moves all core functionality into the openshift-ansible repo, adds
the necessary equivalent entrypoint to the openshift-ansible installer
image, and ensures the dynamic inventory mechanisms in openshift-ansible
continue to work.

Notable changes from origin-gce:

* playbook extensions changed to .yml
* dynamic inventory subdirectory created to prevent accidental use
* use the custom entrypoint entrypoint-gcp for this image
* move tasks into openshift_gcp role
@smarterclayton smarterclayton force-pushed the move_gcp_in branch 2 times, most recently from a8c2910 to 9f01113 Compare January 23, 2018 15:58
- import_playbook: init/main.yml

- import_playbook: openshift-checks/private/install.yml
- import_playbook: common/openshift-cluster/private/control_plane.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're trying to remove things from 'openshift-cluster'

We can't call this control_plane if it installs nodes, because our upgrade_control_plane only upgrades masters and etcd hosts. Calling everything below pods the control plane is confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 'base_cluster' is better than 'control_plane'

I think the file path should be
common/private/base_cluster.yml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine splitting out the nodes. GCP and AWS will be using golden images so in general they wouldn't call this step either (I call it today in case people want to manage masters directly in an invocation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, when we change how the control plane is installed (moving to static pods and self-hosted) we'll have to configure masters as nodes first. So we shouldn't get attached to the current node order anyway.

For posterity, the arc we're on for control plane is going to be only supporting the following steps:

  1. runc installed
  2. kubelet installed inside system container
  3. host level config that includes static pods
  4. kubelet launches static pods, which include the control plane
  5. static pods form working control plane
  6. master nodes are bootstrapped into the cluster

The "bootstrap node" configuration will then be required as part of control plane setup (or required to have been completed). So we should expect that some node config goes before the control plane, but that all nodes are configured in bootstrap mode (the current node config is going to be deprecated and removed).

I'm going to move the node setup into the callers for now.


- import_playbook: openshift-management/private/config.yml
when: openshift_management_install_management | default(false) | bool
- import_playbook: common/openshift-cluster/private/components.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this to be in common/private as well.

@smarterclayton
Copy link
Contributor Author

/test gcp

@smarterclayton
Copy link
Contributor Author

/test gcp

@smarterclayton
Copy link
Contributor Author

@michaelgugino last comments addressed

Copy link
Contributor

@michaelgugino michaelgugino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 24, 2018
Copy link
Contributor

@kwoodson kwoodson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late 👍 . Was out of town last week. Looks like a lot of good code coming in. Some follow up needed but thanks for the PR @smarterclayton.

/lgtm

@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-ci-robot
Copy link

@smarterclayton: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/install f2b14d2 link /test install
ci/openshift-jenkins/extended_conformance_install_crio f2b14d2 link /test crio

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sdodson
Copy link
Member

sdodson commented Jan 24, 2018

well known conformance flake

@sdodson sdodson merged commit 7e112b2 into openshift:master Jan 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants