Skip to content

Conversation

@mtnbikenc
Copy link
Member

  • Removes playbooks/byo/config.yml standard entry-point
  • Adds new playbooks/deploy_cluster.yml

This PR is also intended to facilitate the conversation of how to rearrange and componentize the installation, instead of having one monolithic install process.

Trello: https://trello.com/c/5HLEg8UM/555-5-playbook-consolidation-openshift-cluster

@mtnbikenc mtnbikenc self-assigned this Dec 5, 2017
@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 5, 2017
Copy link
Contributor

@michaelgugino michaelgugino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Do we want to move to using a script in the CI job so we can just make future changes here?

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 5, 2017
@mtnbikenc
Copy link
Member Author

@sdodson @michaelgugino @kwoodson
Since we are changing the UX for using openshift-ansible, I'd like to talk about how we can address some of the concerns regarding having a single monolithic install playbook. Here is one approach.

Multiple optional install playbooks

Optionally run the nfs and loadbalancer playbooks if desired
openshift-nfs/config.yml
openshift-loadbalancer/config.yml

Run the playbook for installing the cluster
deploy_cluster.yml

---
- import_playbook: init/main.yml
- import_playbook: openshift-checks/private/install.yml
- import_playbook: openshift-etcd/private/config.yml
- import_playbook: openshift-master/private/config.yml
- import_playbook: openshift-node/private/config.yml

Optionally run the glusterfs playbook if desired.
openshift-glusterfs/config.yml

Run the playbook for installing addons
deploy_addons.yml

---
- import_playbook: init/main.yml
- import_playbook: openshift-master/private/additional_config.yml
- import_playbook: openshift-hosted/private/config.yml
- import_playbook: openshift-metrics/private/config.yml
  when: openshift_metrics_install_metrics | default(false) | bool
- import_playbook: openshift-logging/private/config.yml
  when: openshift_logging_install_logging | default(false) | bool
- import_playbook: openshift-prometheus/private/config.yml
  when: openshift_hosted_prometheus_deploy | default(false) | bool
- import_playbook: openshift-service-catalog/private/config.yml
  when: openshift_enable_service_catalog | default(true) | bool
- import_playbook: openshift-management/private/config.yml
  when: openshift_management_install_management | default(false) | bool

Move deprecated variable warnings to a callback_plugin.

Do the above suggestions address the concerns and problems discussed in #5846? How could this process be improved for the customer as well as handling provisioning?

(Pending feedback)
/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 6, 2017
@michaelgugino
Copy link
Contributor

I'm not sure you can put when statements on import_playbook or not.

I think playbooks/deploy_cluster.yml should include playbooks/openshift-addons/private/config.yml

All of the addons should be boolean driven.

@sdodson
Copy link
Member

sdodson commented Dec 6, 2017

I'm fine with the proposed structure. The private subdirectory gives us a decent method for things like playbooks/openshift-metrics/config.yml to include ../init.yml and then ./private/config.yml So that each component can be addressed at varying levels like at cluster deploy, addon deploy, or metrics alone.

I sure hope we can import_playbook conditionally, if not then we'll sink a lot of time skipping tasks in entire playbooks whenever someone isn't hand crafting their installation workflow and instead using our high level playbooks.

I wonder if now would be a good time to consider how we'd do "config loop"? I'm pretty sure that we'll be asked to address that soon. Should we break each component up into "install" "configure" "upgrade" where each is completely independent set of tasks? Deploy a cluster would include install and configure. Config loop would only include config. Building a golden image could only include install plus alternate config tasks that are specific to bootstrap.

@michaelgugino
Copy link
Contributor

Okay, so I tested the when condition on the import_playbook. I believe the docs mention this, but that statement is applied to every role and task found inside, not whether or not to apply the playbook itself. So we're still going to see a bunch of skipped tasks.

For most add-ons, the skipped tasks will be almost no time since they run against masters or a single master.

For add-ons that need to be run against nodes, we should dynamically build groups like we do currently for the network plugins.

@mtnbikenc
Copy link
Member Author

We have when: on playbook includes already. Should be the same for import_playbook:.
https://github.com/openshift/openshift-ansible/blob/master/playbooks/common/openshift-cluster/config.yml#L7

Provisioning has a problem with the addons as part of the main cluster deployment and results in rewriting the deployment steps.

The steps I'm attempting to define are:

  1. Do things upfront (prerequisites, nfs, loadbalancer, networkmanager, etc.)
  2. Build the basic cluster (etcd, master, nodes)
  3. Run any addons

@mtnbikenc
Copy link
Member Author

Skipped tasks on conditional playbook includes are nothing new. I would have liked include_playbook: which would have been more dynamic at that level, but that is not a thing.

@mtnbikenc
Copy link
Member Author

I'd also like to come up with a better way to handle those dynamically built groups, in order to avoid:

 [WARNING]: Could not match supplied host pattern, ignoring: oo_nodes_use_flannel
 [WARNING]: Could not match supplied host pattern, ignoring: oo_nodes_use_calico
 [WARNING]: Could not match supplied host pattern, ignoring: oo_nodes_use_contiv
 [WARNING]: Could not match supplied host pattern, ignoring: oo_nodes_use_kuryr
 [WARNING]: Could not match supplied host pattern, ignoring: oo_nodes_use_nuage

@michaelgugino
Copy link
Contributor

I'd also like to come up with a better way to handle those dynamically built groups, in order to avoid:

I can live with a couple messages like that; as long as upstream doesn't break the functionality around that pattern.

Provisioning has a problem with the addons as part of the main cluster deployment and results in rewriting the deployment steps.

I'm not sure what you mean.

@mtnbikenc
Copy link
Member Author

Provisioning has a problem with the addons as part of the main cluster deployment and results in rewriting the deployment steps.

I'm not sure what you mean.

@kwoodson had to rewrite the cluster install steps in #5846 instead of just being able to consume a portion of the cluster build process. Any changes we make to that install process will have to be updated in both the aws provisioning as well as whatever we have as deploy_cluster.yml.

@mtnbikenc mtnbikenc force-pushed the consolidate-deploy-cluster branch from 87049ec to d1c96ee Compare December 6, 2017 21:33
@openshift-merge-robot
Copy link
Contributor

/lgtm cancel //PR changed after LGTM, removing LGTM. @michaelgugino @mtnbikenc

@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Dec 6, 2017
@mtnbikenc mtnbikenc force-pushed the consolidate-deploy-cluster branch from d1c96ee to cccdd4a Compare December 6, 2017 21:44
@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 6, 2017
@mtnbikenc
Copy link
Member Author

/retest

@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 7, 2017

@mtnbikenc: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/extended_conformance_install_crio cccdd4a link /test crio

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mtnbikenc
Copy link
Member Author

bot, retest this please

@mtnbikenc
Copy link
Member Author

bot?

@sdodson sdodson merged commit 9ee49e0 into openshift:master Dec 7, 2017
@mtnbikenc mtnbikenc deleted the consolidate-deploy-cluster branch December 7, 2017 21:37
@mtnbikenc mtnbikenc removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 12, 2017
@mtnbikenc
Copy link
Member Author

/cherrypick release-3.8

@openshift-cherrypick-robot

@mtnbikenc: #6361 failed to apply on top of branch "release-3.8": exit status 128

Details

In response to this:

/cherrypick release-3.8

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mfojtik
Copy link
Contributor

mfojtik commented Jan 23, 2018

@mtnbikenc
Copy link
Member Author

Opened PR to address recent changes: openshift/openshift-docs#7274

@smarterclayton
Copy link
Contributor

As part of #6541 deploy_cluster was split into two equal includes (components and control plane) which are now included by deploy_cluster, AWS, and GCP (the top level entry points). Deploy cluster is only relevant for the preconfigured hosts use case (over time, only bare metal), and will not be a common path on cloud providers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants