Skip to content

GCE-support#658

Merged
twiest merged 8 commits intoopenshift:masterfrom
chengchengmu:gce-support
Oct 12, 2015
Merged

GCE-support#658
twiest merged 8 commits intoopenshift:masterfrom
chengchengmu:gce-support

Conversation

@chengchengmu
Copy link
Contributor

Hello,

The file that has been merged by error is reverted to the previous and correct version : playbooks/openstack/openshift-cluster/launch.yml

The other file that impacts Openstack is nova.py. The modification is a fix.
Now openstack deployment can be launched from a directory different than openshift-ansible.

This code successfully deploys clusters on openstack and it seems to be working well with or without infra node.

Thank you

arsogukpinar and others added 2 commits October 5, 2015 18:20
Template name is conflicting with the template name from 'eap6-basic-sti.json' .
@openshift-bot
Copy link

Can one of the admins verify this patch?

Add kind/apiVersion to scheduler.json template
@chengchengmu chengchengmu mentioned this pull request Oct 6, 2015
@chengchengmu
Copy link
Contributor Author

The only difference of my code with previous PR #641 is that playbooks/openstack/openshift-cluster/launch.yml is right now

@twiest
Copy link
Contributor

twiest commented Oct 7, 2015

@wshearn can you please make sure this PR works for you. You said the last one (PR #641) broke you guys.

@twiest
Copy link
Contributor

twiest commented Oct 7, 2015

ok to test

@twiest
Copy link
Contributor

twiest commented Oct 7, 2015

@sdodson are you ok with this version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this change makes sense, this seems unrelated to the purpose of this PR, why's this here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this doesn't make sense to me. The way it's written it would use nova.ini in the same location as nova.py (ie in the git checkout) which I don't suspect anyone would modify in place. I think it'd be more likely that they'd expect it to look for it in cwd, do you agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you that this change is not related at all to GCE ... however this change makes sense :

nova.ini is located in the same directory than nova.py
Getcwd will return the location from where we execute the script nova.py resulting in an error if the script is executed from directory different. So this is fixing that issue.

On AWS, GCE, LIBVIRT we take the .ini file in the same location as the .py
So I think it was just a mistake to use getcwd for Openstack

Anyways we can take this change off and make an other PR.
It's up to you to judge if it's worth for a minor change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with leaving it in.

@sdodson
Copy link
Member

sdodson commented Oct 7, 2015

@twiest LGTM

@twiest
Copy link
Contributor

twiest commented Oct 7, 2015

@sdodson thanks

Just waiting on a +1 from @wshearn. Once we get that, I think we'll be ready to merge.

@chengchengmu
Copy link
Contributor Author

@sdodson @twiest Thank you

Just waiting for green light of @wshearn

@wshearn
Copy link
Contributor

wshearn commented Oct 8, 2015

👎 Still breaks node labels

localhost                  : ok=79   changed=9    unreachable=0    failed=0   
whearntest-master-0a2ff    : ok=166  changed=57   unreachable=0    failed=0   
whearntest-node-compute-69228 : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-compute-a2a1f : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-compute-f6333 : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-infra-8c6d1 : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-infra-97b9f : ok=61   changed=25   unreachable=0    failed=0   

[02:01 PM] [gce-support %]
[0]Pluto:~/Work/ansible/test_labels $ ssh root@52.23.238.XXX
Warning: Permanently added '52.23.238.XXX' (ECDSA) to the list of known hosts.
Last login: Thu Oct  8 14:01:52 2015 from XXXZZZWWWYYY
[root@172 ~]# oc get nodes
NAME          LABELS                               STATUS                     AGE
172.20.3.23   kubernetes.io/hostname=172.20.3.23   Ready                      3m
172.20.3.24   kubernetes.io/hostname=172.20.3.24   Ready                      3m
172.20.3.58   kubernetes.io/hostname=172.20.3.58   Ready                      3m
172.20.3.59   kubernetes.io/hostname=172.20.3.59   Ready                      3m
172.20.3.60   kubernetes.io/hostname=172.20.3.60   Ready                      3m
172.20.3.76   kubernetes.io/hostname=172.20.3.76   Ready,SchedulingDisabled   2m

@chengchengmu
Copy link
Contributor Author

@wshearn what do we expect for node's labels ?

When the variable openshift_node_labels is not defined, should it default to none instead of empty ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @wshearn noted this breaks node labels, should be

labels: "{{ lookup('oo_option', 'openshift_node_labels') | default( openshift_node_labels | default(none) ) }}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, that doesn't work either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels: "{{ lookup('oo_option', 'openshift_node_labels') | default( openshift_node_labels | default(none) , true) }}"
With @twiest 's adivce, what do you think about this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@menren Looks good to me, but will need to be tested by @wshearn to verify it works for him.

Please update this PR with the change and when you're done, ask @wshearn to test it.

Thanks!

@wshearn
Copy link
Contributor

wshearn commented Oct 8, 2015

This is a recently built cluster:

[root@172 ~]# oc get nodes
NAME           LABELS                                                              STATUS                     AGE
172.31.25.70   kubernetes.io/hostname=172.31.25.70,region=us-east-1,type=master    Ready,SchedulingDisabled   2d
172.31.30.88   kubernetes.io/hostname=172.31.30.88,region=us-east-1,type=compute   Ready                      2d
172.31.30.89   kubernetes.io/hostname=172.31.30.89,region=us-east-1,type=compute   Ready                      2d
172.31.30.90   kubernetes.io/hostname=172.31.30.90,region=us-east-1,type=compute   Ready                      2d
172.31.30.91   kubernetes.io/hostname=172.31.30.91,region=us-east-1,type=compute   Ready                      2d
172.31.30.92   kubernetes.io/hostname=172.31.30.92,region=us-east-1,type=compute   Ready                      2d
172.31.30.93   kubernetes.io/hostname=172.31.30.93,region=us-east-1,type=compute   Ready                      2d
172.31.31.6    kubernetes.io/hostname=172.31.31.6,region=us-east-1,type=infra      Ready                      2d
172.31.31.7    kubernetes.io/hostname=172.31.31.7,region=us-east-1,type=infra      Ready                      2d

Both of these clusters were built in AWS and not GCE.

@chengchengmu
Copy link
Contributor Author

oo_option returns an empty array, even empty the variable is defined right ?
if right, it won't default to the variable openshift_node_labels

If this is the case, giving priority to the variable openshift_node_labels should resolve this issue.

I will check it tomorrow at office

@wshearn @sdodson
Thank you

@twiest
Copy link
Contributor

twiest commented Oct 8, 2015

@menren To replace an empty array (or anything that evaluates to "false" in python), pass a second parameter to default like this:

"{{ [] | default(my_default, true) }}"

In this case, my_default will replace the empty list.

For more information, see our best practices guide:
https://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#filters

@chengchengmu
Copy link
Contributor Author

@twiest thank you for sharing knowledge !

…_labels when oo_option returns an empty list
@chengchengmu
Copy link
Contributor Author

@wshearn
Now labels takes the value of the oo_option variable (passed by -o option of bin/cluster), then defaults to the variable openshift_node_labels defined by playbooks if defined, else defaults to none.
This should fix the labels. Could you check please ?

However this choice may be discussed :
Currently oo_option variable overrides the playbook's variable.
We could also concatenate both variables

What is your opinion ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the best practices guide, we should always pass in "true" as the second param of default, unless we're dealing with boolean values.

So, this should be:

infra_names: "{{ infra_names_output.results | default([], true)

Best practices guide:
https://github.com/openshift/openshift-ansible/blob/master/docs/best_practices_guide.adoc#filters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass in true for both default calls (same as above)

@twiest
Copy link
Contributor

twiest commented Oct 9, 2015

ok to test

@chengchengmu
Copy link
Contributor Author

@twiest All the default mentionned have second param. as true now.
Thanks

@twiest
Copy link
Contributor

twiest commented Oct 9, 2015

ok to test

@twiest
Copy link
Contributor

twiest commented Oct 9, 2015

This LGTM. Waiting on @wshearn to see if the problem he was seeing has been fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@menren hey, sorry for missing this earlier, but I just noticed this.

So we use infra nodes, why is this commented out?

We want it like the AWS playbook which has this section in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twiest

We use set_node_launch_facts_tasks.yml for infra_node (https://github.com/openshift/openshift-ansible/blob/master/playbooks/gce/openshift-cluster/launch.yml#L31) : that's not working obviously, the playbook stop on error while trying to launch infra nodes instances.

I tried to repair it with set_infra_launch_facts_tasks.yml, the instances were correctly launched, however I encountered many issues at that time (in August).

Currently the GCE deployment isn't working at all, with or without infra-node. I suppose nobody is working with it. If somebody is working with it, he/she has to tell me how it can work in the current state.

By commenting this code, the rest is working.
We have to wait an other patch in order to make the infra-node work.

This PR fixes the standard deployment without infra and will allow people to work on GCE deployment and repair the infra-node

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I just tried and I was able to launch by making the infra section look the same on gce as it does in aws.

So, it now looks like this in my branch:

  - include: ../../common/openshift-cluster/set_node_launch_facts_tasks.yml
    vars:
      type: "infra"
      count: "{{ num_infra }}"
  - include: tasks/launch_instances.yml
    vars:
      instances: "{{ node_names }}"
      cluster: "{{ cluster_id }}"
      type: "{{ k8s_type }}"
      g_sub_host_type: "{{ sub_host_type }}"

  - add_host:
      name: "{{ master_names.0 }}"
      groups: service_master
    when: master_names is defined and master_names.0 is defined

Locally, I also removed this file as it's no longer being used:

playbooks/common/openshift-cluster/set_infra_launch_facts_tasks.yml

I'm ok if we get this as a separate PR as I have some other things I had to patch to be able to launch a cluster in GCE. I'll @ mention you when I create my PR to make sure it works for you too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twiest good if you can repair the infra nodes for GCE because I work without it and don't know how it works exactly

@wshearn
Copy link
Contributor

wshearn commented Oct 12, 2015

👍

PLAY RECAP ******************************************************************** 
localhost                  : ok=79   changed=9    unreachable=0    failed=0   
whearntest-master-7b695    : ok=166  changed=58   unreachable=0    failed=0   
whearntest-node-compute-2ec70 : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-compute-a24b5 : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-compute-b899a : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-infra-c02a3 : ok=61   changed=25   unreachable=0    failed=0   
whearntest-node-infra-d2a77 : ok=61   changed=25   unreachable=0    failed=0   

[01:28 PM] [gce-support %]
[0]Pluto:~/Work/ansible/test_labels $ ssh root@XX.YY.WW.ZZ "oc get nodes"
Warning: Permanently added 'XX.YY.WW.ZZ' (ECDSA) to the list of known hosts.
NAME           LABELS                                                              STATUS                     AGE
172.20.3.126   kubernetes.io/hostname=172.20.3.126,region=us-east-1,type=infra     Ready                      4m
172.20.3.127   kubernetes.io/hostname=172.20.3.127,region=us-east-1,type=infra     Ready                      4m
172.20.3.230   kubernetes.io/hostname=172.20.3.230,region=us-east-1,type=compute   Ready                      4m
172.20.3.231   kubernetes.io/hostname=172.20.3.231,region=us-east-1,type=compute   Ready                      4m
172.20.3.232   kubernetes.io/hostname=172.20.3.232,region=us-east-1,type=compute   Ready                      4m
172.20.3.54    kubernetes.io/hostname=172.20.3.54,region=us-east-1,type=master     Ready,SchedulingDisabled   4m

@chengchengmu
Copy link
Contributor Author

@wshearn Thank you for the test and the thumb

@twiest
Copy link
Contributor

twiest commented Oct 12, 2015

Merging because major issues have been addressed. We'll fix the infra node issue in a separate PR.

twiest added a commit that referenced this pull request Oct 12, 2015
@twiest twiest merged commit 3547698 into openshift:master Oct 12, 2015
@twiest
Copy link
Contributor

twiest commented Oct 12, 2015

@menren great job on this PR. Thank you for your tireless efforts to make our GCE support better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants