Initial Push of Azure Ansible Ref Arch Tempaltes#125
Initial Push of Azure Ansible Ref Arch Tempaltes#125themurph merged 34 commits intoopenshift:masterfrom glennswest:master
Conversation
|
Can one of the admins verify this patch?
|
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch?
|
| @@ -0,0 +1,8 @@ | |||
| oc new-project demo | |||
There was a problem hiding this comment.
Do you want to try to migrate to the common validation playbooks here? Less work to maintain and closer to the ansible migration
There was a problem hiding this comment.
not at this point, as I want to tech freeze and get doc finished. Then can add lots of value I have already got hooks for 3.5 release.
| "rhel" : { | ||
| "publisher" : "RedHat", | ||
| "offer" : "RHEL", | ||
| "sku" : "7.2", |
There was a problem hiding this comment.
Should we ship 7.3 due to the requirements for OpenShift 3.4
There was a problem hiding this comment.
If worried about ocp 3.3 on rhel we shipped it about a month ago on aws with no issues
There was a problem hiding this comment.
Not there yet. This is currently 3.3. So when 3.4 comes, so will only change at becomes available. So not changable till then.
There was a problem hiding this comment.
ok, 7.3 is available will change.
| "fileUris" : [ | ||
| "[concat(parameters('baseTemplateUrl'), 'bastion.sh')]" | ||
| ], | ||
| "commandToExecute" : "[ concat('sh bastion.sh ', resourceGroup().name, ' ', parameters('adminUsername'), ' ', parameters('adminPassword'), ' ', reference(concat(parameters('vmName'), 'pip')).dnsSettings.fqdn, ' ', padLeft(parameters('numberOfNodes'), 2, '0'), ' ', parameters('routerExtIP'),' ', parameters('RHNUserName'), ' ', parameters('RHNPassword'),' ', parameters('SubscriptionPoolId'), ' ', variables('sQuote'), parameters('sshPrivateData'), variables('sQuote'), ' ', variables('sQuote'), parameters('sshKeyData'), variables('sQuote') ) ]" |
There was a problem hiding this comment.
Probably just a small NIT but maybe change the var to RHSM rather than RHN. It should make retro fitting to common plays easier.
There was a problem hiding this comment.
this is actually a variable between ARM template and the bash template. Has no material effect.
There was a problem hiding this comment.
You are correct that the prefixing your credential variables with RHN has no material effect in this case, however, the reasoning behind @cooktheryan's Agile based suggestion of standardizing the credential variables by using the prefix RHSM over RHN, will go a long way in insuring that future refactoring efforts will be less painful. I'm certain that @cooktheryan wasn't trying to single you out in this comment, the avoidance of creating technical debt right from the start is very relevant and needs all of our attention. Standardizing these common variables is something we, as a team, all need to work together on. We could tack this on to #117 or create a new issue, but either way, doing the work up front will pay dividends in the future. /cc @dav1x @markllama @scollier
There was a problem hiding this comment.
+1 since the vmware and gce plays all used the same var names it has made #126 that much easier. The common vars also allowed for me to use all the same plays regardless of provider which is our end goal in getting this work productized.
| "properties" : { | ||
| "protocol" : "Tcp", | ||
| "sourcePortRange" : "*", | ||
| "destinationPortRange" : "9090", |
There was a problem hiding this comment.
Will you be proxying access to the cockpit UI through the bastion?
| chmod 600 /root/.ssh/id_rsa | ||
|
|
||
| sleep 30 | ||
| cat <<EOF > /root/setup_ssmtp.sh |
There was a problem hiding this comment.
Have you tested failing the mail step? The reason why I ask is that it would be a bummer if the installation hung just due to an email. Also, is the email portion required or optional? Could it be configured to be optional if it is currently required?
There was a problem hiding this comment.
Yes. Many times. It runs fine without the mail even being there. (In early I faced that quite a few times, so its safe)
|
|
||
| cat <<EOF > /etc/sysconfig/docker-storage-setup | ||
| DEVS=/dev/sdc | ||
| VG=docker-vg |
There was a problem hiding this comment.
Might be worth adding this in as well
DATA_SIZE=95%VG
EXTRA_DOCKER_STORAGE_OPTIONS="--storage-opt dm.basesize=3G"
The extra storage options will help with quota and I can't remember right. Also, Im not sure how much space of the vg that is created is used. It might be worth testing or providing a value.
There was a problem hiding this comment.
Will remove docker support from bastion.
| subscription-manager repos --enable="rhel-7-server-ose-3.3-rpms" | ||
| yum -y install atomic-openshift-utils | ||
| yum -y install git net-tools bind-utils iptables-services bridge-utils bash-completion httpd-tools | ||
| yum -y install docker |
There was a problem hiding this comment.
Will you need docker on the bastion?
There was a problem hiding this comment.
There is support there for it, but at the moment no.
There was a problem hiding this comment.
will remove docker support from bastion.
| rhn_pool_id=${RHNPOOLID} | ||
| openshift_install_examples=true | ||
| deployment_type=openshift-enterprise | ||
| openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}] |
There was a problem hiding this comment.
I wouldn't do HTPasswd esp if we are multi-master. You should be able to borrow gmail, github, or ldap creds from one of the completed archs
There was a problem hiding this comment.
Actually plan is HTPASSWD is "emergency login", Ive done the work for keycloak already, and will be the auth solution. This has been the strategy since day one. With keycloak everything can be integrated. I've done initial testing with it, Im only missing the auto setup of the providers.
| remote_user=${AUSERNAME} | ||
|
|
||
| openshift_master_default_subdomain=${ROUTEREXTIP}.xip.io | ||
| openshift_use_dnsmasq=False |
There was a problem hiding this comment.
@detiber I think you had a comment about this before but i can't remember if there was a resolution
There was a problem hiding this comment.
I did test this, and everything appears to be working. Did read thru the docs, and far as I can tell, everything is functional. So its technical risk at this port to change it.
| openshift_public_hostname=${RESOURCEGROUP}.trafficmanager.net | ||
|
|
||
| openshift_master_cluster_method=native | ||
| openshift_master_cluster_hostname=${RESOURCEGROUP}.trafficmanager.net |
There was a problem hiding this comment.
are these hard coded domains correct?
There was a problem hiding this comment.
Yes, and working. And mandated via azure. Supports multiple clusters, driven by resource group.
There was a problem hiding this comment.
Is there a play that sets the domain for a customers domain?
There was a problem hiding this comment.
First, we do give them a change to specify one within the Azure naming conventions. Then its full auto. If they want a totally separate DNS server or domain outside, they can manually add it.
| # default selectors for router and registry services | ||
| openshift_router_selector='region=infra' | ||
| openshift_registry_selector='region=infra' | ||
|
|
There was a problem hiding this comment.
You should be able to scale router and registry here based off of the amount of infra nodes you have
There was a problem hiding this comment.
Can consider in the future, as I need to look has to get this to happen.
| # default selectors for router and registry services | ||
| openshift_router_selector='region=infra' | ||
| openshift_registry_selector='region=infra' | ||
|
|
There was a problem hiding this comment.
Should probably provide a default node selector
| master2 openshift_node_labels="{'region':'master','zone':'default'}" | ||
| master3 openshift_node_labels="{'region':'master','zone':'default'}" | ||
| node[01:${NODECOUNT}] openshift_node_labels="{'region': 'primary', 'zone': 'default'}" | ||
| infranode1 openshift_node_labels="{'region': 'infra', 'zone': 'default'}" |
There was a problem hiding this comment.
Im not sure if region is the best naming for the labels. I think once we move to multi-region or more federated deployments region may be used to show the location of the nodes geographically. Most of the providers used role rather than region. As for the node label rather than primary we used app. Completely optional on the primary vs app but keeps the arks rather in sync.
There was a problem hiding this comment.
can change the nodes to "app" if you think thats sensible. The above is based on documentation, so which is better, stick to doc, or copy other providers?
| cat <<EOF > /home/${AUSERNAME}/subscribe.yml | ||
| --- | ||
| - hosts: all | ||
| vars: |
There was a problem hiding this comment.
I believe in the previous PR @detiber had suggested serial during the registration process. If not. This should probably be in some sort of serial format just to prevent a herd trying to register all at once.
There was a problem hiding this comment.
actually the subscription with retry that I've implemented is solid, with large number of tests, that no matter what I'd get failures till I did the retry logic. With the retry logic its very solid and works every time. So going serial and without the retries would signficantly increase failures and "bad" days where it just didnts work.
Also there are issues in subscribe ansible module that are new in ansible 2.1
There was a problem hiding this comment.
Glenn, can you point to or create a BZ for this issue. There are hundreds of people who rely on the official "Subscription Manager" module, so if you have a reproducible bug, make sure to at least report it or even better, submit a patch.
| ignore_errors: yes | ||
| - name: disable all repos | ||
| shell: subscription-manager repos --disable="*" | ||
| - name: enable rhel7 repo |
There was a problem hiding this comment.
Rather than using shell for all of the repos I would suggest using the already used playbooks/roles/rhsm-repos/tasks/main.yaml
There was a problem hiding this comment.
this was changed due to ansible problems.
| - name: install the latest version of PyYAML | ||
| yum: name=PyYAML state=latest | ||
| - name: Install the ose client | ||
| yum: name=atomic-openshift-clients state=latest |
There was a problem hiding this comment.
Probably not required as this should be taken care of during the install on all hosts except the bastion
There was a problem hiding this comment.
found it useful for checkout and debug, as it was not happening.
| async: 1200 | ||
| poll: 10 | ||
| - name: Wait for Things to Settle | ||
| pause: minutes=5 |
There was a problem hiding this comment.
I would potentially add in NetworkManager installation and the starting and enabling
There was a problem hiding this comment.
It is installed in all the hosts except bastion.
So is there a requirement on the bastion host?
There was a problem hiding this comment.
no it will not be required
| - name: Change Node perFSGroup Qouta Setting | ||
| lineinfile: dest=/etc/origin/noce/node-config.yaml regexp=^perFSGroup: line=" perFSGroup:512Mi" | ||
| - name: Update Mount to Handle Quota | ||
| mount: fstype=xfs name=/var/lib/origin/openshift.local/volumes src=/dev/sdd option="gquota" state="mounted" |
There was a problem hiding this comment.
Is it possible to cloud init this task?
There was a problem hiding this comment.
Actually potentially could move this into per node scripts, but ansible does seem to do it fine.
| - name: Create Master Directory | ||
| file: path=/etc/origin/master state=directory | ||
| - name: add initial user to OpenShift Enterprise | ||
| shell: htpasswd -c -b /etc/origin/master/htpasswd ${AUSERNAME} ${PASSWORD} |
There was a problem hiding this comment.
Definitely look towards another auth
| "priority" : 2000, | ||
| "direction" : "Inbound" | ||
| } | ||
| } |
There was a problem hiding this comment.
This may be answered later on but is 22 allowed only from the bastion or 22 allowed from both internal and external to the infra nodes?
There was a problem hiding this comment.
22 only from bastion host.
| #!/bin/bash | ||
|
|
||
| #yum -y update | ||
| yum -y install wget git net-tools bind-utils iptables-services bridge-utils bash-completion docker |
There was a problem hiding this comment.
I dont think you need these packages as the installer should take care of docker and possibly any other items.
There was a problem hiding this comment.
THis is recommended from the doc. https://docs.openshift.com/container-platform/3.3/install_config/install/host_preparation.html
So how do u want to handle?
There was a problem hiding this comment.
I believe these packages are only in required for the host performing the installation. If you check out the aws plays we don't actually install those packages.
| EXTRA_DOCKER_STORAGE_OPTIONS="--storage-opt dm.basesize=3G" | ||
| EOF | ||
|
|
||
| docker-storage-setup |
There was a problem hiding this comment.
Id let the installer handle storage setup and the enabling of docker. This may be a good location though to do you mounts
There was a problem hiding this comment.
This is from the documentation, plus the 3G was recommended in a prevous commit.
How to handle the difference?
There was a problem hiding this comment.
Maybe it was a bad comment line. I would just remove the step of actually running docker-storage-setup yourself
| chmod 600 /root/.ssh/id_rsa | ||
|
|
||
| subscription-manager unregister | ||
| subscription-manager register --username $RHNUSERNAME --password $RHNPASSWORD |
There was a problem hiding this comment.
Could you just re-use the existing ansible items from above here?
There was a problem hiding this comment.
Note the intention of doing it here, was to get the logging host up so it could provide logging before everything is ready. For the moment I'm deleting logging host.
| "protocol" : "Tcp", | ||
| "sourcePortRange" : "*", | ||
| "destinationPortRange" : "22", | ||
| "sourceAddressPrefix" : "*", |
There was a problem hiding this comment.
I feel like master 22 access should be limited from the the bastion only
There was a problem hiding this comment.
Funny, it should be the same as infra, will dig and figure out why.
| "properties" : { | ||
| "protocol" : "Tcp", | ||
| "sourcePortRange" : "*", | ||
| "destinationPortRange" : "8443", |
There was a problem hiding this comment.
Im not sure how azure handles vips but is it possible to use 443, the console_port variable at install time, and set the vip to use 443 as well? Just to use standard ports I remember how relieved i was to move from 8443 to 443
There was a problem hiding this comment.
I won't block on this, but please consider standardizing the port for the same reasons as stated above #125 (comment)
…nd script names that are external cannot be changed
…varaible names and functional areas
…e is correct intent.
…ontainer Platform
|
At this point, I've processed the list of suggestions, with the following exceptions:
On the commit on auth: It will be done shortly in keycloak, but is out of scope for this commit. |
|
@glennswest Great job on refactoring, resolving, answering, and pushing out all these requests for changes. Approving and merging. |
Initial Push of Azure Ansible Ref Arch Tempaltes
Significant Changes:
Architecture
all other scripts, playbooks etc are created within the 1 script. Part of this is based on the way
Azure Linux Extensions work, and the rest is done for transparency.