Skip to content
This repository has been archived by the owner on May 22, 2020. It is now read-only.

min-turnup: azure #106

Closed
wants to merge 4 commits into from
Closed

Conversation

colemickens
Copy link

I changed:

  • --service-cluster-ip-range to not overlap with the ip range that will be used for the actual nodes.
  • changed the manifest templates to jinja2, it was easier to do the syntax I needed for adding the --cloud-config flag
  • changed the master pod manifests to use hyperkube-amd64 instead of the kube-{apiserver,scheduler,controller-manager} images. kubelet.service was already using hyperkube-amd64 so it's already on box, plus it makes it easier for me to test changes to cloudprovider if I only have to rebuild and upload hyperkube instead of 4+ containers.
  • added a Dockerfile that has terraform, jq, jsonnet, etc. (especially since this relies of a pre-release version of terraform [https://github.com/azurerm: Can't enable ipv4_forwarding on a NIC hashicorp/terraform#6803]).

This worksaround Azure's lack of metadata service by using terraform's template_file functionality. I use terraform's base64encode and file functions to access the contents of the crypto assets and use template_file to interpolate them into a configure-vm.sh template that is rendered and passed as custom_data to the VM. Similarly it passes through some environment information and Active Directory credentials that are required by the cloudprovider implementation. The template_file approach was chosen instead of using the remote-exec and file provisioners in terraform because they don't work easily for Azure currently: hashicorp/terraform#7122.

This relies on the azure cloudprovider, so it current defaults to my own hyperkube-amd repo/image. (I'm blocked on sending the PR for the cloudprovider for reasons... but hopefully that will stop being a thing soon)

It should probably include a script to help create the Azure Service Principal that could then be used by Terraform and passed through for cloudprovider.

So, it's not ready to merge yet for reasons mentioned above, but I wanted to get this out for any comments. I was able to deploy a 20 cluster on the first try after getting the single master/node working: http://i.imgur.com/boyFZsw.png.

The biggest issue I see is that addons seem to be unaddressed. For now I haven't added kube-proxy as a manifest pod since I assume we're going to run it via a DaemonSet addon....

@colemickens
Copy link
Author

colemickens commented Jun 12, 2016

Other changes:

  • added /etc/kubernetes as a volume for the manifest pods (azure cloudprovider config lives at /etc/kubernetes/azure.json).
  • removed GCE PD from etcd (should be added back behind a conditional for GCE, just haven't yet)

@colemickens
Copy link
Author

Another consideration: I don't think either the gce/azure variant output a user-friendly kubeconfig at the end for connectivity outside of the cluster.

@colemickens
Copy link
Author

@mikedanese mikedanese self-assigned this Jun 13, 2016
@mikedanese
Copy link
Contributor

Thanks Cole! This looks great, I will give this a review soon. cc @errordeveloper

@colemickens
Copy link
Author

(Another random question I've had -- is it completely safe to run kubelet in container now with Docker 1.11.x? Like volume/secret mounts will work?)

@errordeveloper
Copy link

Secrets work for sure. Formal validation is still required, but it
generally works with 1.10 and 1.11, and secrets are broken with 1.9 and
older versions.

On Tue, 14 Jun 2016, 00:47 Cole Mickens, [email protected] wrote:

(Another random question I've had -- is it completely safe to run kubelet
in container now with Docker 1.11.x? Like volume/secret mounts will work?)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#106 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAPWS1Dd1Qk9n0ahvj2VpkC_-rvjLCIOks5qLewQgaJpZM4IzsbA
.

@colemickens
Copy link
Author

(I guess I should've known as much -- I deployed my test app using secrets to the cluster I brought up with this.)

…as a SAN. Let kubelet restart. Add set-kubeconfig helper for configuring kubeconfig after deployment. Add note about extra-sans-ip being unused.
@colemickens
Copy link
Author

Flakiness is fixed as far as I can tell - etcd had restarted, it hadn't been persisting anywhere on the host and the apiserver/kubelets did not get along. I suspect that kubelets weren't completely re-registering and so were just failing to update their status since the apiserver didn't know about them. Either way, etcd is persisting now and I haven't had problems with nodes disappearing anymore.

I also had to add the cluster ip for the kubernetes service as a SAN so that the certs would be valid for in-cluster-config usages (kubernetes-dashboard was complaining. I don't know if it pre-resolves the kubernetes service clusterip or if it's made to not rely on dns being available. Regardless...)

Added a helper for configuring kubectl after deployment. Probably should be temporary or should be improved as it requires the user to manually get the MASTER_IP and set CLUSTER_NAME even those things can be acquired through Terraform and/or inspecting .config.json and reading the instance prefix as a CLUSTER_NAME.

"--etcd-servers=http://127.0.0.1:2379",
"--cloud-provider={{ phase1['cloud_provider'] }}",
{% if phase1['cloud_provider'] == "azure" %}
"--cloud-config=/etc/kubernetes/azure.json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For comparison, this could be done with jsonnet like this:

local build_params(arr) =
  std.flattenArrays(std.filter(function(a) a != null, arr));

build_params([
  [
    "--this",
    "--that",
  ],
  if "gce" == "gce" then
    ["--life"],
  if "azure" == "gce" then
    ["--death"]
  else
    ["--parties"],
])

produces:

[
   "--this",
   "--that",
   "--life",
   "--parties"
]

Copy link
Author

@colemickens colemickens Jun 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, do you think that is better? I think the j2 version is much easier to parse/grok at a glance, albeit with slightly more json cruft.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to avoid a templating language with a python dependency because of kubernetes/kubernetes#27547. I prefer to use a language with an interpreter that can be statically linked, e.g. go templates or jsonnet. I don't like using text templating langauges to template data formats so I lean towards jsonnet.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this running from within the ansible container anyway? I'm fine with changing it, but I think it's orthogonal to whether or not python is on the host, no?

Copy link
Contributor

@mikedanese mikedanese Jun 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not so much that we can't possible get it to run on these os'es. It's more about the size of the dependency. python:2.7-slim was at 200MB when I looked. This is larger then the GCI vm image. This hits us on time to boot and is prohibitive. So now we have code like this https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh#L813-L845. There's a benifit to sharing templates of some form for the kubernetes components but I don't think that it's not long term viable to do this with jinja. In comparison jsonnet is a couple of MBs and only has a dependency on libc (when it's not statically compiled). Go templates would be similarly lean.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I've also noticed that pulling the install-k8s container adds noticeable time to the deployment.

I'll convert it back to jsonnet in my kubernetes-anywhere branch.

@mikedanese
Copy link
Contributor

So we were using /var/etcd instead of /var/lib/etcd before this change which caused data to not be saved between etcd container restarts?

DNS.4 = kubernetes.default.svc.cluster.local
DNS.5 = names.master_vm
IP.1 = ${azurerm_public_ip.pip.ip_address}
IP.2 = 10.3.0.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make node cidr, service cidr, pod cidr configurable ASAP.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

On Tue, 14 Jun 2016, 22:26 Mike Danese, [email protected] wrote:

In min-turnup/phase1/azure/lib/azure.jsonnet
#106 (comment):

  •           pip: {
    
  •               resource_group_name: "${azurerm_resource_group.rg.name}",
    
  •               location: cfg.azure.location,
    
  •               name: names.master_public_ip,
    
  •               public_ip_address_allocation: "static",
    
  •               provisioner:  [{
    
  •                   "local-exec": {
    
  •                       command: |||
    
  •                           cat <<EOF > ../../phase1b/crypto/san-extras
    
  •                           DNS.1 = kubernetes
    
  •                           DNS.2 = kubernetes.default
    
  •                           DNS.3 = kubernetes.default.svc
    
  •                           DNS.4 = kubernetes.default.svc.cluster.local
    
  •                           DNS.5 = names.master_vm
    
  •                           IP.1 = ${azurerm_public_ip.pip.ip_address}
    
  •                           IP.2 = 10.3.0.1
    

We should probably make node cidr, service cidr, pod cidr configurable
ASAP.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kube-deploy/pull/106/files/f56f5e25dce862b13c240daede4900fe2e9c680e#r67060401,
or mute the thread
https://github.com/notifications/unsubscribe/AAPWS4rcn0N4jZ2HvDAL4Jnapaf7vQnvks5qLxx7gaJpZM4IzsbA
.

Copy link

@errordeveloper errordeveloper Jun 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a combination of template and file provisioner could probably be better here... (Please ignore, that won't work for various reasons.) Anyhow, it'd be great to investigate if TLS provider would do the job (see #35).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using terraform to build the certs SGTM

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds good to me, but I don't want to try to add that on top of this PR. I can potentially look at in the future (but that would be after azure-cloudprovider).

@mikedanese
Copy link
Contributor

mikedanese commented Jun 14, 2016

@colemickens what are your thoughts after implementing a phase 1? How was it better than and worse then implementing a kube-up.sh provider? Excluding the rewrite of the manifest files, the diff seems fairly small.

@colemickens
Copy link
Author

Thank you both for the review and comments. They all make sense and I'll address these ASAP, but it may take a few days - I'm out of town right now for my brother's wedding and it's taking more of my attention than I'd expected. (I'm also simultaneously trying to get the cloudprovider itself out for review.)

@colemickens
Copy link
Author

colemickens commented Jun 16, 2016

@mikedanese I muddled the etcd/flakiness discussion, sorry. I had removed the hostPath volume entirely for Azure since it was using the GCE PD mount path. The fix was that I re-added the volume, not that I changed the path (that was unintentional). I've actually reverted the path to /var/etcd as it was originally and I've added conditional logic so that it uses the GCE persistent disk path on GCE and /var/etcd on Azure.

@colemickens
Copy link
Author

I've responded to or fixed all of the comments and pushed a new commit. Note that I didn't revert j2->jsonnet or fix the short abbreviated names in Terraform, but I did offer my comments about those suggestions above.

Overall, yes, this was more straightforward and less painful than diving into the Salt mine. I don't have a lot else to say other than that, though the thought in the back of my mind is that the Salt kube-up.sh did more than this is doing. That might just be addons though, I can't think of much else.

I have several questions though:

  1. What is the expected user experience? Right now, I am entering a docker container (because I need terraform + jsonnet and they're not in my package manager), then I generate the terraform config. Then I apply the terraform config a couple times (to get around flakiness), and then I set CLUSTER_NAME and MASTER_IP and call make set-kubeconfig. The last step, at least, could be output by terraform since it has access to cfg.phase1.instance_prefix and obvious the public ip, but we still need some entrypoint script to drive the process. (And for Azure I need to help the user create the service account to use for the deployment [both in Terraform and to plumb credentials to the box because Azure doesn't have managed VM account identities]).
  2. What is the plan for addons?
  3. Are we taking a dependency on systemd (or ubuntu even)? (like the systemctl in configure-vm.sh, along with Ansible service units?)
  4. Can we make Flannel work with this (the experimental overlay flag?)? Otherwise the Azure min-turnup will be reliant on my personal branch until k8s 1.4 with Azure CloudProvider.
  5. Is using Ansible for upgrade scenarios an explicit goal? Or, rather, are supporting upgrades with min-turnup an explicit goal?
  6. (minor) Is config.phase2.extra-api-sans actually used anywhere? I thought I'd seen it at an older revision but I can't see it being used now.

And then these additional concerns:

  1. Terraform is flaky for the Azure subnet. It requires retrying. Not a huge deal. I have an issue files on Terraform.
  2. Terraform, a number of times, gave me failures about kubelet.tar missing. Which is weird because it would have existed from previous runs and been clobbered in place. Retrying bypasses this as well.
  3. We should remove the root CA private key from the bundle that is uploaded to machines

@colemickens
Copy link
Author

Unfortunately terraform rc2 seems to have a blocking regression from rc1: hashicorp/terraform#7248

@colemickens
Copy link
Author

After seeing how far ahead kubernetes-anywhere is (answering many of my questions above), along with @mikedanese's makefile PR, I decided to move it myself.

https://github.com/colemickens/kubernetes-anywhere/commits/azure

There are some hacks in there to work around Terraform bugs and some weird bug I'm now experiencing in Docker. They're noted in comments with links to the upstream bugs.

It's not a super nice git history. I have fixed some typos and made some slight, but hopefully appropriate changes to the GCE phase1. I can split this into smaller commits if needed.

That's on top of the new, unmerged makefile PR. Let me know how we want to proceed here. I can go ahead and close this and open a PR on kubernetes-anywhere but maybe we want to have consensus here on the thing I punted from the comments.

The Azure deployment has an extra feature, it writes out a kubeconfig file at the end so that it's easy to connect to the cluster immediately (well not immediately, the ansible playbook docker image takes a while to pull and execute).

@mikedanese
Copy link
Contributor

Also you need to run make fmt to format the jsonnet.

@mikedanese
Copy link
Contributor

Code is gone.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants