Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No node_taints for default pool since v2.34 #10490

Closed
tantweiler opened this issue Feb 5, 2021 · 7 comments
Closed

No node_taints for default pool since v2.34 #10490

tantweiler opened this issue Feb 5, 2021 · 7 comments

Comments

@tantweiler
Copy link

Terraform version: 0.14.4
Provider version: v2.46.1
Kubernetes version: AKS 1.18.14

Steps to reproduce

Since AKS needs a default node pool and does not accept empty AKS clusters, we used the default node pool just for "system management" and additional node pools as user pools. Which actually sucks because you even have to keep the default pool redundant as well for high availability purposes which means unnecessary costs! To prevent from having user pods running on the default node pool we set node_taints on the default pool like this so that the user pods are being scheduled on the user pools only:

    default_node_pool {
        name                = "defaultpool"
        node_count          = 3
        vm_size             = "Standard_B2s"
        vnet_subnet_id      = azurerm_subnet.akssubnet.id
        node_taints = [
         "CriticalAddonsOnly=true:PreferNoSchedule"]
    }
........
resource "azurerm_kubernetes_cluster_node_pool" "userpool" {
  name                  = "userpool"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.k8s.id
  vm_size               = var.aks_vm_size
  node_count            = 3
  mode                  = "User"
  enable_auto_scaling   = true
  max_count             = 18
  min_count             = 3
  os_disk_size_gb       = 200
  vnet_subnet_id        = azurerm_subnet.akssubnet.id
}


With the new provider this does not work anymore:

Error: expanding `default_node_pool`: The AKS API has removed support for tainting all nodes in the default node pool and it is no longer possible to configure this. To taint a node pool, create a separate one

Expected behavior

The use case is to change the VM type of the node pool without loosing the whole configuration and the pods when only a default pool is used.

  1. Preferred option: Accept cluster objects without a default pool which could mean AKS accepts even empty cluster objects like GKE does and the pod configuration is being kept by the k8s master. When the node pool is replaced with a new pool, because we changed the VM type, the pods get in pending mode and wait for getting compute resources to re-schedule the pods.
  2. Alternative option: Enable node_taints for the default pool again.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@tantweiler tantweiler changed the title No node_taints for default pool since v2.46 No node_taints for default pool since v2.34 Feb 5, 2021
@favoretti
Copy link
Collaborator

#10307 was just merged. For now it's half-way measure to create default pool as system pool, but we're still discussing re-enabling other (soft) taints as well.

@tombuildsstuff
Copy link
Contributor

Duplicate of #9183

@ghost
Copy link

ghost commented Feb 11, 2021

This has been released in version 2.47.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 2.47.0"
}
# ... other configuration ...

@Cyanopus
Copy link

@tombuildsstuff This is not fixed in 2.47.0. Can you please check ?

@robinmanuelthiel
Copy link

@Cyanopus To avoid confusion, it might be worth noting that this fix does not allow setting the CriticalAddonsOnly taint as asked in the original question of this issue like this...

default_node_pool {
  node_taints = ["CriticalAddonsOnly=true:PreferNoSchedule"] # <- still not supported, use solution below
}

... but introduces a new only_critical_addons_enabled argument to the default_node_pool block. So adding the CriticalAddonsOnly taint now works like this:

default_node_pool {
  only_critical_addons_enabled = true # supported
}

@favoretti
Copy link
Collaborator

@robinmanuelthiel Thank you, you are completely correct. I'm still on the fence about allowing other soft taints for default node pool, since AKS team is also on the fence about keeping that functionality :) But this at least allows for a clean separation of system and application pools.

@ghost
Copy link

ghost commented Mar 13, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Mar 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants