google_container_cluster tries to recreate cluster always when used in combination with google_container_node_pool #2115

mpgomez · 2018-09-26T15:32:06Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.11.8

provider: google: 1.18

Affected Resource(s)

google_container_cluster
google_container_node_pool

Terraform Configuration Files

resource "google_container_cluster" "primary" {
  name               = "${var.cluster_name}"
  # If we want a regional cluster, should we be looking at https://cloud.google.com/kubernetes-engine/docs/concepts/regional-clusters#regional
  #  region = "${var.region}"
  zone               = "${var.main_zone}"
  additional_zones   = "${var.additional_zones}"
  # Node count for every region
  initial_node_count = 1
  project            = "${var.project}"
  remove_default_node_pool = true
  enable_legacy_abac = true

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/compute",
      "https://www.googleapis.com/auth/devstorage.read_write",
      "https://www.googleapis.com/auth/sqlservice.admin",
      "https://www.googleapis.com/auth/cloud-platform",
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]
  }
  addons_config {
    horizontal_pod_autoscaling {
      disabled = false
    }
  }
}

resource "google_container_node_pool" "nodepool" {
  name               = "${var.cluster_name}nodepool"
  zone               = "${var.main_zone}"
  cluster            = "${google_container_cluster.primary.name}"
  node_count         = "${var.node_count}"

  autoscaling {
    min_node_count = "${var.min_node_count}"
    max_node_count = "${var.max_node_count}"
  }
}

Debug Output

A lot of info in those logs to share them openly. Any tool there to anonimase them? Happy to share them if there is no sensitive data on them. I couldn't find much info about it.

Panic Output

It does not crash

Expected Behavior

Once applied succesfully, if I terraform plan again, no changes should be needed.

Actual Behavior

If right after applying the changes successfully, I terraform plan, I get:

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

-/+ module.google.google_container_cluster.primary (new resource required)
      id:                                                    "dev" => <computed> (forces new resource)
      additional_zones.#:                                    "1" => "1"
      additional_zones.2873062354:                           "europe-west2-a" => "europe-west2-a"
      addons_config.#:                                       "1" => "1"
      addons_config.0.horizontal_pod_autoscaling.#:          "1" => "1"
      addons_config.0.horizontal_pod_autoscaling.0.disabled: "false" => "false"
      addons_config.0.http_load_balancing.#:                 "0" => <computed>
      addons_config.0.kubernetes_dashboard.#:                "0" => <computed>
      addons_config.0.network_policy_config.#:               "1" => <computed>
      cluster_ipv4_cidr:                                     "10.20.0.0/14" => <computed>
      enable_binary_authorization:                           "false" => "false"
      enable_kubernetes_alpha:                               "false" => "false"
      enable_legacy_abac:                                    "true" => "true"
      endpoint:                                              "****" => <computed>
      initial_node_count:                                    "1" => "1"
      instance_group_urls.#:                                 "2" => <computed>
      logging_service:                                       "logging.googleapis.com" => <computed>
      master_auth.#:                                         "1" => <computed>
      master_version:                                        "1.9.7-gke.6" => <computed>
      monitoring_service:                                    "monitoring.googleapis.com" => <computed>
      name:                                                  "dev" => "dev"
      network:                                               "****" => "default"
      network_policy.#:                                      "1" => <computed>
      node_config.#:                                         "1" => "1"
      node_config.0.disk_size_gb:                            "100" => <computed>
      node_config.0.disk_type:                               "pd-standard" => <computed>
      node_config.0.guest_accelerator.#:                     "0" => <computed>
      node_config.0.image_type:                              "COS" => <computed>
      node_config.0.local_ssd_count:                         "0" => <computed>
      node_config.0.machine_type:                            "n1-standard-1" => <computed>
      node_config.0.oauth_scopes.#:                          "6" => "6"
      node_config.0.oauth_scopes.1277378754:                 "https://www.googleapis.com/auth/monitoring" => "https://www.googleapis.com/auth/monitoring"
      node_config.0.oauth_scopes.1328717722:                 "" => "https://www.googleapis.com/auth/devstorage.read_write" (forces new resource)
      node_config.0.oauth_scopes.1632638332:                 "https://www.googleapis.com/auth/devstorage.read_only" => "" (forces new resource)
      node_config.0.oauth_scopes.172152165:                  "https://www.googleapis.com/auth/logging.write" => "https://www.googleapis.com/auth/logging.write"
      node_config.0.oauth_scopes.1733087937:                 "" => "https://www.googleapis.com/auth/cloud-platform" (forces new resource)
      node_config.0.oauth_scopes.299962681:                  "" => "https://www.googleapis.com/auth/compute" (forces new resource)
      node_config.0.oauth_scopes.316356861:                  "https://www.googleapis.com/auth/service.management.readonly" => "" (forces new resource)
      node_config.0.oauth_scopes.3663490875:                 "https://www.googleapis.com/auth/servicecontrol" => "" (forces new resource)
      node_config.0.oauth_scopes.3859019814:                 "https://www.googleapis.com/auth/trace.append" => "" (forces new resource)
      node_config.0.oauth_scopes.4205865871:                 "" => "https://www.googleapis.com/auth/sqlservice.admin" (forces new resource)
      node_config.0.preemptible:                             "false" => "false"
      node_config.0.service_account:                         "default" => <computed>
      node_pool.#:                                           "1" => <computed>
      node_version:                                          "1.9.7-gke.6" => <computed>
      private_cluster:                                       "false" => "false"
      project:                                               "***" => "***"
      region:                                                "" => <computed>
      remove_default_node_pool:                              "true" => "true"
      zone:                                                  "europe-west2-b" => "europe-west2-b"


Plan: 1 to add, 0 to change, 1 to destroy.

------------------------------------------------------------------------

This plan was saved to: devplan.tfplan

To perform exactly these actions, run the following command to apply:
    terraform apply "devplan.tfplan"

Steps to Reproduce

terraform apply
2 terraform apply again

Important Factoids

This was not happening when using the default node pool. I started seeing the issue after using my own node pool instead, so I think it may be related to the node pool.

References

Maybe related to hashicorp/terraform#18209 ?

#0000

The text was updated successfully, but these errors were encountered:

paddycarver · 2018-09-26T17:15:51Z

Hmm, looking at that plan, what stands out to me is:

      node_config.0.oauth_scopes.#:                          "6" => "6"
      node_config.0.oauth_scopes.1277378754:                 "https://www.googleapis.com/auth/monitoring" => "https://www.googleapis.com/auth/monitoring"
      node_config.0.oauth_scopes.1328717722:                 "" => "https://www.googleapis.com/auth/devstorage.read_write" (forces new resource)
      node_config.0.oauth_scopes.1632638332:                 "https://www.googleapis.com/auth/devstorage.read_only" => "" (forces new resource)
      node_config.0.oauth_scopes.172152165:                  "https://www.googleapis.com/auth/logging.write" => "https://www.googleapis.com/auth/logging.write"
      node_config.0.oauth_scopes.1733087937:                 "" => "https://www.googleapis.com/auth/cloud-platform" (forces new resource)
      node_config.0.oauth_scopes.299962681:                  "" => "https://www.googleapis.com/auth/compute" (forces new resource)
      node_config.0.oauth_scopes.316356861:                  "https://www.googleapis.com/auth/service.management.readonly" => "" (forces new resource)
      node_config.0.oauth_scopes.3663490875:                 "https://www.googleapis.com/auth/servicecontrol" => "" (forces new resource)
      node_config.0.oauth_scopes.3859019814:                 "https://www.googleapis.com/auth/trace.append" => "" (forces new resource)
      node_config.0.oauth_scopes.4205865871:                 "" => "https://www.googleapis.com/auth/sqlservice.admin" (forces new resource)

So here's what I think's happening:

The node_config in the container_cluster is setting the scopes it wants all node pools to use.
The node pool you're adding has a default node_config
Terraform is getting confused about whether you want the node_config from the container_cluster or the default node_config from the node pool.

It's not perfect, but I believe if you move the node_config block from container_cluster into the node_pool, that confusion will be resolved.

I'll investigate and see if we can't come up with a better solution for this to make it work intuitively.

mpgomez · 2018-09-27T08:36:38Z

That actually makes a lot of sense. I didn't think about that.
I was just confused about the:
id: "europe-west2-b/dev/devnodepool" => (forces new resource)

Thank you very much! (yes, it does indeed fix the problem)

pdemagny · 2018-09-27T13:49:59Z

omg thanks for this, i've been banging my head with this for a few days ;)

paddycarver · 2018-09-27T18:01:06Z

So it sounds like we either have a documentation problem or a validation problem. I'm not 100% up to speed on the reason we have node_config at the cluster and node pool levels, so I'm not comfortable enough that I have all the use cases in mind to be able to say what the ideal solution is here, but I think we can improve this either through documentation or through not letting cluster set node_config, or through potentially handling an empty node_config on a node pool better. I'll leave this open so we can investigate those options.

danawillow · 2018-10-01T17:35:00Z

@paddycarver the answer to your question is that the node_config on the cluster corresponds to the default node pool. The ideal solution would be that we would have a default_node_pool block on the cluster, but alas, that's not what the API gives us to work with. In the meantime, we can probably solve through documentation.

flokli · 2019-05-28T20:16:13Z

Wow, until this is resolved, a big fat warning should be added to the docs.

We advertise this as the recommended way to bootstrap a GKE cluster, yet recreate the cluster on every terraform apply.

don't advertise a separately managed node pool as recommended, until hashicorp#2115 is fixed.

flokli · 2019-05-28T20:25:04Z

Flipped default and added warning in #3733.

rileykarson · 2019-05-28T20:49:36Z

Hey @flokli! Our recommendation is to use separately managed node pools and not use the default node pool at all.

If you specify a node_config block, you're telling Terraform you want to use the default node pool. That block was badly named by the API & by extension by the original implementation in Terraform. Despite the name omitting default_ prefix, it only applies to the default node pool.

As shown in the recommended example, node_config should be omitted and node_pool should be omitted.

flokli · 2019-05-28T20:59:03Z

@rileykarson if I copy that exact example:
https://www.terraform.io/docs/providers/google/r/container_cluster.html#example-usage-with-a-separately-managed-node-pool-recommended-

and terraform apply a second time, it'll destroy and recreate the whole cluster.

arianvp · 2019-05-29T10:08:33Z

I just tested this and I can confirm that the 'recommended' example destroys itself on every run of terraform apply even when not using the default pool

rileykarson · 2019-05-29T18:06:59Z

The same is true of the other example using the default node pool, and neither is related to configuration of node pools. This is related to a breaking change from the GKE API where a default value was changed. Patching with GoogleCloudPlatform/magic-modules#1844. See #3672 / #3369.

rileykarson · 2019-06-11T21:42:06Z

https://www.terraform.io/docs/providers/google/r/container_cluster.html#node_config is more clear about being used for the default node pool now. I don't think there's anything actionable to fix here, so I'm going to close this out. If anyone has anything unresolved and thinks this should be reopened, feel free to comment and I will.

ghost · 2019-07-12T13:50:20Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

ghost added the bug label Sep 26, 2018

paddycarver added documentation and removed bug labels Nov 6, 2018

baboune mentioned this issue Feb 22, 2019

node_pool.# count is incorrect "2" => "1" (forces new resource) #3107

Closed

mattnworb mentioned this issue Mar 11, 2019

container cluster is not indemptent #3208

Closed

g0blin79 mentioned this issue Mar 27, 2019

Forced recreating node_pool at any plan terraform-google-modules/terraform-google-kubernetes-engine#120

Closed

lukehoban mentioned this issue Mar 28, 2019

increasing minMasterVersion of google container cluster previews replacement instead of update pulumi/pulumi-gcp#88

Closed

flokli added a commit to flokli/terraform-provider-google that referenced this issue May 28, 2019

container_cluster.html.markdown: flip recommended

bde0dda

don't advertise a separately managed node pool as recommended, until hashicorp#2115 is fixed.

flokli mentioned this issue May 28, 2019

container_cluster.html.markdown: flip recommended #3733

Closed

rileykarson closed this as completed Jun 11, 2019

ghost locked and limited conversation to collaborators Jul 12, 2019

github-actions bot added service/container forward/review In review; remove label to forward labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

google_container_cluster tries to recreate cluster always when used in combination with google_container_node_pool #2115

google_container_cluster tries to recreate cluster always when used in combination with google_container_node_pool #2115

mpgomez commented Sep 26, 2018

paddycarver commented Sep 26, 2018

mpgomez commented Sep 27, 2018

pdemagny commented Sep 27, 2018

paddycarver commented Sep 27, 2018

danawillow commented Oct 1, 2018

flokli commented May 28, 2019

flokli commented May 28, 2019

rileykarson commented May 28, 2019

flokli commented May 28, 2019

arianvp commented May 29, 2019 •

edited

Loading

rileykarson commented May 29, 2019 •

edited

Loading

rileykarson commented Jun 11, 2019

ghost commented Jul 12, 2019

google_container_cluster tries to recreate cluster always when used in combination with google_container_node_pool #2115

google_container_cluster tries to recreate cluster always when used in combination with google_container_node_pool #2115

Comments

mpgomez commented Sep 26, 2018

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

paddycarver commented Sep 26, 2018

mpgomez commented Sep 27, 2018

pdemagny commented Sep 27, 2018

paddycarver commented Sep 27, 2018

danawillow commented Oct 1, 2018

flokli commented May 28, 2019

flokli commented May 28, 2019

rileykarson commented May 28, 2019

flokli commented May 28, 2019

arianvp commented May 29, 2019 • edited Loading

rileykarson commented May 29, 2019 • edited Loading

rileykarson commented Jun 11, 2019

ghost commented Jul 12, 2019

arianvp commented May 29, 2019 •

edited

Loading

rileykarson commented May 29, 2019 •

edited

Loading