Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: use gcloud module for scripts, closes #401 #404

Merged
merged 12 commits into from
Jun 18, 2020
Merged

refactor: use gcloud module for scripts, closes #401 #404

merged 12 commits into from
Jun 18, 2020

Conversation

marshallford
Copy link
Contributor

@marshallford marshallford commented Jan 16, 2020

Uses gcloud module for wait-for-cluster and delete-default-resource scripts.

Fixes #401

In other words: Removes dependency on locally installed glcloud, kubectl, and jq binaries.

Copy link
Contributor

@morgante morgante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it looks like we're still seeing errors with the dependency:

Step #23 - "converge simple-regional-with-networking-local":        Error: Invalid value for module argument
Step #23 - "converge simple-regional-with-networking-local":        
Step #23 - "converge simple-regional-with-networking-local":          on ../../../cluster.tf line 236, in module "gcloud_wait_for_cluster":
Step #23 - "converge simple-regional-with-networking-local":         236:   module_depends_on = [
Step #23 - "converge simple-regional-with-networking-local":         237:     google_container_cluster.primary,
Step #23 - "converge simple-regional-with-networking-local":         238:     google_container_node_pool.pools,
Step #23 - "converge simple-regional-with-networking-local":         239:   ]
Step #23 - "converge simple-regional-with-networking-local":        
Step #23 - "converge simple-regional-with-networking-local":        The given value is not suitable for child module variable "module_depends_on"
Step #23 - "converge simple-regional-with-networking-local":        defined at
Step #23 - "converge simple-regional-with-networking-local":        .terraform/modules/example.gke.gcloud_wait_for_cluster/terraform-google-modules-terraform-google-gcloud-55b927a/variables.tf:29,1-29:
Step #23 - "converge simple-regional-with-networking-local":        all list elements must have the same type.

@marshallford
Copy link
Contributor Author

marshallford commented Jan 27, 2020

I'm new to the module_depends_on concept but both elements in that list are of type "resource". Is that not specific enough? Should I instead depend on two string values returned by the resources?

@morgante
Copy link
Contributor

I'm new to the module_depends_on concept but both elements in that list are of type "resource". Is that not specific enough? Should I instead depend on two string values returned by the resources?

Yeah I think they need to be the exact same type. Let's try making them two string values.

@marshallford
Copy link
Contributor Author

How does the usage of for_each in the resource google_container_node_pool.pools impact how I reference a returned value?

  module_depends_on = [
    google_container_cluster.primary.master_version,
    google_container_node_pool.pools.name, // not just one pool name?
  ]

@morgante
Copy link
Contributor

How does the usage of for_each in the resource google_container_node_pool.pools impact how I reference a returned value?

You'll probably need to use splat syntax:

  module_depends_on = concat([google_container_cluster.primary.master_version], google_container_node_pool.pools.*.name)

@morgante
Copy link
Contributor

I'm getting this error now:

Step #35 - "converge stub-domains-local":        Error: Unsupported attribute
Step #35 - "converge stub-domains-local":        
Step #35 - "converge stub-domains-local":          on ../../../cluster.tf line 238, in module "gcloud_wait_for_cluster":
Step #35 - "converge stub-domains-local":         238:     join("-", google_container_node_pool.pools[*].name),
Step #35 - "converge stub-domains-local":        
Step #35 - "converge stub-domains-local":        This object does not have an attribute named "name".

@marshallford
Copy link
Contributor Author

google_container_node_pool does not export any values to depend on.

@marshallford marshallford changed the title closes #401: uses gcloud module for scripts refactor: use gcloud module for scripts, closes #401 Jan 28, 2020
@marshallford
Copy link
Contributor Author

I'd like to keep this PR moving. @morgante, what's the current error? I'd love to wrap this up.

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

Rerunning tests. @marshallford Could you resolve conflicts?

@morgante
Copy link
Contributor

morgante commented Feb 6, 2020

Here's the error I'm seeing in CI, which makes me thing the dependency order isn't working properly:

Step #43 - "converge stub-domains-upstream-nameservers-local":        Error: configmaps "kube-dns" already exists
Step #43 - "converge stub-domains-upstream-nameservers-local":        
Step #43 - "converge stub-domains-upstream-nameservers-local":          on ../../../dns.tf line 88, in resource "kubernetes_config_map" "kube-dns-upstream-nameservers-and-stub-domains":
Step #43 - "converge stub-domains-upstream-nameservers-local":          88: resource "kubernetes_config_map" "kube-dns-upstream-nameservers-and-stub-domains" {

@marshallford
Copy link
Contributor Author

I recant my earlier statement, this recent PR clarifies that google_container_node_pool does export attributes: hashicorp/terraform-provider-google#5477

@morgante
Copy link
Contributor

@marshallford FYI this is still failing. This is the error:

       Error: Invalid value for module argument
       
         on ../../../cluster.tf line 237, in module "gcloud_wait_for_cluster":
        237:   module_depends_on = [
        238:     google_container_cluster.primary.master_version,
        239:     google_container_node_pool.pools[*].name
        240:   ]
       
       The given value is not suitable for child module variable "module_depends_on"
       defined at
       .terraform/modules/example.gke.gcloud_wait_for_cluster/terraform-google-modules-terraform-google-gcloud-84e31b8/variables.tf:35,1-29:
       all list elements must have the same type.
       
       
       Error: Unsupported attribute
       
         on ../../../cluster.tf line 239, in module "gcloud_wait_for_cluster":
        239:     google_container_node_pool.pools[*].name

@bharathkkb
Copy link
Member

@marshallford it would be great to get this in. Anything we can do to help complete this?

@marshallford
Copy link
Contributor Author

marshallford commented Jun 12, 2020

@bharathkkb As you can see, I fought quite a bit with TF trying to establish a dependency between gcloud_wait_for_cluster and google_container_node_pool. If you have any suggestions on using the helper input module_depends_on please let me know.

@bharathkkb
Copy link
Member

bharathkkb commented Jun 12, 2020

@marshallford I have a couple of ideas. Let me know if you have additional questions
i) I think the way to do the implicit dependency is

module_depends_on = concat(
    [google_container_cluster.primary.master_version],
    [for pool in google_container_node_pool.pools: pool.name]
  )

This similar to what @morgante suggested but I don't believe you can use splat with for_each resources as it will output a map. So I have used the usual for expansion here.

ii) A few other things are we should expose skip_download and possibly upgrade from the gcloud module so that users have that additional flexibility and imho skip_download should default to true. @morgante any thoughts on this?

@bharathkkb
Copy link
Member

@marshallford it looks like a bunch of unrelated verify steps are failing. I see a couple of converge steps passed, so it looks promising. Could you try a rebase so we have the latest tests from master?

@bharathkkb
Copy link
Member

@marshallford this is the test that failed and it was fixed here: #454

×  Command: `gcloud beta --project=ci-gke-10f1 container clusters --zone=us-central1 describe simple-regional-beta-cluster-1hne --format=json` cluster has the expected nodeMetadata conseal config
     
     expected: {"nodeMetadata"=>"EXPOSE"}
          got: {"mode"=>"GCE_METADATA", "nodeMetadata"=>"EXPOSE"}
     
     (compared using ==)
     
     Diff:
     @@ -1,2 +1,3 @@
     +"mode" => "GCE_METADATA",
      "nodeMetadata" => "EXPOSE",

     ✔  Command: `gcloud beta --project=ci-gke-10f1 container clusters --zone=us-central1 describe simple-regional-beta-cluster-1hne --format=json` cluster has the expected podSecurityPolicyConfig config

@marshallford
Copy link
Contributor Author

I hope I rebased correctly, dealing with the generated files wasn't fun. Had to step through a couple of manual fixes and git rebase --continue.

@bharathkkb
Copy link
Member

@marshallford I am seeing some submodule generation errors, could you run make build and commit? That should take care of that error.

Checking submodule's files generation
diff -r '--exclude=.terraform' '--exclude=.kitchen' '--exclude=.git' /workspace/modules/beta-public-cluster-update-variant/cluster.tf /tmp/tmp.HIfNPcb4Hd/workspace/modules/beta-public-cluster-update-variant/cluster.tf
471,472c471,474
< resource "null_resource" "wait_for_cluster" {
<   count = var.skip_provisioners ? 0 : 1
---
> module "gcloud_wait_for_cluster" {
>   source  = "terraform-google-modules/gcloud/google"
>   version = "~> 1.0.1"
>   enabled = var.skip_provisioners
474,477c476,479
<   triggers = {
<     project_id = var.project_id
<     name       = var.name
<   }
---
>   create_cmd_entrypoint  = "${path.module}/scripts/wait-for-cluster.sh"
>   create_cmd_body        = "${var.project_id} ${var.name}"
>   destroy_cmd_entrypoint = "${path.module}/scripts/wait-for-cluster.sh"
>   destroy_cmd_body       = "${var.project_id} ${var.name}"
.
.

@bharathkkb
Copy link
Member

@marshallford looks like tests are green 😄

A few other things are we should expose skip_download and possibly upgrade from the gcloud module so that users have that additional flexibility and imho skip_download should default to true.

I would still like to provide this functionality if possible in this PR

@bharathkkb bharathkkb requested a review from morgante June 13, 2020 21:36
@marshallford
Copy link
Contributor Author

@bharathkkb Agreed. Given that I've been at odds with the module dependency issue I figured I'd focus on that first.

Can you expand on what you mean by

possibly upgrade from the gcloud module

@bharathkkb
Copy link
Member

Can you expand on what you mean by

possibly upgrade from the gcloud module

@marshallford sure. From the gcloud module I think we should surface the following upgrade, gcloud_sdk_version, skip_download as optional variables for the GKE modules.

imho the recommended defaults should be upgrade:false, gcloud_sdk_version:296.0.0 and skip_download:true. I believe these defaults will keep parity with the way we currently do it with local-exec. @morgante any thoughts on these defaults?

@morgante
Copy link
Contributor

imho the recommended defaults should be upgrade:false, gcloud_sdk_version:296.0.0 and skip_download:true. I believe these defaults will keep parity with the way we currently do it with local-exec. @morgante any thoughts on these defaults?

I don't think we need to expose sdk_version, but otherwise agreed.

@bharathkkb
Copy link
Member

@marshallford the lint tests are failing due to a timeout issue, should be safe to ignore.

Copy link
Member

@bharathkkb bharathkkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morgante morgante dismissed their stale review June 15, 2020 18:36

stale

@bharathkkb
Copy link
Member

@morgante do we need anything else before we can merge this?

@morgante morgante merged commit 65172de into terraform-google-modules:master Jun 18, 2020
@marshallford marshallford deleted the gcloud-module branch June 18, 2020 03:43
CPL-markus pushed a commit to WALTER-GROUP/terraform-google-kubernetes-engine that referenced this pull request Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gcloud executable requirement, removal of jq
3 participants