[Provider: google-cloud] deleting an attached disk should not be possible #2078

ghost · 2018-09-19T16:20:01Z

Affected Resource(s)

google_compute_snapshot
google_compute_instance
google_compute_disk

This issue was originally opened by @jeff-knurek as hashicorp/terraform#18898. It was migrated here as a result of the provider split. The original body of the issue is below.

There seems to be enough issues in the past regarding disks that are attached and not being able to properly manage desired state:
hashicorp/terraform#7792
hashicorp/terraform#12760
hashicorp/terraform#12398

But if I run: gcloud compute disks delete DISKNAME
I get the following error from Google:

The following disks will be deleted:
 - [DISKNAME] in [ZONE]

Do you want to continue (Y/n)?  y

ERROR: (gcloud.compute.disks.delete) Could not fetch resource:
 - The disk resource 'projects/-/zones/-/disks/DISKNAME' is already being used by 'projects/-/zones/-/instances/gke-0c4z....'

However, if I comment out the disk configuration in terraform and run apply,

Terraform will perform the following actions:
- module.-.module.-.google_compute_disk.disk
....
Apply complete! Resources: 0 added, 0 changed, 1 destroyed.

This then leaves everything that was attached to it in a strange state of not being able to connect.

My expected behaviour is to see the same error message that the gcloud command outputs.

b/305377667

The text was updated successfully, but these errors were encountered:

chrisst · 2018-09-19T16:37:19Z

There are multiple ways to attach disks to instances so I'm not sure exactly what state you are in right now. @jeff-knurek can you upload a more complete configuration and debug logs so that we can help investigate?

jeff-knurek · 2018-09-24T08:44:07Z

here's an example of the disk provisioning with Terraform:

resource "google_compute_disk" "disk-test" {
  name    = "disk-test"
  project = "k8s-test"
  type    = "pd-standard"
  zone    = "europe-west1-b"
  size    = 50
}

I'd say the steps are pretty straightforward and don't need logs, but I'll detail out the steps anyhow.
Steps:

run terraform apply with above example to create disk
put disk to use (below I've added some k8s config that will do just that), but I'd say as long as it's in use an some way the issue exists
comment out the above terraform example and run apply again (this will cause terraform to delete the disk)

That's it. Step 3 should error out but instead deletes the disk with no error. The setup that has the disk in use is unaware of the disk removal because this isn't expected behavior

example k8s manifest to put disk in use:

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: my-test
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: my-test
  template:
    metadata:
      labels:
        app: my-test
    spec:
      containers:
      - name: bbox
        image: busybox
        command: ['sh', '-c', 'echo The app is running! && sleep 3600']
        ports:
        - containerPort: 80
          name: http
        volumeMounts:
        - name: disk-test
          mountPath: /tmp/test
        resources:
          requests:
            memory: "50Mi"
            cpu: "50m"
      volumes:
      - name: disk-test
        persistentVolumeClaim:
          claimName: pv-test

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-test
  namespace: test
spec:
  storageClassName: ""
  capacity:
    storage: 50G
  accessModes:
    - ReadWriteOnce
  gcePersistentDisk:
    pdName: disk-test
    fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pv-test
  namespace: test
spec:
  storageClassName: ""
  volumeName: pv-test
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50G

chrisst · 2018-09-25T20:57:16Z

It looks like you have terraform managing the creation of the disk but you are using a different tool in order to manage the attachment of the disk to an instance. In this case terraform has no knowledge that the disk was supposed to be attached to an instance an so in this case when you remove the disk from your config, if terraform was the authoritative source of your configuration then it could be considered correct behavior to detach and delete the disk rather then fail.

We are currently intentionally detaching disks from an instance during the delete process. So I'm not sure I would call this a bug, but given that the rest api fails to delete a disk in use it's a bit of a gray area IMO.

Finally it would be a breaking change to do what you are suggesting, so I'll bring it up with the team and see where we fall.

chrisst · 2018-09-25T21:59:50Z

Also if you wanted to let us know more details around why you want this behavior to happen we can think about if there are different or better solutions.

In the mean time I propose a few possible solutions:

Using Terraform as your single source of truth is usually a preferable experience then mixing different tools. There is an attached_disk resource that could be used to make the connection between a disk resource and an instance and it would complain at plan time when you remove a disk that is still being referred to elsewhere.
Use the prevent_destroy flag to throw errors to prevent accidental removal of resources.

jeff-knurek · 2018-09-26T15:44:07Z

We are currently intentionally detaching disks from an instance during the delete process. So I'm not sure I would call this a bug, but given that the rest api fails to delete a disk in use it's a bit of a gray area IMO.

That makes sense, and I can see how this falls into a gray area.

From our perspective, it's not possible to use terraform for all our resource management. The google & kubernetes providers have come along way in the past year, but there is still lacking functionality in places. And since disk allocation is managed by the kubernetes API specifically, I'm not clear on how terraform would be able to know about the state of an attached disk in this use case.

We try to use terraform as a replacement for the gcloud cli tool in all places possible. And so for us, this is why we expect terraform to respect the api failure during this.

paddycarver · 2018-09-26T23:24:38Z

I think there's an argument for letting Kubernetes to manage the disk attachments--that's a really gray area, but there's a certain level of pragmatism I think we have to apply in that case. However, I'm not 100% clear on how it's relevant. I think the more important question is, why are you removing the disks from your config when they're in use? What does the error enable you to do that you can't do today?

chrisst · 2019-04-10T23:01:07Z

@paddycarver I've been thinking about this a bit more and right now if a disk is attached to a compute_instance using either an attached_disk resource or using source on compute instance this isn't a great experience. Any operation that forces a recreation of the compute_instance will fail because the disk gets deleted out from under terraform during the deletion of the instance.

To make sure we aren't orphaning disks we still need to delete them when attached with boot_disk or the attached_disk attribute. But I think we could check if those attributes are set and then choose to detach instead of destroy if they aren't.

sanghaniJ · 2022-12-14T09:24:22Z

b/262494546

sanghaniJ · 2022-12-14T09:32:44Z

I've looked into this issue and found out that the request is regarding the disks, those disks which are attached shouldn't be deleted and an error should've been thrown. Similar to the error thrown via gCloud cli in the above comment.

When the disks are attached to any instances, while deletion they're detached at first. That feature was available when this Pull Request was merged.
But going through the above description, the attached resource isn't managed by Terraform hence it isn't able to detect the change. If the same was done using terraform, an error would've been thrown. Below is the config which I tried and core terraform behavior validates.

data "google_compute_image" "my_image" {
  family  = "debian-11"
  project = "debian-cloud"
}

resource "google_compute_disk" "foobar" {
  name  = "test"
  image = data.google_compute_image.my_image.self_link
  size  = 10
  type  = "pd-ssd"
  zone  = "us-central1-a"
}

resource "google_compute_snapshot" "foobar" {
  name        = "test"
  source_disk = google_compute_disk.foobar.name
  zone        = "us-central1-a"
}

Now, in this case if I comment out the disk, an error will be thrown stating Error: Reference to undeclared resource.

In cases when disk is attached to resources not managed via Terraform, we can print a warning which states Disk is attached to instances which will be detached during deletion
Also, in-order to delete the disk, we can check if it is already attached. If that is the case, we can return an error stating Disk is already in use, please detach in order to delete
That can be done in Delete function checking if v, ok := readRes["users"].([]interface{}) returning error. This is present currently at line#1022 and when the condition is true, it detaches and deletes.
The suggested change might result in breaking-change as well.

rileykarson · 2023-04-10T17:23:38Z

Terraform detaching disk users made a lot of sense when disk was initially implemented, but fine-grained resources like the attachment resource may make this less necessary now. We could consider guarding the behaviour behind a deletion_policy flag or removing it entirely, and requiring users to manage their disk attachments in Terraform so they're deleted first in the execution graph

Not sure if that works for all use cases, though- we'd want to investigate this substantially before changing anything.

melinath · 2023-10-13T21:15:34Z

Note that if the behavior were guarded behind a deletion policy, and the current behavior is the default, that would not be considered a breaking change. For more information:

zli82016 · 2024-07-08T19:59:14Z

Closing the issue with the response from the service team:

The api does not allow deletion of disk if attached to an instances. There is no force delete option. The disk has to be detached first.

It is recommended to manage all of resources using Terraform.

github-actions · 2024-08-09T02:12:40Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

ghost mentioned this issue Sep 19, 2018

[Provider: google-cloud] deleting an attached disk should not be possible hashicorp/terraform#18898

Closed

paddycarver added the bug label Sep 20, 2018

chrisst added enhancement breaking-change and removed bug labels Sep 25, 2018

paddycarver modified the milestone: 2.0.0 Sep 30, 2018

paddycarver removed the breaking-change label Oct 18, 2018

paddycarver removed this from the 2.0.0 milestone Oct 18, 2018

paddycarver added the size/s label Dec 12, 2019

paddycarver added this to the Backlog milestone Dec 12, 2019

sanghaniJ added the priority/4 label Dec 14, 2022

sanghaniJ added the forward/linked label Dec 14, 2022

rileykarson added breaking-change and removed priority/4 labels Apr 10, 2023

rileykarson modified the milestones: Backlog, 5.0.0 Under Consideration Apr 10, 2023

This was referenced Apr 13, 2023

Unable to 👍 - Allow interpolations in lifecycle attributes hashicorp/terraform#33031

Closed

"Variables may not be used here" for prevent_destroy hashicorp/terraform#22544

Open

crw mentioned this issue Apr 13, 2023

Cannot use interpolations in lifecycle attributes hashicorp/terraform#3116

Closed

flebotany mentioned this issue Apr 21, 2023

terraform lifecycle ignore_changes should not be static or should otherwise support conditional ignore lists hashicorp/terraform#33063

Closed

c2thorn added the decision-needed label May 23, 2023

c2thorn modified the milestones: 5.0.0 Under Consideration, Future Major Release May 23, 2023

github-actions bot added the service/compute-pd label Oct 5, 2023

melinath added breaking-change and removed forward/linked breaking-change labels Oct 13, 2023

modular-magician added the forward/linked label Oct 14, 2023

github-actions bot added the service/compute-instances label Oct 25, 2023

c2thorn modified the milestones: Future Major Release, 6.0.0 Under Consideration May 1, 2024

zli82016 closed this as completed Jul 8, 2024

github-actions bot locked as resolved and limited conversation to collaborators Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Provider: google-cloud] deleting an attached disk should not be possible #2078

[Provider: google-cloud] deleting an attached disk should not be possible #2078

ghost commented Sep 19, 2018 •

edited by modular-magician

Loading

chrisst commented Sep 19, 2018

jeff-knurek commented Sep 24, 2018

chrisst commented Sep 25, 2018

chrisst commented Sep 25, 2018

jeff-knurek commented Sep 26, 2018

paddycarver commented Sep 26, 2018

chrisst commented Apr 10, 2019

sanghaniJ commented Dec 14, 2022

sanghaniJ commented Dec 14, 2022 •

edited

Loading

rileykarson commented Apr 10, 2023

melinath commented Oct 13, 2023

zli82016 commented Jul 8, 2024

github-actions bot commented Aug 9, 2024

[Provider: google-cloud] deleting an attached disk should not be possible #2078

[Provider: google-cloud] deleting an attached disk should not be possible #2078

Comments

ghost commented Sep 19, 2018 • edited by modular-magician Loading

Affected Resource(s)

chrisst commented Sep 19, 2018

jeff-knurek commented Sep 24, 2018

chrisst commented Sep 25, 2018

chrisst commented Sep 25, 2018

jeff-knurek commented Sep 26, 2018

paddycarver commented Sep 26, 2018

chrisst commented Apr 10, 2019

sanghaniJ commented Dec 14, 2022

sanghaniJ commented Dec 14, 2022 • edited Loading

rileykarson commented Apr 10, 2023

melinath commented Oct 13, 2023

zli82016 commented Jul 8, 2024

github-actions bot commented Aug 9, 2024

ghost commented Sep 19, 2018 •

edited by modular-magician

Loading

sanghaniJ commented Dec 14, 2022 •

edited

Loading