Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-attaching EBS volumes to new EC2 instances... #2740

Closed
bdesilva opened this issue Jul 15, 2015 · 15 comments
Closed

Re-attaching EBS volumes to new EC2 instances... #2740

bdesilva opened this issue Jul 15, 2015 · 15 comments

Comments

@bdesilva
Copy link

Hey guys,

I have a scenario where if I decide to bring down an EC2 instance in order to update my Elasticsearch version to a new version (this setup is for my Elasticsearch cluster on AWS), I would like to detach my EBS volume and re-attach it to the new EC2 instance with the updated Elasticsearch AMI.

I believe I can already detach an EBS volume on EC2 destruction this way:

ebs_block_device {
    device_name = "/dev/sdh"
    volume_type = "io1"
    iops = "4000"
    volume_size = "500"
    **delete_on_termination = "false"**
  }
}

However, I'm still trying to find out how to re-attach this existing EBS volume to a new EC2 instance in Terraform prior to bringing that EC2 instance/node back into my cluster.

Can you guys please help me with this?

Thanks,
Ben.

@Pryz
Copy link
Contributor

Pryz commented Jul 22, 2015

Same issue here.

I also tried to create an aws_ebs_volume and attach it with aws_volume_attachment. But here I don't see how we can bootstrap (mkfs + mount) the disk with cloudinit or a provisioning script.

@johannesboyne
Copy link
Contributor

I've managed to reattach a volume (including mounting) using the user_data field:

user_data = "#!/bin/bash\nmkdir /data; mount /dev/xvdh /data;

resource "aws_instance" "web" {
  ami = "ami-b7f0f987"
  instance_type = "t2.micro"
  availability_zone = "${var.aws_region}a"
  vpc_security_group_ids = ["${aws_security_group.default.id}"]
  iam_instance_profile = "ecsInstanceRole"
  key_name = "us-west-ecs"
  user_data = "#!/bin/bash\nmkdir /data; mount /dev/xvdh /data; service docker restart; echo 'ECS_CLUSTER=${aws_ecs_cluster.cms.name}\nECS_ENGINE_AUTH_TYPE=dockercfg\nECS_ENGINE_AUTH_DATA={\"${var.registry}\": {\"auth\": \"${var.auth}\",\"email\": \"${var.email}\"}}' >> /etc/ecs/ecs.config;"
}

# Attach DBMS Volume
resource "aws_volume_attachment" "ebs_att" {
  device_name = "/dev/xvdh"
  volume_id = "<volume-id to reattach>"
  instance_id = "${aws_instance.web.id}"

}

If one wants to use an EBS volume attached to a docker container, like in the example above. Make sure service docker restart is included. One has to restart the docker daemon.
Otherwise the attached volume is not used, unfortunately the aws init process won't tell you that.

@Pryz
Copy link
Contributor

Pryz commented Sep 4, 2015

No update on this ?

Should we use aws_ebs_volume instead ?

@johannesboyne
Copy link
Contributor

Regarding your question

But here I don't see how we can bootstrap (mkfs + mount) the disk with cloudinit or a provisioning script

I'm pretty sure you could bootstrap it via the user_data field, doing something like:

#!/bin/bash
mkfs -t ext4 /dev/xvdh #boostrapping
mkdir /data #create mount point
mount /dev/xvdh /data #mount it
#service docker restart (if you want to use it as a docker volume)

Should we use aws_ebs_volume instead ?

Well, I guess for the creation it would be correct

Does this helps you?

@Pryz
Copy link
Contributor

Pryz commented Sep 4, 2015

Well, you can do that if you are using an "aws_ebs_volume" resource. But not an "ebs_block_device" insinde an "aws_instance" resource.

I was hoping doing the bootstrap of the ebs volume outside of CloudInit since I have a lot of difference instances and most of the time the only differences are about EBS.

@johannesboyne
Copy link
Contributor

O.k., now I see your point. It's an interesting question.

@Pryz
Copy link
Contributor

Pryz commented Sep 4, 2015

Using the proposition of @phinze for now. See : #2050

@catsby
Copy link
Contributor

catsby commented Dec 2, 2015

Hey all – I believe this issue has been resolved with aws_volume_attachment. Mounting needs to be done in the user_data section as mentioned. Thanks!

@catsby catsby closed this as completed Dec 2, 2015
@aiwilliams
Copy link

I have an existing volume that I want to attach to an Amazon Linux instance and ensure that upon reboot it will be reattached.

riak-user-data.sh:

#!/bin/bash

mkdir /data
echo '/dev/sdh /data ext4 defaults,nofail,noatime,nodiratime,barrier=0,data=writeback 0 2' >> /etc/fstab
mount -a
resource "aws_instance" "riak" {
  ...

  user_data = "${file("riak-user-data.sh")}"

  provisioner "remote-exec" {
    inline = [
      "sudo mkdir -m 0755 -p /etc/ansible/facts.d",
    ]
    connection {
      user = "ec2-user"
      bastion_host = "${data.terraform_remote_state.vpc.jumphost_eip_public_ip}"
      bastion_user = "ubuntu"
    }
  }
}

resource "aws_ebs_volume" "riak" { ... }

resource "aws_volume_attachment" "riak" {
  device_name = "/dev/sdh"
  volume_id   = "${aws_ebs_volume.riak.id}"
  instance_id = "${aws_instance.riak.id}"
}

The process runs as follows, eventually timing out because the remote-exec provisioner is never able to connect:

aws_instance.riak: Creating...
aws_instance.riak: Still creating... (10s elapsed)
aws_instance.riak (remote-exec): Using configured bastion host...
aws_instance.riak (remote-exec):   Host: 34.XXX.XXX.XXX
aws_instance.riak (remote-exec):   User: ubuntu
aws_instance.riak (remote-exec):   Password: false
aws_instance.riak (remote-exec):   Private key: false
aws_instance.riak (remote-exec):   SSH Agent: true
...
aws_instance.riak: Still creating... (5m30s elapsed)
Error applying plan:

1 error(s) occurred:

* aws_instance.riak: 1 error(s) occurred:

* timeout

It seems that the user_data is leading to a system that fails to start SSH; likely the system is failing to completely boot (a clear warning about fstab in AWS docs). When I remove the specified user_data the machine will successfully boot and then we see:

aws_volume_attachment.riak: Creating...
  device_name:  "" => "/dev/sdh"
  force_detach: "" => "<computed>"
  instance_id:  "" => "i-01c2d24bbeecb4586"
  skip_destroy: "" => "true"
  volume_id:    "" => "vol-0d4d48ac7cdef77ec"
aws_volume_attachment.riak: Still creating... (10s elapsed)
aws_volume_attachment.riak: Still creating... (20s elapsed)
aws_volume_attachment.riak: Creation complete (ID: vai-3330874248)

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Of course, the volume is attached but not mounted.

My questions is, how can any cloud-init user_data that attempts to mount the device in any way ever successfully execute in a reliable way considering that the attachment cannot occur until after the instance ID is obtained? I suppose that if I have no remote-exec then the API call to create the instance will return fast enough with an instance ID so that the attachment API call can be made before the cloud-init user_data is executed?

I would appreciate knowing if I am thinking of this all wrong. Thanks!

@aiwilliams
Copy link

After much testing, the most reliable solution has turned out to be using a provisioner on the aws_volume_attachment. No matter how long it takes to bring up the aws_instance, the attachment will not be mounted until the host is booted and SSH is available for provisioning.

resource "aws_volume_attachment" "riak" {
  skip_destroy = true
  provisioner "remote-exec" {
    script = "attach-data-volume.sh"
    connection {
      host = "${aws_instance.riak.public_ip}"
    }
  }
}
#!/bin/bash

devpath=$(readlink -f /dev/sdh)

sudo file -s $devpath | grep -q ext4
if [[ 1 == $? && -b $devpath ]]; then
  sudo mkfs -t ext4 $devpath
fi

sudo mkdir /data
sudo chown riak:riak /data
sudo chmod 0775 /data

echo "$devpath /data ext4 defaults,nofail,noatime,nodiratime,barrier=0,data=writeback 0 2" | sudo tee -a /etc/fstab > /dev/null
sudo mount /data

# TODO: /etc/rc3.d/S99local to maintain on reboot
echo deadline | sudo tee /sys/block/$(basename "$devpath")/queue/scheduler
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled

@mwakerman
Copy link

mwakerman commented Mar 20, 2018

Is @aiwilliams approach still considered the best way to do this? Notably, this doesn't work well (at all?) if your aws_instance is in a private subnet.

@earzur
Copy link

earzur commented Mar 21, 2018

@mwakerman i suppose you can set a bastion_host in your provisioner

   script = "attach-data-volume.sh"
   connection {
     host = "${aws_instance.riak.public_ip}"
     bastion_host = xxxxx
   
   }
 }

i have the same issue and just discovered that provisionners can be attached to any resource, not just the instance, and that's going to fix a big issue for me.

@mwakerman
Copy link

mwakerman commented Mar 22, 2018

Thanks @earzur, that should work for us.

So it looks like terraform doesn't allow you to use a passphrase encrypted private_key in the connection block of the remote-exec provisioner. May be able to temporarily add and then revoke unencrypted keys but might also try and do the mkfs and mount in user-data after polling until the EBS volume has been attached.

Ended up doing it all in user-data by giving the instance an IAM role that included anec2:Describe* policy and waiting until the EBS volume attaches with (credit):

while [ -e /dev/xvdh ] ; do sleep 1 ; done

EC2_INSTANCE_ID=$(wget -q -O - http://169.254.169.254/latest/meta-data/instance-id || die \"wget instance-id has failed: $?\")
EC2_AVAIL_ZONE=$(wget -q -O - http://169.254.169.254/latest/meta-data/placement/availability-zone || die \"wget availability-zone has failed: $?\")
EC2_REGION="`echo \"$EC2_AVAIL_ZONE\" | sed -e 's:\([0-9][0-9]*\)[a-z]*\$:\\1:'`"

#############
# EBS VOLUME
#
# note: /dev/sdh => /dev/xvdh
# see: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/device_naming.html
#############

# wait for EBS volume to attach
DATA_STATE="unknown"
until [ $DATA_STATE == "attached" ]; do
	DATA_STATE=$(aws ec2 describe-volumes \
	    --region $${EC2_REGION} \
	    --filters \
	        Name=attachment.instance-id,Values=$${EC2_INSTANCE_ID} \
	        Name=attachment.device,Values=/dev/sdh \
	    --query Volumes[].Attachments[].State \
	    --output text)
	echo 'waiting for volume...'
	sleep 5
done

echo 'EBS volume attached!'

@ghost
Copy link

ghost commented Apr 4, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants