Tag Archives: ebs

AWS re:Invent 2014 – Day 2

The second day of the AWS re:Invent conference has drawn to a close, and I sit here with ringing ears from the re:Play party where Skrillex was headlining. AWS certainly know how to throw a party – this one including a 20ft Tetris game, an entire arcade of retro games, a quadcopter obstacle course, and luminous dodgeball.

Back to the keynote; and a large portion of today’s was spent letting large AWS customers talk about their positive experiences of using the “AWS platform”. And this is an important phrase – one that has been used significantly more this year than in previous years, and reinforces the message that Amazon are trying to push – that AWS is much more than just IaaS. As Amazon continue to release products and services that creep up the stack, this message will become increasingly important. The other key message coming across this year is containers – they have appeared from nowhere in the past 18 months and are becoming increasingly important. Expect to see a lot more of them in the coming years.

Continue reading AWS re:Invent 2014 – Day 2

Automatically creating and purging snapshots for additional EBS volumes attached to an instance

If you are using AWS in any anger, you will likely be storing data that you need to persist. Unfortunately, persistent data doesn’t fit well with Amazon’s disposal philosophy, and so you’ll be using EBS volumes over ephemeral storage.

As the data’s important enough to persist, you’ll probably want to make sure it’s backed up too. The easiest way to back up EBS volumes is by snapshotting them. It’s easy to automate the process, so you can rest assured that your data is safe.

Snapshots are particularly useful as you can create new EBS volumes from a snapshot and mount them alongside your live data. Snapshots are also Amazon’s suggested mechanism to restore an EBS volume into another AZ if the AZ hosting your EBS volumes becomes unavailable for some reason.

Create snapshot script

Below is a script I wrote to automatically snapshot all volumes attached to a particular instance. The script is run from the host to which the EBS volume is attached, so you’ll need to have some AWS credentials in there for this to work. You could quite easily move the script to a centralised management host and feed in the instance ID or some other parameter, but I didn’t have a requirement to do that.

The script relies on a couple of custom Tags we apply to all our EBS volumes, but you could modify the script to work around this. I just couldn’t think of a simple way to de-duplicate all the volume IDs you get in the output from ec2-describe-volumes.

#!/bin/bash
 
echo "#"
echo "# Starting create_volume_snapshot.sh on `date +%Y-%m-%d` at `date +%H%M`"
echo "#"
source /home/ec2-user/.bash_profile
export YV_INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id > /dev/null`
 
# Create a temp text file to store the information about the volumes
TEMP_VOL_FILE="/tmp/YV-ec2-describe-volumes-`date +%s`.txt"
ec2-describe-volumes -F "attachment.instance-id=${YV_INSTANCE_ID}" > ${TEMP_VOL_FILE}
# Check we received a non-empty response
RESPONSE_CHECK=`cat ${TEMP_VOL_FILE} | wc -l`
if [[ ${RESPONSE_CHECK} = 0 ]]; then
  echo "Call to ec2-describe-volumes resulted in a blank response, cannot continue."
fi
 
# Loop through the output file of ec2-describe-volumes command, searching for 4th field 
# which is Tag Key name (and should match MountPoint)
while read LINE
  do
  MPSEARCH=`echo $LINE | cut -f4 -d' '`
  if [[ ${MPSEARCH} == "MountPoint" ]]
    then
    # For each volume, get all data about the volume and output to a temporary file
    VOLUMEID=`echo $LINE | cut -f3 -d' '`
    ec2-describe-volumes ${VOLUMEID} > /tmp/${VOLUMEID}.txt
    # Get the volume ID and the MountPoint tag value
    ROLE=`grep Role /tmp/${VOLUMEID}.txt | awk -F" " '{ print $5 }'`
    ENVIRONMENT=`grep Environment /tmp/${VOLUMEID}.txt | awk -F" " '{ print $5 }'`
    MOUNTPOINT=`grep MountPoint /tmp/${VOLUMEID}.txt | awk -F" " '{ print $5 }'`
    DATESTAMP=`date +%Y-%m-%d-%H%M`
    echo "    VolumeID    : ${VOLUMEID}"
    echo "    Environment : ${ENVIRONMENT}"
    echo "    Role        : ${ROLE}"
    echo "    MountPoint  : ${MOUNTPOINT}"
    echo "    DateStamp   : ${DATESTAMP}"
    # Create a snapshot of the volume, including a description
    ec2-create-snapshot ${VOLUMEID} -d snap_${ENVIRONMENT}_${ROLE}_${MOUNTPOINT}_${YV_INSTANCE_ID}_${DATESTAMP}
    # Clean up volume information temporary file
    rm -f /tmp/${VOLUMEID}.txt
  fi
done < ${TEMP_VOL_FILE}
 
# Clean up temporary text file with all volume information
rm -f ${TEMP_VOL_FILE}
echo "#"
echo "# Finished create_volume_snapshot.sh on `date +%Y-%m-%d` at `date +%H%M`"
echo "#"

Purge snapshot script

As with everything in Amazon, you pay for exactly what you use. So you don’t want to keep all those EBS snapshots indefinitely, as over time you’ll start paying a lot to store them. What you need is a script which runs regularly and removes snapshots that are older than a certain age. The script below is again based on some some of our custom EBS Tags, but it shouldn’t be too hard to modify for your purposes.

#!/bin/bash
 
echo "#"
echo "# Starting purge_volume_snapshot.sh on `date +%Y-%m-%d` at `date +%H%M`"
echo "#"
source /home/ec2-user/.bash_profile
export YV_INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id 2> /dev/null`
 
# Date variables
DATECHECK=`date +%Y-%m-%d --date '15 days ago'`
DATECHECK_EPOCH=`date --date="$DATECHECK" +%s`
 
# Get all volume info and copy to temp file
TEMP_VOL_FILE="/tmp/YV-ec2-describe-volumes-`date +%s`.txt"
ec2-describe-volumes -F "attachment.instance-id=${YV_INSTANCE_ID}" > ${TEMP_VOL_FILE}
# Check we received a non-empty response
RESPONSE_CHECK=`cat ${TEMP_VOL_FILE} | wc -l`
if [[ ${RESPONSE_CHECK} = 0 ]]; then
  echo "Call to ec2-describe-volumes resulted in a blank response, cannot continue."
fi
 
# Loop through the output file of ec2-describe-volumes command, searching for 4th field which is Tag Key name (and should match MountPoint)
while read LINE
  do
  MPSEARCH=`echo $LINE | cut -f4 -d' '`
  if [[ ${MPSEARCH} == "MountPoint" ]]
    then
    # For each volume, get all data about the volume and output to a temporary file
    VOLUMEID=`echo $LINE | cut -f3 -d' '`
    echo "Volume ID : ${VOLUMEID}"
    ec2-describe-snapshots -F "volume-id=${VOLUMEID}" > /tmp/${VOLUMEID}.txt
    # Loop to remove any snapshots older than 15 days
    while read LINE
      do
      SNAPSHOT_NAME=`echo $LINE | grep ${VOLUMEID} | awk '{ print $2 }'`
      DATECHECK_OLD=`echo $LINE | grep ${VOLUMEID} | awk '{ print $5 }' | awk -F "T" '{ printf "%s\n", $1 }'`
      DATECHECK_OLD_EPOCH=`date --date=${DATECHECK_OLD} +%s`
      echo "    Snapshot Name        : ${SNAPSHOT_NAME}"
      echo "    Datecheck -15d       : ${DATECHECK}"
      echo "    Datecheck -15d Epoch : ${DATECHECK_EPOCH}"
      echo "    Snapshot Epoch       : ${DATECHECK_OLD_EPOCH}"
      if [[ ${DATECHECK_OLD_EPOCH} -lt ${DATECHECK_EPOCH} ]]; then
        echo "Deleting snapshot $SNAPSHOT_NAME as it is more than 15 days old..."
        ec2-delete-snapshot $SNAPSHOT_NAME
      else
        echo "Not deleting snapshot $SNAPSHOT_NAME as it is less than 15 days old..."
      fi
    done < /tmp/${VOLUMEID}.txt
    # Clean up volume information temporary file
    rm -f /tmp/${VOLUMEID}.txt
  fi
done < ${TEMP_VOL_FILE}
 
# Clean up temporary text file with all volume information
rm -f ${TEMP_VOL_FILE}
echo "#"
echo "# Finished purge_volume_snapshot.sh on `date +%Y-%m-%d` at `date +%H%M`"
echo "#"

Acknowledgement

I would like to thank Kevin at stardot hosting for this post, on which I based the above snapshot and purge scripts.

I hope somebody finds these scripts useful – it’s certainly been a great for my peace of mind!