Ubuntu VMware to EC2

This is a guide to porting a local Ubuntu 10.04 install from VMware under OSX to running on Amazon EC2.

Flitterati is hosted on EC2. I’ve done most of the development on OSX, and then ported the scripts to an Ubuntu install on EC2. Unfortunately, there are enough differences to make this suck. When your development environment is too far from production, you create a lot of extra work for yourself.

Thankfully, with Amazon support for user-generated kernels, it’s become a lot easier to run a local Ubuntu on VMware Fusion and then transfer that system to EC2. This isn’t terribly well documented, as most folks seem to work just on EC2. It took a lot of trial-and-error, but now I have a functioning local Ubuntu install, and it’s trivial to upload and run exactly the same system on EC2.

These two blog posts were essential in getting me started, and over hurdles:

  • IonCannon.net – This is a very similar guide with the same objective. I had a few problems with it, and where this guide differs is in the places I had to adjust.
  • tomheady on Github – This guide gives a few more pointers about the necessary xen kernel install

The Objective

I’m on OSX, and running VMware Fusion locally. I want to develop on a local Ubuntu instance that is running apache and couchdb in exactly the same way as it will run live, and work with it interactively. And then as simply as possible, upload the image to EC2 and be able to run an instance with the same code.

I use the ephemeral store on EC2 to hold the CouchDB database. I want to have a /mnt directory that locally refers to another disk, and on EC2 refers to the ephemeral store.

In order to work with BBEdit and Eclipse locally, I want to mount the instance file system. I use Samba to do this.

Step 1 – Install Ubuntu on Fusion

I’m working with Ubuntu 10.04 LTS Server (Lucid). Go ahead and download the install .iso from Canonical, and begin installing under Fusion. Make the following changes:

  • Change the Fusion disk size to 2GB – this is plenty for my install, and saves time in uploading the raw disk image later
  • Add another Fusion disk of 150GB – this is the size I usually see on a small Amazon instance, and we’ll map this to match the ephemeral store. Both these disks should be on the SCSI bus
  • I set up networking to use NAT – this still allows connecting to services inside the VM from the desktop
  • Once the install routine is booted, choose a manual install so you can partition the disk and avoid LVM
  • Remove the swap partition, and instead format the 2 GB disk as only a primary bootable partition. Don’t format the second disk yet
  • I set up the initial system with LAMP, OpenSSH, and Samba – you might want a different list of services
  • Install Grub

Step 2 – After Ubuntu is Running

I didn’t bother to set up VMware tools on the instance. All the following is from the command-line. Almost every command should actually start with ‘sudo’, but I’ve omitted that.

First we’ll mount our local disk the same way Amazon will mount the ephemeral drive:

  • ls /dev/s* – make sure you have sda and sdb, the two disks you created at the beginning
  • mkfs.ext4 /dev/sdb – you will receive a warning about creating a filesystem with no partition table, but this is the way the amazon ephemeral store is mounted
  • Create a new fstab with these contents:
proc    /proc    proc    nodev,noexec,nosuid 0  0
/dev/sda1  / ext4  errors=remount-ro 0  1
/dev/sdb  /mnt  ext4  defaults 0  0
/dev/sda3 none  swap  sw   0  0

When running locally, there is no sda3. So this results in a warning on boot. I don’t really care, and it does result in swap space under EC2.

Now install the xen kernel you need to run under EC2. Helpfully, amazon is using PV-Grub to boot with which uses menu.lst. Ubuntu Lucid installs Grub 2, so we can leave the Grub 2 kernel alone, and only refer to the EC2 kernel from within menu.lst!

  • aptitude install linux-ec2
  • Create /boot/grub/menu.list:
default 0
timeout 1
title EC2-kernel
    root (hd0,0)
    kernel /boot/vmlinuz-2.6.32-314-ec2 root=/dev/sda1
    initrd /boot/initrd.img-2.6.32-314-ec2

Get easier access to the Ubuntu VM:

  • Get the IP address using ifconfig or something similar inside the VM
  • ssh from the desktop using this IP address
  • You should be able to log in using only a password at this point, we’ll disable password authentication later

Configure sshd and anything else necessary:

  • update-rc.d -f hwclock remove – the hardware clock is meaningless under xen, and it’s recommended that you stop the service
  • mkdir ~/.ssh
  • [from desktop] scp id_rs.pub user@ipaddress:~/.ssh/authorized_keys2 – use the IP address found earlier, and copy your credentials to the appropriate authorized_keys file
  • emacs /etc/ssh/sshd_config
    • PermitRootLogin no
    • AllowPassword no
  • Confirm that you can no longer log in using a password, and you must present your RSA key instead. EC2 machines are scanned by hackers constantly, and there is no reason for allowing access with only a password.

Go ahead and reboot the instance, make sure everything is running well, and you still have access to ssh. That’s it for setting up the VM!

Step 3 – Convert the VM disk and Transfer to EC2

This is the step that you will probably run repetitively as you develop. You can obviously script these steps, and the only real time sink is the upload of the 2GB disk image each time you wish you update the AMI.

From the host system, locate the .vmdk file that represents the boot drive of the VM. This should be in ~/Documents/Virtual Machines.localized/VMdir, or someplace similar. Then extract the disk:

  • sudo port install qemu – (or something similar) I had to install qemu
  • qemu-img convert -O raw EC2.vmdk ~/temp/vmbase.raw

Run an amazon instance so that you can attach a volume to it. This can be any linux-based instance, such as the base Amazon Linux AMI. Install the AWS API tools if necessary. From that instance:

  • ec2-create-volume -z us-east-1d -s 2 – this creates a 2GB volume
  • ec2-attach-volume <volume-id> -i <instance-id> -d /dev/sdh
  • (copy the vmbase.raw disk image from the desktop to this instance)
  • dd if=vmbase.raw of=/dev/sdh bs=10M
  • ec2-create-snapshot -d “vmbase snapshot” <volume-id>
  • ec2-register -n “VMBase” -d “AMI created from VMware base” –root-device-name /dev/sda1 -b /dev/sda=<snapshot-id>:2:true -b /dev/sdb=ephemeral0 –kernel aki-4c7d9525 – this is for US East, look up the correct hd00 aki here

That’s it! You should be able to boot and log into the AMI. I just continue working normally on the VM, and copy to EC2 as necessary.

Some other notes

Making snapshots will create multiple .vmdk files. I’m not sure how to convert these into a .raw file successfully. Copying the VM using Fusion will create a fresh .vmdk file that can be converted.

If you copy the Fusion VM, you might lose network connectivity. Delete /etc/udev/rules.d/70-persistent-net.rules and reboot if this happens.

You can set a static IP for your VM:

  • In the VM properties, ask Fusion to create a MAC address for the network card
  • Add this to /Library/Application Support/VMWare Fusion/vmnet8/dhcpd.conf:
host <name of VM here> {
    hardware ethernet <mac address here> ;
    fixed-address 192.168.146.10;
}

(make sure the IP address range matches the subnet mentioned in the routing tables higher in the same dhcpd.conf file)