by Bryon D Beilman
Virtual machines can be a great way to consolidate your environment, provide redundant services without buying more hardware and take advantage of underutilized services. I have written about Virtual machines before indirectly as it relates to sizing hardware and planning for growth.
There are many hypervisors our there from Xen, Microsoft , Vmware and others. We utilize Vmware when required, and there is quite a bit of capability in the free version that can be downloaded from their web site. The commercial version of the software provides quite a bit of functionality to move Virtual Machines (VMs) around, provide availability and backups. What about the free version (ie VMware 2.x)? Well there are two approaches to backing up virtual machines, "inside the VM" and "Outside of the VM".
Inside the VM
The VM is like any other operating system, it can be backed up using back up clients and software. Netbackup, Backup exec, dump, cpio, tar, rsync, or whatever you would use on a conventional app. If you have a database, a unique application that you are required to guarantee a backup SLA, or your VM's contain specific "state", then you need to consider this approach. If you are providing front end servers, or your business requirements are such that you need for cheap or free backups overrides other requirements, you should consider the next approach.
Outside the VM
The virtual machines can be started, stopped, captured in a snapshot and reconfigured on the fly inside the VM management tools. The VMs are really a series of files that are specified in a location by you during installation. These .vm* files along with some logfiles, lock files and a few other files can be copied (outside of the OS ) using standard utilities. I will give a simple example of how this can be done using VMWare running in a CentOS Linux environment. In this example, what we will show is how you can have two machines a primary and secondary machine, while utilizing simple scripts can provide a cold (or even warm ) failover for DR or maintenance purposes. The other requirement is that it should be done with minimal interruption to the working environment and be reasonable automated. VM's can easily be taken down and copied, and this can also be done via the GUI, but automation is a better way to ensure that it happens and during night or weekend maintenance periods.
Example:
In our example below, our services are provided by vmserv1. What we want to do is backup and copy the virtual machines to from vmserv1 to vmserv2 such that you can have vmserv2 as a warm or cold standby server. This method should also allow you use the same process to do maintenance on vmserv1 while still maintaining the same services. Depending on how active your VM's are , vmserver2 can be any host that has resources to run the VM's and does not have the same performance requirements as vmserv1.
The process:
1) Take vmware snapshots of VM's that you want.
Vmware snapshots take a point in time snapshot of a running VM. During this time, the services may not be available, but it take a few minutes and can be done in off-hours.
2) rsync them to vmserver 2
rsync is a good tool because it has the ability to only transfer files or bits that have changed and is a fast and effective way to transfer data.
We are showing a process that just uses two standalone type servers, but another good approach is to put your VM's on a fast NFS server and you can utilize the same VM files for each server. Like most IT problems, there are numerous solutions , each having their merit.
Before we show the script, you might be wondering how we can copy a live running OS and have it work on the other side. Once the snapshot is taken, you can copy (live) the entire VM and on the target side (in this case vmserv2), you can then 'revert to last snapshot'. It will use that snapshot information to make the VM run as it was during the time of the snapshot. I have tested this method many times and even tested some database applications that were running to make sure that the data that was updated minutes before the snapshot is on the newly brought up VM. Those who haven't done this , should also be aware that you should not bring up two identical VM's on two servers unless you are specifically controlling the networking on them to make sure there are not conflicts. The one thing you need to watch out for is that rsync may copy the .lck files/directories, so you will need to test /refine your process to remove that .lck related files so that failover VM properly starts.
Here is an simple example script:
#!/bin/bash
# below <vmroot> is the user that runs your vmware, and <pass> is the passsword you use
# rsync needs to be set up using either an rsync server or via ssh keys, so that your process can communicate without interaction
# a few notifications are included, but this is an example, and you can provide many more checkpoints and comparisons, to make sure it succeeds.
TMPFILE=/tmp/process.$$
#Gather the list of running VM's (according to your configuratoin)
PRODVM=`vmrun -h https://localhost:8333/sdk -u <vmroot> -p '<pass>' listRegisteredVM > $TMPFILE`
# Now create Snapshot of VMs and rsync it to the secondary server
cat $TMPFILE | while read VM; do
echo "VM is $VM"
/usr/bin/vmrun -T localhost -u <vmroot> -p <pass> -h https://localhost:8333/sdk snapshot "$VM"
done
echo "Now, rsync the data" >> $TMPFILE
/usr/bin/rsync -avzq /data/VM/PROD/ vmserv2:/data/VM/PROD >> $TMPFILE 2>&1
cat $TMPFILE | mail -s "VM snapshots/Rsync complete" itops@yourdomain.com
/bin/rm $TMPFILE
It's simple, it works and it's free.