Backing up data is always a challenge. There are different ways to do backups — some are convenient, others complicated depending on how your data is stored, and you often need to strike a balance between the two to accommodate recovery and business needs. Electronic Design Automation (EDA) environments are even more challenging, mainly because of the massive quantity of files that get generated as part of design and simulation.First, repeat after me: RAID IS NOT BACKUP. A RAID array provides some protection against certain types of failures, namely with the individual disks. Multiple disk failures, or failures in the controllers or storage server can still result in corruption and data loss. And RAID certainly doesn't protect against the human factor — accidentally overwriting or deleting a file.
There are different ways one can do backups these days, and some methods may depend on the type of storage used, whether you are using virtual machines, or how long you want to retain the data, etc. But to simplify the discussion, we can generally divide backup methods into one of two classes, file-based and block-based.
Let's take for example the combination of a VMware vSphere environment and Datto’s SIRIS backups. vSphere provides the means to make snapshots of a given VM, providing for convenient rollback after updates, testing of changes, etc. What happens in VMware is that the virtual disk devices, which themselves exist as files on the underlying storage, stop getting new writes and instead all future changes to the virtual disks are written to new files called deltas. This means the state of the virtual disks is preserved at the moment the snapshot is made. Now a backup product such as Datto SIRIS can take the original virtual disk files, compare what blocks have changed since the last time it backed them up, and copy only the blocks that have changed. When the backups are done, the snapshot is released and the changed blocks being written to the delta files gets consolidated back into the originals. Note: While the snapshot exists, more disk space is used to write all the changes to the delta files, so make sure you have sufficient disk space for this to occur for the length of the backup run. This will depend on how much data changes during that time frame, so quiet periods are best.
Added bonus - Datto SIRIS takes the backups it makes and can run the VMs directly on the SIRIS system itself, or even in the cloud! This provides business continuity along with the ability to restore.
Some of these systems which support snapshots may also support an inherent ability to make a limited form of backup sometimes referred to as "snapshot shipping". The idea is that if you already have a copy of a logical device on another system, and we take snapshots of the original, we can send those snapshots over to the other system and recreate the changes there. The snapshots already contain only the changes to the filesystem, so the amount of data to transfer is already minimal and can be suitable for transferring directly offsite depending on overall size and the speed of your connection.
NetApp Filers support this through their SnapMirror and SnapVault products. The ZFS filesystem (supported on Solaris and BSD-derived systems, and as an add-on to Linux) does this as well using the "zfs send" and "zfs recv" commands. The best part is that in the case of loss of the original storage system, these backup systems are generally ready to go immediately — no need for an explicit restore to be done.
The drawback is that typically the destination needs to be similar hardware and a compatible OS release, and depending on the rate of new data being written (and in an EDA environment, that can be large) and the size of the backup system, you may not be able to use this mechanism for long-term archives or transfer over slow network links. Consider a secondary backup system for long-term storage — but keep in mind you can run the archival backups off the backup system and not impact the primary system, so how long it takes is less of a consideration.
When you do go looking for a backup system, be aware that some of the more specialized systems that are able to do block-level backups may not support backing up anything other than the type of system they were designed for, or there are additional costs to backing up different storage systems. Hybrid environments consisting of varying types of storage may be difficult to find a solution that is fast, economical, and takes care of everything. You may wish to consider more than one solution if your environment incorporates lots of different systems, rather than choose a solution which is suboptimal all around.
In the end, there are many different options to choose from. Finding the right balance for your environment will be a challenge. Pay attention to what the vendor options say, but always keep in mind that in YOUR environment, the number of files may be more important that the total amount of data you need to back up, and vendors tend to say very little about that. If you can, choose a backup system that can deal with the underlying block structure of your storage instead of the traditional file methods, and if you are looking for new storage as well, consider how you are going to back it up!
1. Bonk bonk on the head if you missed that nerd reference.