To begin, it is important to understand the things you need to do to plan a virtual infrastructure and choose the appropriate data protection for it. Identifying and selecting the capabilities and limitations of data protection within your virtual infrastructure is one of the most critical tasks.
For simplification, this article limits the virtualization platform example to VMware ESX. The process is the same for Microsoft Hyper-V, Virtual Iron and others until you get to the end and have to determine the right implementation.
With current virtualization technology almost all applications can be virtualized. You just have to decide on a reasonable set of applications and then compile the following information:
It is absolutely critical that you characterize these applications under their heaviest expected load or you'll start running out of resources unexpectedly when you implement your virtual infrastructure.
Total memory footprint
Memory the application uses at peak load? If the application "leaks" memory (its memory footprint grows even under constant load) you'll need to allow room for that as well.
Total CPU utilization
How many CPUs and at what percentage used at peak load? Don't forget to note the type of CPU you used when you did your measurements.
Total disk space including growth to next budget cycle
Network bandwidth utilization
Network bandwidth used by this application at peak load. Remember to account for both directions of network traffic.
Storage network throughput (SCSI, FC, iSCSI, NAS) as both input and output
The same thing you just did for your messaging network.
Disk reads and writes
The disk activity that this application requires at load. There are other disk load parameters that may need to be characterized as well, depending on the application.
Memory bus utilization estimate (memory bus available bandwidth minus four times the total I/O)
Years of empirical data have upheld this useful rule of thumb. This can be somewhat difficult to get since it is not always easy to identify the memory bus speed of a particular system.
Is there a window during the day or night when they could reasonably be shut down and backed up?
Is there a window during the day or night when the total load on the ESX physical server is low enough that backups can be performed without negatively impacting the running apps? If there is no application and ESX server available window, you will need to select a proxy backup method.
Do you need to be able to recover individual files on a regular basis? If so, you will most likely need to run a backup agent directly within a virtual machine.
If you've designed and implemented a few data protection architectures, the requirements gathering process was probably quite familiar to you. It doesn't change much for virtual infrastructures.
Once you understand your application and data protection requirements there are some simple decisions to make:
Agents in each virtual machine
This is the simplest decision, since it mirrors what you are already doing with your physical infrastructure. The strengths of this approach:
There are two significant weaknesses to this approach:
Agent in Hypervisor Service Console
This is pretty simple as well. It only requires a single Red Hat Linux agent for each ESX server.