Five imperatives for extreme data protection in virtualized environments
Transforming an organization through server virtualization requires a strategic and coordinated approach. Data protection – which includes not only backup, but also secondary storage and disaster recovery considerations – is an area that can easily complicate virtualized data centers if implemented hastily. It is essential that data protection efforts reduce hardware purchases, rather than require additional hardware to make it work. The following are five critical data protection imperatives that organizations must consider during virtual server planning.
1. Minimize impact to host systems during backups
In virtual environments, numerous virtual machines (VMs) share the resources of the single physical VM host. Backups – which are among the most resource intensive operations – negatively impact the performance and response time of applications running on other VMs on the same host. On a large virtual machine host with many VMs, competing backup jobs have been known to bring the host to a grinding halt, leaving critical data unprotected.
There are various approaches for minimizing the impact to host systems during backups, though each has drawbacks. The simplest approach is to limit the number of VMs on a given system, making sure you do not exceed the number you can effectively back up. While effective, this goes counter to the purpose of virtualization, which is to consolidate applications to the fewest possible physical servers. It would also limit the financial benefits accrued from consolidation, perhaps significantly.
A second approach is to stagger the scheduling of VM backups. For example, if performance is impacted when four backups are running simultaneously, limit backups to three at a time. This can solve the performance issue, but it can create other challenges. For example, backup jobs cannot be scheduled without referencing all the other existing jobs. What if a particular job runs longer than expected and the next set of jobs start? Suddenly, the performance limit has been surpassed. As data grows over time, jobs may take longer to run, creating backup overlap. There is also no clear way to account for full backups and incrementals in such a scheme. Even if the scheduling is worked out, the total backup window has now been extended significantly by stretching backups over time.
An early technical attempt at solving the backup problem was the use of a proxy server. For VMware, this model is known as VMware Consolidated Backup, commonly called VCB. With VCB, a separate server is dedicated for running the backups directly off the storage. The virtual machines do not participate in backups. While this seemed good in theory, in practice there was still significant performance impact due to the use of VMware snapshots. It also proved complex to configure. The result was that few users adopted this model and VMware has dropped support for it going forward.
In response to this, with vSphere 4.0 VMware released a new storage API called vStorage APIs for Data Protection. This introduced the concept of Changed Block Tracking (CBT). Simply put, CBT tracks data changes at the block level, rather than the file level. This results in significantly less data being moved during backup, making them faster and more efficient. CBT goes a long way toward solving the problem of backup impact, though it does still rely on VMware snapshots, which create impact, and the data tracking overhead can cause slower performance of virtual machines. CBT also requires backup software to integrate with the APIs, which can result in the need to upgrade the backup environment or change to a new vendor.
A final approach is to install an efficient data protection agent on each virtual machine, and then run backup jobs just as they would be run in a physical environment. The efficient agent requires technology that deftly tracks, captures, and transfers data streams at a block level without the need to invoke VMware snapshots. By doing so, no strain is placed on the resident applications, open files are not an issue, and the file system, CPU, and other VMs are minimally impacted.
2. Reduce network traffic impact during backups to maximize backup speed
Reduction of network traffic is best achieved through very small backups, which dart across the network rapidly, eliminating network bottlenecks as the backup image travels from VM to LAN to SAN to backup target disk. Block-level incremental backups achieve this while full base backups, and even file-level incrementals, do not.
Minimal resource contention, low network traffic and small snapshots all lead to faster backups, which deliver improved reliability (less time in the transfer process means there is less time for network problems) and allowance for more frequent backups and recovery points. In a virtual environment, this also means more VMs can be backed up per server, increasing VM host density and amplifying the benefits of a virtualization investment. Technologies such as CBT and other block-level backup models are the best way to limit network impact.
3. Focus on simplicity and speed for recovery
Numerous user implementations have revealed that server virtualization introduces new recovery challenges. Recovery complications arise when backups are performed at the physical VM host level (obscuring and prolonging granular restores) or through a proxy (necessitating multi-step recovery).
It is important to consider the availability of a searchable backup catalog when evaluating VM backup tools. Users of traditional, file-based backup often assume that the searchable catalog they are used to is available in any backup tool. But with VMs this is not always the case. Systems that do full VM image backups or use snapshot-based backups often are not able to catalog the data, meaning there is no easy way to find a file. Some provide partial insight, allowing users to manually browse a directory tree, but not allowing a search.
It is also important to understand how the tool handles file history. A common recovery use case is the need to retrieve a file that has been corrupted, but the exact time of corruption is not known. This requires the examination of several versions of a file. A well-designed recovery tool will allow input for both the file name and a date range to detect every instance of the file housed in the backup repository. While this may seem a minor point, it can make the difference between an easy five-minute recovery process and a frustrating hour or two hunting around for files.
Fast and simple recovery, at either a granular or virtual machine level, can be achieved if point-in-time sever backup images on the target disks are always fully “hydrated” and ready to be used for multiple purposes. In fact, with a data protection model that follows this practice, immediate recovery to a virtual machine, cloning to virtual machine, and even quick migrating from a physical to virtual machine are all done the same way – by simply transferring a server backup image onto a physical VM host server.
4. Minimize secondary storage requirements
Traditional backup results in multiple copies of the entire IT environment on secondary storage. Explosive data growth has made those copies larger than ever, and the need for extreme backup performance to accommodate more data has necessitated the move from tape backup to more expensive disk backup. The result is that secondary disk data reduction has become an unwanted necessity.
Deduplication of redundant files can be achieved at the source or at the target. In isolation, each approach has drawbacks. Each new data stream needs to be compared with an ever-growing history of previously stored data. Source-side deduplication technology can impact performance on backup clients because of the need to scan the data for changes. They do, however, reduce the amount of data sent over the wire. Target-side deduplication does nothing to change the behavior of the backup client or limit sent data, though it does significantly reduce the amount of disk resources required. A hybrid approach combining efficient data protection software with target-side deduplication can help organizations achieve the full benefits of enterprise deduplication without losing the other benefits.
5. Strive for administrative ease-of-use
Very few users have a 100% virtualized environment. Consequently, a data protection solution that behaves the same in virtual and physical environments is desirable.
A data protection solution in which a backup agent is installed on each VM can help ease the transition from physical to virtual. Concerns about backup agents needing to be added to every new virtual machine are overstated because each VM needs to be provisioned anyway – with an operating system and other commonly deployed applications and software. New virtual machines cloned from a base system will already include the data protection agent.
When evaluating solutions, it is vital to consider the entire backup lifecycle, from end to end.
For example, if some data sets need to be archived to tape, a deduplication device may not allow easy transfer of data to archive media. This might then require an entire secondary set of backup jobs to pull data off the device and transfer it to tape, greatly increasing management overhead. This kind of “surprise” is not something organizations want to discover after they have paid for and deployed a solution.
Ease of use can also be realized with features such as unified platform support, embedded archiving, and centralized scheduling, reporting, and maintenance – all from a single pane of glass.
A holistic view of virtualization
To maximize the value of a virtualization investment, planning at all levels is required. Data protection is a key component of a comprehensive physical-to-virtual (P2V) or virtual-to-virtual (V2V) migration plan.
The five imperatives recommended here can help significantly improve organizations’ long-term ROI around performance and hardware efficiencies and accelerate the benefits of virtualization. To complete this holistic vision, organizations must demand easy to use data protection solutions that rate highly on all five of the imperatives. Decision makers who follow these best practices may avoid the common data protection pitfalls that plague many server virtualization initiatives.