Increasingly long backup windows are an issue for most IT departments, and engineers have resorted to a number of different tactics and storage architectures to solve it. In recent years, one of the more popular methods has been disk-based backup that leverages deduplication.
While disk-based deduplication can improve the per-GB cost of backup storage, it also tends to introduce latency and extend backup processes.
When selecting a disk-based backup solution, you want to avoid just going with the default vendor. Do your homework and take the time to really understand the hardware, because the architecture you choose will affect your backup window.
Disk + Tape
Many companies utilize disk to disk to tape backup solutions, commonly known as D2D2T, to decrease backup windows and enable faster copies to tape media. Spinning disk can handle multiple backup streams from backup applications for rapid backups and can efficiently stream tape drives to allow for fast copies to tape.
But D2D2T presents its own challenges. The first is that as backup data grows, the front-end disk storage repository has to scale along with it. For example, a 20TB environment needs 40TBs of disk capacity to hold one week’s worth of backup data (one full and four daily incremental copies).
To store an additional week’s worth of data, capacity would have to double to 80TBs. For this reason, most organizations can only afford to store 1-2 weeks worth of backups on disk.
The second challenge is that D2D2T infrastructure relies heavily on tape. To satisfy DR off-siting requirements, data must be exported to tape nightly, and transported to a separate location. And replicating uncompressed data over WAN isn’t practical, due to the amount of bandwidth that would be required.
The Drawbacks of Dedupe
The challenges of D2D2T have sent IT planners in search of an alternative, and disk-based deduplication is one solution. With dedupe, multiple weeks worth of compressed backup data can be stored on disk, since it only replicates the changed data.
Another benefit is that backup data can be efficiently replicated offsite over WAN links for DR and archiving purposes.
Using deduplication, many organizations have been able to reduce their reliance on tape. But most organizations never get rid of tape entirely, and continue to make tape-based copies on a monthly or quarterly basis for an extra layer of protection.
But the deduplication process itself can have unforeseen consequences, such increased backup and recovery windows. The added latency of deduplicating data in CPU and memory before it is written to disk can elongate backup windows past your organization’s recovery time objectives, as discussed in the post, “How to Speed Up Recovery Times From Deduped Storage.”
Dedupe and Latency
To compensate for the latency introduced by deduplication, some disk backup appliances over-provision CPU and memory resources. But throwing hardware at the issue only inflates costs and reduces ROI. This is especially true in “scale-up” appliances that come equipped with maximum CPU and memory resources right out of the box.
In scale-up architectures, CPU, memory and disk are all housed within a single chassis. With these systems, I/O performance is shaped like a bell curve – storage performance increases as more disk drives are added to the array, but then plateaus and decreases as maximum capacity is reached.
Performance does not increase along with data and deduplication loads, resulting in lengthened backup windows. And the only way to reduce backup times is to replace the front-end controller with a larger, more powerful system.
This need to continually refresh with newer models places an increased burden on IT staff to plan for and manage upgrades and increases TCO, especially compared to alternatives, like scale-out systems.
Scale-out backup architectures solve for some of the biggest drawbacks of scale-up backup appliances. Scale-out systems consist of independent nodes with discrete containers of storage, CPU, bandwidth, and memory that can be integrated together via software.
With scale-out systems, there is no need to over-provision CPU and memory resources on the initial deployment. You can start small, and then gradually scale out over time to meet growing capacity requirements.
One of the benefits to this approach is a simplified upgrade process. Generally, older appliances can be intermixed with newer systems, and there’s never a need to perform a forklift upgrade.
Another benefit is that as data grows, processor, memory, and bandwidth resources are also added, and backup windows stay fixed. This is a key advantage for organizations struggling with window creep.
With scale-out systems, full server appliances can be added into a scalable grid, where all onsite and offsite appliances are viewed through a single user interface. In addition, data load balances across all appliance nodes. There are various sized appliances allowing you to add the right amount of compute and capacity, needed.
Want to Know More?
There are several different options for backup strategy, and the best one for your company depends on multiple factors. But before you commit to any one platform, you need to know what you’re getting into and what kind of drawbacks you can expect.
As the World’s #1 Reseller of Certified Pre-Owned Storage Hardware & Support, if you’re in the process of weighing your backup options and could use an expert’s opinion, reach out to one of our storage and support specialists at 1.877.227.0828 and we can help.