Executive Summary: Microsoft System Center Data Protection Manager (DPM) 2007 uses Microsoft Volume Shadow Copy Service (VSS) to provide in-depth protection for Microsoft SQL Server, Microsoft Exchange Server, SharePoint 2007, and Microsoft Virtual Server 2005 R2 SP1. DPM protects three types of information: transactional applications, nontransactional applications, and regular files; the type of data dictates how you set up DPM protection. DPM 2007 lets you enable end-user restores and perform bare-metal restores. |
After spending some time with System Center Data Protection Manager (DPM) 2007, I’m sure you’ll see it’s a fantastic data protection solution for your Microsoft platform. It has a lot of clever stuff going on under the covers. In “DPM 2007: Set It Up and Get Started,” I gave an overview of DPM 2007 and described its requirements and how to get it up, running, and protecting data. In this article, I describe in more detail how DPM really protects your data and show you how to use DPM to restore protected data—both for administrators and end users.
Under the Covers
As you’ll see frequently in Microsoft presentations and white papers on DPM, Microsoft Volume Shadow Copy Service (VSS) is the “secret sauce” that makes DPM tick. It’s through VSS that DPM provides in-depth protection for Microsoft SQL Server, Microsoft Exchange Server, SharePoint 2007, and Microsoft Virtual Server 2005 R2 SP1. Let’s take a look at exactly how VSS is used to protect and store data in DPM.
The DPM agent injects an application-aware block-level filter on protected systems. The important part here is that DPM watches blocks of the file system. But rather than watching all blocks on a disk, the filter is application aware: It monitors only blocks that contain data for protected files, a process that leads to minimal protection overhead. This list of protected files is fluid and changes as the services write to different files. At all times, however, DPM watches only the protected blocks of files, no matter where those blocks currently are. Essentially, DPM lays down a bitmask over the disks it’s protecting; when a block changes, DPM flips a bit in the mask to signify the block has changed, and on the next express full backup, the content of the block is sent to the DPM server.
Remember that with express full backups, you get a full backup on the DPM server but only copy information that has changed on the protected client since the last backup. Figure 1 shows the express full backup process. In step 1 in the figure, the red blocks contain data protected by DPM that has already been fully copied to the DPM server. In step 2, the yellow blocks indicate that data on those blocks has changed. The changed blocks that DPM protects are sent to the DPM server and stored on the DPM replica. VSS is critical to this process because DPM calls the VSS writers of protected clients—for example, the OS file storage VSS writer, the Exchange VSS writer—and these VSS writers make sure all the data on disk is in a consistent state and stays in a consistent state while DPM copies the changed blocks to the DPM server.
Does this process mean that while the data is copied to the DPM server, the data on disk can’t be modified and your applications won’t function? No. When DPM requests a VSS snapshot, the VSS writer ensures the data on disk is consistent, then goes into copy-on-write mode until the snapshot is complete. In copy-on-write mode, the VSS writer monitors the data on disk. If a block that’s part of the snapshot needs to be changed during the snapshot process, the current content of the block is first copied to another location on disk and the snapshot map is updated to point to the new location for that block. Thereafter, the data on the block in its original location can be changed without affecting the snapshot.
Obviously, copy-on-write mode causes a performance drop, but this mode is used only while the data is copied to DPM. When the copy is complete, the copy-on-write is stopped, so this performance drop should be minimal. The copy-on-write is much faster than copying all the blocks to another location: Only blocks that change during the creation of the snapshot are moved. It also uses much less space on disk than a full copy.
Protecting Data
But wait—there’s more! DPM protects three types of information: transactional applications such as Exchange and SQL Server, nontransactional applications such as SharePoint and Virtual Server, and just plain files such as, well, a file. The type of data dictates how you set up DPM protection.
When you’re protecting files with DPM, you configure how often to synchronize changes and at what times to create file restore points, which are the specific point-in-time views from which end users can restore files. You might synchronize every 30 minutes, for instance, but create restore points only at 8:00 A.M., 12:00 P.M., and 6:00 P.M. Users will see and be able to select only from the restore points instead of seeing every 30-minute synchronization as a possible restore time. Establishing restore points limits the number of possible recovery times, which is necessary because the previous versions client, which is the software that runs on client machines to enable the view of point-in-time copies of data, can see only 64 recovery points. Therefore, DPM also limits protected files to 64 recovery points. The DPM server still has the latest content from the synchronization schedule, so in a disaster you should be able to restore to within 30 minutes of the failure (or whatever your synchronization interval is set for).
You use the same approach for nontransactional applications; because they don’t have transaction logs, you would typically perform multiple express full backups during the day. For example, if you set DPM to perform an express full backup every two hours, you would never lose more than two hours of data. And remember, you can have as many as 512 express full backups, with each of those 512 as a possible point to recover to.
For transactional applications, you still perform express full backups; however, generally you perform only one a day. In addition, at a specified interval—for example, every 15 minutes—DPM pulls and stores the transaction logs of the application. In the event of a recovery, DPM restores the last express full backup, then applies all the transaction logs created since that backup was made. If only the database disk was corrupt and the transaction logs were on a separate disk on the live machine, DPM would also play back any transaction logs that were still on the live server that had not yet made it to DPM. This process means zero loss for your transactional applications.
You need to be careful that you don’t have anything else running on the transactional application servers that might interfere with the transaction logs. If DPM sees something that could truncate transaction logs, such as another backup solution or log shipping or mirroring technology, the transaction log pull won’t be available, nor will it be available if you have a configuration that’s not using transaction logs such as a simple, nontransactional SQL Server database.
Restoring Data
So DPM performs express full backups at certain intervals, and for transactional applications it also pulls the transaction logs more frequently. DPM uses VSS to manage the various previous version states you have. What does this level of protection mean when you need to restore data, and what can you restore?
As Figure 2 shows, when you need to restore information from a transactional application, you can restore from the times you’ve performed an express full backup but also from any time that you’ve collected transaction logs, which could be as often as every 15 minutes. For nontransactional applications, you can restore from the times you performed express full backups, and for file resources, you can restore from the file recovery points specified plus the latest synchronized version, which is most likely more recent than the last recovery point defined.
So, that tells you what points in time you can recover, but what data can you recover and where to? Although you can set protection only at high-level containers with DPM—for example, you can protect an Exchange storage group but not an individual database or mailbox within a storage group—when you restore, you select from those smaller units within the high-level protection, as you can see in Figure 2. Because the entire storage group was captured, you can select to restore everything in the storage group, only specific databases in the storage group, or specific mailboxes in a database; DPM gives you very granular restoration capabilities. With SharePoint you can restore individual sites and even pages; with SQL Server, you can restore at the database level; with Virtual Server, you can restore a specific virtual machine (VM), and with file-based protection, you can recover all the way down to an individual file. When you restore data, only the necessary blocks are sent back, not a full snapshot, which would waste bandwidth and slow down the restore process.