A few days ago my wife asked me to look at her computer. Programs were taking ages to respond even to a simple mouse click, or just crashed. And the system box kept making odd clicking noises.
I confirmed her diagnosis of a failing hard drive and shut the computer down. The following lunchtime I drove to the nearest PC World and bought a replacement hard drive. I normally buy mail-order, but we wanted the computer back up as soon as possible. That evening I removed the faulty drive and installed the new one, and I also stuck in some extra RAM while I had the case open. Then it was time to find out if my rather sketchy disaster recovery plan was going to work.
I run Fedora on my own box, and both computers are backed up using Bacula. This is designed for backing up lots of computers to tape drives, but it also supports backup to disk drives. I have two USB drives, keep one plugged in, and swap them every month or so. I use the default Bacula backup schedule of a weekly full dump on both machines and nightly incremental dumps. Most of the Bacula components only run on Unix, but there is also a Windows client installed on my wife's computer (which runs Windows XP SP2). Bacula configuration is rather hairy, but if you have more than one computer its well worth the effort.
The big headache for restoring Windows is the Registry. After previous bad experiences with this horrible blob of data I had made sure the registry was backed up by using regedit to dump the registry contents to a file before each Windows backup, so I hoped it would be OK. There are also problems with other files in C:\Windows that are constantly in use and therefore unwritable during system restore.
The Bacula manual pointed me at Bart PE Builder, which generates a Live CD version of Windows XP. It suggested that you might be able to run Bacula in this environment to get around these problems. Bacula ran fine, but I couldn't get the drivers for the network card to install, so that wouldn't work.
Instead I re-installed Windows XP SP1 from the original CD and ran Bacula there. The restore duly dumped copies of C: and D: in C:/temp/restore/c and /d. (The computer has two partitions because it used to have two drives. My wife got used to having C: and D: drives and various programs had been configured to look for data on D:, so I kept the layout even when it went back to a single drive.). Then I booted under the Bart PE disk, deleted the existing contents of C: (except for /temp) and copied the restored contents into the root directories. Then for the big test: would it reboot?
Well, sort of. The login screen came up, but when I tried to log in the computer hung. I tried rebooting in safe mode, and then in "safe mode with console", which seemed to be the bare minimum. That at least got me a command line, so I ran regedit and imported the registry copy that had been created by the last backup. This warned me that some registry keys could not be modified and claimed that the changes had failed. I rebooted again, and this time found that I could log in, but pretty much all the settings had been forgotten. Office wanted to re-install itself, and every time I started Word it asked for my name and initials twice. This suggested that important parts of the registry were not only still not recovered, but also unwritable.
Windows also decided that it was running on new hardware, and badgered me to activate it. So I did. I then tried logging in as a different user and restoring the registry again in the hope that being a different user would lock different bits of registry. This merely overwrote the new activation data and Windows now point blank refused to let me log in at all until I activated it again. So I tried. This time it told me that I had exceeded my activation limit and would have to phone up for activation. So I did. Windows gave me a 36 digit number and a robot on the other end of the phone line told me to type in the number. Then a polite gentleman named "Fred" with an Indian accent asked me how many computers I had Windows installed on and why I needed to activate it. Then he gave me another 36 digit number to type into Windows to activate it. This worked. But when I logged in Windows wasn't behaving any better.
A bit of Googling reminded me of something I should have remembered much earlier: Windows occasionally checkpoints its critical state, including the Registry, and you can wind it back to a previous known good state using the System Restore function. So I located a checkpoint from before all the trouble started, restored Windows to that state and rebooted.
When I tried to log in I got immediate joy: the desktop background had been restored. This suggested that the registry was now intact. But this restore had also overwritten the registration data, and Windows once again demanded to be activated before I could log in. Back to the phone, this time to a polite woman named June with a much stronger Indian accent who asked me the same questions and gave me yet another 36 digit number to type in. This time everything worked. Disaster recovery was completed.
I'd like to thank the authors of Bacula for their excellent backup program. It saved both of us a lot of heartache. In the past I've found it too easy to neglect backups, and sometimes our computers have gone for months without being backed up. When I got it properly configured (not a trivial task) Bacula made backups automatically with minimal intervention by me (basically, swapping USB drives occasionally). That meant I had a good recent backup to work from.
I'd also like to thank Bart Lagerweij, author of Bart PE. I could probably have managed by booting Knoppix and using its NTFS driver capture facility, but having a native Windows environment made life much easier.
6 comments:
Dumping the registry with regedit does not save everything! That will not help you to do a disaster recovery.
Dumping the registry with regedit does not save everything!
So I have discovered. The Bacula manual suggests using the following command instead:
ntbackup backup systemstate /F c:\systemstate.bkf
Which only goes to show that I should RTFM *before* disaster strikes.
Yes, you should do a system state backup and then do a backup using Bacula. You should also enable VSS for the backup job. This allows Bacula to backup locked/open files. The combination of the system state backup and a VSS backup are all you'll need to do a full recovery.
To do the recovery I typically install Windows to the workstation, get the network card up and install the Bacula client. I then restore all files and before rebooting I use ntbackup to restore the system state. This restores the registry and also restores DLLs (which isn't possible w/ the Bacula restore due to system file protection).
For me work with this dirty mode
a)backup c with bacula,must return
0 error(use onefs=no)
b)restore from bacula server(bconsole,restore,etc,etc)
restore all in c:/tmp/bacula-*/C
c)reboot with linux live-cd,or slackware dvd with nice cli
d)mount c: partition with ntfs-3g
e)Delete all except tmp
f)Move tmp/bacula*/C/* in C:\
Umount and reboot.
On Windows7 works fine all
On Windows2003 ok.
I really appreciate your effort. I always wants to know how disaster recovery works and its fundamentals. Thanks for sharing valuable content.
Great reead thankyou
Post a Comment