I recently got to write a fun piece for InfoWorld called "Stupid user tricks" about protecting your network from human error. Researching the article revealed to me how many variables folks tend to miss when running a network, as well as when planning to protect and recover that network.
I suppose some of the errors I encountered while researching the article are more surprising to us consultant types because we live and breathe best practices. We live it, we breathe it, we get to install and bill for it, and then we get to walk away and do it all someplace else. Day-to-day systems administrators live and breathe a just-get-it-done philosophy, and they can't walk away.
So in that spirit, I've condensed some of the disaster-recovery best practices into a top six list for this week's column. Make sure you've got these six points covered, and you're much more likely to survive not only stupid human tricks but any kind of network disaster curveball Lady Fate may decide to pitch your way.
1. Test your backups.
This is first because it was by far the most popular entry. Someone installs a tape drive setup, installs the backup software, and schedules daily, weekly, monthly backups. Something happens a year later, and it turns out nothing's actually been running. Backups are boring, I know. Not to mention mind-crushingly tedious. But if you don't have them when you need them, you're done. So do a test backup and restore after installing a new backup system. Then -- and this is critical, not optional -- do a test restore every week. That's right: every week. Not the whole tape, just a specific subset of folders. Shouldn't take more than 15 minutes, and it can save your professional career in a crisis. Just do it.
2. Spend a little money on your backup software.
Don't just buy Bob's Basic Backup package because it's cheap or came with the tape drive. Spend some bucks here. Make sure the thing can support dynamic backups; also ensure that it can support individual folder and file restores. Take a step back and think about investing in a disk-heavy server to act as a disk-based backup between the tape drive and the network. Many of the better packages, including those from CA, IBM/Tivoli, and Veritas can manage this NAS-type device as well as the backup, which means not only safer data but much faster restore times. And the cost really isn't that huge.
3. Store a weekly copy off-site.
This was the next most popular entry for the article, even though it didn't get much play in the finished article. If you're worried about recovering data should the office building burn down, then keeping all the data in the office building isn't all that bright. Explain this to your tightfisted boss using small words. Get a safety deposit box or a secure business storage locker and bring your tapes there. One tape a week'll do you. Likely this is a quick 30 minutes out of your day door-to-door. Look at it like this: It's less desk work.
4. Block off access to servers.
If your business runs on its server applications, they shouldn't be accessible to just anyone -- including cleaning people. Put them in a room. Get ventilation. Think about things like sprinklers (bye-bye servers) versus Halon systems (servers live), UPS protection, a building-based power generator, and maybe even a Webcam-based monitoring system, such as the NetBotz system from APC. Know that room is safe, and know what's going on inside it. Then add this new thing to the door called a lock. Make sure only you, the IT staff, and a responsible member or two on the executive team have the clearance to open this lock. If the cleaning people need to get in there, open the door for them and show them where they can plug in their gear.
5. Map out a plan for what happens if the office building burns down.
Worst-case scenario day. There are oodles of options in this department, so I'm not going to try and list them all here, but do decide if your business can shut down due to one of these occurrences or if it needs to recover somewhere else right away. And how quickly it needs to recover. Then figure out what it needs in order to recover. You should also make sure you can deliver all these requirements in time. Yes, this is a lot of work.
6. Write all of the above down and title it "Disaster Recovery Plan".
Put gold and red star stickers on the cover, then put your name, your IT staffer's names, and some executive manager names on it, and make sure it's distributed to everyone who needs to see it. Then show it to everyone who doesn't need to see it. Make sure the section on what employees should do if the building blows up gets to the worker bees.
My snide tone and I are making all this sound obvious, but both the article and my wide IT travels have continually shown me throngs of people that keep putting these things off or simply ignoring them altogether. Yes, getting all this done is a month or more of real work. But having it in place when Godzilla steps on your server room: priceless.