Disaster Recovery Testing: Whose house are you going to meet at when your facility experiences a disaster?

If a disaster hit while you were reading this, where would you go? A better question is, where would all of your employees go? A common suggestion is to employ a work-from-home strategy where employees continue their daily work activities from home. Another solution is to utilize a backup recovery site. These can both be a good approach, but they need to be tested.

Many companies think they have tested the first solution because employees work from home all the time. The problem is that in a disaster scenario, everyone must work from home or the recovery site if the company’s facility is damaged.

While some employees may work from home day to day, having the entire company log in to the company network will put heavy stress on the Internet bandwidth, VPN concentrators, Terminal Server sessions, Citrix sessions, etc. The IT resource requirements being pushed to the limit by all employees at once will cause the system to slow down or stop working altogether.

Work-from-home or backup location strategies can work, but many problems will not reveal themselves until you simulate the extra workload. Finding ways to load balance the traffic across multiple sites (if you have that luxury) can be used to solve these issues. Unfortunately, many companies stop short of true testing, and someone from IT simply verifies that one or two sessions are working and then reports to management that the solution “works great.”

You should also consider the equipment and resources required at each employee’s home or at the recovery site. Do employees each have a computer that can run necessary applications? Is there enough Internet bandwidth to run the applications? What about employees who live in a rural area with only dial-up modem access? Some companies suggest that employees use their company-supplied laptops, but this won’t work if employees routinely leave the laptops at work.

Maybe you already have the data connectivity issue figured out, but what about voice communications? Customers will need to be redirected from the company’s phone system to an employee working from home or at a recovery site.

If you have a backup location, it is important to think through logistics such as parking, bathroom availability, coffee supplies, etc. One company showed up at its backup location only to find that the facility had no power. Further investigation revealed that vandals had stolen the copper wire from the utility poles as well as the air conditioning system.

The first step to identifying some of the problems explained above is testing. Once you have identified any problems, you can then address them in a timely fashion rather than under the gun in the midst of a disaster.

The process of testing also provides a learning opportunity to help your organization become smarter, leaner and more efficient. Finding ways to solve problems uncovered during testing can help identify better ways to carry out day-to-day business. These solutions can then be employed, and you will not only have a resilient disaster plan, but will be able to find ways to help the company save money. Testing identifies the issue and ingenuity solves it.

Finally, your company has probably spent plenty of money figuring out how to replicate data off-site. Has any money been spent to figure out how the employees will get to that data? If users can’t get to the data, why replicate it? 

Have you had any experience (good or bad) relocating employees when recovering from a disaster? What problems did you discover while testing a solution or responding to an actual disaster? Please share your stories!

By: Steve O'Neal

Popular Posts