Texas A&M Health Science Center Reminds Us of the Importance of Testing

Gym set up as emergency overflow treatment centerOur team recently volunteered for a community-wide disaster preparedness event, Disaster Day, hosted by the Texas A&M Health Science Center (TAMHSC).

The idea for Disaster Day began with Hurricane Ike in 2008, when the hospitals in College Station, TX started to overflow. Patients were taken to the Health Science Center (HSC) to be preliminarily treated until the hospitals cleared out, giving medical, nursing and pharmaceutical students real-life experience treating victims of a disaster.

TAMHSC saw this was an indispensable opportunity for students, but couldn’t rely on a natural disaster to occur every year. And that’s how the simulated exercise appropriately dubbed Disaster Day came to be. For the event, volunteers role play disaster victims receiving treatment for “injuries” and “medical conditions” caused by the disaster. In the HSC students’ eyes, though, the disaster — along with the injuries it’s caused — is real.

We chose to participate in Disaster Day this year not only to support the school but also to reinforce one of our main company philosophies: the importance of testing and preparedness. Two weeks before the event our volunteers attended a training session where we each received case studies of the injuries we'd be portraying — heart attacks due to stress and panic after a wildfire hit our area. To adequately prepare, we studied the cases and quizzed each other on our backstories until the day of the event.

Upon arriving at Disaster Day we were sent to the gym, which was set up as an emergency overflow treatment center for victims suffering from burns, dog bites, dehydration, smoke inhalation, etc. The scene was as chaotic as a real disaster would be. A girl lying in a cot was wailing in pain from burns over her entire body. Another victim coded, so emergency personnel ran over to start reviving her. A mother dropped to the floor when doctors were unable to resuscitate her daughter.

The simulation gave students a chance to test their medical knowledge and gave professors, who are real doctors and nurses, the opportunity to assess their students’ treatment of patients.

Just as the HSC needs to make sure its students are prepared for the medical field after graduating, you need to make sure your employees are prepared for a business interruption. Regularly testing your business continuity and disaster recovery (BC/DR) plan will give employees the opportunity to familiarize themselves with every aspect of the plan so they’re prepared in the event of a disaster.

If you’re new to the DR testing scene, check out some of our posts for tips.

Four Reasons Testing Your Business Continuity Program Is Essential

By Brandon Tanner, senior manager for Rentsys Recovery Services, Inc.

Word "continuity" on chalkboard
One of the most important things your business can do is test its business continuity program. You might assume that because you have a written plan your business is prepared for a disaster or business interruption, but how do you really know your plan works until you test it? Below are four reasons testing is essential. 

It Identifies Interdependencies


Performing business continuity tests helps you identify interdependencies and gaps within your system databases and technology. For example, a customer we tested with was able to recover their main application and network environment, but they discovered there was a particular database the application made a call to for a subroutine. That specific database was housed in a separate environment and wasn’t being backed up. As a result, the entire system application that relied on that database wouldn’t have been able to operate during a real-world recovery scenario, which would have prevented an entire business unit from functioning. By testing, they were able to identify that interdependency ahead of time. 

It Reveals Differences Between Production and Recovered Environments


Differences between your current production environment and recovered environment could cripple your employees' productivity. People are used to using an application a certain way and for a specific purpose on a day-to-day basis. If an application isn't configured to allow users to perform the desired functions, the application will become essentially useless to your employees. Testing will reveal any configuration changes you need to make. 

It Validates Compliance Requirements


Many businesses are required to have certain security protocols in place for compliance purposes. They also need to meet specific recovery time objectives (RTOs) driven by business objectives, regulatory requirements or both. Unfortunately, sometimes when businesses are in the middle of an event, they tend to try to recover as quickly as they can, which can open up security issues. With testing, you can assess your ability to recover within your RTOs while validating that the required security controls are in place.

It Provides an Opportunity for Creating Time-Saving Documentation


It’s critical for people going through an exercise to document work issues in a recovery scenario. That way, if other people are involved in a recovery situation down the road, they have documentation that can expedite recovery, rather than wasting time working out logistics that were resolved during a previous test.

If you work with a business continuity services provider, that third party can leverage documentation on the customer's environment to speed up the recovery. After a disaster strikes, people are typically dealing with the effects of the event and making sure their families are taken care of, so key personnel are not always available to initiate the business’s recovery.

When we work with our customers during these types of events, we’re able to rely on the documentation we have to get the customer’s environment set up by the time the employees arrive on-site. In our experience, having detailed documentation can cut about six to 12 hours off the recovery process.

By proactively identifying weaknesses in your business continuity plan, you can save yourself a lot of headaches down the road.

What problems have you identified during a test? Tweet your answers to us at @RentsysRecovery and use the hashtag #TestingTimes.

Business Continuity Awareness Week: Testing Business Continuity Plans

Photo courtesy of the BCI
Today kicks off the Business Continuity Institute (BCI) Business Continuity Awareness Week. This year's theme is testing and exercising business continuity plans (BCPs), so in honor of #BCAW2015 we've put together a list of some of our top testing tips.

Conduct a Full-Scale Test Annually


One of the reasons business continuity plans fail is staff members simply take for granted the things they need to do their jobs. To truly make sure your BCP will work as planned, it's helpful to do a complete run-through of the plan at least annually to make sure you're not forgetting anything.

Test With Critical Vendors to Make Sure Both Parties Are on the Same Page


Testing with critical vendors helps make sure you each have the proper expectations of how a recovery event will unfold. You should have requirements outlined in a service level agreement, but it's important to test them.

Involve Employees and Ask for Feedback


When testing, make sure all the people who would be involved in a recovery — whether they're new or experienced employees — participate in the test. Afterward, ask them for feedback to identify any unclear or inefficient areas in your strategy.

Learn From Your Peers


Interface with other companies in your industry to see what they've learned from their tests. Not only will you learn what BC strategies were successful for them, but you'll also learn from their mistakes.

Publicize Your Test Efforts


The purpose of testing is to identify areas of improvement in your BC strategy, of course, but it's also an opportunity for positive publicity. By publicizing your test efforts, you're letting your customers know that you're dedicated to being available to them in the event of a business interruption.

What are your testing tips? Tweet them to us at @RentsysRecovery and use the hashtag #BCAW2015.

Highlights From InformationWeek’s 2014 State of the Data Center Survey

server room in data center
InformationWeek recently released the results of a survey of 217 data center managers and decision makers at organizations with data centers of 1,000 square feet or larger. The findings reveal that virtualization and private cloud solutions are on the rise, and organizations want secure data storage and application infrastructure solutions that can scale to accommodate data center growth.

Below are a few highlights from the survey.

Virtualization and Cloud


  • 46 percent said 50 to 90 percent of servers will be virtualized by the end of 2015. This transition is driven by factors such as business continuity and disaster recovery, operational flexibility and agility, and high availability/service clustering.
  • 34 percent are prioritizing private cloud and if they haven’t already implemented it, they’re on the way there.
  • 48 percent said 0 to 9 percent of new applications use public cloud services.

Data and Application Growth Management


  • The prevailing expectation (54 percent) is that demand for data center resources will grow somewhat as compared to last year.
  • When running new apps, 56 percent would rather add servers and virtualize rather than pursuing alternatives such as building new facilities, relying on SaaS to avoid application hosting or repurposing existing hardware.
  • The top requirements for application infrastructure are reliability and availability, security and data protection, and flexibility to rapidly meet new business needs.
  • The number one reason for investing in an appliance rather than a separate server and software is that all hardware, software and support are included in a single product bundle.

Top Three Challenges to Data Center Operations


  • 10 Gbps or greater network technologies
  • Storage growth
  • Constrained budget

To access the complete survey results, visit informationweek.com.

Cloud Vaulting Doesn’t Always Equal Disaster Recovery

tan clouds
One of the key benefits of cloud services is that they enable faster and more cost-effective disaster recovery (DR). So once you’ve selected a cloud vaulting service and your data is tucked safely into the cloud, you can check DR off your to-do list, right? Not necessarily.

While cloud vaulting solutions can lend themselves to a DR strategy, simply sending your data to the cloud isn’t enough. There are a few components you need to look for to ensure your cloud solution has what it takes to meet your DR goals.

Complete Environment Backup


Recovery isn’t just about restoring file backups. It also requires restoring your entire IT environment. If your applications, operating systems and configurations aren’t backed up along with your data, you could be adding valuable hours to your recovery time, because you’ll need to reconfigure your servers, PCs and other hardware.

Failover


Before the time comes to recover your data, you need to be familiar with your vendor’s failover processes. Below are a few examples of questions you’ll want to ask:
  • How is failover initiated and managed?
  • How will I access the environment?
  • Will my environment be restored within my recovery time objectives (RTOs) and recovery point objectives (RPOs)?
Being able to access your mission-critical data and applications within your RTOs and RPOs is fundamental to a DR strategy. If the provider can’t meet your goals, it might be time to look at alternative backup and recovery solutions that meet your DR requirements.

Testing


Infrequent or nonexistent testing is a problem. Without it, you can’t know for sure that you have a true DR solution. Once you have your backups configured, it’s tempting to take a set-it-and-forget it approach to DR. But changes to your environment, file corruptions and other factors can create problems with your backups that can impede recovery. Testing before an incident will help you pinpoint and mitigate these issues.

To reap the benefits of cloud for DR, you need to make sure your vaulting solution has adequate recovery capabilities. Once you’ve identified a cloud recovery solution, follow these best practices for implementing it.

FFIEC Update: Ensuring Resiliency of Outsourced Technology Services

Dollar bill in binary code
Earlier this month the Federal Financial Institutions Examination Council (FFIEC) released a new appendix to its Information Technology Examination Handbook: "Strengthening the Resilience of Outsourced Technology Services."

Outsourcing technology services often makes good business sense for financial services institutions. It allows them to benefit from outside expertise and alleviate internal workloads, increasing their professionalism and efficiency.

The FFIEC acknowledges this fact with one caveat: Your organization's management and board are still responsible for making sure "outsourced activities are conducted in a safe and sound manner." This responsibility entails making sure the third-party provider provides an adequate level of resiliency so as not to disrupt key processes in the financial services organization.

Below are a few key guidelines from the FFIEC document.

Address Risk


Because your firm is ultimately still responsible for outsourcing business practices, be aware of the risk factors you face when working with a third-party technology services provider and establish controls to mitigate those risks. To assess the level of risk, perform due diligence into the provider’s business continuity program (BCP), establish clear guidelines in your contract with the provider and continually monitor the vendor’s services.

Be Aware of the Provider's Scalability


Organizations rely on technology for critical processes more than ever before. Any outage of critical technology can be detrimental to your business. For this reason, you need to be familiar with a service provider’s ability to respond to a few types of scenarios:

  • A widespread physical disaster or cyber threat in which multiple organizations are affected and need continued service.
  • An isolated incident affecting a single service provider location, which in turn affects several firms.
  • Other continuity scenarios, such as financial distress.

In each of these scenarios, assess the service provider’s ability to meet your recovery time objectives (RTOs) and recovery point objectives (RPOs.). Prepare contingency plans to ensure the continuity of key applications.

Make Sure the Service Provider Has a Business Continuity Plan


A service provider needs to have identified single points of failure and created a comprehensive business continuity plan that addresses restoration of key services. Being familiar with the provisions of a service provider’s BCP will allow you to make adequate preparations in your own BCP.

Involve the Provider in Testing


Services provided by third parties should be included in regular business continuity testing, especially if the services provided are critical business functions.

The FFIEC recommends testing in conjunction with the service provider. These tests have a two-fold benefit in that they demonstrate both parties’ ability to recover within the designated time frames and to meet contractual obligations. However, some third parties service hundreds of organizations and as such might not be able to participate in one-on-one tests. In these cases, you should still ensure that you’re familiar with the provider’s testing scope, frequency and remediation activities.

Prepare for Cyber Threats


With the predominance of virtualized infrastructures, you need to adequately prepare for cyber threats. The FFIEC recommends preparing incident response strategies for the following types of threats:

  • Malware
  • Insider threats  
  • Data systems destruction and corruption
  • Communications system disruption
  • Simultaneous attacks on the firm and service provider
  • Cyber attacks

You should review incident response strategies to keep pace with the evolving threat landscape.

To read the full appendix, visit ithandbook.ffiec.gov.

[Webinar Recap] I Need A Compliant Business Continuity Strategy. Now What?

Intro slide for webinar presentation
Today organizations in regulated industries know that to remain compliant with industry and federal regulations, they need a well-rounded business continuity strategy. Unfortunately, developing a strategy can be a challenge, which is why during our webinar with DRJ earlier this week, Rentsys Senior Manager Brandon Tanner offered tips for getting started with compliance.

After the show, participants had several great questions for Brandon, so we’ve featured a few highlights below.

Q: How do I know which recovery time objectives (RTOs) and recovery point objectives (RPOs) are applicable to my organization?
A: I would start with asking, “What does our business impact analysis say today? What are our established RTOs and RPOs?”

Then I’d go and I take a look at the regulatory bodies that are tied to your particular organization and industry and look to identify any areas where you’re told how to classify your data (for instance, critical or urgent) and given timelines associated with those. Also consult with some of your peers that may have information on that piece.

Finally you’ll want to look at service level agreements (SLAs) that your organization has tied to service delivery.

Those three things allow you to come to a reasoned framework for determining the appropriate RTOs and RPOs. If there’s a gap, you have a tool for discussing how to prioritize each of those requirements. You’ll want to meet the most aggressive requirement.

Q: Who needs to have a SOC 2 and how is it different from a business associate agreement (BAA)?
A: Any critical vendor you’re dependent on and that is tied to your compliance requirements and service level agreements should have that SOC 2 report because you need to have visibility into what they’re doing.

A BAA is an agreement between the organizations. It does tie into HIPAA and how the data you deal with is protected, but what’s to validate that what’s in the BAA is actually happening? Now, obviously if the agreement has been signed and something does happen, there’s liability associated with it, but in a SOC 2 there’s actually validation from third parties. If you’re a healthcare organization, I’d require a SOC 2 and a BAA.

Q: What is the best approach to getting critical third-party providers to embrace BC compliance?
A: If you’ve got critical third-party vendors that are resistant to BC compliance, I would look for alternative vendors. But I would also say if you’re struggling there, it’s an executive-level decision.

If your business arrangements or compliance requirements are tied to that vendor embracing business continuity, whoever manages the business relationship should have those requirements written into the documentation. There should be a service level agreement tied to it and expectations that they will comply to those standards. The SLA needs to be tested, so the vendor needs to be able to prove to your organization that they have validated the requirements. Once you get that far, now you’re most likely talking again about the SOC 2.

To see the complete webinar, get it on demand here.