As organizations face the imminent threat of an IT service outage or cyberattack, they often fail to step back and understand how well they have planned to deal with the crisis. According to recent research, we learn that:
Perhaps the most regrettable part of it all? Almost half (45%) of these organizations already acknowledge the inadequacy of their disaster recovery capabilities. So, in this article, let’s discuss a framework and steps for creating a disaster recovery plan that sets you up for actual recovery, so you can stay resilient over the long-term.
Disaster recovery planning is less about investing in cybersecurity solutions and multiple layers of cloud and on-site data center resources (though that supports your overall business resilience plan). Instead, planning for disaster recovery is more about communications, governance, organizational structure and culture of dealing with the crisis.
How do you ensure business continuity amid persistent threat of disasters — which may come from an external cyberattack, an IT service outage, natural disaster or a disgruntled internal employee with access to sensitive business information?
A Disaster Recovery Planning strategy builds against these risks as a subset to the Business Continuity Plan (BCP) in these focused stages of a disaster:
How do you plan for disaster recovery? Disaster recovery planning is about three key activities:
The goal of disaster recovery planning is to reduce business disruption when the underlying resources — computing, applications and data — are rendered unavailable. (It could be due to an unforeseen threat, or an inevitability that you can only prepare for so much.) A robust disaster recovery planning process ensures that cost-effective and practical measures are developed in anticipation of these threats, allowing the organization to recover from disasters that may take them by surprise.
(Understanding incident severity levels can help risk prioritization.)
Here are a few important steps that you can follow for your disaster recovery planning:
The first step of an effective disaster recovery plan is to obtain strong support from all stakeholders, especially for resource investments and allocation.
Disaster recovery requires investments in technology resources and activities that do not offer an immediate ROI but are critical to reducing the opportunity cost of a downtime incident. While the management is responsible for implementing and executing a disaster recovery plan, its effectiveness depends on the resource allocation — which requires approval from business decision makers and top management.
Establish a dedicated team that will oversee the planning, development and execution of a disaster recovery plan. This team can comprise cross-functional team members, across multiple levels of the organizational hierarchy. The goal of a planning committee is to:
Quantify the business impact of the downtime incident that impacts different workloads and operational activities. Create a risk profile that depends on the cost of downtime as well as the probability of the threat impact, threat resilience, alternatives, opportunity cost of downtime and its role in disrupted other dependent operational activities and services.
Evaluate the cost of disaster recovery for each item; prioritize disaster recovery objectives for the most impactful operational activities and services. Some of the important metrics to consider are:
(Learn about cyber threat intelligence.)
Your disaster recovery plan can focus on a variety of recovery strategies based on the risk profile and business value. These strategies can include backup in a few areas:
If the applicable data and application backups are stored in the cloud, you may choose from a variety of storage tiers that give different levels of recovery performance and service level agreement (SLA) guarantees at different price points.
In order to develop a practical disaster recovery plan, incentivize the disaster recovery activities across all business functions and hierarchical levels. Understand their needs; identify their limitations especially those pertaining to risk mitigation and recovery; develop a governance and reporting mechanism that makes it easy to communicate and collaborate on threat risks, threat incidents and disaster recovery activities where and when needed.
Some of the key starting points in this regard, could be a strong focus on eliminating silos between teams, hierarchical levels and business functions; and automating the reporting and collaboration process.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.