Our site uses cookies to make it work and to help us give you the best possible user experience. By using our site, you agree to our use of cookies. To find out more about cookies and how you can disable them, please read our cookies statement. 

Cookie Settings

You can manage your cookie settings by turning cookies on and off.

Click on the different cookie  headings to find out more about the types of cookies we use on this site and to change your settings. Please be aware that if you choose to turn off  cookies, certain areas of our site may not work and your browsing experience may be impacted.

For further information on how we use cookies, please see our cookies statement. 

Strictly Necessary Cookies

(Req)

These cookies are essential for the technical operation of and proper functioning of our site  and enable you to register and login, to easily move around our site, and to access secure areas. Without these cookies our site won't function properly.  

These cookies are required

Performance Cookies

Performance cookies allow us to collect aggregated and anonymous data on how our site is used, such as the number of visitors to our site, how you navigate around and the time spent on our site and also to identify any errors in functionality. These cookies also help us to improve the way our site works by ensuring that you can find what you’re looking for easily, to better understand what you are interested in and to measure the effectiveness of the content of our site. 

Marketing Cookies

These cookies allow us to advertise our products to you and allow us to pass this information on to our trusted third parties so that they can advertise our products to you on our behalf. All information these cookies collect is aggregated and therefore anonymous. No personal information is shared to third parties. Any personal information collected while using our website could be used for direct marketing from Dimension Data only.

One-size-fits-all? Not when it comes to disaster recovery

Blog

When it comes to Disaster Recovery and Business Continuity, everyone has their own ideas, not only on definition but also how to implement a solid solution that will actually work when an organisation is on its knees. This is the key, you never choose when to have a disaster, it can happen anytime. A Business Continuity Plan (BCP) will take into account all these factors – not just Technology but People and Process are just as, if not more, important.

In today’s DevOps world where applications are evolving faster than ever, the processes and people managing them need to be just as flexible. An application that was in pilot phase last week could now be in production and had three development cycles. Furthermore, it may be used by hundreds or thousands of users. Was a plan for how to recover this system in case of disaster put in place? Who was consulted and at what phase of the lifecycle? Have requirements changed since the last development cycle or now that the application has become critical to a business function?

Planning for a DR event is made more complex when you start to think about what you are protecting your systems from. Are you limiting the scope of your solution to just natural and physical disasters? What about the increased threat of ransomware? These two types of disasters should be handled differently to ensure business impact is limited as much as possible. In terms of ransomware, traditional data replication technologies will not help you avert disaster as the problem is simply copied to secondary locations.

DR events for physical problems can manifest themselves in different ways also. For example, does an internet connectivity failure impact a system if users only access the system locally? Maybe it does if it relies on an external system for some functionality. Knowing the system dependencies are is critical to a successful BCP implementation.

Once the dependencies are known, a matrix can start to be built. From here, decisions can be made about which systems are critical, nice to have and maybe even not required in a DR event. This needs to be made clear in your BCP. Resources during an event are finite and they must be working on restoring the right system at the right time. This period is what is known as the Recovery Time Objective (RTO), in simple terms, how long can the organisation be without the system? Within this RTO the system needs to be restored to a functioning state that can be accessed by users. If the RTO is not achieved what is the Maximum Tolerable Outage (MTO)? This benchmark is related to how long the business can sustain the loss of the system and can mean different things to different organisations. If the application is so critical to the organisation, is a plan needed for if/when your DR site cannot be brought online? All these alternate plans need to be counted as part of the MTO.

Furthermore, don’t let preconceptions hold you back from exploring new ideas. Perhaps a third party might be able to help with thinking outside of the box. Does your current IT provider offer a consultative service to approach these challenges? More often than not it helps to get fresh eyes and ideas for these. Challenges that arise may also not fit your team’s core skillset, as you focus on business-as-usual operations. All of these can combine and provide road-blocks to achieving a solid outcome for the organisation.

So that should be it right? Nope, not even close. The question of how you get back to production after an event has passed is almost always overlooked by organisations while planning their BCP. The process of moving back to production can sometimes be even harder. This is due to the fact that data may still be intact in the primary site. How you deal with this can mean the potential to recover data that was thought to be lost due to a gap in synchronisation between the Primary and Secondary sites (Recovery Point Objective, RPO). Depending on how long the DR event lasted, this data could mean saving hundreds of hours of rework by users. Be sure to include getting back to production as part of your BCP, trying to work it out after the event will just make the impact of the disaster even worse.

Have you gone through issues discussed here? I’d love to hear from you on your experiences and what you did to address these/other challenges.

Previous Article: Why incident response is high on the executive agenda Next Article: How IoT is transforming healthcare: 5 new technologies that help make hospitals more human

You may be interested in

Golden Gate bridge
Blog

Your digital future: what's holding your business back?

Digital transformation has defined the CIO’s priorities recently. We hear it often, but the disruption of old business models still affects almost every market.

Read blog
Cyclists riding in the rain
Blog

Thinking differently. Thinking digitally.

In most global industries, we see start-ups and smaller enterprises competing against large, established players with much bigger budgets and greater access to the best resources.

Read blog
Team Dimension Data cyclists
Blog

Technology to support an ambitious climb

Data has changed professional cycling in a whole range of ways. It’s changing the viewing experience, the athlete’s experience, and management’s experience.

Read blog
Cyclist using garmin
Blog

Digital solutions for body and brain, on and off the bike

A platform for peak performance. As Doug Ryder (Team Principal) has said, running a cycling team is a bit like distance learning: our people are all over the world.

Read blog