The mere mention of the word “downtime” strikes fear in the hearts of many IT leaders, and opening Pandora’s Box to investigate the true costs associated with a critical infrastructure outage can be paralyzing. How do you begin? Which “costs” should be included? Can you predict what will cause a disaster? And once you’ve discovered the potential repercussions that can be caused by downtime, how do you implement an appropriate recovery plan?
When downtime is discussed, it’s usually categorized in terms of monetary impact using a generic source unrelated to a singular business. But as Gartner reports, this methodology is short-sighted and unrepresentative of the damage an outage can have on your business—no matter the duration of the episode.
“I&O leaders in IT organizations often try to leverage information from various sources about how much per hour an outage will cost. Vendors often promote vague, often unattributed, cost-of-downtime numbers for what a day, hour or minute of downtime is valued at.” Gartner research continues, “their lack of a solid foundation in meaningful metrics that the business values often results in an immediate dismissal of relevance by business leaders and a lack of funding for needed investments.”
Failure to look beyond a general estimate of the top-line financial impact an outage can have on a business will likely result in a woefully insufficient disaster recovery plan. Rather, as Gartner suggests, business leaders should expand their focus to include potential impacts to all stakeholders of the business—"no matter if they are customers, patients, citizens or the individual business operations leaders — and how those relevant outcomes are directly impacted by the loss of IT services.”
The recent news this spring regarding recurring operations’ delays for JetBlue, American and Alaska Airlines flights are an illustration of how system failures negatively impact businesses beyond a loss in revenue. Despite the fact that the computer issues--which resulted in the airlines’ inability to check passengers into flights and issue boarding passes, and passengers’ inability to access services at self-serve kiosks--were not directly caused by the airlines themselves, their collective brand reputations and market share suffered when flights were delayed.
Reframing the upfront research required to more thoroughly evaluate all the costs, including those associated with negative customer experiences, potential safety issues, and decreasing brand reputation, should now be part of the first step in creating a comprehensive disaster recovery plan. Step two is identifying the myriad reasons your critical infrastructure could go down.
As technology advances, the list of downtime causes has grown and businesses are struggling to keep up. Network, application and power outages, server and storage failures, usage spikes and human error have long contributed to an IT manager’s nightmares, but as Ransomware attacks become more sophisticated, IT security departments are under increasing strain. And while natural disasters have always been on the list of potential causes of downtime, changing weather patterns have made tornado and hurricane seasons more difficult to predict.
Consider the airline downtime example above. Computer systems failures such as those that affected JetBlue, American and Alaska Airlines in March, April and May 2019 are damaging, but just imagine if their third-party software crashed during heavier travel periods such as the Thanksgiving and Christmas holidays? The multi-faceted impact would be utterly devastating for these companies.
So, what is an IT manager to do? Developing disaster preparedness and recovery plans for your critical infrastructure is not an easy task but knowing that some sort of adversity is inevitable makes tackling this planning critical to business viability. Indeed, 43% of businesses that suffer a catastrophic loss of data are forced to close immediately, while 51% are forced out of business within two years. The key to ensuring the resiliency of your critical infrastructure is widening your list of downtime costs and planning for the unforeseen disasters you’ll never see coming.