By Ritu Dubey, Head of Sales and Market Development, Europe, Digitate

Never has a reliable internet connection seemed so critical. From grocery shopping to school, and from working to family gatherings, the COVID-19 pandemic has fundamentally changed our lives.

For businesses, one of the biggest challenges is the shift to homeworking at scale. Ensuring that they can continue to provide products and services and satisfy their customers against this backdrop is a struggle for many.

Systems and infrastructure that may have previously matched the shape of the company are now faced with a whole different set of employee needs, while home internet and hardware is also being put to the test. 

We are already seeing examples of infrastructure under strain. One leading cloud platform was hit by an outage for several hours in March, after a connectivity issue in one of its US data centres. A video calling service, which had seen a massive upsurge in usage, experienced a partial outage in April. In the US, a government loan scheme aimed at shoring up small businesses was beset with technical issues, with many banks and would-be users unable to access the online portal.

In this uncertain environment, it is natural that companies are looking to ‘make do and mend’, patching existing products and services with rapidly deployed bolt-ons to keep their business running while also improving communication and accessibility for people who work from home.

Many businesses have turned to video conferencing technology to continue their operations. Source: Shutterstock

The stakes are high

Downtime is a very visible form of failure that everyone – including customers – can see. However, as embarrassing as it is for the companies involved, it is a risk faced by every business. There are multiple potential causes – network failures, software malfunctions, usage spikes, human error and configuration error among them. 

Such high-profile outages are troubling for business leaders, who must grapple with the huge costs associated – running into the hundreds of millions – as well as the impact on the confidence of their customers. Fortune 1000 companies lose between $1.25 billion to $2.5 billion every year due to unplanned outages. 

When time is money

The duration of outages, as well as their impact and cost, can vary hugely, not least because multiple parts of a business are likely to be affected simultaneously. Adding to the problem is often the size and scale of a company, which may have developed blind spots as it has grown. Evolving technology and platforms across multiple locations can cause weak points that are not immediately obvious without oversight of the entire system. With tightening operations budgets, this can be a constant challenge.

A 2014 study by Gartner estimated that the average cost of downtime is $5,600 per minute, a more recent (2016) report by Ponemon put this cost at nearly $9,000 a minute.

This cost, of course, varies greatly depending on the size of the business affected and the sector it operates in. Banking, government, healthcare, manufacturing, media, retail, utilities and transport are among those most at risk – and where outages are the most costly.

Working out the cost to a business caused by an outage is not as simple as looking at lost revenue alone. Business disruption, reputational damage, customer churn and the effect on productivity levels also play a role. Further down the road, there may well be a fall-out caused by fines, litigation or settlements, third-party costs and equipment replacement. 

IT outages can cost companies billions of dollars a year. Source: Shutterstock

Building resilience

When outages do happen, resolution is often achieved by a trial-and-error approach – and it is dependent on intrinsic knowledge and teams who are working in operational and technology silos. This is likely to prolong the amount of time businesses are offline.

So what’s the solution?

Companies need to take steps to avoid outages and implement a recovery plan to get them back up and running as soon as possible. This should include cooperation with third-party providers and technology partners.

Agile businesses will be best placed to weather the current situation. The ability to adapt to demand quickly and fall back on a robust IT application system will help ensure that resilience.

There are some obvious key steps to be taken to reduce the risk of downtime, including eliminating single points of failure – balancing load between servers, following good back-up practices and building in technical fail-safes. What is becoming increasingly apparent is sophisticated AI, predictive processes and automation are starting to play a critical role in prevention. 

AI can be used to help businesses avoid IT outages. Source: Shutterstock

Broadly, there are three basic levels this cognitive technology operates at: the ability to perform tasks, perform activities, and handle situations. This last group of intelligent incident or situation handling prioritizes what needs to be acted on, identifying the root cause and prescribing an action. It further augments productivity by performing the action autonomously. 

Enabling these mission-critical applications to keep IT running is core to supporting current, essential services such as healthcare systems, utilities, telecom providers and retail and distribution services.

The current crisis has put additional strain on businesses, with remote working and COVID-19 impacts on staff affecting the normal processes. But many of the challenges we are facing now have common ground with smaller-scale problems that crop up during ‘business as usual’. 

It is also important to remember that the business landscape is going to look fundamentally different once the immediate crisis has passed. There will be increasing demand to run businesses effectively by working remotely, managing cash flows through smart supplier management, and shifting from a reactive to a proactive mode of IT operations by eliminating slow and error prone manual processes.

Intelligent systems management, taking advantage of leading AI and automation platforms, can help provide businesses with the necessary firepower to meet this new normal.

These tools provide the necessary agility and resilience to adapt to changing environments, backing up business stresses and growth by ensuring that the technology that underpins them is reliable and future ready.

Digitate is a software venture of TCS. Launched in 2015, Digitate’s ignio™ is an award-winning solution that reimagines enterprise IT operations with its unique and innovative design that blends artificial intelligence, machine learning, and advanced software engineering to quickly and autonomously resolve issues when they arise, and preempt incidents wherever possible. ignio has been adopted by large, global enterprises, mostly Fortune 500 and Global 2000 corporations, which are leaders and innovators in their respective industries.