The Lowdown on Data Center Downtime:
Frequency, Root Causes and Costs
In 2013, Emerson Network Power again partnered with the Ponemon Institute to update its Study of Data Center Outages. The two-part study found that although the frequency and duration of data center downtime events has slightly decreased, unplanned outages prove to remain a costly line item for organizations.
The first part of the study, which surveyed more than 450 U.S.-based data center professionals and focused on the root causes and frequency of downtime, found that organizations are more aware of data center downtime and its potential consequences, and are increasingly taking action to prevent outages.
The second part of the study includes an analysis of 67 U.S. data centers with a minimum size of 2,500 square feet, delving into the direct, indirect and opportunity costs associated with data center outages.
Downtime costs are increasing
The Cost of Downtime study quantifies the costs of an unplanned data center outage at slightly more than $7,900 per minute, which is a 41 percent increase from the $5,600 it was in 2010. Total data center outages averaged a recovery time of 119 minutes, equating to about $901,500 in total costs.
Partial outages, or those limited to certain racks, averaged 56 minutes in length and costs were approximately $350,400.
Outages are less frequent
An overwhelming majority of 2013 Causes of Downtime survey respondents reported having experienced an unplanned data center outage in the past 24 months (91 percent). This is a slight decrease from the 95 percent of respondents in the 2010 study who reported unplanned outages.
Regarding the frequency of outages, respondents experienced an average of two complete data center outages during the past two years. Partial outages, or those limited to certain racks, occurred six times in the same timeframe. The average number of device-level outages, or those limited to individual servers was the highest at 11. These durations have declined slightly from 2010 findings (complete: 2.5, partial: 7, device level: 10).
Root causes of downtime
The most frequently cited total expense of unplanned outages in the Cost of Downtime study includes:
- IT equipment failure ($959, 000)
- Cyber crime ($882,000)
- UPS system failure ($478,000)
- Water, heat or CRAC failure ($517,000)
- Generator failure ($501,000)
- Weather incursion ($436,000)
- Accidental/human error ($380,000)
Common causes of outages
Eighty-three percent of survey respondents in the Causes of Downtime study said they knew the root cause of the unplanned outage. The most frequently cited root causes of outages include:
- UPS battery failure (55 percent)
- Accidental EPO/ human error (48 percent)
- UPS capacity exceeded (46 percent)
- Cyber attack (34 percent)
- IT equipment failure (33 percent)
- Water incursion (32 percent)
- Weather related (30 percent)
- Heat related/CRAC failure (29 percent)
- UPS equipment failure (27 percent)
- PDU/circuit breaker failure (26 percent)
Fifty-two percent believe all or most of the unplanned outages could have been prevented.
Minimizing cost and defying downtime
While eliminating downtime altogether is a difficult and somewhat challenging undertaking, the studies did identify common attitudes and behaviors for reducing costs:
- Consider data center availability the highest priority above all others, including cost minimization and improving energy efficiency.
- Utilize all best practices in data center design and redundancy to maximize availability.
- Dedicate ample resources to bring the data center up and running in case of an unplanned outage.
- Have complete support from senior management on efforts to prevent and manage unplanned outages.
- Regularly test generators and switchgear to ensure emergency power in case a utility outage does occur.
- Regularly test or monitor UPS batteries.
- Implement data center infrastructure management (DCIM).