Facilities Management in the Data Center
Today's data centers are being designed and built to meet incredibly stringent uptime requirements. No matter what type of facility is built – from one with the potential for many facility-related points of failure to no single points of failure – one critical component, which must be addressed, consistent throughout is the potential for human error. Human error continue to be the greatest cause of unplanned downtime in the data center. The probability of human error causing points of failure within the data center is much greater in companies where processes are less mature and the IT organization and the Data Center Facilities group are two separate and distinct (“siloed”) groups with little communication and interaction between the two.
These “siloed” groups have traditionally developed with different cultures and processes, even reporting to different senior executives within the organization, with little communications and process integration. Historically, when IT would need equipment provisioned, no specific requirements were communicated to Data Center Facilities, causing servers to be installed in racks without formal planning and communication. This was considered “good enough” before the era of high density servers and when a full rack of servers only consumed 2 KW of power. However, with the emergence of blades and other high density equipment, racks can consume significantly more power by an order of magnitude – 15kW, 20kW or even 40kW. The majority of data centers in operation were built before 2002 to support 2kW of power per rack, which can become a recipe for disaster if lack of communication and integrated processes between IT and Data Center Facilities proliferate. Data Center Facilities can no longer haphazardly place a server in any open rack, without understanding how it will affect the overall data center from a holistic perspective. Overlooking this impact analysis within change management processes can result in severe disruptions in service. In addition, the management of capacity within the data center as it applies to power, cooling and space, has become significantly more complex. This “physical infrastructure” capacity management is particularly critical as the timelines for provisioning additional space, power or cooling resources can take as long as 12-24 months.
Traditional roles of the IT and Data Center Facilities groups have begun to merge into a consolidated group to collectively manage aspects that affect both job functions. Some organizations have split the Data Center Facilities group out of the traditional Facilities organization and made them part of IT Operations in order to accommodate the necessary functions that require both IT and Data Center Facilities to work together. Managing the space, power, cooling, installing equipment and conducting all change management and configuration management has slowly become integrated.
Although this combination represents a needed first step, organizations must go even further for operations to run effectively. The new Data Center Facilities group must now develop the same best practice processes and tools utilized by the IT organization, such as IT Service Management Frameworks like ITIL, to address change management, configuration management, incident management and capacity management issues. Only through a tightly integrated group that combines both IT and Data Center Facilities cohesively, can an organization effectively move from a chaotic or reactive level of maturity to one that is service-oriented and adding significant value.
Today, the data center is a strategic resource to the enterprise. Managing the data center today, requires addressing people, processes and technology issues. Integrating the Data Center Facilities and IT people and process, and providing the appropriate tools is of the utmost importance. For more information on Aperture's solutions for "integrated" data center management, click here.