THE ROLE OF INFRARED TESTING AT DATA CENTERS
By: Eric
R. Stockton
CompuScanIR™ Division
www.compuscanir.com
800-248-7226
Introduction
Data centers and other mission critical facilities with
uninterruptible power supply (UPS) systems can experience extremely high ROI
(return on investment) with a properly integrated infrared program. A
successful infrared program includes all of the facility equipment from
construction and operation to server efficiency. Commissioning of data center
initial construction as well as the addition of new equipment is accepted as
standard operating procedure for information technology (IT) facilities. This
paper discusses the different aspects of the commissioning and maintenance process
and how the infrared (IR) thermographer fits into the process.
The
Business Impact of Downtime at Data Centers
Downtime in these facilities is not an option. Infrared
thermography is being utilized for regular electrical switchgear surveys,
optimizing of cooling systems and servers, and commissioning of all electrical
equipment, including UPS modules , PDU (power distribution unit) equipment and
computer servers. Many construction project specifications have infrared
surveys as a requirement before the building is turned over to the owner. Data
center infrared thermography must have total accountability for all infrared data
in the commissioning process, regardless of whether or not there are problems.
This accountability can be achieved by documenting all equipment inspected with
time, date, location and equipment condition. The thermographer must create a
data log and record the infrared video onto a digital storage device of some
type. New technologies in data acquisition and report preparation will make
historical data (images previously taken) available for comparison. This will enable
the thermographer to more closely compare circuit boards and other UPS
equipment with previously acquired images. If something fails or causes
downtime in the system, an IR image of that component may be referenced to
document that the equipment was operational, at thermal steady-state and in
acceptable condition when the survey was made.
Table 1:
Uptime and Maximum Downtime
|
Uptime |
Uptime |
Maximum Downtime per year |
|
Six nines |
99.9999% |
31.5 seconds |
|
Five nines |
99.999% |
5 minutes 35 seconds |
|
Four nines |
99.99% |
52 minutes 33 seconds |
|
Three nines |
99.9% |
8 hours 46 minutes |
|
Two nines |
99.0% |
87 hours 36 minutes |
|
One nine |
90.0% |
36 days 12 hours |
(1. Hiles, Andrew 2004)
Table 2:
|
Industry
Sector |
$
Revenue / hour |
|
Energy |
2,818,000 |
|
Telecom |
2,066,000 |
|
Manufacturing |
1,611,000 |
|
Finance |
1,495,000 |
|
Information Technology |
1,344,000 |
|
Insurance |
1,202,000 |
|
Retail |
1,107,000 |
Source: Meta Group
Estimates for other industries provide a cross-check. A
2004 survey, for instance, put losses on brokerage operations at
$4,500,000/hour, banking operations at $2,100,000/hour, media operations at
$1,150,000/hour and e-commerce operations at $113,000/hour. Retail operations trailed
at $90,000/hour. Share value for some companies can be affected, for example e-Bay’s
outages in 1999 saw shares temporarily drop by over 26 percent, while e*Trade’s
similar problems saw a 22 percent temporary drop.
(2. Hiles, Andrew 2004)
IR Commissioning
of
The commissioning process should include these types of
equipment and considerations. The following infrastructure support equipment
should be tested:
Loading Considerations:
Figure 1) Resistive load bank is shown with a bad cable
connection.
Causes for
Electrical Failure and Downtime in Data Centers
The critical power distribution system takes
conditioned power from the UPS and distributes it throughout the facility to
individual loads. Most site failures occur in areas where hot electrical work
is required and physical maintenance is difficult to perform.
Typical causes for failures include:
·
cover slipped while accessing load panel,
·
overheated breakers tripped unexpectedly,
·
wires were not physically secured under screws,
·
screws were not torqued adequately,
·
wires or circuit breaker handles were dislodged
while adjacent work was being performed,
·
screws were stripped,
·
insulation was skinned causing faulted wires,
·
rotations were reversed.
(3. UpTime Institute, 2006)
Infrared
Applications for Servers and Server Racks
Ten percent of all server racks currently in service
are too hot to meet industry standards for maximum IT reliability and
performance.
“Institute research into computer
room cooling indicates 1/3 all perforated tiles are incorrectly located and 60%
of all available cooling capacity is being wasted by bypass airflow. Increasing
under-floor static pressure to get air where it needs to go requires
permanently blocking all unnecessary air escape routes. This includes sealing
cable cutouts behind and underneath products or racks (this unmanaged airflow
is what is really cooling most computer rooms) as well as the penetrations in
the floor or walls or ceiling and any other openings in the raised floor.
Perforated floor tiles with 25% openings can be replaced with 40% and 60%
grates to permit a much higher airflow. For sites with unused raised floor
space deliberately spreading equipment out to create white space and reduce the
averaged gross watts per square foot power consumption will be a viable option.”
(4. Brill, Kenneth 2006)
Figure 2) Server cooling fans are shown. The top fan is
operating normally, while the bottom fan has failed.
Figure 3) High density server being tested at
increasing CPU utilizations.
Server infrared applications
include:
·
Thermally mapping complete data center from sub-floor
to ceiling.
·
Verifying proper hot aisle/cold aisle operation
preventing short circuiting and bypassing of air flow.
·
Verifying high density server farm cooling
capabilities.
·
Monitoring server rack temperature distribution
patterns.
·
Finding internal server fans which are inoperable
or damaged.
Figure 4) Verifying proper hot aisle/cold aisle operation.
Safety
Considerations
Of course, the thermographer must comply with all OSHA
and NFPA 70E regulations. The good news is that unlike most industrial sites,
the switchgear rooms and data centers have controlled temperatures and low
humidity, which makes the use of the arch flash suits and associated safety
equipment much less onerous for the thermographer.
Figure 5) Thermographer inspecting a battery bank during full battery discharge
testing.
How does a
thermographer become “qualified” and obtain contracts to do data center thermal
survey work?
First, the thermographer must understand the critical
nature of the equipment being tested as well as the surrounding equipment.
Furthermore, he/she should understand that the work he/she is performing is
critical and vital to the operation. A thermographer wanting to do this type of
work should get general training and certification on electrical switchgear and
also get specific training on data center equipment. He/she should contact UPS
vendors and their clients and cultivate relationships with them.
Since this work has a high accountability, the
methodology for performing the surveys and creating the reports must be
“upgraded” from the typical office building or factory. This means the
thermographer must use a high resolution, radiometric and sensitive thermal
imager and learn how to record all thermal, visual and textual data by using a
detailed data logging system. Also, data center specific work schedules often
include nighttime maintenance windows from Saturday midnight until Sunday
morning, therefore the thermographer must get used to working during off-peak
times. We know that large companies commission all data center equipment, so do
the smaller companies have UPS and server systems? Absolutely! In order to successfully complete the commissioning
process and maintain the systems, large and small companies must find
thermographers that are close-by and have experience in critical facility
activities. A thermographer interested in providing these services must be commercially
available to the UPS, electrical and facilities maintenance contractors. Having
a great professional reputation with no accidents or system failures is
essential to being the preferred thermographer for data center infrared work. What
these infrared service clients want…are the most professional, experienced and
qualified thermographers in the electrical infrared industry.
References:
1. Hiles, Andrew, (2004) Five Nines: Chasing the Dream?
<continuitycentral.com> Continuity Central (12/18/06)
2. Hiles, Andrew, (2004) Five Nines: Chasing the
Dream? <continuitycentral.com>
Continuity Central (12/18/06)
3. UpTime Institute, (2006) Procedures and Guidelines
for Safely Working in an Active Data Center pg 9., <uptimeinstitute.org>
UpTime Institute (12/18/06).
4. Brill, Kenneth, (2006) 2005-2010 Heat Density Trends
in Data Processing, Computer Systems, and Telecommunications Equipment:
Perspectives, Implications and the Current Reality in Many Data Centers. P. 13 <uptimeinstitute.org> UpTime Institute (12/18/06).
Author
Bio:
Eric R. Stockton received a BA in Zoology from the
Copyright January 2007
Published at IR/Info 2007 Conference in
