 |
Stockton Infrared Thermographic Services, Inc.
|
“Five Nines” and Infrared (IR) Testing at Data
Centers
By Gregory R. Stockton
99.999% uptime…five nines. That is what IT
(information technology) customers are looking for. Uptime or
“availability” at data centers is an absolutely necessity. A loss in power
to a data center can cost the owner millions, literally. The power,
cooling and support systems are vital to the continuous flow of
information in these “mission critical” facilities. IR/PM (infrared
predictive maintenance) is a must. The electrical switchgear, UPS
(uninterruptible power supply), ATS (automatic transfer switches), server
systems and cooling systems must be checked with infrared thermography and
other testing means on a regular basis to insure super-high reliability.
Mission Critical
Mission critical facilities are like other facilities in that they have
electromechanical equipment that must be maintained. The difference is
that the operators of mission critical facilities, owing to the extremely
high availability requirements from management, have to pay much more
attention to the equipment so that it will not fail. This requires
dual-path power supply systems (for redundancy) and regular testing of the
systems.
Systems
- Dual-power technology requires two completely
independent electrical systems tied together with switchgear. When the
normal source of power fails, these dual-path power supply systems
quickly switch to a back-up source. A UPS system keeps the power flowing
until the normal source is restored or another source is brought on-line
and synchronized. Usually, the UPS, through a PDU or power distribution
unit, takes AC power, converts it to DC where a bank of batteries is
tied in and then inverts it back to AC to feed the computer hardware.
| Figure 1
(click to enlarge) |
Figure 2
(click to enlarge) |
Figure 3
(click to enlarge) |
 |
 |
 |
| Typical PDU in a data center with
load bank test being run. |
SCR connection on an inverter
assembly at over 550º F. |
Bolted/crimped connector on an
output filter. |
Since the systems often cannot be tested on-line, they
must be tested during “maintenance windows”, planned outages or times
when the impact of testing is low, so that simulations can be run. By
pulling power from a load bank, resistive load testing is used to fully
simulate and test all equipment on the floor. Any problems that are
encountered during an infrared survey are repaired immediately and the
system is rechecked before putting the equipment back on-line.
- Battery back-up systems must be checked in a
real-time battery discharge situation to fully simulate an actual loss
of the normal source of power. The batteries, connections, cables,
switches and charging systems are checked for unwanted heating
conditions.
| Figure 4
(click to enlarge) |
 |
Small battery bank with a loose lug
connection on the main breaker.
|
- Uniform cooling of all data center server, storage,
and computer equipment is essential for proper operation. The design
objective of the cooling system is to provide a clear path from the
source of the cooled air to the equipment and back to the cooling unit.
This issue has received much attention lately as miniaturization of the
equipment and economic pressures have increased the amount of heat that
is generated per cubic foot of floor space and per cubic foot of rack
space in the server rack panels. This hardware is sensitive to heat and
humidity and some new designs are being tested so that failures do not
occur solely due to environmental conditions (see figure 5). How perfect
an application for IR!
| Figure 5
(click to enlarge) |
 |
Server rack designs being tested for
heat dissipation.
|
- Utility main power supplies are typically owned by
the local power company but are sometimes owned by the user. A looped
system feeds power from two different power company substations and can
be “back fed” if the power is out on the primary. No matter who the
technical owner of the utility equipment is, it must be checked with IR
like all other components. (See figure 6).
| Figure 6
(click to enlarge) |
 |
Pad-mounted transformer with loose
connection on line side.
|
- Mechanical Systems have the same stringent
requirements as the electrical system. Again, this is achieved by
redundancy and failure prevention engineering.
Accountability
There must be a total accountability of all infrared survey results,
especially all of the equipment associated with the UPS, computer and
server systems. This can be accomplished by recording the entire survey on
digital videotape and/or capturing fully-radiometric images of all
equipment, whether problems exist or not. In either case, a data log of
all equipment surveyed must be created including a time/date stamp
reference for all equipment. Documentation is very important.
Summary
To achieve five nines availability, it is essential that competent IR
testing be performed on all electrical and mechanical systems in
conjunction with other testing and in cooperation with management and
maintenance personnel.
If you maintain an office building, manufacturing
facility or any other type of facility where uptime is important, you
should take time to follow what is happening with data centers, as they
are among the most mission critical of all operations.
Gregory R. Stockton, President
Stockton Infrared Thermographic Services, Inc.
8472 Adams Farm Road
Randleman, NC 27317
(800) 248-SCAN
www.stocktoninfrared.com
Copyright 1999-2005.
All rights reserved.
|