|
Over the past two decades, the goals in
high-performance computing have been to improve
performance and price/performance; likewise,
these are the same two metrics used to award
the Gordon Bell Prize at SC (formerly
Supercomputing) every year. However, after two
decades of focusing on improving performance,
performance, and occasionally
price/performance, our belief is that the key
metrics of this decade will be efficiency,
reliability, and availability. Why? From the
early 1990s to the early 2000s, the performance
of our n-body code for galaxy formation
improved by 2000-fold, but the performance per
watt only improved 300-fold and the performance
per square foot only 65-fold. Clearly, we have
been building less and less efficient
supercomputers, thus resulting in the
construction of massive datacenters, and even,
entirely new buildings (and hence, leading to
an extraordinarily high total cost of
ownership). Perhaps a more insidious problem to
the above inefficiency is that the reliability
(and usability) of these systems continues to
decrease as traditional supercomputers continue
to follow "Moore's Law for Power
Consumption."
With the above discussion in mind, our first
attempt at achieving "Supercomputing in Small
Spaces" involved a novel cluster systems
architecture called the Bladed Beowulf, which
leveraged commodity parts from RLX Techologies
and World Wide Packets. This Bladed Beowulf
cluster consisted of compute nodes made from
commodity parts mounted on motherboards blades
called RLX ServerBlades (see Figure 1). Each
motherboard blade (node) originally contained a
633-MHz Transmeta TM5600 CPU, 256-MB memory,
10-GB hard disk, and three 100-Mb/s Fast
Ethernet network interfaces. Twenty-four such
ServerBlades mounted into a rack-mountable 3U
"RLX System 324" chassis, shown in Figure 2,
formed out first "Bladed Beowulf" called
MetaBlade.
In April 2002, we unveiled our
next-generation Bladed Beowulf, as shown in
Figure 3, dubbed "Green Destiny" --- a 240-node
version of MetaBlade that upgraded each CPU to
a 933-MHz/1-GHz Transmeta TM5800 with
customized high-performance code-morphing
software (HP-CMS). The speed bump in clock
frequency, along with an increase in memory
bandwidth, improved overall performance by 33%.
The customized HP-CMS system software improved
the floating-point performance by an additional
42%.
More importantly, Green Destiny consumed
only 3.2 kilowatts of power (i.e., two
hairdryers) while occupying only five square
feet and delivering over 100 Gflops in LINPACK
performance. For two years, it provided
reliable supercomputing cycles while sitting in
an 85-degree F dusty warehouse at 7,400 feet
above sea level.
|