Over the past two decades, the goals in high-performance computing have been to improve performance and price/performance; likewise, these are the same two metrics used to award the Gordon Bell Prize at SC (formerly Supercomputing) every year. However, after two decades of focusing on improving performance, performance, and occasionally price/performance, our belief is that the key metrics of this decade will be efficiency, reliability, and availability. Why? From the early 1990s to the early 2000s, the performance of our n-body code for galaxy formation improved by 2000-fold, but the performance per watt only improved 300-fold and the performance per square foot only 65-fold. Clearly, we have been building less and less efficient supercomputers, thus resulting in the construction of massive datacenters, and even, entirely new buildings (and hence, leading to an extraordinarily high total cost of ownership). Perhaps a more insidious problem to the above inefficiency is that the reliability (and usability) of these systems continues to decrease as traditional supercomputers continue to follow "Moore's Law for Power Consumption."

With the above discussion in mind, our first attempt at achieving "Supercomputing in Small Spaces" involved a novel cluster systems architecture called the Bladed Beowulf, which leveraged commodity parts from RLX Techologies and World Wide Packets. This Bladed Beowulf cluster consisted of compute nodes made from commodity parts mounted on motherboards blades called RLX ServerBlades. Each motherboard blade (node) originally contained a 633-MHz Transmeta TM5600 CPU, 256-MB memory, 10-GB hard disk, and three 100-Mb/s Fast Ethernet network interfaces. Twenty-four such ServerBlades mounted into a rack-mountable 3U "RLX System 324" chassis, formed out first "Bladed Beowulf" called MetaBlade.

In April 2002, we unveiled our next-generation Bladed Beowulf, dubbed "Green Destiny" --- a 240-node version of MetaBlade that upgraded each CPU to a 933-MHz/1-GHz Transmeta TM5800 with customized high-performance code-morphing software (HP-CMS). The speed bump in clock frequency, along with an increase in memory bandwidth, improved overall performance by 33%. The customized HP-CMS system software improved the floating-point performance by an additional 42%.

More importantly, Green Destiny consumed only 3.2 kilowatts of power (i.e., two hairdryers) while occupying only five square feet and delivering over 100 Gflops in LINPACK performance. For two years, it provided reliable supercomputing cycles while sitting in an 85-degree F dusty warehouse at 7,400 feet above sea level.


Last updated: Oct 29, 2010