I’ve been busy all day talking, listening, and maybe even learning a thing or two at the 2010 GPU Technology Conference. The speaker from the NOAA session (topic of the blog below) put the move toward GPUs into perspective toward the end of his talk today with two key points.

His first point was that big HPC advances tend to come at around 10-year intervals. Citing the moves from vector processing to MPP to COTS (Common Off The Shelf systems) to GPU-hybrid systems as evidence, he pointed out that each jump required significant changes in mindset, tools, management, and code. However, the benefits from each jump were profound, with cost savings and performance increases on the order of 20x to 50x.

To illustrate this, he compared the current state-of-the-art 2.3 PFlop DOE Jaguar system to what a 1 PFlop GPU-based system would look like. The difference (see table below) is amazing:

By every measure, GPU + CPU hybrids are a better deal in terms of price/performance, facilities, availability, and even the initial price tag for the hardware. What this table doesn’t take into account is the time, labor, and heartache involved in making code “GPU-riffic.” (Yeah, that’s my own term for it; hoping it catches on, but doubt it will.) This task is getting easier over time as the ecosystem around CUDA becomes more robust and varied.

Jaguar

GPU Alternative

Performance

2.3 PFlop/s

 

1.0 PFlop/s

Cores

250,000 CPU cores

 

1,000 Fermi GPUs

Size

284 cabinets

 

10 cabinets

Power Draw

7-10 MW

 

.5 MW

Cost

$50 to $100 million

 

$5 to $10 million

Reliability

Measured in hours

 

Measured in weeks

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>