You’ve pimped out your fancy new HPC cluster with the latest technology, each node jammed to the gills with shiny new CPUs, loads of memory, and maybe even a double brace of GPUs or other accelerators.
The installation work is done, and users are getting their first shot at the system. The performance boost on the new box is so profound that you’re halfway expecting a mob of ecstatic users to hoist you on their shoulders and parade you around the building. Indeed, a mob of users are gathering… but they’re carrying weapons crudely fashioned from their desktops.
As you’re being beaten about the head by a postdoc wielding a used toner cartridge, you find yourself wondering whether this situation could have been avoided.
Parallelizing user code and, better yet, helping users to better parallelize their own code, is the answer. Code that’s well-behaved and scales on smaller clusters can have a nervous breakdown on larger systems.
This is where Allinea comes in. Their tool suite is designed to make it faster/easier to develop, debug, and optimize code that will scale from the smallest closet cluster to the largest supercomputers in the world.
At GTC 2014 I spent a few minutes with Allinea’s Beau Paisley. With real-world experience in large-scale applied physics, climate modeling, and image processing, Beau knows what he’s talking about when it comes to making big systems perform.
In the video, we talk about the challenges of parallelizing code and making the jump from traditional clustering to hybrid systems. It’s a brief, but meaty, conversation that’s worth a look.