We spent a bit of quality time with Deepak Khosla, CEO of X-ISS, Inc., a Houston-based IT consulting firm. X-ISS has been around for 15 years and has done a lot of work with HCP clusters – installing more than 40,000 nodes in the last five years and more than 5,000 nodes since 2006. While this is impressive on its own, what really caught our attention is their new managed services offering for clusters. First, a little background…
While much of the story around supercomputing centers on the massive installations that make up the Top500 list, most of the growth is in the low-end and mid-range cluster market. In fact, this segment is the fastest growing market in all of IT, according to industry bean counters. By some estimates, 20% of all x86-based systems are sold as part of an HPC-style cluster. It’s a big market. However, there are a couple of problems, the first being that many of these clusters are falling into places that don’t have huge data center infrastructures. They don’t have enough personnel to take on the additional management load, they don’t know how to diagnose cluster problems, and they don’t know how to optimize cluster performance. Moreover, they don’t realize that HPC clusters are different animals from standard IT. Sure, they’re all computers, but the skills necessary to properly manage, maintain, and operate a cluster aren’t in the comfort space of most IT administrators – and the people who do know how to do this stuff are neither common nor cheap. However, without proper management, monitoring, configuration, etc., users will see more downtime and lower performance, and it’ll take longer to get the results they need to reach their research or business goals.
X-ISS anticipated this problem and has developed an innovative solution: the first (that we can find, anyway) cluster management service. Their services cover the gamut of needs from initial design/deployment to ongoing monitoring, management, and optimizing for unique workload and user needs. Just as valuable are the alert and reporting mechanisms that can monitor and track every aspect of the cluster – giving admins a heads-up when a problem occurs or letting managers know up-to-the-minute utilization figures for the system. More details are at the X-ISS web site and their press release.
