Sun Microsystems publicly detailed their new Niagara II processor yesterday. Sun announced the processor with an intro presentation titled “ULTRASPARC T2: The Highest Performing, Most Energy Efficient Processor”. Typical understatement from Sun; when will these guys get develop some self esteem? Niagara II is an 8-core, 64-thread cpu that shows considerable improvement over their Niagara I processor. Each core now has its own floating point unit, which bumps FP performance by 10x, according to Sun. They also state that the 1.4GHz Niagara II provides double the throughput of Niagara I, at the same clock speed. In addition, Sun highlighted Niagara II’s meager energy requirements, benchmark performance, and integrated I/O. All the details are here.
Sun’s overarching theme (and value proposition) is that Niagara II offers industry leading power per core
and power per thread ratios. Both of these claims are true, but not necessarily the entire story. This is certainly an interesting processor, and will be the foundation of interesting systems. However, we’re not so sure that even a much improved Niagara II processor is truly a good fit for general purpose computing needs. In our mind, the processor is limited by Sun’s design choices. Sun decided to integrate both 10Gb Ethernet and 8x PCIe interfaces directly on the processor. There are some good reasons for doing this, with energy efficiency and simplified packaging leading the pack. However, integrating I/O on the processor imposes some limitations in maximum throughput, which loom large when you compare Niagara to competitive offerings from other vendors.
Here are the processor I/O numbers for Niagara II:
PCIe max throughput: 2.0 GB/second (provided by Sun)
10Gb Ethernet throughput 1.25 GB/sec (GCG estimate)
Total Niagara II processor I/O = 2.0 + 1.25 = 3.25 GB/sec\r\n\r\nFor comparison, here are numbers for IBM and HP
POWER6 (dual core): total processor-to-system I/O = 20 GB/sec max throughput\r\n- Itanium2 (dual core): total processor-to-system I/O = 8.5 GB/sec max throughput
We went straight to the vendors to get the numbers just to be sure and to have someone to blame if the stats are wrong. Intel didn’t get back to us in time we will post their Xeon dual-core and quad-core numbers when and if we get them.
The differences in processor I/O are profound. Even though Niagara II has many more cores and threads than either POWER6 or Itanium2, it has much less I/O to the system. In other words, POWER6 can potentially move six times more data in and out of the cpu vs. Niagara II, while Itanium2, at 8.5 GB/sec, can transfer more than 2.5x more data than Niagara II.
How big a deal are these differences in chip I/O? Could be a very big deal indeed, but, like everything else, the degree of importance is highly dependent on how the systems are used and the characteristics of the particular workloads. With light I/O loads and applications that are highly threaded, Niagara II should be a screamer. However, if customers try to use this box as they would any midrange x86 server or as a consolidation target, piling on lots of general purpose workloads (databases, enterprise applications, etc.), they run the risk of outstripping the I/O capacity of the chip. This is why we generally agree Sun’s competitors when they say that the Niagara-based systems are niche products. However, the niche in question is pretty damn big and is growing. Summary: Sun’s Niagara II is well-suited to specific workloads and should perform very well on those workloads, but it probably isn’t the right choice for a general purpose consolidation platform.
