Woven released a technical note this week summarizing the results of tests comparing a 36 node (quad socket, dual core AMD nodes) system cabled with DDR InfiniBand and a 10 GbE with Woven’s 144-port EFX 1000 Ethernet Fabric Switch. You can find the full report here.
Despite the theoretical advantage of IB’s lower latency and higher throughput, Woven’s HPL test shows nearly identical performance for the 10 GbE solution with their switch and IB in both performance and efficiency out to 36 nodes (288 processors). The company is cautious about being too optimistic based on these results, though
Clearly not every application will perform as well on 10 GE as on InfiniBand, but it should be pointed out that the majority of applications in the HPC space do indeed have a payload profile of moderate packet sizes that is not unfavorable to 10 GE, including industry standard applications such as LSDyna, Fluent, Abaqus, and others. A few applications, such as Amber and Gamess, operate in a communications regime dominated by small packets, and for these, InfiniBand may be a better choice.
Woven goes on to assert that the ubiquity of Ethernet and broad experience with the technology makes it the technology of choice as long as there is no performance penalty
When the multiuse capability of 10 GE for storage and system backbone functions are considered, along with the inherent ease of design, construction and maintenance of a cluster, it should be clear to any system architect that they should be using 10 GE as the standard scenario. 10 GE is a safe and natural starting point for new cluster designs, and moving away from it is indicated only when there is a specific and compelling need.
I don’t actually design systems, but I’m willing to give them the benefit of the doubt on that argument. I suppose what they needed to do with this paper is to show that there isn’t a performance disadvantage to using 10 GbE, which is the first step in getting into the market.
Sadly, “hey, we’re not worse than the other guys” isn’t a great marketing strategy, so they can’t stop here.
They actually have what is to my mind a much better story to tell here about their adaptive routing capabilities which, as demonstrated in the earlier Sandia tests, show that their Ethernet switch technology can offer real advantages over statically routed fabrics.
The test in this most recent report were too small to stress the network enough for the adaptive component to matter. I’d really like to see both tests run at real scale; thousands of processors would be ideal. If the results they’ve demonstrated in the small translate to large-scale systems, Woven’s technology should find its way into a whole bunch of systems.