QLogic Aims for the Fences with Infiniband Fabric Suite

qlogicBefore you mark this as “just another Infiniband press release,” you might want to reconsider.  I had the pleasure of speaking with Phil Murphy this week, VP of QLogic’s Network Solutions Group.  The Network Solutions Group heads up the goodness that is QLogic’s TrueScale Infiniband product suite.  Those who have been around the Infiniband block before remember that this group was formerly their own company called PathScale.  QLogic acquired the startup and pumped them full of funding and corporate clout with the fabs.  After several years of work, what they have is a high bandwidth, low latency interconnect that looks like Infiniband, smells like Infiniband but runs like a scalded cat.

Our conversation got off to a quick start with a bit of Infiniband history.  Infiniband was originally designed as a data center consolidation product.  Ethernet, fibre channel and even PCI carried over the same phy was the ambitious dream of the early adopters.  As such, the early protocol stacks reflected the idea of encapsulating multiple frame or packetized network layers over a single interconnect.  Exactly the sort of design that most HPC network gurus cringe at.

Fast forward to 2010.  QLogic has decided to change the face of their Infiniband network stack.  Rather than barreling down the path of “queue-pair” style Infiniband communication [Verbs for those in the know], they have implemented a new connection-less and state-less communication primitive.  The new software layer allows applications to send millions [literally] of concurrent messages without paying a terrible amount of setup penalty.  Who many millions?  According to Phil, traditional Infiniband products will peak at around 7 million messages per second.  QLogic’s new stack will hit 30 million messages per second.

QLogic accomplishes all this by going down into the guts of Infiniband routing and QoS metrics in order to tune the fabric for a myriad of different message classes.  Hammering a disk sub system will large blocks?  They can do that.  Hitting a neighboring node will billions of small messages?  They do that too.  With IFS 6.0 they’ve wrapped up the following additional features:

  • Virtual Fabrics combined with application-specific CoS, which automatically dedicates classes of service within the fabric to ensure the desired level of bandwidth and appropriate priority is applied to each application. In addition, the virtual fabrics capability helps eliminate manual provisioning of application services across the fabric, significantly reducing management time and costs.
  • Adaptive Routing continually monitors application messaging patterns and selects the optimum path for each traffic flow, eliminating slowdowns caused by pathway bottlenecks.
  • Dispersive Routing, which load-balances traffic among multiple pathways and uses QLogic® Performance Scaled Messaging (PSM) to automatically ensure that packets arrive at their destination for rapid processing. Dispersive Routing leverages the entire fabric to ensure maximum communications performance for all jobs, even in the presence of other messaging-intensive applications.
  • Full leverage of vendor-specific message passing interface (MPI) libraries to maximize MPI application performance. All supported MPIs can take advantage of IFS’s pipelined data transfer mechanism, which was specifically designed for MPI communication semantics, as well as additional enhancements such as Dispersive Routing.
  • Full support for additional HPC network topologies, including torus and mesh as well as fat tree, with enhanced capabilities for failure handling. Alternative topologies like torus and mesh help users reduce networking costs as clusters scale beyond a few hundred nodes, and IFS 6.0 ensures that these users have full access to advanced traffic management features in these complex networking environments

QLogic has gone well out of their way to make Infiniband even more HPC-friendly.  So much so that Dell, IBM, HP and SGI have already signed up to resell/OEM the new gear.  Keep an eye of the continued change via the QLogic Infiniband landscape.  This could prove to change HPC interconnects as we know it.

Correction: SGI remains a Voltaire customer for Infiniband products.

Comments

  1. There are few missing points and wrong data. “traditional ” InfiniBand does not peak at 7M messages per second at all. The numbers demonstrated are much much higher. Furthermore, the approach described here requires many CPU cycles to drive the network, so in real application works, you will see many issues. You can refer to – http://www.hpcwire.com/features/Network-based-Processing-Versus-Host-based-Processing-93961524.html for more info.

    Second, queue-pair is the InfiniBand advantage. It allows creating 16M virtual NICs with full protection and isolation between them. But if you avoid that, you put a big load on the CPU. It becomes more like using TCP for HPC. Not very efficient. InfiniBand target was to provide the most efficient interconnect, and this is what other vendors provide.