<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: User Response: 10GbE Cluster Interconnect</title>
	<atom:link href="http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/feed/" rel="self" type="application/rss+xml" />
	<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/</link>
	<description>HPC News Without the Noise for Supercomputing Professionals &#124; insideHPC</description>
	<lastBuildDate>Sun, 09 Jun 2013 01:54:13 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
	<item>
		<title>By: iggy</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-137492</link>
		<dc:creator>iggy</dc:creator>
		<pubDate>Tue, 13 Jan 2009 20:05:06 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-137492</guid>
		<description>Scott..

Gilad is the VP of Technical Marketing @ Mellanox.  That&#039;s all.</description>
		<content:encoded><![CDATA[<p>Scott..</p>
<p>Gilad is the VP of Technical Marketing @ Mellanox.  That&#8217;s all.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Scott Atchley</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-137476</link>
		<dc:creator>Scott Atchley</dc:creator>
		<pubDate>Tue, 13 Jan 2009 19:02:57 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-137476</guid>
		<description>@Gilad

&lt;blockquote&gt;You can run MX over 10G link layer to get lower latency, but this require MX on both sides,&lt;/blockquote&gt;

As opposed to running IB on one-side? What is on the other side?

&lt;blockquote&gt;and still, it will be higher than the 2u stated above (the best Eth switches are 600ns per switch hop,&lt;/blockquote&gt;

You should try a Fujitsu switch. They are less than 450 ns.

&lt;blockquote&gt;and MX itself is not lower than 2us…,&lt;/blockquote&gt;

Latest NICs, latest CPUs, just under 2 us...

&lt;blockquote&gt;by the way Cisco switch latency is around 3us per hop…. definitely not good for MPI….).&lt;blockquote&gt;

You said it was fine when you were selling SDR NICs...

&lt;blockquote&gt;When you run 8 jobs at the same time, you will still get the 1us per job, but with 10G the latency will increase with job count. Bottom line, InfiniBand is still the best performance interconnect, and probably will stay like that in the next years.&lt;/blockquote&gt;

Best micro-benchmark performance, perhaps. Let&#039;s talk alltoall on a large fabric. :-)

Scott&lt;/blockquote&gt;&lt;/blockquote&gt;</description>
		<content:encoded><![CDATA[<p>@Gilad</p>
<blockquote><p>You can run MX over 10G link layer to get lower latency, but this require MX on both sides,</p></blockquote>
<p>As opposed to running IB on one-side? What is on the other side?</p>
<blockquote><p>and still, it will be higher than the 2u stated above (the best Eth switches are 600ns per switch hop,</p></blockquote>
<p>You should try a Fujitsu switch. They are less than 450 ns.</p>
<blockquote><p>and MX itself is not lower than 2us…,</p></blockquote>
<p>Latest NICs, latest CPUs, just under 2 us&#8230;</p>
<blockquote><p>by the way Cisco switch latency is around 3us per hop…. definitely not good for MPI….).<br />
<blockquote>
<p>You said it was fine when you were selling SDR NICs&#8230;</p>
<blockquote><p>When you run 8 jobs at the same time, you will still get the 1us per job, but with 10G the latency will increase with job count. Bottom line, InfiniBand is still the best performance interconnect, and probably will stay like that in the next years.</p></blockquote>
<p>Best micro-benchmark performance, perhaps. Let&#8217;s talk alltoall on a large fabric. <img src='http://insidehpc.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>Scott</p></blockquote>
</p></blockquote>
]]></content:encoded>
	</item>
	<item>
		<title>By: john casu</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136839</link>
		<dc:creator>john casu</dc:creator>
		<pubDate>Sun, 11 Jan 2009 20:59:48 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136839</guid>
		<description>For now, you&#039;re absolutely right.  Infiniband is going to be the cost leader for at least the next 18 months.

But once 10GbE over CAT takes hold (and it is a question of when, not if, imho), it&#039;ll happen very quickly, because 10GbaseT is a natural evolution of the overwhelmingly ubiquitous network technology whose use is driven by so many things outside HPC.

Actually, I&#039;m going to change my point slightly, because I think 10GbE dominance is also dependent on when Intel &amp; Broadcom really decide to get into the market, and when Dell decides it&#039;s time for 10GbE.
  
Also, in my experience, there&#039;s an innate resistance to 10GbaseT, among the 10GbE startups, precisely because it will drive prices way down.

The real question, in my mind, is that when 10GbE does become ubiquitous, will it still be considered a high speed interconnect technology?  Will Mellanox, Myricom and other have moved on to the next great thing?  I hope so.</description>
		<content:encoded><![CDATA[<p>For now, you&#8217;re absolutely right.  Infiniband is going to be the cost leader for at least the next 18 months.</p>
<p>But once 10GbE over CAT takes hold (and it is a question of when, not if, imho), it&#8217;ll happen very quickly, because 10GbaseT is a natural evolution of the overwhelmingly ubiquitous network technology whose use is driven by so many things outside HPC.</p>
<p>Actually, I&#8217;m going to change my point slightly, because I think 10GbE dominance is also dependent on when Intel &amp; Broadcom really decide to get into the market, and when Dell decides it&#8217;s time for 10GbE.</p>
<p>Also, in my experience, there&#8217;s an innate resistance to 10GbaseT, among the 10GbE startups, precisely because it will drive prices way down.</p>
<p>The real question, in my mind, is that when 10GbE does become ubiquitous, will it still be considered a high speed interconnect technology?  Will Mellanox, Myricom and other have moved on to the next great thing?  I hope so.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jeff Layton</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136762</link>
		<dc:creator>Jeff Layton</dc:creator>
		<pubDate>Sun, 11 Jan 2009 14:45:44 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136762</guid>
		<description>@John Casu - I hope you&#039;re right, I really really do. I would love to have a high-speed network like 10GigE on my systems at GigE prices. But for the coming future I can get inexpensive SDR on my systems at a price point that 10GigE can&#039;t touch.

I&#039;ve been waiting for almost 5 years for inexpensive 10GigE. Every year, the vendors keep saying, &quot;it&#039;s here, it&#039;s here!&quot; and the costs just are dropping fast enough.

I lived through the GigE price drop and that was fairly easy to see coming. But I just can&#039;t see inexpensive 10GigE coming. The NICs are still too expensive and the switch costs are just too high (As I mentioned before, looking at 24-port switch prices for 10GigE is misleading at best. Building multi-tiered Ethernet switches from 24-port switches will just kill performance).

So I&#039;m hoping someone comes along and sprinkles magic pixie dust on 10GigE and the prices drop to an acceptable level. Until then I just don&#039;t see it being competitive to IB in HPC.

BTW - in the company I work for, we are seeing a resurgence in IB on the enterprise side because of the performance for systems with lots of VM&#039;s and the fact that 10GigE is just not coming down in price like people would like.</description>
		<content:encoded><![CDATA[<p>@John Casu &#8211; I hope you&#8217;re right, I really really do. I would love to have a high-speed network like 10GigE on my systems at GigE prices. But for the coming future I can get inexpensive SDR on my systems at a price point that 10GigE can&#8217;t touch.</p>
<p>I&#8217;ve been waiting for almost 5 years for inexpensive 10GigE. Every year, the vendors keep saying, &#8220;it&#8217;s here, it&#8217;s here!&#8221; and the costs just are dropping fast enough.</p>
<p>I lived through the GigE price drop and that was fairly easy to see coming. But I just can&#8217;t see inexpensive 10GigE coming. The NICs are still too expensive and the switch costs are just too high (As I mentioned before, looking at 24-port switch prices for 10GigE is misleading at best. Building multi-tiered Ethernet switches from 24-port switches will just kill performance).</p>
<p>So I&#8217;m hoping someone comes along and sprinkles magic pixie dust on 10GigE and the prices drop to an acceptable level. Until then I just don&#8217;t see it being competitive to IB in HPC.</p>
<p>BTW &#8211; in the company I work for, we are seeing a resurgence in IB on the enterprise side because of the performance for systems with lots of VM&#8217;s and the fact that 10GigE is just not coming down in price like people would like.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: john casu</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136598</link>
		<dc:creator>john casu</dc:creator>
		<pubDate>Sat, 10 Jan 2009 20:46:46 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136598</guid>
		<description>Here&#039;s the thing:  While Infiniband is here to stay, the ubiquity of 10GbE is inevitable... it&#039;s just a question of time.   And the tipping point is going to be 10GbE over RJ-45, and multi-speed NICs in servers.

And no-one should underestimate the importance of being able to run your interconnect over CAT-6/7, if only for the reason that you can cut your own cables with CAT, as opposed to having to buy expensive fibre/cx-4 cabling, that cannot easily be cut to length nor repaired in the field.  Hell, cables are the single main reason the blade business exists.

Remember, we&#039;ve seen this before with regular GbE.  One minute, each NIC is $700, and the switches are hugely expensive, and the next the NICs are embedded on servers, and we&#039;re buying 24port switches from Dell at less than $200/port.   Quanta is already building low cost 24-port 10GbE switches for their OEMs.

Those who forget their history are doomed to repeat it.</description>
		<content:encoded><![CDATA[<p>Here&#8217;s the thing:  While Infiniband is here to stay, the ubiquity of 10GbE is inevitable&#8230; it&#8217;s just a question of time.   And the tipping point is going to be 10GbE over RJ-45, and multi-speed NICs in servers.</p>
<p>And no-one should underestimate the importance of being able to run your interconnect over CAT-6/7, if only for the reason that you can cut your own cables with CAT, as opposed to having to buy expensive fibre/cx-4 cabling, that cannot easily be cut to length nor repaired in the field.  Hell, cables are the single main reason the blade business exists.</p>
<p>Remember, we&#8217;ve seen this before with regular GbE.  One minute, each NIC is $700, and the switches are hugely expensive, and the next the NICs are embedded on servers, and we&#8217;re buying 24port switches from Dell at less than $200/port.   Quanta is already building low cost 24-port 10GbE switches for their OEMs.</p>
<p>Those who forget their history are doomed to repeat it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gilad</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136564</link>
		<dc:creator>Gilad</dc:creator>
		<pubDate>Sat, 10 Jan 2009 17:14:32 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136564</guid>
		<description>There are items here that need to be fixed, since they are wrong. I need to say that I am working in a company that sells both InfiniBand and 10G Ethernet.

- Price - switch prices depends on the switch configuration and the margin that the vendor take. You can find 24 port IB DDR switches from $3000 to $4000 to $5000. The statement on IB 24 port DDR switch in $16K seems to be unreal (else it is a golden switch....). The 10G 24 port switches are in the range of $16k-$24k, so if you look on price, IB is still much cheaper. Even IB QDR is cheaper than that. 

- Performance - 10G is 10Gb/s real data rate, IB DDR is 16Gb/s and QDR is 32Gb/s. Latency - if you use standard IB vs standard 10G, it is 1us on IB vs 8-10us with 10G. You can run MX over 10G link layer to get lower latency, but this require MX on both sides, and still, it will be higher than the 2u stated above (the best Eth switches are 600ns per switch hop, and MX itself is not lower than 2us…, by the way Cisco switch latency is around 3us per hop…. definitely not good for MPI….). When you run 8 jobs at the same time, you will still get the 1us per job, but with 10G the latency will increase with job count. Bottom line, InfiniBand is still the best performance interconnect, and probably will stay like that in the next years.

- Woven - unlike 10G NICs, you can find a variety of IB adapters, from SDR to QDR, and from PCIe x4 Gen1 to PCIe x8 Gen2. There are multiple flavors, each with different latency capabilities as well. If you take the lowest performance DDR card (and the cheapest of course), and run it in a PCIe x4 interface, you will get lower bandwidth than 10G.... marketing tricks to show that 10G is not worse than IB. 

- RDMA - not all application uses RDMA, but some does for good reasons. Not many people know that you get zero copy with InfiniBand if you are using RDMA OR Send/Receive. The different between RDMA and Send/Receive on IB is in the CPU overhead on the remote side (with RDMA the remote CPU is not innovated in the data transaction). 

- 10GBaseT - this was the promise of 10G - use the &quot;same&quot; cables as 1G. Good motivation, the cables will be cheaper, but the switch cost much, much, much more. Do your own math. If you need the cat6 cables, go with the 10GBaseT, but the switch/NICs will be more expensive than the SFP+ ones. 

- Cables - most of the installation will go SFP+ for 10G, CX4 for IB DDR and QSFP for IB QDR. at the end of the day, the cable cost is the same for both technologies


Enough for now ..... :-)</description>
		<content:encoded><![CDATA[<p>There are items here that need to be fixed, since they are wrong. I need to say that I am working in a company that sells both InfiniBand and 10G Ethernet.</p>
<p>- Price &#8211; switch prices depends on the switch configuration and the margin that the vendor take. You can find 24 port IB DDR switches from $3000 to $4000 to $5000. The statement on IB 24 port DDR switch in $16K seems to be unreal (else it is a golden switch&#8230;.). The 10G 24 port switches are in the range of $16k-$24k, so if you look on price, IB is still much cheaper. Even IB QDR is cheaper than that. </p>
<p>- Performance &#8211; 10G is 10Gb/s real data rate, IB DDR is 16Gb/s and QDR is 32Gb/s. Latency &#8211; if you use standard IB vs standard 10G, it is 1us on IB vs 8-10us with 10G. You can run MX over 10G link layer to get lower latency, but this require MX on both sides, and still, it will be higher than the 2u stated above (the best Eth switches are 600ns per switch hop, and MX itself is not lower than 2us…, by the way Cisco switch latency is around 3us per hop…. definitely not good for MPI….). When you run 8 jobs at the same time, you will still get the 1us per job, but with 10G the latency will increase with job count. Bottom line, InfiniBand is still the best performance interconnect, and probably will stay like that in the next years.</p>
<p>- Woven &#8211; unlike 10G NICs, you can find a variety of IB adapters, from SDR to QDR, and from PCIe x4 Gen1 to PCIe x8 Gen2. There are multiple flavors, each with different latency capabilities as well. If you take the lowest performance DDR card (and the cheapest of course), and run it in a PCIe x4 interface, you will get lower bandwidth than 10G&#8230;. marketing tricks to show that 10G is not worse than IB. </p>
<p>- RDMA &#8211; not all application uses RDMA, but some does for good reasons. Not many people know that you get zero copy with InfiniBand if you are using RDMA OR Send/Receive. The different between RDMA and Send/Receive on IB is in the CPU overhead on the remote side (with RDMA the remote CPU is not innovated in the data transaction). </p>
<p>- 10GBaseT &#8211; this was the promise of 10G &#8211; use the &#8220;same&#8221; cables as 1G. Good motivation, the cables will be cheaper, but the switch cost much, much, much more. Do your own math. If you need the cat6 cables, go with the 10GBaseT, but the switch/NICs will be more expensive than the SFP+ ones. </p>
<p>- Cables &#8211; most of the installation will go SFP+ for 10G, CX4 for IB DDR and QSFP for IB QDR. at the end of the day, the cable cost is the same for both technologies</p>
<p>Enough for now &#8230;.. <img src='http://insidehpc.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Terry Hulett</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136556</link>
		<dc:creator>Terry Hulett</dc:creator>
		<pubDate>Sat, 10 Jan 2009 16:01:28 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136556</guid>
		<description>Latency on a quiesced system with one active connection is 2-3X higher today on iWARP (RDMA/TCP/Ethernet) than it is on IB.  However, with an active system and 8 simultaneously active connections the latency difference is indistinguishable.  This observation is backed by the fact that many (if not most) applications have similar run times on identical clusters with the two different interconnects.

It is the case today that 10GbE is a more expensive to deploy than IB.  Therefore, it is incumbent on the DC manager to decide between deployment costs, TCO and manageability.</description>
		<content:encoded><![CDATA[<p>Latency on a quiesced system with one active connection is 2-3X higher today on iWARP (RDMA/TCP/Ethernet) than it is on IB.  However, with an active system and 8 simultaneously active connections the latency difference is indistinguishable.  This observation is backed by the fact that many (if not most) applications have similar run times on identical clusters with the two different interconnects.</p>
<p>It is the case today that 10GbE is a more expensive to deploy than IB.  Therefore, it is incumbent on the DC manager to decide between deployment costs, TCO and manageability.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Scott Atchley</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136378</link>
		<dc:creator>Scott Atchley</dc:creator>
		<pubDate>Fri, 09 Jan 2009 23:34:51 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136378</guid>
		<description>&quot;Also, aren’t the latencies for even 10GbE considerably higher than IB &quot;

You are confusing Ethernet and TCP/IP over Ethernet. With MX over Ethernet and a good low-latency switch, you can get 2 us.</description>
		<content:encoded><![CDATA[<p>&#8220;Also, aren’t the latencies for even 10GbE considerably higher than IB &#8220;</p>
<p>You are confusing Ethernet and TCP/IP over Ethernet. With MX over Ethernet and a good low-latency switch, you can get 2 us.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Serge Polevitzky</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136371</link>
		<dc:creator>Serge Polevitzky</dc:creator>
		<pubDate>Fri, 09 Jan 2009 23:11:29 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136371</guid>
		<description>... isn&#039;t Infiniband (IB) full duplex? ... so even if you have to pay the 10-to-8 reduction penalty (and there are plenty of preamble and postamble ethernet overhead bits you need to push, too),  you can potentially get data flowing simultaneously in both directions.  So this would seem to be a plus for IB.  Also, aren&#039;t the latencies for even 10GbE considerably higher than IB ?  -- FWIW, Serge</description>
		<content:encoded><![CDATA[<p>&#8230; isn&#8217;t Infiniband (IB) full duplex? &#8230; so even if you have to pay the 10-to-8 reduction penalty (and there are plenty of preamble and postamble ethernet overhead bits you need to push, too),  you can potentially get data flowing simultaneously in both directions.  So this would seem to be a plus for IB.  Also, aren&#8217;t the latencies for even 10GbE considerably higher than IB ?  &#8212; FWIW, Serge</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: john casu</title>
		<link>http://insidehpc.com/2009/01/09/user-response-10gbe-cluster-interconnect/#comment-136347</link>
		<dc:creator>john casu</dc:creator>
		<pubDate>Fri, 09 Jan 2009 21:55:18 +0000</pubDate>
		<guid isPermaLink="false">http://insidehpc.com/?p=3452#comment-136347</guid>
		<description>I used to do technical marketing @ Woven, and during that time did some benchmarks comparing 10GbE and DDR Infiniband.

Running HPL over 10GbE (NetEffect NICs + Woven Switch), and HPL over 4X DDR produced results that were indistinguishable from each other, at a reasonable node count.

That&#039;s not to say that Infiniband isn&#039;t faster than 10GbE, but it does suggest that when you take everything into account (PCI-E bus, driver overhead/efficiency, etc...), with real or near-real applications, using moderate packet sizes, the actual difference between the two is minimal for most real-world cases.

Of course, that should change somewhat as PCI-Express 2.0 and QDR Infiniband gain market traction, but I&#039;m guessing that in real applications, unless you&#039;re running something like Amber, it won&#039;t make much difference, performance wise.</description>
		<content:encoded><![CDATA[<p>I used to do technical marketing @ Woven, and during that time did some benchmarks comparing 10GbE and DDR Infiniband.</p>
<p>Running HPL over 10GbE (NetEffect NICs + Woven Switch), and HPL over 4X DDR produced results that were indistinguishable from each other, at a reasonable node count.</p>
<p>That&#8217;s not to say that Infiniband isn&#8217;t faster than 10GbE, but it does suggest that when you take everything into account (PCI-E bus, driver overhead/efficiency, etc&#8230;), with real or near-real applications, using moderate packet sizes, the actual difference between the two is minimal for most real-world cases.</p>
<p>Of course, that should change somewhat as PCI-Express 2.0 and QDR Infiniband gain market traction, but I&#8217;m guessing that in real applications, unless you&#8217;re running something like Amber, it won&#8217;t make much difference, performance wise.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
