Video: Scalable Informatics Steps Up IO at SC14

Print Friendly, PDF & Email

In this video from SC14, Joe Landman from Scalable Informatics showcases the company’s innovative IO solutions for high performance computing.

By working with ThinkParQ, we have been able to leverage one of the best and highest performance storage systems for scale-out deployment,” said Dr. Joseph Landman, CEO of Scalable Informatics. “When testing a write-dominated workload using fio, IOR, and io-bm,a single rack of FastPath Unison with BeeGFS running on spinning disks sustained in excess of 40GB/s for multi-terabyte sized writes,far outside of cache. This level of performance comes from the combination of FastPath Unison hardware design, the Scalable Informatics Operating System (SIOS), and the excellent BeeGFS filesystem.” Individual spinning disk storage nodes in a Unison system provide similar performance to a rack full of other vendors’ products, while Flash-based nodes in FastPath Unison provide much higher performance at similar densities.”

Full Transcript:

insideHPC: What do you guys got going here in your booth?

Joe Landman: We have a lot of very good, very interesting stuff. What we have is something we’re jokingly, but not so jokingly pulling, our portable petabyte. Could you imagine a portable petabyte? It’s this very large rack here – not so large, really – and you can put a handle on it and you can take that rack and roll it up to the plane and put it in the overhead. Well, maybe not really because it’s 1000 pounds.

insideHPC: Well, maybe on Southwest.

Joe Landman: Exactly. And it’ll only cost you 25 bucks, too. But what we have in this is we have a petabyte of storage, and this petabyte is interconnected with FDR InfiniBand, and will sustain 20 gigabytes a second in this little rack. Its big brother is at a site in California where these guys are sustaining 40 gigabytes a second, and they’re doing it with grabbing images from large cameras and dumping it out terabyte-size I/Os at 40 gigabytes a second. The small guy is 20 gigabytes a second, and we’re selling this as a special right now – a fully supported, fully delivered parallel file system, turn-key operation, for $250,000.

insideHPC: For a quarter million, what is inside there? Is there spinning rust, or is there SSD, or what?

S2 01:39 This is spinning rust. We use SSD for metadata, but it is 100% spinning rust in the data storage pipeline. It’s a fantastic system. We’ll talk about the flash systems in a second.

insideHPC: You said there is Lustre in there, and what else?

Joe Landman: There’s a parallel file system. We’re not using Lustre in the base system, but we can use Lustre. What we’re actually using is BeeGFS. It’s a fantastic parallel file system – gives us much better single thread performance and the scale-out is phenomenal.

insideHPC: Having done this myself, when you try to Google BeeGFS, you have a lot of trouble if you do. Fraunhofer FS, which is the previous name of that, is easier to find. Anyway, but BGFS.

Joe Landman: Yes, BeeGFS. And curiously enough, these guys are located right next to us in the booth space. It’s been very good for cross-fertilization – a great discussion. We send people back and forth. The BGFS guys say, “Hey, come over there, look at a petabyte of BeeGFS.” And we say, “Hey, go over there and speak about BeeGFS. Get the T-shirt, and they have awesome coffee, too.”

insideHPC: What else is going on? Can you show us the other stuff in your booth?

Joe Landman: Yes, certainly. We have a next generation version of what we call our siRouter. SiRouter is this little device down here. In this unit, we have 108 1-gig ports, 16 10-gig ports, I believe two 40-gig ports on the rear of it. Its purpose in life is actually not to be a switch or a router, but it’s an extraordinarily fast NAT device. This device is basically used for Cloud folks, for anyone who has a large cage where you want to bring in fibers from other organisations and have them go into your fabric in a software-defined manner. This is truly, at the most base level, a software-defined networking product. On this unit, we have a fair amount of computing and memory as well as storage capability. We light up VMs and containers, and with specific functions. It’s similar in concept to what the folks at [POJO?] data have done, but we were doing this a little bit before they were. This is in use by our friends over at Lucera Financial Services as the base of their SDN system. It’s a fantastic product and gives them absolute control over where packets go.

insideHPC: Joe, when I think of Scalable Informatics, I think about fast I/O, but I don’t really think of switches. Why is this here? Why do you need that?

Joe Landman: Well, it turns out our friends over at Lucera had asked us many years ago, “Can we build something that NATs packets into their Cloud as fast as possible?” My first reaction was, “Let’s just go and buy a switch from Cisco or go buy something from Arista.” And then when we looked at the specs with the customer, the customer said, “These aren’t going to work. This is what I need.” And I said, “No, it’s impossible to do it.” And their CEO said, “No, it’s not impossible. Figure out how to do it.” Our first version of it was actually a gargantuan box, a 4U box, with this ugly thing. We brought it in, we started playing with it, and these guys are– they go away for a while, they come back and they say, “Performance on that is better than anything we can get in the market.” We’re like, “Really? Okay, that’s good. Let’s try the next version.”

The second version was actually a 2U unit with everything rear mounted, and we had 44 ports and they played around with it. They were getting four microseconds port-to-port on the NAT, full wire speed for as many connections as they needed, and they were doing this as an SNAT, as a DNAT, bunch of other things. Now, we’ve got 108-port unit, and this 108-port unit is a much denser configuration. We’re not trying to build a switch here. We’re not trying to build a router even though we can run router processes in it. We can run BGP and RIP and everything else. What we’re trying to do is we’re trying to just build an awesomely fast on-ramp for Clouds. This is basically cross connection box, a very dense cross-connection box. Right now, we have 1G and 10G and 40G, but we’ve got customers asking us to do 10G, 40G, and 100G. And 10G, 40G, and InfiniBand, and all these other sorts of very interesting cross-connection type of capabilities.

insideHPC: In the case of Lucera, this is one of your customers. Are they selling this as a solution that they just plug into in their Cloud? How do they deliver it to their customers?

Joe Landman: Certainly. What they do is they light up the virtual machines and containers that end users will be able to connect to through their cages themselves. Their cages maybe an NY4 or NY2. They run a fiber into this, and then from there, that goes into the Lucera backbone network, and that is NATed directly over to their virtual machines or containers. So they get a virtual private Cloud with full security and–

insideHPC: Right, because security and latency are king for those guys, right?

Joe Landman: Absolutely. They want their processes running in the Cloud within four microseconds and being able to hit that market. So that’s what they’ve got with that, and they want to be able to access those processes at very high speed from their own cages. That’s what this gives them.

insideHPC: Excellent. What else we’ve got here? We’ve got some storage device here, the side flash.

Joe Landman: Side flash storage device. This is actually the basis of our FastPath Cadence appliance as well as several of our other appliances. What side flash is, is basically a very large– think of it as a flash array, but it’s not an array in a traditional sense. As with everything else we do, we have a very tight coupling between a computing system in the back as well as the I/O system in the front. And the I/O system here, we have up to– take it off you can see on the inside.

insideHPC: Yeah, let’s look inside. Whoa.

Joe Landman: We have actually 64 solid-state drives, and the solid-state drives can be pretty much any size you’d like. We can go from very small systems at 100 gigabytes per drive up through currently four terabytes per drive, which would give us 240 terabytes of solid-state disk in a 4U container. The beautiful part about this is last year, we’re showing the soft– we actually showed this to you, and I believe you have a great video of this. Last year’s version of this, we sustained 30 gigabytes a second.

insideHPC: Yeah, we had the speedometer.

Joe Landman: We had the speedometer. We’re not showing this version because we actually have the wrong type of SSDs in there right now – we didn’t get the 12G SSDs. But this is our 12G chassis. Last year’s chassis was our 6G chassis. Our 12G chassis, each drive we can drive it at a gigabyte a second. Well, 64 drives being driven at a gigabyte a second, it’s not that hard to figure out that this is a very fast box.

insideHPC: There’s no spinning rust here at all, is it?

Joe Landman: There’s no spinning rust here at all, and we can put up to 36 processor cores up to a terabyte and a half of memory in the back, as well as multiple InfiniBand, 40 gig. And we even have a customer asking us to build him 100 GigE NIC. Not 100 Gig IB NIC, but 100 GigE NIC which is tremendous.

insideHPC: Joe, hang on a second. Who needs this kind of I/O and this kind of density? Who are they?

Joe Landman: There are two groups that come into immediate mind. One of our most recent customers for this type of system, this performance system, is folks such as the Broad Institute, the genomics guys in Cambridge, a world-class genomics institute. These guys have very large scale computing processes that are extraordinarily seeky by their very nature. Whenever you’re doing assembly operations or other operations on very large numbers of small files, those sorts of operations tend to go very well on SSDs. So, if we can provide a compact system providing tremendous efficiency and tremendous performance, these guys can use actually fewer of these to get much better performance than they’re able to get on much larger systems. That works out tremendously well.

The other aspect of this is we have folks in financial services. Again, folks like Lucera and their entire Cloud is made up of exactly this type of design. Their entire Cloud is a flash-based Cloud and leveraging 40 GigE as an interconnect. And because of that, they have very simply the world’s fastest Cloud. They just did a STAC-M3 benchmark and published the results yesterday. Those results yesterday– we were very happy with STAC-M3 for a year. We held two-thirds of the records for that and we just got beat yesterday. We got beat yesterday on our own hardware. So now I think, on the hardware side, 15 of 17 benchmarks we own.

insideHPC: For those that might not know, what is STAC-M3? Is it some kind of financial services benchmark?

Joe Landman: Yes. It’s actually a trading analytics benchmark where people take a huge number of ticker symbols and perform time series analytics on those large number of ticker symbols. They perform queries, they perform analytical operations. The issue that you discover very rapidly is your code is bound by the performance of your I/O system. If your I/O system is slow, you’re not going to perform well. STAC-M3 allows people to explore their full system stack. You’re not doing just a conjugate gradient solution. You’re not doing the LU decomposition. You’re actually doing something that traders will do, that analysts will do on a day-to-day basis. And because of that, it’s a fairly meaningful benchmark for a lot of people in the financial services. This system dominated last year. This system dominated 10 of 17 benchmark results. And with the new results, we’re now, I think, at 14 or 15 of 17.

insideHPC: Joe, you and I have talked about the topic of stupid benchmarking tricks before, and the difference, and how you guys apply what I would call integrity to that. I mean, help me out with this. How does that work here?

Joe Landman: Certainly. There are folks that will show you tremendous benchmark numbers, which we’ve just seen, just out of this world. You see–

insideHPC: 40 quintillion IOPS! Okay?

Joe Landman: Yeah. Somewhere on the show flow, say, within 100 feet of me right now, you will see speedometers which show you tremendous numbers. But, the moment you put an application on those systems, you see much lower numbers. The reason being is people are trying to gain those benchmarks to show these really high numbers. The way we approach things, we start at the application, at the end user results that they want to achieve and the runs that they want to use to achieve them, and we work from that process. We try to understand, how does the system behave? How does it need to behave? What knobs we need to turn, both on the design side as well as on the software stack side? And those changes, those are absolutely the most important things you can do.

insideHPC: Well, you got anything else here? Because I remember seeing something about a desktop Hadoop. What’s that about?

Joe Landman: Let’s take a look at the desktop Hadoop box from Basement Supercomputing. This is related to our Concert appliance. Our Concert appliance is the bigger brother of this, but the desktop Hadoop system is basically a big data system in a small package. It’s tremendous. This system allows you to start doing your development and testing on your big data applications, your Hadoop type queries. Everything you want to do, you can start it here. And when you need to go to a much larger data lake– our Unison systems, when we run Hadoop on them – we call those a Concert appliance – it uses exactly the same software stack so you can migrate seamlessly. You can migrate all of your queries seamlessly over to the bigger system when you need to do it. But before that, you can start developing on this and using this right away.

insideHPC: I love this, because Hadoop has this tremendous reputation that is just a nightmare to implement, and you can’t even find these guys to hire that know how to do this right. But here, you got it. It shows up, it sits next to your desk, and it’s ready to go?

Joe Landman: Yes. It’s completely turn-key. As the Basement Supercomputer guys tell you, you own the power switch. You own the reset switch [chuckles]. And that is phenomenal, which means you have complete control over your environment, which allows you to do the things you need to do quickly and easily. Again, it’s a turn-key system, and the environment here is exactly the same as the environment on our Concert series. It’s a seamless step upwards, and you may not even need that stuff upward. This is, again, a– I cannot sing the praises of this enough. This is a fantastic box.

insideHPC: So you guys are distributing this through Douglas Eadline and Basement Supercomputing. I see how it kind of ties into the whole ecosystem of what you’re doing, but it just speaks to me, Joe, that you guys are not just delivering fast I/O boxes, you’re delivering solutions.

Joe Landman: Absolutely. We’re designing and building large-scale appliances that handle a number of use cases that our customers need and have talked to us about for years. That difference– I mean, you can buy boxes from anyone on the show floor.

insideHPC: It’s just a big, big PC with a lot of slots installed.

Joe Landman: Yeah. It’s so much more than that. That’s the beauty of it, because there was care taken in the design and the implementation. And when you get the box, you plug it in, you turn it on, you enter a couple of things, and it just works. We want to make, collectively – the Basement Supercomputing folks, us, all of our friends and partners at the booth – we want to make the super computing and high performance big data experience as simple and as straightforward as possible, as rapid to start as possible, so you’re not spending time waiting for something you don’t need to wait for.

Sign up for our insideHPC Newsletter.