Entries filed under “InsideTrack”

News from the HPC community before anyone else has it.

Inside Track: Runtime Design Automation to Enter HPC Market at SC11

Next week at SC11, Runtime Design Automation (RTDA) plans to announce their entry into the HPC marketplace with their Distributed Resource Management (DRM) suite of tools.

You might be familiar with RTDA from their EDA software including the NetworkComputer and WorkloadAnalyzer. According to the company, these tools have increased the production capacity of compute farms by much as 10x, translating into faster time-to-market and significantly higher profits for customers.

RTDA will be sharing booth #6006 with Aeon Computing at SC11 in Seattle, November 14-17.

Also posted in Events, HPC, HPC Software, SC11 | Leave a comment

Inside Track: Blue Gene Architect Alan Gara Leaving IBM for Intel

insideHPC has confirmed that IBM Fellow Dr. Alan Gara is leaving the company to join Intel. As the chief system architect of the three generations of Blue Gene supercomputers, Gara was an IBM Fellow at T.J. Watson Research Center and was leading exascale system research at IBM.

As the 2010 recipient of the Seymour Cray Award, Gara has had a storied history in HPC. His Blue Gene/L system was the #1 system on the top500 list for 5 consecutive dates. The Blue Gene/P and the latest Blue Gene/Q systems both debuted as #1 in terms of energy efficiency on the green500 list. Alan Gara received his PhD in physics from the University of Wisconsin, Madison in 1987. Dr. Gara has received two Gordon Bell Awards in each of 1998 and 2006 for his scientific work in supercomputing.

Update:An Intel spokesperson confirming this story said that Dr. Gara will not be working in HPC for the first year. This is interesting in that the company has been staffing up with top-level supercomputing talent including John Gustafson (Sun) and more recently Dr. Bill Feiereisen (Lockheed Martin) and Mark Seager (LLNL). One thing is for sure, the loss of Alan Gara is a significant setback for IBM as it readies it plans to deploy several 10 Petaflop Blue Gene/Q systems in the near future.

Also posted in HPC, HPC People | 9 Comments

Inside Track: Mark Seager Resigns from Livermore, On to Intel

Mark Seager, leader of the Hyperion project at Livermore and one of the most vocal figures in the Lustre community, has reportedly resigned from LLNL and is headed for a new position at Intel.

Dr. Seager received his B.S. Degree in Mathematics and Astrophysics at the University of New Mexico at Albuquerque in 1979 and received his PhD in Numerical Analysis from the University of Texas at Austin in 1984. Mark started working at Lawrence Livermore National Laboratory in 1983 and has been working in the field of parallel processing ever since. He manages the Platforms Program for the Advanced Simulation and Computing (ASCI) Program at LLNL and has successfully managed partnership to successfully deploy architectures such as ASCI Blue Pacific (3.9 TF/s in 1998), ASCI White (12.3 TF/s in 2000) and the powerful LLNL Linux clusters (MCR at 11.3 TF/s in 2002 and Thunder at 23 TF/s in 2004). He then went on to manage the IBM contract for ASCI Purple (100 TF/s in 2H05) and BlueGene/L (180/360 TF/s in 1H2005). In his most recent position as principal investigator for ASCI platforms, Seager was the head of the Hyperion project at Livermore.

Mark is a Board member of OpenSFS, a technical organization focused on Lustre and other high-end open-source file system technologies. As to what Seager’s departure means, my guess is that we will see some changes in OpenSFS, which has been criticized by members of the community for not being open enough.

Where will Seager go next? Word has it that he will take on the role as Supercomputing Architect at Intel, which would seem to be a great fit. We wish him well with his new opportunity.

You can check out my interview with Mark Seager from September, 2010.

Also posted in HPC, HPC People, HPC Software | 1 Comment

Inside Track: Oracle has Kicked Lustre to the Curb

He's mean. He's nasty. And he hates everything to do with Christmas!

The Bumble. He's mean. He's nasty. And he hates everything to do with Christmas!

Companies usually wind down the week before Christmas, so you don’t usually see them make a lot of strategic moves or announcements. And so it was with some marked astonishment that I received an anonymous tip that Oracle ceased development of Lustre right before the holidays. Not out of a job quite yet, Lustre engineers have reportedly been encouraged to apply for other positions within the company.

You might question the timing of this move, but what better way to bury a story than to pull the plug when everyone else is home for ten days singing Kumbaya?

No one at Oracle responded to requests for comment on this story, but that’s how the company works. When I went poking around the Lustre community for some kind of confirmation, not one individual was surprised about this development. And while most were reluctant to go on the record, they all had heard the same story.

Careful What You Ask For: Short Term Support of Lustre a Key Issue

As Oracle exits the Lustre business, support of current Lustre installations is a real short term issue that will need to be resolved quickly.

The National Labs may be able to hold their own, but they can not provide software support to other institutions. That will be the job of the many vendors that have or support Lustre on their storage for customers to buy. Open source organizations probably have no role to perform in providing support; they can perform a role in hosting open source code repositories, bug databases, and providing an open “forum” for customers and vendors to collaborate.

The good news is that, in 2010, market forces responded to the opportunity presented by the Lustre Limbo, with new companies forming (Whamcloud) and others like Xyratex staffing up to fill in the gaps.

Norman Morse, CEO of OpenSFS, had this to say:

As your article points out, there is a huge commitment to and dependence on the Lustre file system in the Supercomputing community. Because of the requirement for a vibrant Lustre system in the future and given speculation about possible changes in Lustre support, members of the community, including some who caused Lustre to be created, founded OpenSFS specifically to ensure Lustre continues to meet community requirements and remains the preeminent open source parallel file system for high performance computing.  We are moving forward with this as our mission and we encourage all members of the Lustre community to join us – see www.OpenSFS.org.

Again as your article points out we don’t wish to comment or speculate on Oracle’s plans but OpenSFS have always stated our intention to fully cooperate with Oracle as they develop their business plan for Lustre. Through member dues, OpenSFS has resources at hand to make a major contribution to continued support and development of Lustre by funding Lustre support organizations.

The Lustre Silent Auction?

We can look forward to more Limbo for a while, but what happens next with stewardship of Lustre? Will Oracle quietly kill it like they did with OpenSolaris? Will they set the legacy code base free like they did with Grid Engine? Or will they just cash in and sell it?

Chances are that Lustre is being shopped around and we won’t hear a peep from the Dark Tower until a sale is announced. That could be months, and that kind of prolonged uncertainty would not be good for the Lustre community.

Now, if you’re wondering who might buy Lustre, you probably should look at who stands to make a lot of money supporting Lustre or selling disk storage systems that run it. I’m thinking DDN, Xyratex, or even Whamcloud. At last count, there was something like 100 Lustre engineers within Sun, so continued development is going to require deep pockets.

A Great Opportunity for the Lustre Community

As I wrote in a recent story about Lustre engineers joining Whamcloud, Oracle’s Lustre bumbling has rallied the Lustre community in a way that might not have been possible before. With the formation of HPCFSOpenSFS, and the European Open Filesystems Group, the community has done remarkable job of organizing to ensure that the popular open source file system remains viable for their pending supercomputer plans.

I know what you’re thinking; since Lustre is open source, maybe it doesn’t need a corporate holding company. The National Labs helped spawn Lustre and they can just take it back.

Why There isn’t an Easy Answer

Herein lies the rub: according to the published Lustre roadmaps, the future of Lustre on Linux is the incorporation of ZFS. Who owns that code? Oracle. So this divorce will come with strings.

Speculation and second-guessing aside, Lustre remains important for lot of reasons. Half the systems on the TOP500 run it, and there’s no open source replacement out there at the moment that is ready for prime time. And while no one in the Lustre community would probably cry if Oracle fell off a cliff, any kid can tell you that Bumbles bounce.

Also posted in HPC, HPC Software | 16 Comments

Video: Why Supercomputing Needs to Come Back to Portland

Our Video Sunday feature continues with this hilarious showcase of my hometown — Portland, Oregon, “the place where young people go to retire.”

While I had a great time at SC10 in New Orleans, I think there’s a reason that SC09 in Portland broke the attendance records with 12,000 conference goers. Let’s bring it back!

Here is the unofficial list of future SC shows:

  • SC11: Seattle
  • SC12: Salt Lake City
  • SC13: Denver
  • SC14: New Orleans (so I heard anyway)

Also posted in Events, HPC, SC10, Video | Leave a comment

And now the Verari website is off the air [UPDATED]

Verari websiteI don’t want to read too much into this, since it could be that their Apache server has just gone tits up and will be back, but following up on comments left at this site and on Twitter it is apparent that Verari’s website is indeed down. Coming as it does on the heels of a forced two week employee furlough, this is a curious sign indeed.

Of course it could just be that the employee who is in charge of the server is on furlough, making the site down a meaningless coincidence.

[UPDATE] Mike LaPan, who sports a verari.com email address and is therefore credible, has left comments saying that the site is just down. Evidently GoDaddy is working on it. Good luck getting it fixed guys…someone is definitely having a bad day. Comments below if you want to read them yourself. At ten past four central, they are still down.

Also posted in Business of HPC | 7 Comments

InsideTrack: NVIDIA Fermi Performance with CULA

Hot off the presses this morning are some real benchmarks on the latest NVIDIA Fermi gear.  We’ve all heard the technical news from the latest in silicon goodies from NVIDIA, but not a whole lot with real workloads.  We were tipped off this morning on a ‘hot off the presses’ blog post from the nice folks at EM Photonics.  They’re in the biz packaging mathematics libraries, called CULA, geared toward the NVIDIA platform.  They released some performance bits with their latest release of CULA, version 1.3a.  Now that they have their release out in the wild, they focused some engineering time on beginning to port and adapt CULA to NVIDIA’s Fermi platform.  They posted the first series of benchmarks on their company blog.

Hot off the heels of a 1.3a service release, we’ve got some brand new information on the future directions of CULA.  Today we’ll be talking about Fermi, NVIDIA’s next-generation GPU architecture that was announced in September at the GPU Technology Conference.  At that time, we shared our thoughts on the new and exciting performance we hoped Fermi would bring.  After 6 months of anticipation, we’re very proud today to debut the first performance results for CULA running on Fermi.  To our knowledge, these results are the first published double-precision performance results for Fermi running real-world code. [Kyle Spagnoli]

With only a few compiler flags and some driver upgrades, the engineers at EM Photonics were able to achieve some very tasty speedups on traditional linear algebra solvers.  Specifically, they posted numbers for LU decomposition using DGETRF and QR decomposition using DGEQRF.

As you can see, Fermi is no slouch!  We’re reporting performance gains for doubles up to 3x over the previous generation of Tesla GPUs.  It’s also very important to note that these gains are achieved with no Fermi-specific optimizations added — these are practically plug-and-play performance enhancements.  We have every expectation that with a little time and effort we can improve significantly upon these already impressive numbers.

Rest assured that the folks from EM Photonics will be tweaking the latest 1.3a release and optimizing performance for the latest in NVIDIA silicon.  Check out the original blog post here.

Also posted in GPUs, HPC, HPC Hardware, HPC Software | 1 Comment

Inside Track: Employees at new Verari head for furlough as company struggles in recovery

It’s been a while since we wrote about Verari. You’ll recall that Verari Systems went out of business back in December and had to sell assets to pay creditors, laying off all their employees in the process. Then co-founder Dave Driggers managed to put together some financing and buy some of what was left to try and make another go of it as Verari Technologies. The company didn’t say much after it started getting systems back on line, but we assumed things were clicking along.

Evidently not.

I have been hearing via Twitter and a few other outlets (and on this site; see the more recent comments on this article) that the company is not doing well, and even that a furlough might be in the offing. That was confirmed yesterday via an email from someone who attended a company-wide conference call on the 15th of April.

According to that source, the news on the call was grim. Sales are much lower than expected, and evidently they cannot find partners willing to license and sell Verari IP (this partner strategy was part of the early press release the company issued right after it re-started). Verari has lined up an outside manufacturer (Celestica), but they haven’t gotten any sizable orders from Verari and, according to my source, also haven’t put any direct sales staff in place to move Verari hardware as they had originally planned to do. Other sources within the company also say that Verari’s current sales staff does not include any of the top performers from the old Verari Systems, and that seems to be hurting them as well.

The furlough was also discussed on the call, and “many” current Verari employees are headed for a two week furlough which could become permanent if things don’t improve.

I’m sorry to hear that the company is still struggling, and sincerely hope they are able to turn it around. If you have additional insight into what’s going on, leave a comment or drop me an email.

Also posted in Business of HPC | 9 Comments

UV login seen in the wild

SGI logoDuring his presentation today at the Newport HPC conference Eng Lim Goh, SGI’s CTO and the brains behind SGI’s x86-based shared memory future, logged into an SGI UV and gave it a little exercise for the audience. He mentioned that it was 1,000 cores in one 2 TB shared memory space. Was good to see things are on track to start shipping in summer.

Also posted in Compute, HPC Hardware | 1 Comment

InsideTrack: SGI about to launch a new product for SMB

This morning’s RSS feed from SGI has provided what I’m assuming is an unintentional early peek at SGI’s next product. The headline in the RSS feed is

SGI logoSGI Announces Origin 400 Blade System for SMB and Enterprise Markets

There isn’t an article linked to the headline, though, and no word of the product is currently on SGI’s web site. Of course Origin 400 is a decade-old product name, so this could have just been a server glitch that pulled an old article.

Except it isn’t. While I haven’t heard back from SGI officially, sources inside the company have confirmed for me that SGI is indeed launching a new product, and recycling the name (as they did with the recent Octane announcement).

SGI’s Origin 400 is a new project that was developed within the company under the codename ‘Clearbay.’ The product isn’t scheduled for launch March 16, so I’m guessing that SGI’s PR machine or web monkey is going to get a talking to about this.

The information we have right now is that the Origin 400 a 6U form factor with  6 blades each having 2 sockets, and 2 RAID storage pools integrated as a NAS.

We’ll update the story as more information comes in.

Also posted in Business of HPC, Enterprise HPC | 4 Comments

InsideTrack: TotalView Technologies sold to Rogue Wave Software [CONFIRMED]

It seems that TotalView Technologies, maker of the eponymous debugger and several other tools to help with development of large scale parallel applications, may have gotten married over the past several days.

TotalView logoIt turns out that TotalView trades over the counter on the Norwegian stock exchange under the ticker symbol TVTI (click here and scroll down to the “t’s”). A vigilant international man of mystery sent me a link to a press release posted at the exchange that indicates TotalView is being acquired by Rogue Wave Software. The release is in Norwegian, but Google translates it as follows:

Rogue Wave Software, Inc. signs agreement to acquire Total View Technologies, Inc. (TVTI)

Total View Technologies, Inc. (TVTI) announces that the company on 14 December 2009 has entered into an agreement regarding the Rogue Wave Software, Inc’s acquisition of TVTI. The owners of the majority of the outstanding shares of TVTI approved in writing, shortly after the conclusion of the agreement, the transaction.

If sales are consumed each shareholder will be entitled to receive U.S. $ 1.42 per share, less approximately U.S. $ 0.10 per share in closing costs associated with the sale. In addition, this transaction subject to certain common escrows and adjustments are explained in an information document (Information Statement) which is sent to all registered shareholders in TVTI per 15 December 2009. Copy this information document (the Information Statement) are available upon request to Adam Schauer on adam.schauer @ totalviewtech.com. Information Document (Information Statement) gives details of the acquisition (referred to as a “merger (merger)” in the document), including common questions and answers, summaries of financial information and reasons why the Board of TVTI approved the transaction and recommended it to shareholders . The consideration per share in the transaction represents a premium of approximately U.S. $ 1.33 over the last registered trade in the shares on 16 December 2009 (based on current exchange rate).

The acquisition is structured as a merger of a subsidiary of Rogue Wave with TVTI and is expected to be carried out on 4 January 2010 or earlier. If the transaction is completed, all shareholders, shortly after the transaction is completed, receive instruction on how to receive their merger consideration in exchange for their shares. Registration date (record date) to shareholders will be entitled to merger consideration will be 31 December 2009. For those shareholders who have TVTI shares registered in the VPS, the registration date (record date) in the VPS to be 30 December 2009 (in other words, shareholders who appears as such in the VPS by the end of December 30, 2009 will be entitled to the merger consideration). Registration of transactions in TVTI shares in VPS will be blocked by the end of 23 December 2009.

TVTI will ask for deregistration from the OTC-list as of the date of closing of the transaction if it is implemented.

With 5,983,502 shares outstanding, the acquisition values TotalView at $8,496,572.84 US. I’ve contacted the company for more information and I’ll pass it along when/if I hear back.

[UPDATE] I was able to get in touch with the President and CEO of  TotalView Technologies. He did confirm the sale, but is holding off on providing further details until after closing. When we know more, we’ll let you know.

Also posted in Business of HPC, Featured Stories | 5 Comments

insideHPC exclusive with Verari CEO: “The doors are open.”

Verari Sytems logoI just got off the phone with Verari CEO David Wright, who took a few minutes to talk with me this morning about what his company is going through. It was an interesting conversation.

Wright describes Verari as being in a “controlled reorganization,” which is consistent with what employees were told at the all hands meeting last week (going by the comments on the original story, anyway). According to him, this is not a Chapter 7 or 11 proceeding, but it is a reorganization, and he expects it will take about six weeks to resolve.

Also, Wright was at pains to point out that the doors are not closed. Although he did say that most of the staff is gone today, there are people in the office working on a way forward — “some on payroll, and some not.” These folks are working on a support plan for existing customers which Wright hopes to have on the web site “today or tomorrow,” as well as on how to fill existing orders. Verari’s support comes largely from business partners, so they may actually be able to come up with something there.

Although he isn’t able to comment on much at this point, when I asked about how the business got into this shape he ascribed it to a lot of factors, including a credit market problem. Evidently Verari’s is a capital intensive business, especially the container business, and according to Wright they couldn’t get the cash to service their “substantial backlog.”

I also asked about a comment left on GigaOm about a potential new business rising from Verari

A rumor is circulating that a bunch of “key employees for the container business” will be rehired next week and be part of a New Company started by one of the key investors in Verari… the current stockholders, investors, and so on are out of luck, but key members of Verari management are going to that new company. Seems like a deal for those involved, if true.

Wright declined to comment, but did commit to staying in touch with us.

Also posted in Business of HPC, Featured Stories | 15 Comments

InsideTrack: Verari Systems out of business [UPDATED]

Verari Sytems logoDuring SC09 a well-connected friend told me that Verari Systems was in trouble for lack of access to capital — essentially the same immediate cause as SiCortex. Then in response to my post a couple weeks ago asking if anyone had seen Verari at SC09, I got a tip that they had registered for booth space but never showed up. I really don’t like posting speculation about going out of business, since a good rumor like that can actually make itself come true. So I waited.

Today I’m hearing that Verari is locking (or has locked) the doors. First, VerariGuy has this Twitter stream over the past 22 hours

These execs should be arrested
about 4 hours ago from web

Conference call at 9am PST to let everyone know we’re done. If I knew I wasn’t talking to myself I’d have half a mind to post the phone #
about 6 hours ago from web

Well everyone, #Verari Systems is now dead. The doors are locked, and people have taken out anything not nailed down
about 22 hours ago from web

Then on my blog just a few minutes ago, a comment on the SC09 post left by reader “foo”

Verari is out of business s of today at 5:00 pm cst!

Also I’ve tried contacting folks I knew at the company with no success. No one is in their offices, and the corporate headquarters in San Diego was routed to a phone message that says I’ve called outside “normal business hours” when I tried to call at 1:00 pm Pacific. That’s enough evidence for me to put them in the deadpool, at least tentatively. If you have more info, leave a comment. Meanwhile I’m continuing to try to contact the company.

[UPDATES]

1. VerariAlumni.com also indicates that the company is out of business

Since Verari is no longer around, I’ll be using this area to communicate anything I believe my peers may need to know about, or may find of interest. This section will be built out more and more as time progresses. Of all the pages on this site, this will be the one updated most often so check back in.

2. Subsequent calls to the company by a friend did raise a person who said “I have nothing to say.” Which isn’t a denial.

Also posted in Business of HPC, Deadpool, Featured Stories | 34 Comments

InsideTrack: Former employees confirm Quadrics officially out of business last week

In late May we reported on rumors at The Reg and the New York Times (here) that interconnect maker Quadrics was heading for a shutdown of operations in June. Subsequent Googling turned up a big fat zero, except that the rumors hadn’t yet been confirmed.

I started sending some emails around, and heard back from former Quadrics employees, who confirmed that four Quadrics staff transferred to  Vega Ltd., another company owned by the parent of Quadrics, Finmeccanica UK. Vega acquired Quadrics’ outstanding support contracts, including the flagship Tera10 at CEA in France, and they also got the remaining Quadrics hardware stock to supply spares for existing customers. There is also some word that Vega has acquired some of the Quadrics IP, but no clear indication yet on what will happen to that.

I was also told that last monday (Jun 29) was the last day that the Quadrics office was open and through which people were paid.

Evidently the company was very close to coming to market with its QsNet III when things shut down. Here is the narrative as it was related to me

QsNet II, its predecessor came onto the general market in early 2004. The last significant QsNet II system was installed in Oct 2008 at British Aerospace in the UK, having beaten off 2 other vendors who offered Infiniband solutions — the Quadrics based solution won hands down on the customers own benchmarks.

Quadrics has been working on QsNet III since the design work on QsNet II was finished in 2003. Quadrics taped out the new chips in late 2008, and got as far as demonstrating sending messages from one node to another via several QsNet III switches before the plug was pulled (no pun intended) in April 2009. QsNet III had 25 Gbit/s in each direction per link. Each PCIe card had dual links, such that the theoretical peak comfortably surpassed QDR IB. The Elan5 ASIC that formed the basis of QsNet III had an incredible seven cpu cores, each a 500 MHz dual issue RISC with 8 loads and 8 DMAs pending, and 4 outstanding DMA writes. Each had a 16 Kbyte instruction cache and a 9 Kbyte DMA buffer (see paper here).

All this complexity made the Elan5 a very powerful communications processor indeed. However this complexity, coupled with the budget constraints put on the staffing the design team led to the project seriously overrunning. As recently as 2006, Quadrics were showing a roadmap (slide 9) that had the Elan5 being ready in early 2007. Another blow to Quadrics was when a group of the core design team left to form a new company at the end of 2007. During the course of 2008, several other Quadrics staff left to join them. One of the final nails in the coffin was when the last of the original founders of Quadrics, Duncan Roweth, left in early 2009 after 13 years with the company. He is now a principal engineer at Cray.

Quadrics, 1996-2009, now officially and firmly in the deadpool.

Also posted in Business of HPC, Deadpool, HPC Hardware, Network | 3 Comments

NVIDIA updates us on the Tesla shortage

nVidia logoAs commented on by Joe Landman at his blog, Tesla C1060s that plug into desktops and workstations and form the cornerstone of the Personal Supercomputing initiative have been hard to come by so far this summer (note that this is a different unit than the rack-mountable S1070). Last year the company was announcing partnerships around the C1060 to promote the Personal Supercomputing idea, including this announcement with Cray for the CX-1, and as recently as May announced that Dell would sell the cards in their high-end products.

What hasn’t been as widely discussed is the fact that the units can be hard to come by. From Joe’s post in mid-June

If you haven’t heard, Tesla’s are hard to come by. We have several Pegasus systems that customers have purchased, that we can’t get the units for. All of the distributors and resellers we have spoken to indicate that they are getting a small fraction of their orders filled. We have had units on order over a month. Several more orders, and a hard deadline to get units filled.

Joe recently updated his blog to report that things were loosening up a little

They are getting backlog serviced as fast as they can. Looks like they are clearing it out pretty quickly. This is good.

I reached out to NVIDIA’s Andrew Humber early this week to get the official word on what’s going on with these units. Here’s what he had to say

Due to the large demand we are seeing across the board — a number of large cluster installations, additional demand from OEMs, and the success of a developer promotion we are currently running, we are left in a position of ramping up production to meet demand. This is naturally a good problem for NVIDIA to have, but we understand that it might leave our customers waiting in some cases. Please note that product is flowing through at high volume and customer backlog is being fulfilled worldwide.

As Andrew points out, this is a high class problem to have, and one shared by (for example) Apple with its iPhone and iPod lines. I don’t run a manufacturing business, but I imagine it’s pretty hard to forecast demand for a new category of product, and guessing too high is probably very expensive.

Also posted in GPUs, HPC Hardware | 3 Comments

Advertisement


View All Videos

insideHPC.com is a production of insideHPC, LLC. © 2006-2011 Sitemap