Entries filed under “Datacenter operations”

New items related to configuration of datacenters or the equipment in them, innovation in the deployment of power and cooling infrastructure, monitoring, and the operation of large scale datacenters

Webinar: Eliminating Complexity in Managing GPU Clusters

This week Bright Computing announced that James River Technical will sell and support Bright Cluster Manager. James River focuses on the Education space, and this is looks like a good move for Bright Computing as the company seeks to gain traction for its Linux cluster management software.

As part of this new collaboration, James River will host a Nov. 10 webinar on Eliminating Complexity in Managing GPU Clusters. Topics include:

  • Key challenges in managing GPU clusters
  • Two models of GPU management: Tool-kit vs. Single Solution
  • How to configure a GPU system from “bare-metal” to functioning, high-performance cluster in less than one hour
  • How to scale to thousands of nodes, including systems comprising a mixture of GPUs and CPUs
  • How to seamlessly switch between CUDA versions

For more information on how to register, check out the James River webinar page.

Also posted in Cloud HPC, GPUs, HPC, System Management | 1 Comment

Rocky Mountain Super Center Welcomes New Board Member

The Rocky Mountain Supercomputing Center announced a new member to their Board of Directors.  Susan L. Baldwin, Executive Director of Compute Canada, was named to the board.

Susan is an excellent addition to the RMSC Board because she shares our vision of bringing High Performance Computing technology to small- and medium-sized businesses,” said Earl J. Dodd, RMSC Executive Director.

RMSC and the State of Montana have seized the leadership position in harnessing HPC technology as an economic driver that makes businesses more competitive in the global market place,” said Susan Baldwin.

Under Baldwn’s direction, Compute Canada has integrated HPC resources from seven partner consortia across Canada to create a powerful and dynamic computational resource.  Compute Canada and the university-based regional HPC consortia provide for overall architecture and planning, software integration, operations and management, and coordination of user support for the national HPC platform.

Congrats to RMSC and Susan Baldwin.  For more info, read their full release here.

Also posted in Collaborations, Enterprise HPC | Leave a comment

University of Arkansas Receives $1.7 million for HPC Improvements

The University of Arkansas at Fayetteville announced that it has received a $1.7million grant from the National Science Foundation to improve the facility that houses the supercomputers for the college.  The funds will be used to purchase air conditioners for the supercomputers and equipment that ensures that electricity can run around the clock, since many research projects require programs to run for several days and cannot sustain a power outage.

This National Science Foundation grant will allow Arkansas to move forward substantially in the area of research computing. It will provide the infrastructure that we need to house large-scale computers and storage that support several areas of science, and will improve our ability to compete nationally. We are delighted that the National Science Foundation has chosen to support this project,” said Amy Apon, Ph.D., University of Arkansas computer science professor and Director of the Arkansas High Performance Computing Center.

The funding will also help enhance the network capabilities available to UA researchers from the state’s high speed optical network.  The high speed optical network, the Arkansas Research and Education Optical network, allows researchers using computers for advanced scientific research at Arkansas’s four-year universities to have better access to the state’s supercomputing resources.  For more info, read their full release here.

Also posted in HPC Hardware, Network | Leave a comment

Purdue Throttles Power Based on Heat Loads

According to an article in CampusTechnology online, Purdue University IT staff have developed methodologies in order to control the operational performance of machines based on the current data center heat load.  Over the course of a sweltering summer, the Rosen Center for Advanced Computing at Purdue experienced several planned and unplanned outages.  As a result, the IT staff had to develop a method to allow users access throughout the chiller downtime.

Purdue LogoPower outages are actually infrequent at the data center, [Patrick] Finnegan said. But he added that this summer, “due to some planned cooling system maintenance, coupled with the unusually hot summer, we have had some brief cooling outages.”

In both instances, the cause was a temporary capacity reduction in the campus chilled water supply.”

Temperature sensors in the data center kick off scripts that throttle the relative power utilization of the machines.  The article doesn’t go into technical specifics, but my geek radar says they’re likely tickling the CPU stepping via ACPI.

The program worked, and the datacenter didn’t overheat, so the process was a success. We actually were a bit surprised it worked so seamlessly,” said [Mike] Shuey. “It’s much better to have jobs run slowly for an hour than to throw away everyone’s work in progress and mobilize staff to try to fix things.”

For more info, check out their full article here.

Leave a comment

Microsoft parallel runtime expected to go commercial next year

ZDNet wrote last week about a new HPC offering that Microsoft is evidently planning on moving from research to commercial product in the coming months. The platform is called “Dryad”

Microsoft logoDryad is an ongoing Microsoft Research project dedicated to developing ways to write parallel and distributed programs that can scale from small clusters to large datacenters. There’s a DryadLINQ compiler and runtime that is related to the project. Microsoft released builds of Dryad and DryadLINQ code to academics for noncommercial use in the summer 2009.

Dryad stackAs you can see from the diagram (click for a larger view), there is a lot of technology in the platform, including a compiler, runtime, a new file system (TidyFS), and a scheduler (Quincy). “Nectar” is a set of data management tools

“In a Nectar-managed data center, all access to a derived dataset is mediated by Nectar. At the lowest level of the system, a derived dataset is referenced by the LINQ program fragment or expression that produced it. Programmers refer to derived datasets with simple pathnames that contain a simple indirection (much like a UNIX symbolic link) to the actual LINQ programs that produce them.”

According to ZDNet, Dryad was outed in a presentation this month with plans to offer a Community Technology Preview in November 2010 (announced with SC10, I’m guessing), with a final release for Windows HPC Server by next year.

Also posted in Computing Research, HPC Software, System Management, Tools | 1 Comment

Sun GridEngine, now 100% less free

Oracle logoOracle continues its drive to do away with Sun’s strategy of making money by adding value on top of open source tools provided free to the community. In fairness to Oracle it didn’t work all that well as a business model for Sun, which hemorrhaged money and never turned the strategy into any significant share of the HPC market. The latest victim is Sun’s popular job scheduling system, GridEngine.

This change actually happened back in June as far as we can tell, but it’s just popped up as a topic on various discussion boards. Randall Hand at Vizworld wrote up a nice summary of the change, which moves the free Sun GridEngine to the for-pay Oracle GridEngine plus a 90-day evaluation trial.

Oracle has “absorbed” Sun GridEngine internally and renamed it “Oracle GridEngine” (OGE) and placed it under a new license that restricts it to only 90-days of free usage in a “trial” arrangement.  From the 6.2U6 EULA:

As selected in your Entitlement, one or more of the following Permitted Uses will apply to your use of Software. Unless you have an Entitlement that expressly permits it, you may not use Software for any of the other Permitted Uses. If you don’t have an Entitlement, or if your Entitlement doesn’t cover additional software delivered to you, then such software is for your Evaluation Use.

(a) Evaluation Use. You may evaluate Software internally for a period of 90 days from your first use.

So HPC centers large and small using Sun GridEngine are going to have to start ponying up, or move to something else. Or both.

Also posted in Business of HPC, System Management | 5 Comments

Verari Changes the Sign Out Front

Verari has announced that they have changed the sign on the front of the building.  Why, you ask?  They’re focusing their business model specifically on providing hardware for cloud-like environments.  You mean big datacenters?  Yeah, those too.

Verari Tech logoBeing able to base our cloud storage and compute products on Verari’s world class BladeRack® 2 Series technology and FOREST containerized data center infrastructure puts us at the front of the pack to serve the demanding cloud customer,” said Marc Brown, President and COO, Cirrascale. “These products, based on Verari’s patented Vertical Cooling Technology, generated over $500 Million in installed systems in the high performance computing and enterprise markets; these customer segments are the foundation of the burgeoning cloud market of today. This technology is a winning formula for the cloud customer.”

Cirrascale was actually organized under the “Verari Technologies” name while acquiring the intellectual property and other assets of Verari Systems back in January 2010.

Cirrascale logoTechnology innovation is only half the story at Cirrascale; we must also innovate with our business model,” said Dave Driggers, Chairman and CEO, Cirrascale. “Cloud and Web 2.0 businesses are placing new demands on their suppliers. Unlike the enterprise data center customer served by traditional computer companies with established product lines and large IT consulting businesses, the agile, self-sufficient cloud and web 2.0 customers want to collaborate to define their platforms and create a purpose-built data center infrastructure that addresses their unique requirements.”

Quoting their release: “Cirrascale will focus on customers buying at the data center and rack infrastructure level, across a range of storage and computing models including low-power micro-servers, high density storage, scale-out multi-core, HPC cluster and GP/GPU computing. Customers are served by the same physical rack infrastructure that accommodates the customer-defined power, density and cooling requirements.”  This sounds surprisingly like the previous Verari business model. It also sounds very much like the business model of Rackable, now SGI and portions of the Dell business.  Ultimately, this is a very tough market niche.

For more info, read their full press release here.

Also posted in Enterprise HPC | 1 Comment

Final Holyoke Site Announced

Governor Deval Patrick joined by UMass President Jack Wilson and Holyoke Mayor Elaine Pluta, today announced that the Holyoke High Performance Computing Center at the Mastex site.  Huh?  The eventual site of the new Massachusetts supercomputing center is located between Cabot and Appleton Streets in the downtown canal district.

We are on track to deliver new jobs and tech innovation to all of western Massachusetts,” said Governor Patrick. “This project will anchor a vibrant new growth district in the Pioneer Valley.”

Selecting this site is a major step forward for the development of the Holyoke High Performance Computing Center,” said Lieutenant Governor Timothy Murray. “This project, which will lead to downtown redevelopment and growth in the City of Holyoke, is another example of our administration strategically investing in jobs and innovation in all regions of the Commonwealth.”

The Patrick-Murray administration has pledged $25 million toward the construction of the new site.  This, combined with the contributions of university partners and the University Consortium bring the grand build total to $75 million.

The announcement of a final location for the Holyoke High Performance Computing Center brings us one significant step forward towards the economic development and jobs that this project will mean for Holyoke. I’m proud to stand with Governor Patrick as he makes this announcement, and am grateful that his Administration continues to invest in projects, such as this, that create jobs now and lay the foundation for a strong economy for years to come,” said Representative Michael Kane.

For more info on the new HPC digs in Holyoke, read the full article here.

Also posted in Collaborations | Leave a comment

University of Florida Buys ScaleMP for BioTech

ScaleMP announced news today that they were chosen by the University of Florida’s Interdisciplinary Center for Biotechnology Research [ICBR], a research center dedicated to providing biotechnology research services to the UF community.  The ICBR will use the vSMP technology alongside existing infrastructure in order to leverage both legacy and proprietary software packages.  They will also allow researchers to submit larger interactive jobs.

ScaleMP logoMany organizations have a sufficient amount of computational power and enough CPUs, but they are simply unable to leverage their existing infrastructure for larger compute intensive workloads,” said Shai Fultheim, founder and CEO of ScaleMP. “vSMP Foundation for SMP enables biotechnology organizations like ICBR to aggregate existing hardware and to create a virtual SMP for next generation sequence processing and other biotechnology computing needs that need large amounts of processing power as well as shared memory.”

ICBR’s IT team supports research at UF and abroad in various biotechnology fields such as proteomics, genomics, bioinformatics and cellomics. ICBR needed to be able to run legacy software as well as proprietary software packages requiring large shared memory systems. Because of the high price point of traditional SMP systems, the team tried to find other ways to perform these jobs. They ended up stretching their virtual infrastructure to accommodate these large shared memory workloads, resulting in a loss of virtualization benefits.

For more info on the University of Florida’s use of vSMP, check out the ScaleMP website here.

Also posted in Compute, HPC, HPC Hardware | Leave a comment

More computing in a big metal box

After years of no one wanting (or willing) to talk about their trailer sales, HP now has two releases pretty close together. Maybe this means that there is something here after all. Everyone with a container offering has always said to me that the sales cycles are much longer than systems sales cycles, because a container is more like a datacenter than a system. Maybe that wasn’t just marketing hoohah.

Following close on the heels of the iVEC deployment in Australia, Purdue has announced that it, too, has signed up to put part of its computing resources in a trailer

Purdue LogoKnown for its world-leading research in nanotechnology, structural biology and atmospheric chemistry, Purdue is committed in its strategic plan to doubling current research efforts. To this end, Purdue has been adding server clusters to its data center every summer for the last three years.

Constrained by budget, power and space limitations, Purdue has now turned to the HP POD to deliver a cost-efficient, containerized environment that can be quickly deployed. HP POD also integrates multiple vendors’ hardware into interoperable pools of resources that can be tapped on demand.

By implementing the HP POD, Purdue estimates it can expand its research capabilities by 50 percent within a matter of months for less than one-third the cost of building a new data center. Furthermore, the portability of the HP POD enabled the university to place it in front of a power plant, eliminating the possibility of power transfer and capacity issues.

The system in the trailer is reasonably-sized

To permit Purdue’s faculty to conduct leading-edge research, including modeling climate change and designing next-generation nanoscale electronics, Purdue’s Rosen Center for Advanced Computing also is building a new supercomputer. “Rossmann” is composed of a 1,000-node HP Cluster Platform 4000 based on HP ProLiant DL165z G7 servers with dual 12-core AMD Opteron 6100 series processors.


Also posted in New Installations | Leave a comment

The Economist spots wave of innovation in cooling systems

The Economist is reporting on a trend toward more innovation in the air conditioning business driven by energy costs and the desire to be more environmentally responsible with the energy we do use. Some ideas the article calls out? Blowing air over ice, thermal coolers that use waste hot water to cool spaces, and (in dryer parts of the planet) evaporative coolers.

And then there’s this

However, researchers at the National Renewable Energy Laboratory (NREL) in Colorado have designed an evaporative system that sprays ambient-temperature water into warm air to cool it, but in a way that also lowers the humidity. NREL uses syrupy liquids which contain salty desiccants to soak up the humidity. Hot water is used to heat the syrups and dry them out. NREL’s technology, known as “desiccant-evaporative cooling”, is still being developed, but it requires little power, not least because the hot water can be obtained from solar panels. Ron Judkoff of NREL thinks the process will consume only about a fifth of the energy of conventional air-conditioners, depending how dry the climate is to begin with.


Also posted in Green HPC | Leave a comment

The Green Grid juices PUE datacenter measure

The Green Grid has recently updated its PUE metric (Power Usage Effectiveness) that attempts to wrangle some of the uncertainty in the prior definition of the measure. Ted Samson has a nice analysis

The Green Grid logoOne of the greatest strengths of the PUE metric, the industry standard for measuring data center energy efficiency, is its simplicity: Calculate how much energy your data center is consuming overall, then divide that number by how much energy your IT equipment alone consumes.

…At the same time, the simplicity has its shortcomings. For example, it gives operators much flexibility as to where to measure consumption — at the PDU or at the point of connection of IT devices — as well as how often to take measurements.

…In an effort to overcome this drawback, The Green Grid has unveiled four categories of PUE, ranging from Category 0 to Category 3, in a new white paper, “Recommendations for Measuring and Reporting Overall Data Center Efficiency” [PDF]. With each level, the measurements become more granular and the results more precise. Thus, a data center operator may choose to go with Category 0, which requires the least effort and fewest resources — but then those results won’t be viewed in the same light as a rival’s Category 3 PUE figure.

More in Samson’s article; you can also read what The Green Grid itself has to say about the topic in the related whitepaper.

Leave a comment

Amazon EC2 Cluster Workload Management

We posted an article earlier today about the latest service offering from Amazon’s EC2 cloud resource.  Amazon has added what amounts to pseudo-tightly-coupled cluster platform support such that applications have some notion of quality of service [QOS].  MPI is generally not very friendly to link/node failures.  As such, this is directed specifically towards enterprise-level high performance computing applications.

That being said, how do you manage such a resource on the cloud?  Theoretically speaking, the cloud offerings loosely coupled compute “agents” in a distributed manner such that you really don’t care.  However, we, as HPC technologists, DO care.  The folks over at Clustercorp have created a series of integrated solutions such that one can now manage “clusters” of virtual nodes in the cloud the same way one manages a local set of physical nodes: Rocks+.

From their website:

Clustercorp’s Rocks+ and Amazon EC2 combine to form the ideal environment for running large sets of heterogeneous servers in the cloud for any number of general data center use-cases. Rocks+ provides users a ready-to-launch Rocks+ AMI, which is subsequently used to launch 10s, 100s, or 1,000s of servers with a single point of management and control.

The Rocks+ AMI comes pre-loaded with “Rolls,” which allow users to quickly build web servers, database servers, compute servers, and more, with pre-packaged software including Apache Web Server, MySQL, Java, and more, running on CentOS Linux.

Whoa!? Now you’re telling me the same management tools used in my local machines can be used in the cloud?  Yea buddy.  Management headaches are usually the problems that most integration teams easily overlook and often spend the most time mitigating at the end of the day.  Clustercorp’s ability to manage the local resources in the same manner of remote, virtual resources is incredibly powerful.

If you’re interested in utilizing the Clustercorp Rocks+ packages, check out their website here.




Also posted in System Management | 1 Comment

OCF upgrades U of Edinburgh with iDataPlex

UK HPC integrator OCF, have announced they’ve completed an upgrade to the University of Edinburgh’s HPC system (“Eddie”, cute huh?). The upgrade doubles the computing power of the previous system while reducing the facility demands

Despite immediately doubling the compute power available, the HPC system will generate less heat than its predecessor and have minimal energy consumption. There are several reasons for the reduction in heat emissions. Firstly, there are efficiency improvements contained in Intel’s Westmere platform. Second, heat emissions are reduced by the HPC system’s use of IBM System x iDataPlex servers, which are custom engineered for excellent energy efficiency. In addition, the University’s system is fitted with iDataplex water-cooling features to remove 100 per cent of heat generated by the system close to the source, which when combined with the use of Scottish air to cool the water, provides almost free cooling for much of the year.

The new system uses Westmere E5620 Quad Core processors (1,024 cores) with measured Linpack of 15 TFLOPS, and incorporates 90 TB of GPFS file system, and connects compute nodes via an IB network. A second upgrade in 2011 will double the number of cores.

You can learn more about the University’s compute resources at the website.

Also posted in New Installations | Leave a comment

HP publishes first TPC-Energy results

Today the Transaction Processing Council announced that HP submitted the first published results for the new TPC-Energy benchmark, a set of measures that augment the existing TPC-C, TPC-E and TPC-H benchmarks (we talked about TPC-Energy here).

The Transaction Processing Performance Council (TPC) today announced the first results for its TPC-Energy specification. Hewlett-Packard Company has published TPC-Energy results on all three TPC benchmarks: TPC-E, which simulates the online transaction processing (OLTP) workload of a brokerage firm; TPC-C, which simulates the OLTP order-entry workload; and TPC-H, which simulates a decision support workload. Each of these benchmark publications also includes the optional Watts per performance metrics.

The TPC isn’t the only bear in the energy benchmark woods, though; the Standard Performance Evaluation Corp. (SPEC) has been working with the EPA to develop its own test for servers that will be part of the next iteration of the Energy Star for servers program, and The Green Grid has been looking at energy use in the datacenter as a whole for a couple years now as well.

According to coverage in the EE Times, HP’s efforts provided key feedback that will be used in the 1.2 rev of the TPC-Energy measurements.

“It took us seven days to audit four results, and those were very long days,” said Mike Nikolaiev, chairman of the TPC-Energy committee and manager of an x86 server performance group at HP.

HP’s work helped clarify aspects of the spec and the TPC’s software testing package for it. The group expects to post on its Web site soon the resulting version 1.2 of the energy spec and an upgraded version of the software tool.

As we’ve pointed out before, though, these are definitely not technically-oriented computing measurements. Still, understanding the approach may be useful to those looking for ways to extend our benchmarks to include an energy dimension.

Also posted in Green HPC | Leave a comment

Advertisement

Penguin Computing Ad

Video Archive

insideHPC.com is a production of insideHPC, LLC. © 2006-2013 Sitemap