Search Results for: “paris”

Bitcoin Network Aggregates More Cycles than the TOP500

Search Results for: paris

Over at the The Genesis Block, “Phillip Archer” writes that the bitcoin network is now eight times more powerful than the TOP500 supercomputers combined.

While aggregated compute cycles on a network is a far cry from a supercomputer, the comparison does show the remarkable growth of the bitcoin network.

Interestingly, the estimate may still be useful for estimating how well other supercomputers and distributed networking projects would be able to mine bitcoins. Their speed is measured in FLOPS, but they also have the capability of performing the integer operations used in hashing. What would happen if the top 10 supercomputers all switched to bitcoin mining? How much would that affect the network? Lets reverse the equation, and say that they would receive 1 hash for every 12.7k FLOP. The fastest computer, Sequoia, would measure at about 1.6% of the bitcoin network. Their combined speed is 48 petaFLOPS, roughly equivalent to 5% of the bitcoin network. In fact, the top 500 supercomputers have a combined speed of 12% of the bitcoin network.

According to the Wikipedia, Bitcoin is accepted in trade by merchants and individuals in many parts of the world. The processing of bitcoin transactions is secured by servers called Bitcoin miners, which communicate over an internet-based network and confirm transactions by adding them to a ledger which is updated and archived periodically. In addition to archiving transactions each new ledger update creates some newly-minted bitcoins.

Read the Full Story.


Read the entire post …

Posted in HPC, TOP500 | Leave a comment

Green Graph 500 Launches to Boost Energy Efficient Big Data Computing

Search Results for: paris

In this special guest feature, Torsten Hoefler from ETH Zurich writes that the new Green Graph500 aims to boost energy-efficient Big Data Computing.

“Big Data” can be analyzed in various ways. The most successful and prevalent programming model, MapReduce, convinces by its flexibility toadapt to hardware performance variations and faults. However, even though MapReduce covers a huge majority of use-cases, it has its limits for graph computations. Complex graph algorithms become more important as our analysis capabilities grow. For example, problems such as finding hubs in social network graphs are routinely answered today. The underlying algorithm, betweenness centrality, utilizes a graph traversal similar to breadth first search or shortest path search. Systems such as Google’s Pregal, Apache’s Giraph, the (Parallel) Boost Graph Library, and Stanford’s GPS are just some examples for emerging frameworks to handle large-scale graph computations. In order to efficiently compare architectures and possibly programming frameworks, the Graph 500 benchmark strives to establish a database for performance of a standardized breadth first search on various platforms.

As energy is becoming a bigger concern than hardware purchasing costs in large-scale data centers and supercomputing centers, it becomes mandatory to not only consider the performance of such computations but also their exact energy consumption. In fact, if the current cost trends continue, then energy consumption will soon be more important than absolute performance. Such discussions are highly relevant for operators of large data centers such as Google, Amazon, and Yahoo, as well as large supercomputing centers operated by the DOE (e.g., LLNL, Sandia,LANL, ORNL) and the NSF (e.g., NCSA, SDSC, PSC). We are thus looking forward to interesting future developments targeting exascale as well as Big Data architectures and programming frameworks.

We introduce the Green Graph 500 list which fulfills a variety of purposes. First and foremost it is to establish the practice to compete not only for the highest performance but also for the highest energy efficiency, directly benefiting society. It is also set out to collect historical data about developments that may allow us to predict future trends very similar to what the top 500 list has achieved in the past(who doesn’t like to put up a top 500 slide to project out FLOP rate for the next 10 years?). The list will also allow us to compare the energy efficiency of a specific computer for certain tasks, e.g.,dense linear algebra (a problem mainly limited by memory size and CPU peak floating point performance) versus graph search (a problem mainly limited by memory access rates and global system bandwidth). Those two metrics together may serve as a measure to generate more efficient balanced systems as well as special-purpose systems for one of those tasks.

Finally, the new Green Graph 500 list is not meant to compete with any of the existing lists. It is indeed complementary, filling an important gap in the field. In fact, the rules are designed to be similar to the established Green 500 rules (similar, not identical, for example with regards to the network) so that comparisons can easily be made in the future. It also directly integrates with the Graph 500 list and submission system to guarantee one-to-one comparisons (a submission record may be in the Green Graph 500 as well as the Graph 500 even though the lists are ranked by different indices).

The Green Graph 500 list is soliciting submissions from everyone through the Graph 500 submission system. To submit to the list, simply start a normal Graph 500submission and select “Submit to Green Graph 500″ or “Submit to both lists”. The only additional data you need for a Green Graph 500submission is the actual power draw of your system during the benchmark.

Another small difference between Graph 500 and it’s Green peer is the measurement methodology. Since most power meters are not accurate enough to measure the rather short actual BFS run (not including the post-check etc.), we offer a slightly modified version of the reference benchmark which allows to run the BFS in a tight loop long enough for a low-time resolution energy meter to measure the exact energy consumption. This benchmark will also report a Graph 500 number valid for submission. For runs with a custom implementation, this would need to be ensured manually (4-5 lines of C Code suffice for this). The submission opens together with the official Graph 500 submission.

As a sneak peek, we prepared a sample list from March 2013′s energy submissions (which may not have followed all the official rules, thus, the list is not official).

The Green Graph 500 list is maintained by Torsten Hoefler from ETH Zurich in collaboration with the Graph 500 executive committee. For questions or comments please contact [email protected]

Read the entire post …

Posted in Green HPC, HPC, inside-BigData | Leave a comment

Atipa to Build 3.4 Petaflop Super for DOE Environmental Molecular Sciences Lab

Search Results for: paris

The Department of Energy’s Environmental Molecular Sciences Laboratory has ordered up a 3.4-petaflop supercomputer from Atipa Technologies, the HPC division of Microtech Computers. The new system will replace the Chinook supercomputer which aids energy, environment and basic science missions important to DOE.

The 42-rack machine will boast a total of 195,840 cores, consisting of 23,000 conventional Intel Xeon processors tied to 184,000 gigabytes of memory. The 1,440 compute nodes will also have an undisclosed number of Xeon Phi coprocessing cards alongside the Xeons, allowing the system to parallelize up to 120 extra calculations. A shared parallel filesystem will offer 2.7 petabytes of usable storage, across an FDR Inifiniband network. In total, there will be 128 GB of memory per node. What sets the new supercomputer apart, Atipa said, is the amount of memory devoted to each CPU, allowing the models that scientists run to operate more efficiently. For comparison, the recently completed “Stampede” supercomputer at the University of Texas also relies on just over 184,000 gigabytes of memory, including 204,900 cores split between a number of 8-core Intel Xeon E5-2680 microprocessors.

Read the Full Story.


Read the entire post …

Posted in Business of HPC, Co-processors, Compute, HPC, HPC Hardware | Leave a comment

Adaptive Computing Enhances Moab HPC Suite with Version 7.2 at SC12

Search Results for: paris

In this video from SC12, Brady Kimball from Adaptive Computing describes enhancements to the Moab Compute Manager 7.2 suite including:

  • Support for Intel Xeon Phi coprocessors
  • Dual Domain Scheduling for Cray systems
  • Streamlined RPM experience
  • Allocation Updates
  • Enhanced Viewpoint GUI for HPC

Read the Full Story.

Read the entire post …

Posted in Co-processors, Compute, Events, HPC, HPC Hardware, HPC Software, SC12, System Management, Video | Leave a comment

Energy Efficiency Focus in the SC12 Technical Program

Search Results for: paris

Energy Efficiency Focus in the SC12 Technical Program

by Natalie Bates, Co-chair Energy Efficient HPC Working Group (EE HPC WG)

 

Energy efficiency will again be a hot topic at SC12, with at least 38 Technical Program sessions focused on energy efficiency.  A complete list of these sessions organized both chronologically and by topic can be found on the Energy Efficient HPC Working Group website.  SC12, the annual International Conference for High Performance Computing, Networking, Storage and Analysis, will be held Nov. 10-16 in Salt Lake City, Utah. For more information, see the SC12 website.

BROAD SCOPE SESSIONS

The Third Annual Workshop on Energy Efficient High Performance Computing – Redefining System Architecture and Data Centers” promises to be interesting to a broad audience.  Some of the featured speakers include; Peter Kogge, University of Notre Dame who will look at the historical trends of power, energy and supercomputing; John Shalf, Lawrence Berkeley National Laboratory whose talk will focus on the energy requirements for applications; as well as Herbert Huber, Leibniz Supercomputing Center and Steve Hammond, National Renewable Energy Laboratory who will speak about energy efficient data centers.

There are four other technical programs that will cover the topic of energy efficiency at a high level.  Kirk Cameron, Virginia Tech is on the slate to give two talks, both of which have clever and enticing titles with phrases about a “Growing Power Struggle” and “Energy Oddities.”  Prohibitive energy costs motivated Thomas Ludwig, German Climate Computing Center to consider the cost and benefits of “HPC-Based Science in the Exascale Era.”  Finally, there is a “Cool Supercomputing” Birds of Feather (BoF) organized by Pacific Northwest National Laboratory that covers tools and techniques for optimizing energy consumption at all levels.

Setting Trends for Energy Efficiency” is a BoF representing a collaborative effort by the Top500, Green500, the Energy Efficient HPC Working Group and The Green Grid to standardize the power measurement methodology used when running system workloads for architectural comparison, such as High Performance Linpack.  This is one of seven sessions that cover energy efficiency measures and metrics.  The Green500, Top500 and now the Graph500 have their own BoFs and will report power consumption and energy efficiency as well as performance for their Lists.   The High Performance Group  at at the Standard Performance Evaluation Corporation (SPEC) has also organized a BoF that will discuss  a new OpenMP benchmark suite with an optional energy metric that scales to 512 threads.  From the home of the Green500 at Virginia Tech, Balaji Subramaniam will present his doctoral showcase on metrics for energy efficiency.  Finally, an Intel team will present a paper on tuning for the Graph500 Traversal which includes both performance and energy efficiency results.

SESSIONS FOCUSSED ON SYSTEM HARDWARE

Thirteen of the sessions are exploring system hardware energy efficiency.  Of these thirteen, seven of them focus on alternative processors like GPU and ARM that are continuing the trend towards aggregating low-power processors and using accelerators. There are three BoFs that explore alternative processors and all three are organized by Europeans. The Partnership for Advanced Computing in Europe (PRACE) explores a set of prototypes to test and evaluate promising new technologies for future multi- Petaflop/s systems that include GPUs, ARM processors, DSPs and FPGAs.  The Barcelona Supercomputing Center is heading up an ARM-based exascale demonstration and will review their research results and plans at two BoFs; “Energy Efficient HPC” and “Exascale Research- The European Approach.” Besides these BoFs, there is a session as part of Broader Exchange where Calxeda, an ARM-based server provider, will present their products and roadmaps.  NEC is presenting an exhibitor forum on “Hybrid Solutions with a Vector-Architecutre for Efficiency.”  There is also a paper on “Multi-Core DSP” and a poster on modeling “Power-Performance Efficiency” for GPUs.

A new topic for SC this year is a focus on memory technologies, which was presaged by a keynote at the International Supercomputing Conference held in Hamburg, Germany last June when Dr. Byungse So, Samsung Senior Vice President gave a talk on “Advanced Memory Technology – #1 Factor for Energy Efficient HPC”.  Two papers, RAMZzz and Mage, both explore novel memory system designs.  Samsung and Micron, respectively are presenting exhibitor forums on “How Memory and SSDs can Optimize Data Center Operations” and “Hybrid Memory Cube (HMC)”.

Whereas memory is on the uptake, the focus on liquid cooling has waned with only two sessions this year compared to six last year at SC’11.  Eurotech will present an exhibitor forum on “Differences Between Cold and Hot Water Cooling on CPU and Hybrid Supercomputers” and Green Revolution Cooling will present on “100% Server Heat Recapture in Data Centers is Now a Reality.”

DATA CENTER SESSIONS

Kimberly Cupps, Lawrence Livermore National Laboratory will present on “The Sequoia System and Facilities Integration Story”.  It appears that she will be giving the same talk at two different sessions; on Monday during Broader Engagement as well as on Tuesday as an Invited Speaker.  Also, the M+W Group will present an exhibitor forum on “Reducing First Costs and Improving Future Flexibility in the Construction of High Performance Computing Facilities.”

APPLICATION TUNING AND JOB SCHEDULING

There are nine sessions that describe research on tuning applications for energy efficiency and various aspects of energy efficient job scheduling.  Seven of the nine sessions are doctoral showcases, papers or posters.  There is a BoF on “Power and Energy Measurement Modeling”.  In this BoF, members of the research community and industry will present current state-of-the-art and limitations in measuring and modeling power and energy consumption and their effect on HPC application performance. An open discussion about future directions for such work will follow, with the intention of creating a “wish list” of feature requests to HPC vendors.  Another BoF of interest is the SLURM User Group Meeting, which provides an open source job scheduler.  Also, Charles Lively, ORNL will give a talk during Broader Engagement on “Heading Towards Exascale – Techniques to Improve Application Performance and Energy Consumption Using Application-Level Tools”.

Following is a list of the titles for the doctoral showcases:

Following is a list of the titles for the papers:

Following is a list of the titles for the posters:

OTHER SESSIONS

Two other sessions that will cover energy efficiency include an all day workshop on “High Performance Computing, Networking and Analytics for the Power Grid” and a poster on “Pay as You Go in the Cloud: One Watt at a Time.”

Although this is a list of sessions with a specific focus on energy efficiency, many more sessions will include energy efficiency as part of a broader focus.

Read the entire post …

ClusterVision Manages Murex’s Cluster

Search Results for: paris

The French financial analysis solutions provider, Murex, has chosen ClusterVision to supply cluster management software and professional support services for its new HPC cluster in Paris, which is used to develop risk and asset management solutions for financial organisations worldwide.

After the completion of a successful trial, Murex will use Bright Cluster Manager, supported by ClusterVision, as part of an overall HPC management service for their existing GPU powered compute cluster. The cluster is primarily used as an in-house development and testing platform for Murex’s commercial software.

Murex is one of Europe’s largest software developers, and one of the world’s leading providers of software and services to the financial sector. Its flagship product, Murex MX-3, is used for risk analysis in financial market trading, and has over 36,000 users at 200 institutions in 65 countries worldwide.

Risk analysis is both computationally and data intensive, and is one of the most rapidly growing application areas for high performance computing. Murex’s new HPC capability now allows them to manage complex financial products in high precision, and in near real time, compared to a previous capability of only computing analytics once or twice a day.

The system comprises a single head node, with redundant power supply and hard disk drives, and 16 IBM iDataplex dx360 M3 server compute nodes. Each server incorporates two of Intel’s X5675 six-core, 3.06GHz clock speed processors and two Nvidia Tesla M2090 accelerators. System interconnect is managed via Gigabit Ethernet and Mellanox QDR infiniband, and storage is provided by IBM’s high performance clustered file system, GPFS. Large volume data analytics applications, such as the financial analysis solutions developed by Murex, are particularly suited to acceleration, using GPUs.

As a relative newcomer to cluster computing, Murex needed cluster management that was easy to install and operate, and that would create a robust reference platform for their own software development process. After an evaluation of their requirements, ClusterVision supplied Murex with Bright Cluster Manager, initially as a four-node, six-month, proof-of-concept demonstration. Murex has now implemented full cluster management, using 17 Bright Cluster Manager licences, together with a three-year provision of associated professional services. ClusterVision will also assist the team at Murex in their evaluation of the latest release of Bright Cluster Manager, version 6.0, where the Amazon EC2 cloud bursting capabilities are of particular interest.

IT resources are rare these days and the Bright Computing cluster management suite enabled us to avoid investing a huge number of man days rediscovering what others have already done, or re-developing something internally that we would have had to support for years. The possibility to extend our cluster with EC2 is perfectly suited to our usage since, from time to time, we need to run some benchmarks on huge test cases which exceed our own compute capacity,’ said Pierre Spatz, head of quantitative analysis at Murex.

This story originally appeared on HPC Projects. It appears here as part of a cross-publishing agreement with Scientific Computing World.


Read the entire post …

Posted in HPC, HPC Software, New Installations, System Management | Leave a comment

The Missing Amazon Glacier Cost-Estimator

Search Results for: paris

As reported here, storage pundits have been dubious as to claims by Amazon that their new AWS Glacier cloud archiving service is a tape killer. At a penny per gigabyte per month, the press release had some journalists eating the dog food with speculations that Glacier could make tape silos obsolete.

But wait. Was Amazon leaving out crucial pieces of the comparison puzzle? Now J. Brandt Buckley has posted a Amazon Glacier Cost-Estimator Calculator to help elucidate the relationship between cost, data retention periods, and recovery scenarios.

Read the entire post …

Posted in Cloud HPC, HPC, HPC Hardware, Storage | 1 Comment

Video: Graphics in the Cloud

Search Results for: paris

In this video from SIGGRAPH 2012, Nvidia’s Ian Williams presents on the VGX Hypervisor, the company’s move to bring graphics to the Cloud.

The new NVIDIA VGX technology allows for true hardware virtualization of the GPU, enabling a true PC and Workstation experience in a virtual desktop environment. This session will cover a comparison of graphics virtualization technologies available in the industry (both SW and HW methods) as well as accelerated remoting solutions.


Read the entire post …

Posted in Cloud HPC, Events, GPUs, HPC, HPC Hardware, Video, Visualization | Leave a comment

Call for Papers: Performance Modeling, Benchmarking & Simulation of HPC Systems Workshop

Search Results for: paris

The Performance Modeling, Benchmarking & Simulation of HPC Systems Workshop at SC12 has issued its Call for Papers. The event will take place Nov. 12 in Salt Lake City.

This workshop is concerned with the comparison of high-performance computing systems through performance modeling, benchmarking or through the use of tools such as simulators. We are particularly interested in research which reports the ability to measure and make tradeoffs in software/hardware co-design to improve sustained application performance. We are also keen to capture the assessment of future systems, for example through work that ensures continued application scalability through peta- and exa-scale systems.

Papers are due Sept. 9, 2012. Read the Full Story.


Read the entire post …

Posted in Events, Exascale, HPC, SC12 | Leave a comment

Infographic: Sequoia – The Fastest Supercomputer on the Planet

Search Results for: paris

The HPC 4 Energy Blog brings us this infographic on #1 ranked Sequoia supercomputer.

Lawrence Livermore National Laboratory’s Sequoia supercomputer, an IBM BlueGene/Q system, was ranked as the world’s fastest supercomputer on June 18, 2012. Sequoia boasts 16.32 petaflops using 1,572,864 cores, but how fast can it complete calculations? This infographic puts its speed into perspective, demonstrating the potential of American HPC resources to save organizations time and money.

While the comparison of supercomputer flops vs. handheld calculators is pretty much tired, I think it is helpful to help the layman understand how immense 16.3 Petaflops is compared to machine capabilities of just a few years ago. Download the infographic.


Read the entire post …

Posted in HPC | Leave a comment

DOE Doles Out Cash to AMD, Whamcloud for Exascale Research

Search Results for: paris

By Timothy Prickett MorganGet more from this author

The US Department of Energy used its massive budget to push supercomputers to gigaflops, teraflops, and petaflops in the prior three decades and it is being tasked to put the pedal to the exaflops metal before the end of this decade.

To get there, the DOE has to fund primary research at IT vendors who might otherwise not get around to it until it suited their own commercial needs. It has to also foster collaboration across vendors who might otherwise rather not share ideas, because no one vendor is going to be able to solve the exascale problem by itself.

The main vehicle for funding exascale computing is called the Extreme-Scale Computing Research and Development program, which is being funded by both halves of the DOE. That would be the Office of Science, which funds scientific research in the nuke labs, and the National Nuclear Security Administration, which runs simulations to make sure the US military’s nuclear warheads work since Uncle Sam can’t set one off thanks to the Nuclear Test Ban Treaty. There is talk that the supers at the DOE labs aren’t just making sure existing nukes work, but also helping to redesign them.

The first phase of the DOE’s exascale system funding is called FastForward, which is being administered by Lawrence Livermore National Laboratory in conjunction with the six other primary DOE nuke labs (some of which dislike being called nuke labs even though they do nuclear physics research).

Those other DOE labs, along with LLNL, are the name brands in high performance computing in the United States: Argonne National Laboratory, Lawrence Berkeley National Laboratory, Los Alamos National Laboratory, Oak Ridge National Laboratory, Pacific Northwest National Laboratory, and Sandia National Laboratories.

The FastForward exascale research program issued its request for proposals on March 29, and asked that they be submitted by May 11. The program seeks to fund basic research in exascale computing as it relates to three areas: Memory, processors, and storage and I/O.

It has an explicit goal of trying to solicit cooperation across multiple companies, much like the US Defense Advanced Research Project Agency’s Ubiquitous High Performance Computing program. In a way, the UHPC program at DARPA is the trailblazer for the FastForward program at DOE.

DARPA always first to fight
The UHPC program was announced in March 2010 with the goal of creating an HPC system that by 2018 can do 50 gigaflops per watt (BlueGene/Q, the current top performer and most efficient super in the world, can do a little more than 2 gigaflops per watt) and pack 10 petabytes of storage and do around 3 petaflops of number crunching into a slight larger server rack than is standard and within a 57 kilowatt power budget.

Building an exascale system would seem easier, by comparison, since there is, in theory, no limit on the size of the machine or its power budget. But in reality, there are big-time power limits on exascale supers because no one is going to build a 20 megawatt nuclear or coal power station to keep one fed and cooled.

In August 2010, two teams were awarded UHPC ExtremeScale contracts with a total of $74m: one lead by Nvidia and the other Intel. Nvidia got a $25m grant and has teamed up with Cray, Oak Ridge National Lab, and six universities. Intel teamed up with three universities, SGI, Lockheed Martin, Cray, Reservoir Labs, and ET International to take down a $49m grant.

In three related UHPC grants, Sandia National Lab has teamed up LexusNexus and two universities, MIT has its own grant, and so does Georgia Tech, apparently. Total funding for the UHPC effort is said to be on the order of $100m, but DARPA has never confirmed that figure.

Three steps to DOE-sponsored exascale computing
With the FastForward program, the DOE is setting a cap of $20m on any proposals to try to encourage focused work on specific problems, and said at the get-go that what it was looking for was more like two $10m proposals in each of the three areas of primary research.

It is not clear how many awards have been made yet – the vendors are not notified of who was bidding and who won, but rather that they won. At the moment, AMD has been awarded a FastForward contract for processor and memory research and Whamcloud has one contract for storage and I/O research. There could be – and probably will be – others getting grants. Uncle Sam likes to hedge its HPC bets.

Once the primary research on possible exascale technologies is completed over the next two years, DOE will be looking at funding vendors to put together prototypes – this is tentatively called the system design phase – and then, by 2020, to build full exascale systems based on those prototypes – known as the system build phase at the moment. DOE will no doubt come up with other names later on.

According to the statement of work (PDF) for the FastForward contract, the issues that vendors face on the exascale challenge are daunting.

On a current petaflops-class system today, it costs somewhere between $5m and $10m to power and cool the machine today, and extrapolating to an exascale machine using current technology, even with efficiency improvements, you would be in for $2.5bn a year just to power an exascale beast and you would need something on the order of 1,000 megawatts to power it up. That’s 50 nuclear reactors, more or less. The DOE has set a target of a top juice consumption at 20 megawatts for an exascale system.

Using DDR3 main memory today, a 2 petaflop machine with 2PB of main memory burns about 1.25 megawatts, and assuming that we can get to DDR5 main memory by 2020, we’re talking about needing 260 megawatts just for the memory subsystem in an exascale box. Even if you cut the memory-to-flops ratio by a factor of five, which many people don’t think is a good idea, and you are above 50 megawatts just for the memory subsystems across a cluster.

In addition to power consumption, memory components are not getting as cheap as CPU components, and memory bandwidth is not keeping up with the ever-increasing core count on processors and thus memory latencies are increasing.

There are resiliency issues with all of the components in an exascale system, which will have large numbers of components frying all the time. And then you are going to have billions of compute elements, and there has to be a hierarchy of memory and interconnects to keep them all fed and communicating with each other as simulations run.

Worse still, programming these petaflops machines is a complete bitch, and an exaflops system will be in the range of old battle-axe mother-in-law. Beyond that, you are programming against Death.

On the processor front, during the FastForward phase, the DOE is looking to better measure and control the power use in processors and integration with memory, network, and optics from the CPU or hybrid CPU-GPU chip, as the case may be. On its wish list, the DOE wants automatic rollback after faults or synchronization errors and better fault detection and correction.

Boosting the movement of data onto and off of the chip is also key, as is handling collective operations across compute elements, and software-controlled placement of data on the chip and its memory hierarchy is also penciled in. Putting network interfaces on the processor is a requirement, and so is boosting the concurrency across many cores and many threads on the cores.

The compute elements of the FastForward potion of the project have to provide 50 gigaflops per watt at scale – that’s the same level of performance per watt that DARPA is looking for its ExtremeScale UHPC project. The system has to have a mean time between application failure of six days or larger.

This doesn’t sound so great until you realize the system will have trillions of components and that today, with petaflops-class machines, it is on the order of one to five days and, without check-pointing or other resilience mechanisms will drop to about six hours by 2020.

DOE would like to have compute nodes with more than 10 teraflops of double-precision number-crunching performance, 4TB/sec of aggregate memory bandwidth and more than 100GB of main memory; something on the order of 32GB to 640GB is preferred. Total bandwidth between a node and the interconnect that lashes them together should be in excess of 400GB/sec.

The burden of memory
On the memory front, DRAM failure rates are higher than expected and density improvements in memory chips are not coming fast enough. So DOE wants researchers to explore the use of in-memory processing – literally putting tiny compute elements in the memory to do vector math or scatter/gather operations – as well as the integration of various forms of non-volatile storage into exascale systems.

The nuke labs are thinking that 500GB of NVRAM Of some sort per socket will do the trick. While 4TB/sec of bandwidth is a baseline, DOE really wants 10TB/sec.

Parallel storage subsystems generally hold up better than compute nodes on exascale systems these days, with the DOE estimating that the meantime between application failure due to a storage issue being around 20 days. Without any substantial changes to storage architectures, that will drop to 14 days by 2020. Disk capacity is increasing at a decent clip, but disk performance is not. Solid state drives are fast, but they ain’t cheap.

If availability is not as big of an issue for exabyte-class storage, then scale surely is. That exascale system in 2020 will have between 100,000 and 1 million nodes, and will have somewhere between 100 million and 1 billion computing elements, with somewhere between 30PB and 60PB of memory, and across which some sort of concurrency will have to be provided to run applications.

This behemoth will require from 600PB to 3,000PB of disk capacity. In effect, the disk array for an exascale compute farm will be an exascale system in its own right, with peak I/O burst rates on the order of 200TB/sec and metadata transaction rates on the order of 100MB/sec.

For the FastForward storage research projects, DOE wants a storage system that can keep the fully running exascale system fed, without crashing, for 30 days or more, and the mean time between unrecoverable data loss should be 120 days or higher – and do so with the storage array crammed to 80 per cent of capacity and performing full memory dumps from the system every hour.

Data integrity algorithms for storage can impose no more than 10 per cent overhead on the metadata servers at the heart of the storage array. Metadata insert rates are expected to be on the order of 1 million to 100 million per second, and lookup and retrievals are expected to be on the order of 100,000 to 10 million per second out of the metadata servers.

During peak system writing and reading operations, the metadata servers can’t take any more than a 25 per cent performance degradation hit, and DOE would really like to be 10 per cent.

No big deal, right?

So, good luck, AMD, Whamcloud, and friends.

The winners
AMD received research grants under the FastForward portion of the DOE Extreme-Scale Computing program for both processing and memory research, and according to Alan Lee, corporate vice president for advanced research and development at the chip maker, the reason is because the two are interrelated.

Lee was not able to elaborate much on the research plans AMD has put together, but he did confirm to El Reg that AMD would be focusing on research to push its hybrid CPU-GPU processors, what the company calls its Accelerated Processing Units or APUs. On the memory side, AMD is looking a different types of memory, different structures and hierarchies of memory, and different relationships between these memories and the APUs, and that this will, of course, necessarily involve system interconnect work.

“Moving data around to feed the beast is critical for exascale,” explained Lee, adding that the SeaMicro acquisition earlier this year was not done for this DOE work, but the interconnect expertise that AMD gained through that acquisition would be put to good use.

AMD researchers have already identified a subset of key memory technologies that they think will be applicable to exascale-class systems, and this is what the research will focus on. AMD is not throwing the whole kitchen sink of possible volatile and non-volatile memories into the mix.

Lee was not at liberty to say what memory technologies AMD was looking at – that would be helping its inevitable competition. AMD has received a grant of $3m for the memory research and $12.6m for the processor research. It is interesting that AMD was able to bag these contracts all by its lonesome specifically after the DOE said that it wanted multiple companies cooperating on the work.

On the storage front, Whamcloud, the company that was formed in July 2010 to support and extend the open source Lustre file system, is the leading contractor and is soliciting help from a bunch of others.

Whamcloud is managing the project and lending its Lustre file system expertise and is relying on HDF Group for application I/O expertise, EMC for system I/O and I/O aggregation skills, and Cray for scale-out testing of the storage systems. This exascale storage system will have a mix of flash and disk drives.

The word on the street is that Whamcloud received around $8m for its FastForward grant. ®

This article originally appeared in The Register. It appears here in its entirety as part of a cross-publishing agreement.

Read the entire post …

Posted in Computing Research, HPC | Leave a comment

Andrew Jones: A Preview of ISC’12

Search Results for: paris

Andrew Jones from NAG in the UK gives us a preview of what to expect at ISC’12 in Hamburg.

As at ISC’11 last year (and SC11), I think there will be a strong fight for attention in the key area of manycore/GPU devices – and a matching search for evidence of real progress. So far the loudest voice has been NVidia and CUDA, especially following NVidia’s successful GTC event recently. However, interest in Intel’s MIC (Knights Corner) is strong and growing – MIC has often been a big discussion topic in workshops, conferences and meetings over the last year. As the MIC product launch gets closer, people will be making obvious comparisons with NVidia’s Kepler announced at the GTC.

Read the Full Story.


Read the entire post …

Posted in Accelerators, Events, GPUs, HPC, HPC Hardware, ISC12 | Leave a comment

Interview: Bull to Showcase HPC Leadership at ISC’12

Search Results for: paris

Our series of features on European HPC vendors continues with this interview with Pascale Bernier-Bruna from Bull. The company recently deployed the hybrid 2 Petaflop CURIE supercomputer, the first French Tier0 system open to scientists through the French participation in the PRACE research infrastructure.

insideHPC: What is the current status of CURIE, the 2 Petaflop supercomputer designed by Bull for GENCI? Is it fully deployed and running customer workloads?

Pascale Bernier-Bruna: The CURIE supercomputer – which was implemented in two stages between late 2010 and the end of 2011 – is now completely installed and is fully available to the scientific community since the beginning of March 2012.

In its final testing phase from December 2011 to February 2012, CURIE’s effective operation has been verified by running a number of very large-scale simulations using virtually all of its components. This so-called “Grand Challenges” phase also enabled researchers to achieve major scientific advances.

For example, this was the case with the work carried out in December 2011 by the team led by Michel Caffarel from the quantum chemistry and physics laboratory at CNRS/Paul Sabatier University in Toulouse. In order to gain a better understanding of the chemical phenomena at work in the process of neuro-degeneration – particularly in Alzheimer’s Disease, which currently affects over 20 million people worldwide, researchers were looking to model the behaviour of metallic ions that are particularly involved in these processes.

The simulations carried out using virtually all CURIE’s processing cores with the QMC=Chem code proved to be highly more accurate that those obtained so far using classical methods.

In astrophysics, a team of researchers from the Paris Observatory, coordinated by Jean-Michel Alimi, has performed the first-ever computer model simulation of the structuring of the entire observable universe, from the Big Bang to the present day. The simulation has made it possible to follow the evolution of 550 billion particles. This is the first of three runs which are part of an exceptional project called “Deus: full universe run”, carried out using CURIE.

This simulation, along with the two additional runs expected by late May 2012, will provide outstanding support for future projects dedicated to the observation and mapping of the universe. They will shed light on the nature of dark energy and its effects on cosmic structure formation, and hence on the distribution of dark matter and galaxies in the universe.

The implementation of “Deus: full universe run” represents a new stage in the development of supercomputing. The first simulation in the project has largely outperformed the most advanced cosmological simulations carried out over the past few years by a number of international collaborations at the largest supercomputing facilities around the world. The entire project will use more than 30 million hours (about 3500 years) of computing time on virtually all CPUs of CURIE.

Other research teams have high hopes of CURIE, including those from the CEA working on nuclear fusion, with the aim of scoping the future prototype of ITER (the International Thermonuclear Experimental Reactor). Researchers at CORIA and CERFACS are planning to use the system to optimize the combustion processes in turbines and piston engines. And teams from the Pierre Simon Laplace Institute (IPSL) will be creating multi-level climate models, to study cyclones in the Indian Ocean.

insideHPC: What would you say Bull’s unique strengths are as a key vendor in the worldwide HPC market?

Pascale Bernier-Bruna: In its HPC strategy, Bull has three major components. The first one is that Bull has proved with the Tera-100, Curie and Helios systems that it can successfully design and implement petaflops-scale supercomputers integrating thousands of servers, tens of thousands of processor cores and complex storage architectures. Now when you have installed some of the most powerful supercomputers in the world, you are obviously more than capable of handling all types of projects.

And the second key point is that Bull masters every aspect of global HPC solutions design: compute node design, interconnect optimisation, appropriate software stack development, application performance monitoring and optimisation.

The third and main component is the quality of Bull’s HPC technical team. The development, engineering and support are mainly based in Paris and Grenoble (France). They are ideally located to support our European customers, in the same time zone, which improves our capacity to develop closer relationships with our customers.

Merchants of doom predicted that investing all our R&D resources on the European soil was an obstacle to Bull’s expansion in HPC, but it has in fact revealed to be an advantage in the current context.

Quite rightly, the European Commission recently reaffirmed the strategic importance of HPC to the continent, for the competitiveness of its businesses and the creation of employment. So Europe is planning to double its investment in HPC between now and 2020. And Bull – as Europe’s only manufacturer of supercomputers and a leading player with a global reputation – is determined to play a pre-eminent role in this coming drive.

insideHPC: What will you be showcasing this year at ISC’12?

Pascale Bernier-Bruna: Bull will be exhibiting its complete range of Extreme Computing solutions, based on the bullx family of systems designed specifically for HPC. The latest evolution of the bullx blade system, the B510 blades, will be in the limelight, since they are at the heart of the two petascale systems installed by Bull recently. The bullx B510 blades are suited for configurations of all sizes, not just for large-scale supercomputers, and will successfully meet the performance requirements of a large variety of HPC users, as did the previous generation of bullx blades.

Bull will also be exhibiting its coming addition to the bullx family, the bullx B700 Direct Liquid Cooling blades, which deliver drastic savings in energy consumption. Their revolutionary technology brings cooling at the heart of the blades themselves, and makes it possible to use warm water for cooling, while the systems remain extremely easy to maintain.

insideHPC: How important is the ISC conference to your HPC marketing plans on an annual basis?

Pascale Bernier-Bruna: Bull is the only European-based HPC manufacturer, ISC is the largest European-based HPC event… It is a natural fit for Bull to be a Platinum sponsor of ISC!

Read the entire post …

Posted in Compute, GPUs, HPC, HPC Hardware | 1 Comment

Supercomputing: From Candlelit Dinners to the House of Lords

Search Results for: paris

Exascale computers will be here by 2019, according to Hans Meuer, chair of the ISC’12 Supercomputing Conference – although it is currently unclear what technologies they will employ.

In an invited talk in one of the committee rooms of the House of Lords on 18 April, Professor Meuer gave British Peers, and luminaries of the UK computer community, a tour-de-force presentation on the development of supercomputing from the Cray 1 in the 1970s to the advent of exascale.

He emphasised that the demands on high-performance computing are changing and that data crunching is becoming as important a topic as number crunching. However, he said, the conventional tools for assessing the performance of supercomputers – in particular the Linpack benchmark upon which the Top500 listing is based – may not necessarily be the most appropriate measures in such data analysis applications. He stressed that alternative metrics, including Jack Dongarra’s HPC Challenge benchmarks and the Graph500 initiative, were important in assessing machines for specific purposes.

The value of the Top500 benchmark is that it has been applied consistently over a period of nearly 20 years (celebrations of the 20th anniversary will take place in Salt Lake City in November this year). When plotted on a logarithmic scale, the increase in supercomputing power over that period has been a remarkably straight line and he saw no reason to doubt that the trend would continue into the future.

The consistency of the growth in compute power over the period is all the more remarkable as the underlying technologies have changed significantly in that period, he pointed out. ‘For me, the first real supercomputer was the Cray 1 vector supercomputer in 1976,’ he said. But the technology changed to massively parallel architectures, more conventional processor chips and, recently, to include GPU type chips.

Professor Meuer recalled that the Cray 2 was the most powerful supercomputer in the world in 1986. The price tag, of $22M was so high that when one was purchased for Stuttgart, the deal was signed allegedly only after ‘a candlelit dinner’ between the Minister-President of Baden-Wurttemberg and then then CEO of Cray Research, John Rollwagen. For comparison, Professor Meuer said, the Apple iPad2 in 2011 had two-thirds of the processing power of the Cray 2 at a price tag of only $500 – a reduction in price by a factor of 44,000.

He raised the radical question as to whether we need new computer architectures to cope with ‘Big Data’. In traditional computational sciences, he said, the problems fit into memory; the methods require high precision arithmetic; and the computation is based on static data. Recently, interest has grown in data intensive sciences where the problems do not fit into memory; variable precision or integer based arithmetic is required; and the computations are based on dynamic data structures. Such problems arise as a result of experiments such as the Large Hadron Collider at CERN, the European Laboratory for Particle Physics, where the task is analysis (data mining) of raw data from the high throughput instruments.

Looking to the future, Professor Meuer reminded his audience of the perennial problem that to increase the number of transistors per chip, the transistors must become smaller and smaller and so the manufacturing process must be able to define ever-smaller feature sizes year after year. He conceded that the ultimate limits of conventional silicon technology would be reached within the next few decades. Perhaps, he speculated, it would soon be time to turn to more exotic technologies, such as quantum computing. He concluded by citing Mark B. Ketchen, manager of the physics of information group at IBM’s Thomas J. Watson Research Centre in Yorktown Heights, New York, on quantum computing: ‘In the past, people have said, “maybe it’s 50 years away, it’s a dream, maybe it’ll happen sometime”. I used to think it was 50. Now I’m thinking like it’s 15 or a little more. It’s within reach. It’s within our lifetime. It’s going to happen.’

This story originally appeared on HPC Projects. It appears here as part of a cross-publishing agreement with Scientific Computing World.

 

Read the entire post …

Posted in Exascale, HPC | Leave a comment

OSU Grad Student Explores Strengths, Challenges of Cray, IBM, and Nvidia

Search Results for: paris

When it comes to benchmarks, your performance mileage may vary. Now an Ohio State University researcher has established some side-by-side performance comparisons that surveying the wide range of parallel system architectures offered in the supercomputer market, .

We explore the parallelization of the subset-sum problem on three contemporary but very different architectures, a 128-processor Cray massively multithreaded machine, a 16-processor IBM shared memory machine, and a 240-core NVIDIA graphics processing unit,” said Bokhari. “These experiments highlighted the strengths and weaknesses of these architectures in the context of a well-defined combinatorial problem.”

Read the Full Story.

Read the entire post …

Posted in Compute, GPUs, HPC, HPC Hardware | Leave a comment

Advertisement

Altair HP White Paper Ad

Video Archive

insideHPC.com is a production of insideHPC, LLC. © 2006-2013 Sitemap