Over at the Texas Advanced Computing Center, Aaron Dubrow writes that researchers are using a specialized cluster at TACC to do experimental Hadoop-style studies on a current production system.
This system offers researchers a total of 48, eight-processor nodes on TACC’s Longhorn cluster to run Hadoop in a coordinated way with accompanying large-memory processors. A user on the system can request all 48 nodes for a maximum of 96 terabytes (TB) of distributed storage. What’s special about the Longhorn cluster at TACC isn’t simply the beefed-up hardware for running Hadoop; rather it’s the ability for researchers to leverage the vast compute capabilities of the center, including powerful visualization and data analysis systems, to further their investigations. The end-to-end research workflow enabled by TACC could not be done anywhere else, and as a bonus, researchers get access to the full suite of tools available at the center to do computational research.
According to TACC Research Associate Weijia Xu, the best part is that Hadoop is easy to use without requiring users to be experts. It handles a lot of the low-level computing behavior, so people don’t need to have a lot of knowledge about I/O or memory structures to get started.
In this video from the AWS Summit 2013 in New York, Jafar Shameem and David Pellerin from Amazon present: Best Practices for HPC in the Cloud.
More and more, the scalable on-demand infrastructure provided by AWS is being used by researchers, scientists and engineers in Life Sciences, Finance and Engineering to solve bigger problems, answer complex questions and run larger simulations. In this session we start by talking about the supercomputing class performance and high performance storage available to the scientists and engineers at their fingertips. We will go over examples of how startups are innovating and large enterprises are extending their HPC environments. Finally, we walk through some of the common questions that come up as organizations start leveraging AWS for their high performance computing needs.
The eXtreme Science and Engineering Discovery Environment (XSEDE) is accepting proposals for computationally intensive challenges facing industry that could be addressed with access to XSEDE expertise and resources. Faculty and researchers from accredited universities are eligible, as well as industry researchers, and researchers from government laboratories.
Proposals must describe a computationally challenging problem hindering industry today while demonstrating how the cooperative efforts of academia, industry and XSEDE, and potentially government research laboratories, will provide a framework for solving the problem.
Two-page letters of intent are due July 19, 2013 with full proposals are due Sept. 13, 2013. Read the Full Story.
We are thrilled to be recognized by Red Herring with this honor,” said Rob Clyde, CEO of Adaptive Computing. “There is massive waste throughout the world due to low compute utilization rates, which currently stand at about eight percent, meaning over 90 percent of dollars spent on servers are just being wasted. With Adaptive’s patented technologies, we are radically increasing these utilization rates, and being named to this list is a great validation of our efforts.”
One of the criticisms of FPGA computing for HPC has been a lack of useable programming tools. In this video, Russell Stern from SolarFlare describes how the company leveraged its experience with Altera FPGA tools for its AOE ApplicationOnload Engine.
Computers have always been able to perform specific tasks with the addition of application software. Solarflare’s ApplicationOnload™ Engine – AOE – transforms this model by enabling compute-intensive host software to be embedded in network adapter hardware. This greatly accelerates application performance, improves server utilization by reducing CPU, memory and I/O loads and provides an ultra-low latency interface to the network. This is Custom Compute. The tranformation of network data processing can be beneficial to many high-compute networking environments, including electronic/high-frequency trading, government and financial security, scientific research, bioinformatics/healthcare, military, analytics, oil and gas exploration, plus digital media production and broadcast, among others.
Over at John McCalpin’s Blog, “Dr. Bandwidth” has posted an in-depth look at memory-mapped IO (MMIO).
For tightly-coupled acceleration, it would be nice to have the option of having the processor directly read and write to memory locations on the IO device. The fundamental capability exists in all modern processors through the feature called “Memory-Mapped IO” (MMIO), but for historical reasons this provides the desired functionality without the desired performance. As discussed below, it is generally possible to set up an MMIO mapping that allows high-performance writes to IO space, but setting up mappings that allow high-performance reads from IO space is much more problematic.
The Cray User Group wrapped up in Napa Valley earlier this month with a team from NERSC winning best paper: Trillion Particles, 120,000 cores, and 350 TBs: Lessons Learned From a Hero I/O Run on Hopper.
An unprecedented trillion-particle simulation, which utilized more than 120,000 processors and generated approximately 350 terabytes of data, pushed the performace capability of the National Energy Research Scientific Computing Center’s (NERSC’s) Cray XE6 “Hopper” supercomputer to its limits. In addition to shedding new light on a long-standing astrophysics mystery, the successful run also allowed a team of computational researchers from the Lawrence Berkeley National Laboratory (Berkeley Lab) and Cray Inc. to glean valuable insights that will help thousands of scientists worldwide make the most of current petascale systems like Hopper, which are capable of computing quadrillions of calculations per second, and future exascale supercomputers, which will compute quintillions of calculations per second.
Blue Waters' 380 petabyte High Performance Storage System (HPSS)
Today NCSA announced that its 380 Petabyte High Performance Storage System is now in full service production as part of the Blue Waters project. Described as the world’s largest automated near-line data repository for open science, the HPSS environment comprises multiple automated tape libraries, dozens of high-performance data movers, a large 40 Gigabit Ethernet network, hundreds of high-performance tape drives, and about a 100,000 tape cartridges.
This “big data” capacity is available to scientists and engineers using the sustained petascale Blue Waters supercomputer. The storage system can be easily expanded and extended to accommodate the extreme data needs of other science, engineering, or industry projects.
With the world’s largest HPSS now in production, Blue Waters truly is the most data-focused, data-intensive system available to the U.S. science and engineering community,” said Blue Waters deputy project director Bill Kramer.
The HPSS hierarchical file system software is designed to efficiently manage the access and storage of hundreds petabytes of data at high data rates. HPSS manages the life cycle of data by moving inactive data to tape and retrieving it the next time it is referenced. The highly scalable HPSS is the result of two decades of collaboration among five Department of Energy laboratories and IBM, with significant contributions by universities and other laboratories worldwide.
NCSA joined forces with the HPSS Collaboration’s Department of Energy labs and IBM to develop an HPSS capability for Redundant Arrays of Independent Tapes (RAIT)—tape technology similar to RAID for disk. RAIT dramatically reduces the total cost of ownership and energy use to store data without danger from single or dual points of failure through generated parity blocks. It also enhances the performance of data storage and retrieval since the data is stored and read/written in parallel.
As readers of this blog well know, Xeon Phi coprocessors require some advanced parallel programming techniques. Colfax took it upon themselves to create a complete course for their clients — and anyone who wants to program the Xeon Phi coprocessor. As someone involved in the publishing industry for 20 years now as a side job from my programming work, I can say they’ve done a fine job.
In this NPR podcast, Tom Ashbrook looks at how the sequester and budget cuts are affecting American science and research.
American leadership in science has been a given for most of the last century. About a third of science research and development in this country has been supported by the federal government. Funding for about 60 percent of basic research in science comes from Washington. We’ve celebrated the results, from moon shots to the Internet. Now Washington’s cutting back, and it’s hitting American science — just when competitor nations are plowing more cash into the science frontier.
Eight student teams from universities in the United States, Germany, China and Australia have been selected to compete in the Standard Track of the Student Cluster Competition to be held at the SC13 conference, Nov. 17-22, 2013, in Denver, Colorado. This real-time, 48-hour non-stop challenge will feature teams of undergraduate and/or high school students building, tuning and racing HPC clusters of their own design on the SC13 exhibit floor.
Cluster computing has arrived and is now an important part of the computing science curriculum at leading universities,” said Brent Gorda, General Manager of the High Performance Data Division at Intel and originator of the competition. “The participants are the future rock stars of HPC. By showcasing the work of these students, we are seeding the computing community with new talent.”
This year’s teams include the first-ever team from Australia, which will be traveling nearly 9,000 miles from Perth for the competition. The following institutions will field teams:
Boston University (USA)
IVEC, a joint venture between CSIRO, Curtin University, Edith Cowan University, Murdoch University and the University of Western Australia (Australia)
National University of Defense Technology (China)
The University of Colorado, Boulder (USA)
The University of the Pacific (USA)
The University of Tennessee, Knoxville (USA)
The University of Texas, Austin (USA)
Friedrich-Alexander University of Erlangen-Nuremberg (Germany)
The Student Cluster Competition is part of SC13 HPC Interconnections, which provides programs for everyone interested in building a stronger HPC community, including students, educators, researchers, international attendees and underrepresented groups.
In this slidecast, Marty Czekalski from the SCSI Trade Association presents: Extending the SCSI Platform of Innovation.
SCSI Express is the robust and proven SCSI protocol combined with PCIe that creates an industry-standard path to PCIe-based storage. SCSI Express combined with SAS-based solutions provides unprecedented performance and low latency that enterprises demand.
This week Mellanox announced the new SX1012 Ethernet switch. Based on the company’s its SwitchX-2 technology, the SX1012 is a cost-effective solution for small-scale high-performance computing, storage and database deployments.
The Mellanox SX1012 Ethernet switch is a great fit into small-scale storage and database applications, providing very high-throughput capacity in a compact enclosure,” said Gilad Shainer, vice president of marketing at Mellanox Technologies. “We are seeing increased market demand for small-scale 40GbE switches that can enable the creation of small high-performance clusters, storage solutions and database appliances. The new SX1012 switch delivers the right solution for these environments, removing the need for larger, more expensive switch platforms.”
The SX1012 system provides 12 ports of 40GbE at the smallest footprint in the industry with two systems fitting into a single 1U height, in a standard 19 inch rack. Each port can be configured to provide up to 56GbE line rate to further enhance the performance capabilities of the switch and can also be split into four standalone 10GbE interfaces. This allows the SX1012 to be used in migration scenarios from 10GbE to 40/56GbE server and storage connectivity.