Adaptive Computing, a cloud management and high performance computing outfit in Utah, needed something really cool to bring to their trade shows. Something that makes order out of chaos, and demonstrates their attention to detail in the midst of miles of wiring. They decided building the largest non-commercial LED cube would be a good project, and thus the 16x16x16 All Spark Cube was born. The All Spark Cube was constructed using 10 mm RGB LEDs wired together with three-foot lengths of 16 ga pre-tinned copper wire. In this video, [Kevin] shows off the process of constructing a single row; first the LEDs are placed in a jig, the leads are bent down, and a bus wire is soldered to 16 individual anodes per row.”
In this video from the Adaptive Computing booth at SC12, Andrew Howard from Purdue discusses how the community cluster program has moved forward with the help of Moab software at the Rosen Center for Advanced Computing.
In this video from SC12, Nick Ihli from Adaptive Computing demonstrates how the company’s Torque resource manager works with Intel Xeon Phi. By relaying Intel Xeon Phi instrumentation such as memory availability to the company’s Moab workload manager, the system is able to schedule coprocessor resources efficiently.
With the amazing capabilities of the latest supercomputing coprocessors such as the Intel Xeon Phi coprocessor, it’s vital to make it as simple as possible to integrate them into existing supercomputers,” noted Robert Clyde, CEO of Adaptive Computing. “The latest iteration of Moab was designed to maximize the investment being made by today’s HPC providers.”
One of the many product announcements out of SC12 last month was the release of StackIQ Enterprise HPC, a comprehensive cluster management suite powered by Rocks+ software.
We are thrilled to bring this major update to our HPC customers in time for the annual SC12 conference,” said Tim McIntire, President and co-founder of StackIQ. “By bringing the enterprise features of our Enterprise Data product to the HPC products, we’ve improved the HPC product, while making it easier for those building hybrid HPC/Hadoop clusters to get their work done.”
Administrators will find it easier to track cluster health using new advanced cluster diagnostics tools, while developers will find it easier than ever to develop and debut Rolls using features like the filtered “profiles” tab in the GUI. StackIQ also added advanced firewall configuration to enhance the security of HPC clusters, making them more robust and able to be integrated into today’s enterprise data center environments. Read the Full Story.
In this video from Adaptive Computing booth at SC12, Andrei Kaliazin from the University of Cambridge presents: COSMOS – Fundamental Cosmology, Dark Energy, and the Cosmic Microwave Sky.
The COSMOS Supercomputer Consortium, founded by Stephen Hawking and part of the Science and Technology Facilities Council DiRAC High Performance Computing facility, has chosen Moab HPC Suite 7.2 to manage its groundbreaking scientific computing workloads. Moab will coordinate jobs and allocate computing resources for research in cosmology and astrophysics, including simulations of the origins of the Universe and science exploitation of satellite experiments. This research will utilize a new SGI UV 2000 supercomputer with 1,856 Intel Xeon E5 cores and 1,891 Intel Xeon Phi cores. Adaptive Computing has worked closely with Intel and SGI to enable Moab to manage and schedule this cutting-edge system.”
Read the Full Story.
The latest version of Moab was designed to recognize and work with the new Intel Xeon Phi coprocessors, based on the Intel Many Integrated Cores (MIC) technology. This ability to automatically detect Intel Xeon Phi coprocessors– and determine their location and availability – improves processor utilization to more intelligently schedule jobs and removes the need for extensive reprogramming to integrate Intel Xeon Phi coprocessors into existing systems. It also allows for policy-based scheduling, optimizing the choice of accelerators and coprocessors. As Intel Xeon Phi coprocessors are introduced into existing systems, this keeps costs and management efforts at a minimum, while maximizing utilization to ensure the most efficient job processing – by utilizing metrics including the number of cores and hardware threads, physical and memory available (total and free), max frequency, architect and load.
Read the Full Story.
What is the best way to manage an HPC cluster serving a multi-user tenant base? We asked David Gignac, Senior Systems Administrator at the Texas Advanced Computing Center (TACC). David is responsible for managing “Alamo,” a 96-node cluster that’s part of FutureGrid, a high-performance grid test bed for new approaches to distributed computing. Funded by the National Science Foundation, FutureGrid comprises 920 nodes distributed across eight clusters at sites in the U.S. and Germany, including TACC. Gignac has managed Alamo for three years as part of the five year FutureGrid study, giving him unique insight into the challenges of managing an advanced multi-user tenant HPC cluster.
insideHPC: What do you do for FutureGrid?
David Gignac: The FutureGrid Project is a distributed test bed for software developers and systems administrators focused on grid and cloud computing. It is designed to better understand the behavior of various cloud computing approaches, and to allow researchers to tackle complex projects. Anyone interested in testing code can join the effort and request FutureGrid resources online. Researchers may request up to five nodes configured with a specific kernel to test distributed file systems. A single request may specify 20 different components of software. To meet their specific requirements, I generate a new image with each request. The crucial part of my job is capturing an image of each configuration, so the user can get back to the place they started when the system is rebooted.
insideHPC: What do you do when you’re not managing FutureGrid?
David Gignac: In addition to FutureGrid, I am also responsible for managing more than 2,600 servers for a variety of other research projects at TACC. As with any network administrator, there are only a certain number of boxes I can realistically manage effectively. When you talk about clusters, the management requirement goes through the roof. I need to have a good solution to help me manage this complexity.
insideHPC: How do you keep up?
David Gignac: I depend on good cluster management applications. When I took on administration for Alamo, I reviewed a number of advanced management suites. With all of my other responsibilities, my top criterion was minimizing the amount of time I spend managing each cluster. I looked at cluster management software from all the major vendors including Bright Cluster Manager, Cobbler/LOSF, Platform Computing products, Rocks and xCat.”
insideHPC: How did you choose the cluster management solution for Alamo?
David Gignac: My decision was based on minimizing the time required to manage the cluster: automatic time-consuming tasks and reducing complexity— balanced with providing a high level of service to our users. Drilling down, I needed a solution that would minimize the number of custom scripts I was required to write and something that would provide maximum ‘at a glance’ visibility into the health and operations of each cluster. In addition, I looked for something that would integrate seamlessly with Alamo’s job schedulers: Moab, Torque, Slurm and SGE; yet is nimble to accommodate simultaneous requests from researchers. In the end, I selected Bright Cluster Manager.
insideHPC: Three years later, how’s it going?
David Gignac: It’s been a great run. Bright and Fedora EPEL distros have saved a tremendous amount of time for me. Bright’s image-based provisioning lets me reconfigure Alamo on the fly to meet the specific needs of each researcher’s compute jobs. I click on a check box and the cluster management suite installs a server, sets up a client and I’m done. Further, Bright’s ease of use and full integration with job schedulers have produced major time savings. I don’t need to spend hours writing and maintaining scripts because everything just works. I get dedicated product support, so I don’t waste time searching forums and message boards for answers. In addition, I can easily reproduce any testing environment in minutes and rapidly deploy a new environment.
insideHPC: What’s next?
David Gignac: Cloud bursting. I think there’s an opportunity to experiment with hybrid cluster solutions. Bright lets me manage on-premise and remote cloud-based clusters seamlessly. It all looks the same through the management suite portal. I want to work with FutureGrid participants to test it in the program’s next two years.
insideHPC: And in your free time?
David Gignac: I certainly have more of that now, in spite of all the clusters I manage. Because of the time savings, I am spending more time making improvements on the clusters.
- Support for Intel Xeon Phi coprocessors
- Dual Domain Scheduling for Cray systems
- Streamlined RPM experience
- Allocation Updates
- Enhanced Viewpoint GUI for HPC
Read the Full Story.
Today Univa announced that Archimedes is using Grid Engine distributed resource management software to operationalize a mission critical Hadoop application and reduce operating and deployment cost by 50 percent. Archimedes is a healthcare modeling organization that takes publicly available clinical data and uses it to answer complex, vital healthcare questions for researchers, pharmaceutical companies and government agencies. Through Univa’s Grid Engine unique and scalable solution, Archimedes was able to operate its Hadoop application on its current compute infrastructure, without the need to add additional resources or hardware.
Up until the time we had big data analytics, research using healthcare data took a significant amount of time and effort to analyze,” said Katrina Montinola, VP of Engineering at Archimedes. “The volume involved with big data is immense, and with the advancement of mathematics and computers, we are able to make analytical connections between data points, which may have otherwise been overlooked or minimized. With Univa Grid Engine, the complex analysis being completed by Archimedes’ solutions can be done quickly and made available to researchers and physicians in a convenient format that is informative and efficient.”
Read the Full Story.