In this video, Blake Caldwell from ORNL presents: Best Practices for Scalable Administration of Lustre. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
In this video, Blake Caldwell from ORNL presents: Best Practices for Scalable Administration of Lustre. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
The week the OpenACC standards group announced growing support for OpenACC-supported development tools, and initial results from programmers who have been using the recently-released OpenACC compilers to accelerate research. Designed to enable scientists to take advantage of heterogeneous CPU/GPU computing systems, the OpenACC programming standard is now available in compiler products from the OpenACC founding members, Cray, The Portland Group (PGI) and CAPS enterprise. It is also gaining increasing support in other programming tools, including recently released solutions by Allinea and RogueWave, which provide visual debugging of OpenACC directives on Cray XK6 systems.
Using PGI’s OpenACC compiler, we ported a computational fluid dynamics (CFD) application benchmark to a general purpose GPU-based system,” reported NASA researchers in an upcoming research paper. “OpenACC is a much easier way to accelerate applications than other programming approaches, and we saw an immediate speed up of the benchmark on multiple tests, up to 10X faster compared with a single CPU core-based system.”
Read the Full Story.
In a move to greatly expand the number programming languages that can take advantage of GPU acceleration, Nvidia today announced that the LLVM open source compiler now supports CUDA. The company has worked with LLVM developers to provide the CUDA compiler source code changes to the LLVM core and parallel thread execution backend. As a result, programmers can develop applications for GPU accelerators using a broader selection of programming languages, making GPU computing more accessible and pervasive than ever before.
The code we provided to LLVM is based on proven, mainstream CUDA products, giving programmers the assurance of reliability and full compatibility with the hundreds of millions of NVIDIA GPUs installed in PCs and servers today,” said Ian Buck general manager of GPU computing software at NVIDIA. “This is truly a game-changing milestone for GPU computing, giving researchers and programmers an incredible amount of flexibility and choice in programming languages and hardware architectures for their next-generation applications.”
LLVM supports a wide range of programming languages and front ends, including C/C++, Objective-C, Fortran, Ada, Haskell, Java bytecode, Python, Ruby, ActionScript, GLSL and Rust. It is also the compiler infrastructure NVIDIA uses for its CUDA C/C++ architecture, and it has been widely adopted by leading companies such as Apple, AMD and Adobe.
Read the Full Story. To download the latest version of the LLVM compiler with NVIDIA GPU support, visit the LLVM site.
Comparing various aspects of Lustre and PanFS, Anil Patrick R from TechTarget India looks at objects-based file systems as the basis for Exascale supercomputers.
PanFS does compete with Lustre in the research and university HPC arenas, but Panasas seems to have its crosshairs on public sector and commercial applications. “Our approach is to take object-storage architecture into areas that use the product to solve common problems in design and discovery, as well as place a value on manageability, high availability and reliability features—not just on performance,” said Welch. “This allows us to easily support demanding big data applications in the bioscience, energy, government, finance and manufacturing as well as other core research and development sectors.” Panasas has partnerships with major clustered compute solution vendors such as Dell, HP and SGI for PanFS.
Read the Full Story.
In this slidecast, Shaun Walsh from Emulex and Nan Boden from Myricom present: High Performance Networking Solutions. Today Emulex announced its bringing its new family of OneConnect 10Gb Ethernet Network Xceleration solutions to market in partnership with Myricom.
Partnering with Emulex allows Myricom to bring its unique high performance networking software solutions for specific vertical market applications to a broader market,” said Dr. Nan Boden, chief executive officer, Myricom. “Emulex’s market strength and industry-leading Ethernet road map combined with Myricom’s ultra-low latency performance, lossless packet capture/injection and traffic shaping technology enables the two companies to provide best-of-breed HPC networking.”
Read the Full Story * Download the MP3 * Subscribe on iTunes * If Dropbox is blocked, download from this Google page.
In this slidecast, Todd Wilde from Mellanox presents: Scalable HPC New Accelerations for Parallel Programming Languages over InfiniBand.
This presentation will explore new advancements Mellanox has developed in increasing the performance and scalability of parallel programs over InfiniBand. These include Fabric Collectives Accelerator (FCA), and Mellanox Messaging Accelerations (MXM). In addition, the webinar will provide an overview of the parallel programming libraries that are using these accelerations, including the new ScalableSHMEM and ScalableUPC PGAS libraries that Mellanox has recently introduced to run over InfiniBand.”
Read the Full Story * Download the MP3 * Subscribe on iTunes * If Dropbox is blocked, download from this Google page.
The Parallel Programming Conference has issued its Call for Papers. The event will be held in the Netherlands on June 21, 2012. Sponsored by Stream Computing, the event aims to help researchers and professionals share their experiences with GPGPU, OpenCL, CUDA and alike techniques.
Are you researching on/using OpenCL, CUDA, LLVM, OpenMP or alike, and you want to share this information to get feedback, cooperation partners and most of all appreciation? You are very welcome to speak at the small conference. Focus is on sharing information in the Benelux, and researchers who are looking for connections within the Benelux.
Read the Full Story.
In this video, Jason Rappley from NASA presents: Lustre Performance Analysis with SystemTap. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
In this video, Robert Stober from Bright Computing presents: Bright Cluster Manager: Lustre Cluster Management Made Easy. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
Uri Tal from Rocketick writes that GPUs offer great potential for speeding EDA processing.
There’s a lot of potential in GPUs for those EDA applications that have parallelism potential. The GPU architecture is ideal for data-parallel processing; it is an incredible throughput-machine, if you give it the right code to run. However, a major effort is needed to redesign not only the software, but the underlying algorithms as well. For us at Rocketick, this redesign effort paid off. We are able today to simulate the largest chip designs in the world 10 to 30 times faster, compared to the leading simulators in the market.
Read the Full Story.
In this video, Paul Kolano from NASA presents: Optimizing Lustre Performance Using Stripe-Aware Tools. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
This week HPC compiler-maker CAPS entreprise announced support for OpenACC in its HMPP Workbench 3.1, a move designed to make Many-Core programming easier.
The GPU computing breakthrough has allowed many users to propose new massively parallel codes to advance many scientific fields. With OpenACC we are simplifying the use of accelerators and leveraging legacy applications. We are very confident that this will help to further broaden the community taking advantage of many-core technologies.” said François Bodin, CAPS CTO.
Read the Full Story.
In this video, Mike Barry from Terascala presents: Accelerating Applications Through Storage Optimization. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
By Dan Olds, Gabriel Consulting • Get more from this author
Multicore processors drive everything these days from the biggest HPC cluster to the lowliest tablet – even smartphones. While parallel programming has come quite a way, there are still many apps that aren’t well-behaved at all.
They’re the worst kind of guests – acting like they own the whole damned house while paying absolutely no attention to the needs of other residents.
They’ll grab more memory than they need and never let it go. They’ll spawn enough threads to crowd out everyone else; it’s like inviting their deadbeat friends over to watch the Super Bowl at your house and eat your snacks. Operating systems and virtualization mechanisms attempt to control unruly apps, but they don’t have the ability to completely control and prioritize system resources.
Enter exLudus, and what they’re calling the industry’s first micro-virtualization solution, intuitively named MCOpt. What we’re talking about is a suite of software packages that provide dynamic workload containers, workload characterization, and a performance monitoring/management for Linux operating system instances. It works at a node-level, as a layer between the Linux kernel and the applications running on top of it.
With MCOpt, users can dictate the priority of apps and jobs, and the system will automatically adjust core and memory shares to ensure that the priorities are satisfied. It’s not a static fair-share scheduler; it works dynamically to constantly adjust resource shares and job timing so that SLAs are met and the system achieves maximum throughput.
It can do this because it’s monitoring how each job is using cores and memory. It can spot when a job isn’t using all of its allocated memory or core shares (or if it’s trying to use too much of either) and make adjustments on the fly to keep everything running smoothly and according to business priorities.
In our discussion, the exLudus folks talked about the Linux scheduler and how it can unpredictably cause job priorities to change during execution – which isn’t necessarily a bad thing. But it can make it difficult to pinpoint when resource contention is hindering overall performance. It also means that subsequent re-runs of the same set of jobs will result in different contention behaviors.
MCOpt can also save important work from falling victim to the Linux Angel of Death – the Out-Of-Memory Killer. When system RAM is oversubscribed, there’s a risk that the OOM Killer can swoop in (well, it doesn’t really swoop) and kill processes to free up memory.
MCOpt helps in two ways. First, it keeps apps from oversubscribing memory and thus prevents the OOM Killer from coming into play in the first place. Second, it can steer OOM Killer behavior to protect high-priority workloads. With MCOpt, subsequent re-runs of the same set of tasks will behave exactly the same way – meaning predictable application performance even under stress. (System stress, not personal stress.)
This high level of control can really help overall throughput. The company says that their tests show a 20 to 50 per cent increase in total throughput with MCOpt versus a stock Linux. This is a measure that includes the low overhead load of MCOpt, of course. They have some white papers and stuff here, plus free trial versions of their software too.
exLudus also made sure to point out that MCOpt doesn’t require any application or OS modifications – the MCOpt layer is transparent to both. Better yet, MCOpt can work with other cluster management and virtualization suites if needed – sort of as a subcontractor.
exLudus brings an interesting set of capabilities to the Linux workload management table. It’s like taking a previously unmanageable city traffic plan (Boston? The Bay Area?) and adding synchronized lights and a set of maniacally focused traffic managers. It’s definitely worth a look if you’re seeing signs of road rage between competing apps on your Linux systems. ®
This article originally appeared in The Register. It appears here in its entirety as part of a cross-publishing agreement.
In this video: Scott Michael from Indiana University presents: How to Tune Your Wide Area File System for a 100 Gbps Network. Recorded at LUG 2012 in Austin.
Note: Most of the videos from LUG 2012 are now posted at the OpenSFS site.
insideHPC.com is a production of insideHPC, LLC. © 2006-2011 Sitemap