In this special guest feature, John Kirkley writes that code modernization is helping organization large and small prepare for a new era in parallel computing.
Although the drive to upgrade existing legacy code has been a staple of IT organizations for some time, it’s only recently that broader efforts under the banner of “code modernization” have taken on a new urgency. A major impetus is the development of multicore, manycore processor architectures designed to function best with code that has been designed to run on parallel systems.
Understandably Intel, with its Intel® Xeon® and Intel® Xeon Phi™ processors and coprocessors, has a major interest in everything to do with code modernization. The company has created a broad range of programs and initiatives to foster the movement.
In fact, Intel has a long history of working with open source and industry leaders to advance parallel computing. Included is supporting scientific research in key areas such as cancer research, physics, genomics, energy and climate modeling to name just a few of the many disciplines demanding constantly increasing computational resources. The upshot is that modern systems and the applications they run will evolve to fully exploit the performance of today’s highly parallel architectures.
Here are two perspectives from a pair of Intel software experts deeply engaged in bringing the tools, techniques and intellectual property associated with code modernization to the company’s customers and the developer community.
Robert Geva is a principal engineer at Intel’s software and services group. His code modernization efforts center around helping IT organizations and developers with large organizations prepare for this new era in advanced computing.
Scott Apeland is also addressing the needs of developers, but from another point of view based on his role as Director of Intel’s Developer Program. He is also responsible for the Intel® Developer Zone and broad developer outreach, training and support.
Robert Geva and the Customer
Geva and his team are working with large organizations representing leaders in the fields of oil and gas, weather, financial services and more.
“We are working on helping our customers do what they need to do to change their algorithms to take advantage of the processor technology,” he says.
But now, with the pending introduction of the latest versions of the Intel Xeon processor family including Intel Xeon Phi code name Knights Landing, along with advances in code modernization, these customers have the potential to realize major performance gains.
This, Geva says, will require hard work, especially in rewriting existing code to take full advantage of: multithreading, SIMD technology, and memory efficiency
- Multithreading, especially in combination with multiprocessing, allows updated, parallelized code to take full advantage of the processing power of multicore processing – a capability beyond the reach of sequential code. Multithreading with multiprocessing also scales efficiently to handle the increasingly larger datasets that have become a fact of life in today’s computational environment.
- SIMD technology, short for single instruction, multiple data, is a type of parallel architecture that allows a single instruction to perform identical actions on multiple pieces of data – the essence of parallelism. “Very few applications take full advantage of SIMD technology,” comments Geva. “This is another level of parallelism inside the core that give you a large performance boost. So, for large organizations, we find ourselves focusing on SIMD more than anything else. The need is there as is the performance potential, especially when SIMD technology is combined with the APx Accelerator.”
This approach, he says, helps developers program the SIMD hardware, a capability that was absent until recently. “So combine the performance potential with reduced efforts provided by programmability improvement and you have a nice sweet spot combination of capabilities.”
- Memory efficiency is another key component of code modernization. Customers who are parallelizing their code find that memory inefficiency is a much bigger problem than it is for sequential applications. Parallel applications have to be redesigned to incorporate memory efficient and cache efficient algorithms, another situation that is a direct result of advances in manycore, multicore technologies.
Intel helps developers within these large organizations in a variety of ways. In some cases, Intel seeds them with samples of hardware and provides them with development tools such as Intel compilers and/or libraries. For example, the Intel® Advisor Vectorization Advisor in the suite explicitly instructs programmers in how to vectorize their code. Intel’s Developers Relations Division also provides a framework for joint engineering projects with customers – code modification is high on the list.
Geva’s team has considerable domain knowledge in financial services. So the team members work jointly with their financial services customers to take existing sequential applications and subject them to multithreading and vectorization. Not only does the customer end up with enhanced applications, but they also gain code modification skills by doing real work on their own applications. “After doing it a couple of times, they reach a tipping point where they no longer need us to do the work for them. That’s a big benefit,” says Geva.
At the November STAC summit in London, Thomas Trenner, a Quantitative Analyst with Citi Group presented results that were accomplished jointly with the Intel team. The efforts started on the basis of a fully functional application, deployed on a daily basis in the bank. The Intel engineers analyzed the computational profile and identified a strategy to modernize the code. Using the Intel® Math Kernel Library , the memory allocation within Intel® Threaded Building Blocks (Intel® TBB), and vectorization, they achieved a per core speed up of 8.8X and they parallelized using Intel TBB to achieve an overall speedup of > 180X. To illustrate what that means to Citi, a computation that took 90 seconds before the modernization is now done in ½ a second.
According to Geva, there is no shortage of applications and code that need modernization – Geva and his counterparts within Intel have their work cut out for them for some time to come.
Initiatives for Developers
Scott Apeland is also focused on helping developers, but he takes a different tack. He is the Director of Intel’s Developer Program and responsible for the Intel Developer Zone and broad developer outreach, training and support.
One very visible initiative is the Intel® Modern Code Developer Community that was launched in July 2015. Since then, the program has reached nearly 400,000 developers and partners. The program provides tools, training, knowledge and support to developers worldwide.
One of the unique features of the initiative is the expertise offered by a network of elite developers in parallelism and HPC known as “Black Belts.” This effort leverages an online community portal that offers training, support and technical resources, as well as remote access to hardware. Black Belts and Intel experts are available to educate developers on code modernization techniques such as vectorization, memory and data layout, multithreading and multi-node programming. Hot topics include “Parallelism in modern and upcoming Intel architecture,” “Intel Xeon Phi coprocessors: today and tomorrow;” and “Essentials of data parallelism and vectorization with Intel C and C++ compilers.”
Training sessions include a hand-on coding lab using remotely accessed HPC clusters driven by Intel Xeon processors and Intel Xeon Phi coprocessors. Just a few of the topics covered include architecture overviews, memory optimization, multithreading, vectorization, MKL and tools. The sessions also include a comprehensive set of live and on-demand webinars that provide an in-depth, intensive course on efficient parallel programming.
A self-study part of the program provides attendees with remote access over SSH to Linux-based training servers with a virtualized Intel Xeon processor, an Intel Xeon Phi coprocessor and Intel software development tools.
Apeland notes, “Participants can use these training servers to run exercises in the course or experiment with their own applications. This helps your organization drive faster breakthroughs through faster code. You are able to get more results on your hardware today and carry your code forward into the future.”
He says that the Modern Code community builds on the efforts of the Intel Parallel Computing Centers initiative, which was launched in 2014. The program is chartered with driving the modernization of technical computing community codes. One of the primary ways it will meet this goal is by collaborating with and funding universities, institutions and labs to develop curriculum that trains students, scientists and researchers in parallel programming techniques. This, in turn, will accelerate the movement to actively modernize key community codes to run on the latest industry standard parallel architectures.
Impact on the HPC Community
The work being done by Apeland, Geva and many others at Intel and in the HPC community will advance scientific research in a host of critical applications, ranging from cancer research to climatology. These efforts can only be successful if the constantly increasing demand for advanced computational capabilities are met.
Says Apeland, “To meet this demand, modern systems will continue to grow in scale, and applications must evolve to fully exploit the performance of these systems. While today’s HPC developers are aware of code modernization, many are not yet taking full advantage of the environment and hardware capabilities available to them. Intel is committed to helping the HPC community develop modern code that can fully leverage today’s hardware and carry forward to the future. This requires a multi-year effort complete with all the necessary training, tools and support. The customer training we provide and the initiatives and programs we have launched and will continue to create all support that effort.”