Intel DAAL is a high-performance library specifically optimized for big data analysis on the latest Intel platforms, including Intel Xeon®, and Intel Xeon Phi™. It provides the algorithmic building blocks for all stages in data analysis in offline, batch, streaming, and distributed processing environments. It was designed for efficient use over all the popular data platforms and APIs in use today, including MPI, Hadoop, Spark, R, MATLAB, Python, C++, and Java.
In this week’s Sponsored Post, Katie Garrison, of One Stop Systems explains how GPUs and Flash solutions are used in radar simulation and anti-submarine warfare applications. “High-performance compute and flash solutions are not just used in the lab anymore. Government agencies, particularly the military, are using GPUs and flash for complex applications such as radar simulation, anti-submarine warfare and other areas of defense that require intensive parallel processing and large amounts of data recording.”
“As with all new technology, developers will have to create processes in order to modernize applications to take advantage of any new feature. Rather than randomly trying to improve the performance of an application, it is wise to be very familiar with the application and use available tools to understand bottlenecks and look for areas of improvement.”
High-performance computing (HPC) tools are helping financial firms survive and thrive in this highly demanding and data-intensive industry. As financial models grow in complexity and greater amounts of data must be processed and analyzed on a daily basis, firms are increasingly turning to HPC solutions to exploit the latest technology performance improvements. Suresh Aswani, Senior Manager, Solutions Marketing, at Hewlett Packard Enterprise, shares how to overcome the learning curve of new processor architectures.
“By implementing popular Python packages such as NumPy, SciPy, scikit-learn, to call the Intel Math Kernel Library (Intel MKL) and the Intel Data Analytics Acceleration Library (Intel DAAL), Python applications are automatically optimized to take advantage of the latest architectures. These libraries have also been optimized for multithreading through calls to the Intel Threading Building Blocks (Intel TBB) library. This means that existing Python applications will perform significantly better merely by switching to the Intel distribution.”
“Many of the libraries developed in the 70s and 80s for core linear algebra and scientific math computation, such as BLAS, LAPACK, FFT, are still in use today with C, C++, Fortran, and even Python programs. With MKL, Intel has engineered a ready-to-use, royalty-free library that implements these numerical algorithms optimized specifically to take advantage of the latest features of Intel chip architectures. Even the best compiler can’t compete with the level of performance possible from a hand-optimized library. Any application that already relies on the BLAS or LAPACK functionality will achieve better performance on Intel and compatible architectures just by downloading and re-linking with Intel MKL.”
“OpenMP, Fortran 2008 and TBB are standards that can help to create parallel areas of an application. MKL could also be considered to be part of this family, because it uses OpenMP within the library. OpenMP is well known and has been used for quite some time and is continues to be enhanced. Some estimates are as high as 75 % of cycles used are for Fortran applications. Thus, in order to modernize some of the most significant number crunchers today, Fortran 2008 should be investigated. TBB is for C++ applications only, and does not require compiler modifications. An additional benefit to using OpenMP and Fortran 2008 is that these are standards, which allows code to be more portable.”
While HPC developers worry about squeezing out the ultimate performance while running an application on dedicated cores, Intel TBB tackles a problem that HPC users never worry about: How can you make parallelism work well when you share the cores that you run upon?” This is more of a concern if you’re running that application on a many-core laptop or workstation than a dedicated supercomputer because who knows what will also be running on those shared cores. Intel Threaded Building Blocks reduce the delays from other applications by utilizing a revolutionary task-stealing scheduler. This is the real magic of TBB.
In this week’s Sponsored Post, Nicolas Dube of Hewlett Packard Enterprise outlines the future of HPC and the role and challenges of exascale computing in this evolution. The HPE approach to exascale is geared to breaking the dependencies that come with outdated protocols. Exascale computing will allow users to process data, run systems, and solve problems at a totally new scale, which will become increasingly important as the world’s problems grow ever larger and more complex.
Each year the OpenFabrics Alliance (OFA) hosts an annual workshop devoted to advancing the state of the art in networking. “One secret to the enduring success of the workshop is the OFA’s emphasis on hosting an interactive, community-driven event. To continue that trend, we are once again reaching out to the community to create a rich program that addresses topics important to the networking industry. We’re looking for proposals for workshop sessions.”