ARM has taken a step into the artificial intelligence market with the announcement of a new micro-architecture – DynamIQ – specifically designed for artificial intelligence (AI). “DynamIQ technology is a monumental shift in multi-core microarchitecture for the industry and the foundation for future ARM Cortex-A processors. The flexibility and versatility of DynamIQ will redefine the multi-core experience across a greater range of devices from edge to cloud across a secure, common platform.”
With the release of Intel Parallel Studio XE 2017, the focus is on making applications perform better on Intel architecture-based clusters. Intel MPI Library 2017, a fully integrated component of Intel Parallel Studio XE 2017, implements the high-performance MPI-3.1 specification on multiple fabrics. It enables programmers to quickly deliver the best parallel performance, even if you change or upgrade to new interconnects, without requiring changes to the software or operating environment.
Amazon Web Services chief evangelist Jeff Barr announced in a recent blog post that the company was adding Xilinx FPGAs to its Amazon Elastic Compute Cloud (Amazon EC2). The addition of this new hardware will allow users to create accelerated FPGA applications, but AWS will also let users sell these applications on the AWS Marketplace. “We are giving you the ability to design your own logic, simulate and verify it using cloud-based tools, and then get it to market in a matter of days,” said Barr.
Libraries that are tuned to the underlying hardware architecture can increase performance tremendously. Higher level libraries such at the Intel Data Analytics Acceleration Library (Intel DAAL) can assist the developer with highly tuned algorithms for data analysis as well as machine learning. Intel DAAL functions can be called within other, more comprehensive frameworks that deal with the various types of data and storage, increasing the performance and lowering the development time of a wide range of applications.
The Euro-Par 2017 conference has issued its Call for Papers. The conference takes place Aug. 28 – Sept. 1, 2017 in Santiago de Compostela, Spain. Euro-Par is the prime European conference covering all aspects of parallel and distributed processing, ranging from theory to practice, from small to the largest parallel and distributed systems and infrastructures, from […]
FPGAs will become increasing important for organizations that have a wide range of applications that can benefit from performance increases. Rather than a brute force method to increasing performance in a data center by purchasing and maintaining racks of hardware and associated costs, FPGAs may be able to equal and exceed the performance of additional servers, while reducing costs as well.
SC16 returns to Salt Lake City on Nov. 13-18. The Six-day supercomputing event features internationally-known expert speakers, cutting-edge workshops and sessions, a non-stop student competition, the world’s largest supercomputing exhibition,panel discussions and much more. “No other annual event showcases the revolutionary advances and possibilities of high performance computing than the annual ACM/IEEE International Conference for High Performance Computing, Networking, Data Storage Analysis. From the impact of HPC on the future of medicine, to its transformative power in developing countries and “smart cities.” SC is the premiere venue for presenting leading-edge HPC research.”
With the introduction of the Intel Scalable System Framework, the Intel Xeon Phi processor can speed up Finite Element Analysis significantly. Using highly tuned math libraries such as the Intel Math Kernel Library (Intel MKL), FEA applications can execute math routines in parallel on the Intel Xeon Phi processor.
James Reinders presented this talk at the 2016 Argonne Training Program on Extreme-Scale Computing. Reinders is the author of multiple books on parallel programming. His most recent book, entitled Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition 2nd Edition, was co-authored by James Jeffers and Avinash Sodani.
“With up to 72 processing cores, the Intel Xeon Phi processor x200 can accelerate applications tremendously. Each core contains two Advanced Vector Extensions, which speeds up the floating point performance. This is important for machine learning applications which in many cases use the Fused Multiply-Add (FMA) instruction.”