Nikolay Malitsky from Brookhaven National Laboratory presented this talk at the Spark Summit East conference. “This talk will present a MPI-based extension of the Spark platform developed in the context of light source facilities. The background and rationale of this extension are described in the paper “Bringing the HPC reconstruction algorithms to Big Data platforms.” which highlighted a gap between two modern driving forces of the scientific discovery process: HPC and Big Data technologies. As a result, it proposed to extend the Spark platform with inter-worker communication for supporting scientific-oriented parallel applications.”
Intel DAAL is a high-performance library specifically optimized for big data analysis on the latest Intel platforms, including Intel Xeon®, and Intel Xeon Phi™. It provides the algorithmic building blocks for all stages in data analysis in offline, batch, streaming, and distributed processing environments. It was designed for efficient use over all the popular data platforms and APIs in use today, including MPI, Hadoop, Spark, R, MATLAB, Python, C++, and Java.
“We have enhanced Bright Cluster Manager 7.3 so our customers can quickly and easily deploy new deep learning techniques to create predictive applications for fraud detection, demand forecasting, click prediction, and other data-intensive analyses,” said Martijn de Vries, Chief Technology Officer of Bright Computing. “Going forward, customers using Bright to deploy and manage clusters for deep learning will not have to worry about finding, configuring, and deploying all of the dependent software components needed to run deep learning libraries and frameworks.”
Today NERSC announced plans to host a new, data-centric event called Data Day. The main event will take place on August 22, followed by a half-day hackathon on August 23. The goal: to bring together researchers who use, or are interested in using, NERSC systems for data-intensive work.
“This talk will present RDMA-based designs using OpenFabrics Verbs and heterogeneous storage architectures to accelerate multiple components of Hadoop (HDFS, MapReduce, RPC, and HBase), Spark and Memcached. An overview of the associated RDMAenabled software libraries being designed and publicly distributed as a part of the HiBD project.”
With the launch of Univa Small Jobs add-on for Univa Grid Engine, the company, the company offers “the world’s most efficient processing and lowest latency available for important tasks like real-time trading, transactions, and other critical applications.” To learn more, we caught up with Univa President & CEO Gary Tyreman.