This technology guide will help government thought leaders in how best to use data analytics to manage and derive value from an increased dependence on data. “This guide provides an in-depth overview of the use of data analytics technology in the public sector. Focus is given to how data analytics is being used in the government setting with a number of high-profile use case examples, how the Internet-of-Things is taking a firm hold in helping government agencies collect and find insights in a broadening number of data sources, how government sponsored healthcare and life sciences are expanding, as well as how cyber security and data analytics are helping to secure government applications.”
“This talk will describe Monotasks, a new architecture for the core of Spark that makes performance easier to reason about. In Spark today, pervasive parallelism and pipelining make it difficult to answer even simple performance questions like “what is the bottleneck for this workload?” As a result, it’s difficult for developers to know what to optimize, and it’s even more difficult for users to understand what hardware to use and what configuration parameters to set to get the best performance.”
Nikolay Malitsky from Brookhaven National Laboratory presented this talk at the Spark Summit East conference. “This talk will present a MPI-based extension of the Spark platform developed in the context of light source facilities. The background and rationale of this extension are described in the paper “Bringing the HPC reconstruction algorithms to Big Data platforms.” which highlighted a gap between two modern driving forces of the scientific discovery process: HPC and Big Data technologies. As a result, it proposed to extend the Spark platform with inter-worker communication for supporting scientific-oriented parallel applications.”
Today ISC 2017 announced that data scientist, Prof. Dr. Jennifer Tour Chayes from Microsoft Research will give the opening keynote at the conference. “I’ll discuss in some detail two particular applications: the very efficient machine learning algorithms for doing collaborative filtering on massive sparse networks of users and products, like the Netflix network; and the inference algorithms on cancer genomic data to suggest possible drug targets for certain kinds of cancer,” explains Chayes.
Intel DAAL is a high-performance library specifically optimized for big data analysis on the latest Intel platforms, including Intel Xeon®, and Intel Xeon Phi™. It provides the algorithmic building blocks for all stages in data analysis in offline, batch, streaming, and distributed processing environments. It was designed for efficient use over all the popular data platforms and APIs in use today, including MPI, Hadoop, Spark, R, MATLAB, Python, C++, and Java.
Over at TACC, Faith Singer-Villalobos writes that researchers are using the Rustler supercomputer to tackle Big Data from self-driving connected vehicles (CVs). “The volume and complexity of CV data are tremendous and present a big data challenge for the transportation research community,” said Natalia Ruiz-Juri, a research associate with The University of Texas at Austin’s Center for Transportation Research. While there is uncertainty in the characteristics of the data that will eventually be available, the ability to efficiently explore existing datasets is paramount.
In this week’s Sponsored Post, Nicolas Dube of Hewlett Packard Enterprise outlines the future of HPC and the role and challenges of exascale computing in this evolution. The HPE approach to exascale is geared to breaking the dependencies that come with outdated protocols. Exascale computing will allow users to process data, run systems, and solve problems at a totally new scale, which will become increasingly important as the world’s problems grow ever larger and more complex.
In this video, Jonathan Allen from LLNL describes how Lawrence Livermore’s supercomputers are playing a crucial role in advancing cancer research and treatment. “A historic partnership between the Department of Energy (DOE) and the National Cancer Institute (NCI) is applying the formidable computing resources at Livermore and other DOE national laboratories to advance cancer research and treatment. Announced in late 2015, the effort will help researchers and physicians better understand the complexity of cancer, choose the best treatment options for every patient, and reveal possible patterns hidden in vast patient and experimental data sets.”
“For many urban questions, however, new data sources will be required with greater spatial and/or temporal resolution, driving innovation in the use of sensors in mobile devices as well as embedding intelligent sensing infrastructure in the built environment. Collectively, these data sources also hold promise to begin to integrate computational models associated with individual urban sectors such as transportation, building energy use, or climate. Catlett will discuss the work that Argonne National Laboratory and the University of Chicago are doing in partnership with the City of Chicago and other cities through the Urban Center for Computation and Data, focusing in particular on new opportunities related to embedded systems and computational modeling.”
“Atos is determined to solve the technical challenges that arise in life sciences projects, to help scientists to focus on making breakthroughs and forget about technicalities. We know that one size doesn’t fit all and that is the reason why we studied carefully The Pirbright Institute’s challenges to design a customized and unique architecture. It is a pleasure for us to work with Pirbright and to contribute in some way to reduce the impact of viral diseases”, says Natalia Jiménez, WW Life Sciences lead at Atos.