Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

SC17 Panel Preview: How Serious Are We About the Convergence Between HPC and Big Data?

SC17 will feature a panel discussion entitled How Serious Are We About the Convergence Between HPC and Big Data? “The possible convergence between the third and fourth paradigms confronts the scientific community with both a daunting challenge and a unique opportunity. The challenge resides in the requirement to support both heterogeneous workloads with the same hardware architecture. The opportunity lies in creating a common software stack to accommodate the requirements of scientific simulations and big data applications productively while maximizing performance and throughput.

RCE Podcast Looks at NetCDF Network Common Data Format

In this RCE Podcast, Brock Palen and Jeff Squyres speak with the authors of NetCDF. NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. “Unidata’s Network Common Data Form (netCDF) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It is also a community standard for sharing scientific data.”

DOE Helps Tackle Biology’s Big Data

Six proposals have been selected to participate in a new partnership between two U.S. Department of Energy (DOE) user facilities through the “Facilities Integrating Collaborations for User Science” (FICUS) initiative. The expertise and capabilities available at the DOE Joint Genome Institute (JGI) and the National Energy Research Scientific Computing Center (NERSC) – both at the Lawrence Berkeley National Laboratory (Berkeley Lab) – will help researchers explore the wealth of genomic and metagenomic data generated worldwide through access to supercomputing resources and computational science experts to accelerate discoveries.

Video: ddR – Distributed Data Structures in R

“A few weeks ago, we revealed ddR (Distributed Data-structures in R), an exciting new project started by R-Core, Hewlett Packard Enterprise, and others that provides a fresh new set of computational primitives for distributed and parallel computing in R. The package sets the seed for what may become a standardized and easy way to write parallel algorithms in R, regardless of the computational engine of choice.”

Teradata Acquires StackIQ

Today Teradata announced the acquisition of StackIQ, developers of one of the industry’s fastest bare metal software provisioning platforms which has managed the deployment of cloud and analytics software at millions of servers in data centers around the globe. The deal will leverage StackIQ’s expertise in open source software and large cluster provisioning to simplify and automate the deployment of Teradata Everywhere. Offering customers the speed and flexibility to deploy Teradata solutions across hybrid cloud environments, allows them to innovate quickly and build new analytical applications for their business. “Teradata prides itself on building and investing in solutions that make life easier for our customers,” said Oliver Ratzesberger, Executive Vice President and Chief Product Officer for Teradata. “Only the best, most innovative and applicable technology is added to our ecosystem, and StackIQ delivers with products that excel in their field. Adding StackIQ technology to IntelliFlex, IntelliBase and IntelliCloud will strengthen our capabilities and enable Teradata to redefine how systems are deployed and managed globally.”

Alan Turing Institute to Acquire Cray Urika-GX Graph Supercomputer

Today Cray announced the Company will provide a Cray Urika-GX system to the Alan Turing Institute. “The rise of data-intensive computing – where big data analytics, artificial intelligence, and supercomputing converge – has opened up a new domain of real-world, complex analytics applications, and the Cray Urika-GX gives our customers a powerful platform for solving this new class of data-intensive problems.”

Accelerating Innovation with HPC and AI at HPE Discover

In this special guest feature, Bill Mannel from Hewlett Packard Enterprise writes that this week’s HPE Discover meeting in Las Vegas will bring together leaders from across the HPC community to collaborate, share, and investigate the impact of IT modernization, big data analytics, and AI. “Organizations across all industries must adopt solutions that allow them to anticipate and pursue future innovation. HPE is striving to be your best strategic digital transformation partner.”

Overview of Panasas Storage for HPC & Big Data

Dale Brantley presented this talk at the PBS Works User Group meeting. “Panasas storage solutions drive industry and research innovation by accelerating workflows and simplifying data management. Our ActiveStor appliances leverage the patented PanFS storage operating system and DirectFlow protocol to deliver performance and reliability at scale from an appliance that is as easy to manage as it is fast to deploy. Panasas storage is optimized for the most demanding workloads in life sciences, manufacturing, media and entertainment, energy, government as well as education environments, and has been deployed in more than 50 countries worldwide.”

Transaction Processing Performance Council Launches TPCx-HS Big Data Benchmark

Today the Transaction Processing Performance Council (TPC) announced the immediate availability of TPCx-HS Version 2, extending the original benchmark’s scope to include the Spark execution framework and cloud services. “Enterprise investment in Big Data analytics tools is growing exponentially, to keep pace with the rapid expansion of datasets,” said Tariq Magdon-Ismail, chairman of the TPCx-HS committee and staff engineer at VMware. “This is leading to an explosion in new hardware and software solutions for collecting and analyzing data. So there is enormous demand for robust, industry standard benchmarks to enable direct comparison of disparate Big Data systems across both hardware and software stacks, either on-premise or in the cloud. TPCx-HS Version 2 significantly enhances the original benchmark’s scope, and based on industry feedback, we expect immediate widespread interest.”

Lorena Barba Presents: Data Science for All

“In this new world, every citizen needs data science literacy. UC Berkeley is leading the way on broad curricular immersion with data science, and other universities will soon follow suit. The definitive data science curriculum has not been written, but the guiding principles are computational thinking, statistical inference, and making decisions based on data. “Bootcamp” courses don’t take this approach, focusing mostly on technical skills (programming, visualization, using packages). At many computer science departments, on the other hand, machine-learning courses with multiple pre-requisites are only accessible to majors. The key of Berkeley’s model is that it truly aims to be “Data Science for All.”