Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:

Video: ddR – Distributed Data Structures in R

“A few weeks ago, we revealed ddR (Distributed Data-structures in R), an exciting new project started by R-Core, Hewlett Packard Enterprise, and others that provides a fresh new set of computational primitives for distributed and parallel computing in R. The package sets the seed for what may become a standardized and easy way to write parallel algorithms in R, regardless of the computational engine of choice.”

Teradata Acquires StackIQ

Today Teradata announced the acquisition of StackIQ, developers of one of the industry’s fastest bare metal software provisioning platforms which has managed the deployment of cloud and analytics software at millions of servers in data centers around the globe. The deal will leverage StackIQ’s expertise in open source software and large cluster provisioning to simplify and automate the deployment of Teradata Everywhere. Offering customers the speed and flexibility to deploy Teradata solutions across hybrid cloud environments, allows them to innovate quickly and build new analytical applications for their business. “Teradata prides itself on building and investing in solutions that make life easier for our customers,” said Oliver Ratzesberger, Executive Vice President and Chief Product Officer for Teradata. “Only the best, most innovative and applicable technology is added to our ecosystem, and StackIQ delivers with products that excel in their field. Adding StackIQ technology to IntelliFlex, IntelliBase and IntelliCloud will strengthen our capabilities and enable Teradata to redefine how systems are deployed and managed globally.”

Alan Turing Institute to Acquire Cray Urika-GX Graph Supercomputer

Today Cray announced the Company will provide a Cray Urika-GX system to the Alan Turing Institute. “The rise of data-intensive computing – where big data analytics, artificial intelligence, and supercomputing converge – has opened up a new domain of real-world, complex analytics applications, and the Cray Urika-GX gives our customers a powerful platform for solving this new class of data-intensive problems.”

Accelerating Innovation with HPC and AI at HPE Discover

In this special guest feature, Bill Mannel from Hewlett Packard Enterprise writes that this week’s HPE Discover meeting in Las Vegas will bring together leaders from across the HPC community to collaborate, share, and investigate the impact of IT modernization, big data analytics, and AI. “Organizations across all industries must adopt solutions that allow them to anticipate and pursue future innovation. HPE is striving to be your best strategic digital transformation partner.”

Overview of Panasas Storage for HPC & Big Data

Dale Brantley presented this talk at the PBS Works User Group meeting. “Panasas storage solutions drive industry and research innovation by accelerating workflows and simplifying data management. Our ActiveStor appliances leverage the patented PanFS storage operating system and DirectFlow protocol to deliver performance and reliability at scale from an appliance that is as easy to manage as it is fast to deploy. Panasas storage is optimized for the most demanding workloads in life sciences, manufacturing, media and entertainment, energy, government as well as education environments, and has been deployed in more than 50 countries worldwide.”

Transaction Processing Performance Council Launches TPCx-HS Big Data Benchmark

Today the Transaction Processing Performance Council (TPC) announced the immediate availability of TPCx-HS Version 2, extending the original benchmark’s scope to include the Spark execution framework and cloud services. “Enterprise investment in Big Data analytics tools is growing exponentially, to keep pace with the rapid expansion of datasets,” said Tariq Magdon-Ismail, chairman of the TPCx-HS committee and staff engineer at VMware. “This is leading to an explosion in new hardware and software solutions for collecting and analyzing data. So there is enormous demand for robust, industry standard benchmarks to enable direct comparison of disparate Big Data systems across both hardware and software stacks, either on-premise or in the cloud. TPCx-HS Version 2 significantly enhances the original benchmark’s scope, and based on industry feedback, we expect immediate widespread interest.”

Lorena Barba Presents: Data Science for All

“In this new world, every citizen needs data science literacy. UC Berkeley is leading the way on broad curricular immersion with data science, and other universities will soon follow suit. The definitive data science curriculum has not been written, but the guiding principles are computational thinking, statistical inference, and making decisions based on data. “Bootcamp” courses don’t take this approach, focusing mostly on technical skills (programming, visualization, using packages). At many computer science departments, on the other hand, machine-learning courses with multiple pre-requisites are only accessible to majors. The key of Berkeley’s model is that it truly aims to be “Data Science for All.”

Dr. Eng Lim Goh presents: HPC & AI Technology Trends

Dr. Eng Lim Goh from Hewlett Packard Enterprise gave this talk at the HPC User Forum. “SGI’s highly complementary portfolio, including its in-memory high-performance data analytics technology and leading high-performance computing solutions will extend and strengthen HPE’s current leadership position in the growing mission critical and high-performance computing segments of the server market.”

Accelerating Apache Spark with RDMA

Yuval Degani from Mellanox presented this talk at the OpenFabrics Workshop. “In this talk, we present a Java-based, RDMA network layer for Apache Spark. The implementation optimized both the RPC and the Shuffle mechanisms for RDMA. Initial benchmarking shows up to 25% improvement for Spark Applications.”

Accelerating Hadoop, Spark, and Memcached with HPC Technologies

“This talk will present RDMA-based designs using OpenFabrics Verbs and heterogeneous storage architectures to accelerate multiple components of Hadoop (HDFS, MapReduce, RPC, and HBase), Spark and Memcached. An overview of the associated RDMA-enabled software libraries (being designed and publicly distributed as a part of the HiBD project for Apache Hadoop.”