Apache Spark Archives - High-Performance Computing News Analysis

GigaOm Radar for Evaluating Data Warehouse Platforms

July 6, 2020 by staff

This new GigaOm Radar Report “GigaOm Radar for Evaluating Data Warehouse Platforms” provided by our friends over at Vertica, examines the leading platforms in the data warehouse marketplace, describes the fundamentals of the technology, identifies key criteria and evaluation metrics by which organizations can evaluate competing platforms, describes some potential technology developments to look out for in the future, and classifies platforms across those criteria and metrics.

Filed Under: Business of HPC, Datacenter, Enterprise HPC, Featured, Google News Feed, HPC Hardware, HPC Software, Industry Segments, News, Sponsored Post, Storage, Uncategorized, White Papers Tagged With: AI, Apache Spark, data warehouse, Hadoop, Vertica, Weekly featured Newsletter Articles, Weekly Featured Newsletter Post

GigaOm Radar for Evaluating Data Warehouse Platforms

July 1, 2020 by DO NOT USE Leave a Comment

This new GigaOm Radar Report provided by our friends over at Vertica, examines the leading platforms in the data warehouse marketplace, describes the fundamentals of the technology, identifies key criteria and evaluation metrics by which organizations can evaluate competing platforms, describes some potential technology developments to look out for in the future, and classifies platforms across those criteria and metrics.

Tagged With: Apache Spark, Data Lake, data warehouse, Hadoop, Spark, Vertica

Podcast: HPC & AI Convergence Enables AI Workload Innovation

August 25, 2019 by Doug Black

In this Conversations in the Cloud podcast, Esther Baldwin from Intel describes how the convergence of HPC and AI is driving innovation. “On the topic of HPC & AI converged clusters, there’s a perception that if you want to do AI, you must stand up a separate cluster, which Esther notes is not true. Existing HPC customers can do AI on their existing infrastructure with solutions like HPC & AI converged clusters.”

Filed Under: Compute, Enterprise HPC, HPC Hardware, HPC Software, Industry Segments, Machine Learning, News, Podcast, Research / Education, Resources Tagged With: AI, Apache Spark, BigDL, HPC AI convergence, Inferencing, Intel, Intel Select Solutions, Slurm

Accelerate Your Apache Spark with Intel Optane DC Persistent Memory

July 28, 2019 by Doug Black

Piotr Balcer and Cheng Xu from Intel gave this talk at the 2019 Spark+AI Summit. “Intel Optane DC persistent memory breaks the traditional memory/storage hierarchy and scales up the computing server with higher capacity persistent memory. Also it brings higher bandwidth & lower latency than storage like SSD or HDD. And Apache Spark is widely used in the analytics like SQL and Machine Learning on the cloud environment.”

Filed Under: Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Main Feature, News, Research / Education, Resources, Storage, Videos Tagged With: AI, Apache Spark, big data, Intel, Intel Optane DC Persistent Memory, Weekly Newsletter Articles

NEC Embraces Open Source Frameworks for SX-Aurora Vector Computing

July 10, 2019 by staff

In this video from ISC 2019, Dr. Erich Focht from NEC Deutschland GmbH describes how the company is embracing open source frameworks for the SX-Aurora TSUBASA Vector Supercomputer. “Until now, with the existing server processing capabilities, developing complex models on graphical information for AI has consumed significant time and host processor cycles. NEC Laboratories has developed the open-source Frovedis framework over the last 10 years, initially for parallel processing in Supercomputers. Now, its efficiencies have been brought to the scalable SX-Aurora vector processor.”

Filed Under: Compute, Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Machine Learning, Main Feature, News, Research / Education, Resources, Videos Tagged With: AI, Apache Spark, Frovedis, NEC, NEC SX-Auroroa, NEC-X, Vector computing, Weekly Newsletter Articles

Deep Learning Open Source Framework Optimized on Apache Spark*

July 9, 2018 by Richard Friedman

Intel recently released BigDL. It’s an open source, highly optimized, distributed, deep learning framework for Apache Spark*. It makes Hadoop/Spark into a unified platform for data storage, data processing and mining, feature engineering, traditional machine learning, and deep learning workloads, resulting in better economy of scale, higher resource utilization, ease of use/development, and better TCO.

Filed Under: HPC Software, Machine Learning, Parallel Programming, Sponsored Post Tagged With: Apache Spark, Deep Learning, Intel BigDL, intel mkl, Intel TEC, Weekly Newsletter Articles

SpaRC: Scalable Sequence Clustering using Apache Spark

February 26, 2018 by Doug Black

Zhong Wang from the Genome Institute at LBNL gave this talk at the Stanford HPC Conference. “Whole genome shotgun based next generation transcriptomics and metagenomics studies often generate 100 to 1000 gigabytes (GB) sequence data derived from tens of thousands of different genes or microbial species. Here we describe an Apache Spark-based scalable sequence clustering application, SparkReadClust (SpaRC) that partitions reads based on their molecule of origin to enable downstream assembly optimization.”

Filed Under: Compute, Events, Government, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Main Feature, News, Research / Education, Resources Tagged With: Apache Spark, genomics, hpc advisory council, lbnl, SPARC, Stanford HPC Conference, Weekly Newsletter Articles

Accelerating Apache Spark with RDMA

April 2, 2017 by Doug Black

Yuval Degani from Mellanox presented this talk at the OpenFabrics Workshop. “In this talk, we present a Java-based, RDMA network layer for Apache Spark. The implementation optimized both the RPC and the Shuffle mechanisms for RDMA. Initial benchmarking shows up to 25% improvement for Spark Applications.”

Filed Under: Compute, Datacenter, Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Main Feature, Network, Resources, Videos Tagged With: Apache Spark, big data, InfiniBand, Mellanox, OFA, OpenFabrics Workshop, Weekly Newsletter Articles

Accelerating Hadoop, Spark, and Memcached with HPC Technologies

March 31, 2017 by Doug Black

“This talk will present RDMA-based designs using OpenFabrics Verbs and heterogeneous storage architectures to accelerate multiple components of Hadoop (HDFS, MapReduce, RPC, and HBase), Spark and Memcached. An overview of the associated RDMA-enabled software libraries (being designed and publicly distributed as a part of the HiBD project for Apache Hadoop.”

Filed Under: Compute, Datacenter, Enterprise HPC, Events, HPC Hardware, HPC Software, Industry Segments, Main Feature, Network, Research / Education, Resources, Videos Tagged With: Apache Spark, big data, Hadoop, InfiniBand, Memcached, OFA, OpenFabrics Workshop, Weekly Newsletter Articles

Introduction to Data Science with Spark

March 11, 2017 by Doug Black

The Data Science with Spark Workshop addresses high-level parallelization for data analytics workloads using the Apache Spark framework. Participants will learn how to prototype with Spark and how to exploit large HPC machines like the Piz Daint CSCS flagship system.

Filed Under: Compute, Education / Training, Government, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Main Feature, Research / Education, Resources, Videos Tagged With: Apache Spark, cscs, Data Science

GigaOm Radar for Evaluating Data Warehouse Platforms

Podcast: HPC & AI Convergence Enables AI Workload Innovation

Accelerate Your Apache Spark with Intel Optane DC Persistent Memory

NEC Embraces Open Source Frameworks for SX-Aurora Vector Computing

Deep Learning Open Source Framework Optimized on Apache Spark*

SpaRC: Scalable Sequence Clustering using Apache Spark

Accelerating Apache Spark with RDMA

Accelerating Hadoop, Spark, and Memcached with HPC Technologies

Introduction to Data Science with Spark

Sponsored Guest Articles

Dell: Omnia Copes with Configuring HPC-AI Environments

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA