Apache Spark Archives - Page 2 of 3 - High-Performance Computing News Analysis

Compressing Software Development Cycles with Supercomputer-based Spark

March 8, 2017 by Doug Black

“Do you need to compress your software development cycles for services deployed at scale and accelerate your data-driven insights? Are you delivering solutions that automate decision making & model complexity using analytics and machine learning on Spark? Find out how a pre-integrated analytics platform that’s tuned for memory-intensive workloads and powered by the industry leading interconnect will empower your data science and software development teams to deliver amazing results for your business. Learn how Cray’s supercomputing approach in an enterprise package can help you excel at scale.”

Filed Under: Compute, Datacenter, Enterprise HPC, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Main Feature, Resources, Videos Tagged With: Apache Spark, Cray, Spark Summit

Re-Architecting Spark For Performance Understandability

March 4, 2017 by Doug Black

“This talk will describe Monotasks, a new architecture for the core of Spark that makes performance easier to reason about. In Spark today, pervasive parallelism and pipelining make it difficult to answer even simple performance questions like “what is the bottleneck for this workload?” As a result, it’s difficult for developers to know what to optimize, and it’s even more difficult for users to understand what hardware to use and what configuration parameters to set to get the best performance.”

Filed Under: Compute, Datacenter, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Main Feature, Research / Education, Resources, Videos Tagged With: Apache Spark, big data, Monotasks, Spark Summit, Weekly Newsletter Articles

Programming for High Performance Processors

January 5, 2017 by MichaelS

“Managing the work on each node can be referred to as Domain parallelism. During the run of the application, the work assigned to each node can be generally isolated from other nodes. The node can work on its own and needs little communication with other nodes to perform the work. The tools that are needed for this are MPI for the developer, but can take advantage of frameworks such as Hadoop and Spark (for big data analytics). Managing the work for each core or thread will need one level down of control. This type of work will typically invoke a large number of independent tasks that must then share data between the tasks.”

Filed Under: Compute, Datacenter, Government, HPC Hardware, HPC Software, Industry Segments, Main Feature, News, Parallel Programming, Research / Education, Sponsored Post, Tools Tagged With: Apache Spark, Hadoop, Intel, Intel Parallel Studio XE, Intel TEC, MPI, OpenMP

Building a Platform for Collaborative Scientific Research on AWS

December 11, 2016 by Doug Black

“The pharmaceutical industry trend toward joint ventures and collaborations has created a need for new platforms in which to work together. We’ll dive into architectural decisions for building collaborative systems. Examples include how such a platform allowed Human Longevity, Inc. to accelerate software deployment to production in a fast-paced research environment, and how Celgene uses AWS for research collaboration with outside universities and foundations.”

Filed Under: Cloud HPC, Collaboration, Compute, Events, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Main Feature, News, Research / Education, Resources, Videos Tagged With: Apache Spark, AWS, AWS Reinvent, big data, Celgene, Human Longevity

Cray Urika-GX System to Tackle Big Data Analytics

May 24, 2016 by Doug Black

“We took the Aries system interconnect from our supercomputers, the industry-standard architecture of our clusters, the scalable graph engine from the Urika-GD appliance, and the pre-integrated, open infrastructure of our Urika-XA system and combined them into one agile analytics platform. The Urika-GX gives our customers the tool they need to overcome their most advanced analytics challenges today, and the platform to bridge to tomorrow.”

Filed Under: Compute, Datacenter, High Performance Analytics, HPC Hardware, HPC Software, Industry Segments, Network, News, Research / Education, Storage Tagged With: Apache Mesos, Apache Spark, big data, Cray, Cray Urika-GX, Hadoop, HPDA, Intel

Learn Apache Hadoop with Spark in One Day

April 15, 2016 by Doug Black

Hadoop and Spark clusters have a reputation for being extremely difficult to configure, install, and tune, but help is on the way. The good folks at Cluster Monkey are hosting a crash course entitled Apache Hadoop with Spark in One Day. “After completing the workshop attendees will be able to use and navigate a production Hadoop cluster and develop their own projects by building on the workshop examples.”

Filed Under: Compute, Education / Training, High Performance Analytics, HPC Hardware, HPC Software, News, Resources Tagged With: Apache Spark, big data, cluster monkey, Hadoop, Weekly Newsletter Articles

Changes Afoot from the HPC Crystal Ball

February 18, 2016 by staff

In this special guest feature from Scientific Computing World, Andrew Jones from NAG looks ahead at what 2016 has in store for HPC and finds people, not technology, to be the most important issue. “A disconcertingly large proportion of the software used in computational science and engineering today was written for friendlier and less complex technology. An explosion of attention is needed to drag software into a state where it can effectively deliver science using future HPC platforms.”

Filed Under: Compute, Coprocessors, Future Technology, HPC Hardware, HPC Software, Industry Perspectives, Industry Segments, News, Research / Education, Resources Tagged With: Apache Spark, Cray, FORTRAN, IBM, Intel, Knights Landing, KNL, MPI, NAG, nvidia, SC15, SGI

IBM Ramps Up Apache Spark at SC15

December 7, 2015 by Doug Black

“What we’re previewing here today is a capability to have an overarching software, resource scheduler and workflow manager that takes all of these disparate sources and unifies them into a single view, making hundreds or thousands of computers look like one, and allowing you to run multiple instances of Spark. We have a very strong Spark multitenancy capability, so you can run multiple instances of Spark simultaneously, and you can run different versions of Spark, so you don’t obligate your organization to upgrade in lockstep.”

Filed Under: Editor's Choice, Events, High Performance Analytics, HPC Software, Industry Segments, Main Feature, News, Research / Education, Resources, Videos Tagged With: Apache Spark, big data, IBM, IBM SystemML, MapReduce, SC15, Weekly Newsletter Articles

Berkeley Lab to Optimize Spark for HPC

November 13, 2015 by Doug Black

Today LBNL announced that a team of scientists from Berkeley Lab’s Computational Research Division has been awarded a grant by Intel to support their goal of enabling data analytics software stacks—notably Spark—to scale out on next-generation high performance computing systems.

Filed Under: Government, High Performance Analytics, HPC Software, Industry Segments, News, Research / Education Tagged With: Apache Spark, big data, lbnl

Intel Invests in BlueData for Spinning Up Spark Clusters on the Fly

August 26, 2015 by Doug Black

Today Intel Corporation and BlueData announced a broad strategic technology and business collaboration, as well as an additional equity investment in BlueData from Intel Capital. BlueData is a Silicon Valley startup that makes it easier for companies to install Big Data infrastructure, such as Apache Hadoop and Spark, in their own data centers or in the cloud.

Filed Under: Compute, High Performance Analytics, HPC Hardware, HPC Software, News, Resources, Videos Tagged With: Apache Spark, big data, bluedata, Hadoop, Intel

Compressing Software Development Cycles with Supercomputer-based Spark

Re-Architecting Spark For Performance Understandability

Programming for High Performance Processors

Building a Platform for Collaborative Scientific Research on AWS

Cray Urika-GX System to Tackle Big Data Analytics

Learn Apache Hadoop with Spark in One Day

Changes Afoot from the HPC Crystal Ball

IBM Ramps Up Apache Spark at SC15

Berkeley Lab to Optimize Spark for HPC

Intel Invests in BlueData for Spinning Up Spark Clusters on the Fly

Sponsored Guest Articles

Lenovo and NVIDIA at GTC 2024: An Alliance Enabling AI at Scale

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA