In this contributed article, Gou Rao, Co-founder and CTO at Portworx, discusses how running the open source distributed NoSQL database Cassandra in a container requires some special considerations. If you can solve these problems, you’ll have gotten most of the way to a successful Cassandra deployment in containers.
How Charliecloud Simplifies Big Data Supercomputing at LANL
“Los Alamos has lots of supercomputing power, and we do lots of simulations that are well supported here. But we’ve found that Big Data analysis projects need to use different frameworks, which often have dependencies that differ from what we have already on the supercomputer. So, we’ve developed a lightweight ‘container’ approach that lets users package their own user defined software stack in isolation from the host operating system.”
BlueData, Intel Compare Bare-Metal & Containers for Big Data Workloads
Has your business ever tried to decide between a bare-metal environment or a container-based environment for dealing with Big Data needs? BlueData and Intel collaborated to discuss this very issue in a benchmark study of the performance of Big Data workloads.
Bare-Metal Performance for Big Data Workloads on Docker Containers
In a benchmark study, Intel compared the performance of Big Data workloads running on a bare-metal deployment versus running in Docker containers with the BlueData EPIC software platform. The study found that it is possible to run Big Data workloads in a container-based environment without sacrificing performance. The benefits include agility, flexibility, and cost efficiency. Data science teams can get on-demand Hadoop and Spark clusters, while leveraging enterprise-grade security in a multi-tenant architecture. Get the white paper to learn about this breakthrough benchmark study.
ISC 2017 Workshop Preview: Optimizing Linux Containers for HPC & Big Data Workloads
Christian Kniep is hosting a half-day Linux Container Workshop on Optimizing IT Infrastructure and High-Performance Workloads on June 23 in Frankfurt. “Docker as the dominant flavor of Linux Container continues to gain momentum within datacenter all over the world. It is able to benefit legacy infrastructure by leveraging the lower overhead compared to traditional, hypervisor-based virtualization. But there is more to Linux Containers – and Docker in particular, which this workshop will explore.”
HPC Workflows Using Containers
“In this talk we will discuss a workflow for building and testing Docker containers and their deployment on an HPC system using Shifter. Docker is widely used by developers as a powerful tool for standardizing the packaging of applications across multiple environments, which greatly eases the porting efforts. On the other hand, Shifter provides a container runtime that has been specifically built to fit the needs of HPC. We will briefly introduce these tools while discussing the advantages of using these technologies to fulfill the needs of specific workflows for HPC, e.g., security, high-performance, portability and parallel scalability.”
New Univa Grid Engine Release Doubles Performance over Previous Versions
“Grid Engine 8.5’s significant performance improvement for submitting jobs and reduced scheduling times will have a profound impact to our customers’ bottom line as they can now get more work done in the same amount of time. By reducing the wait time for end-users, our customers can save significant costs. Managing the purchase of more servers and getting higher throughput means deadlines can be met with confidence and on budget. Our goal was to increase the value Univa provides over previous versions of Grid Engine, including the popular open source version 6.2U5,” said Bill Bryce, Vice President of Products at Univa.
Designing HPC & Deep Learning Middleware for Exascale Systems
DK Panda from Ohio State University presented this deck at the 2017 HPC Advisory Council Stanford Conference. “This talk will focus on challenges in designing runtime environments for exascale systems with millions of processors and accelerators to support various programming models. We will focus on MPI, PGAS (OpenSHMEM, CAF, UPC and UPC++) and Hybrid MPI+PGAS programming models by taking into account support for multi-core, high-performance networks, accelerators (GPGPUs and Intel MIC), virtualization technologies (KVM, Docker, and Singularity), and energy-awareness. Features and sample performance numbers from the MVAPICH2 libraries will be presented.”
Video: Singularity – Containers for Science, Reproducibility, and HPC
“Explore how Singularity liberates non-privileged users and host resources (such as interconnects, resource managers, file systems, accelerators …) allowing users to take full control to set-up and run in their native environments. This talk explores Singularity how it combines software packaging models with minimalistic containers to create very lightweight application bundles which can be simply executed and contained completely within their environment or be used to interact directly with the host file systems at native speeds. A Singularity application bundle can be as simple as containing a single binary application or as complicated as containing an entire workflow and is as flexible as you will need.”
Microsoft Releases Batch Shipyard on GitHub for Docker on Azure
“Available on GitHub as Open Source, the Batch Shipyard toolkit enables easy deployment of batch-style Dockerized workloads to Azure Batch compute pools. Azure Batch enables you to run parallel jobs in the cloud without having to manage the infrastructure. It’s ideal for parametric sweeps, Deep Learning training with NVIDIA GPUs, and simulations using MPI and InfiniBand.”











