Cray Urika-GX System to Tackle Big Data Analytics

Print Friendly, PDF & Email

urikaToday Cray rolled out the Cray Urika-GX system – the first agile analytics platform that fuses supercomputing technologies with an open, enterprise-ready software framework for big data analytics. The Cray Urika-GX system gives customers unprecedented versatility for running multiple analytics workloads concurrently on a single platform that leverages the speed of a Cray supercomputer.

The Urika-GX is a dynamic analytics solution that brings out the best of Cray’s decades of expertise in providing our customers with world-class systems for data-intensive computing,” said Peter Ungaro, president and CEO of Cray. “Customers have asked us to blend the unique features of our product lines into a single platform for data analytics. We took the Aries system interconnect from our supercomputers, the industry-standard architecture of our clusters, the scalable graph engine from the Urika-GD appliance, and the pre-integrated, open infrastructure of our Urika-XA system and combined them into one agile analytics platform. The Urika-GX gives our customers the tool they need to overcome their most advanced analytics challenges today, and the platform to bridge to tomorrow.”

The Cray Urika-GX system features Intel Xeon Broadwell cores, 22 terabytes of memory, 35 terabytes of local SSD storage capacity, and the Aries supercomputing interconnect, which provides the unmatched network performance necessary to solve the most demanding big data problems.

The size, scope, and complexity of big data analytics is exploding and creating problems for customers who are already struggling with cluster sprawl, a torrent of new applications, and increasing pressure to deliver faster insights. The Cray Urika-GX system is designed to eliminate these challenges of big data analytics. Cray’s new agile analytics platform combines the unique scale and throughput capabilities of Cray supercomputers with the convenience of an appliance, the flexibility of industry-standard hardware, and an open software framework that enables customers to innovate as they run existing and emerging analytics workloads. The Cray Urika-GX system gives customers a powerful tool for delivering high-frequency insights.

Optimized for demanding analytics workloads, the Cray Urika-GX system is pre-tested and pre-integrated with the Hortonworks Data Platform providing Hadoop and Apache Spark, as well as the Cray Graph Engine, designed for solving the largest and most complex graph analytics problems. The system includes enterprise tools, such as OpenStack for management and Apache Mesos for dynamic configuration – all designed to protect customers’ investments in the rapidly-changing big data software landscape.

Cray Urika-GX systems are currently being used by multiple Cray customers across the life sciences, healthcare, and cybersecurity industries. The Broad Institute of MIT and Harvard, a non-profit research institute aimed at advancing the understanding and treatment of disease, is currently using the Cray Urika-GX system for analyzing high-throughput genome sequencing data.

With the Cray Urika-GX, we had quality score recalibration results from our Genome Analysis Toolkit (GATK4) Apache Spark pipeline in nine minutes instead of forty minutes,” said Adam Kiezun, GATK4 Project Lead at the Broad Institute. “This highlights the potential to accelerate delivery of genomic insights to researchers who are making breakthroughs in the fight against disease.”

An exclusive feature of the Cray Urika-GX system is the Cray Graph Engine for fast, complex iterative discovery. Graph analytics has long been understood to pose some of the most difficult scaling and performance challenges for modern analytics systems. The Cray Graph Engine on the Urika-GX system, originally developed for the Cray Urika-GD Graph Discovery appliance, is typically ten to 100 times faster than current graph solutions for complex analytics operations. The Cray Graph Engine can run at any scale from a single processor to thousands of processors without compromising performance. With the Cray Graph Engine, customers can tackle multi-terabyte datasets comprised of billions of objects. The Cray Graph Engine can run in conjunction with open analytics tools such as Hadoop and Spark, enabling customers to build complete end-to-end analytics workflows and avoid unnecessary data movement.

Analytics workflows are becoming increasingly sophisticated with businesses looking to integrate analytics such as streaming, graph, and interactive,” says James Curtis, Senior Analyst, Data Platforms & Analytics at 451 Research. “An agile analytics platform that can eliminate many of the challenges data scientists face, as well as reduce the time it takes to get an integrated environment up and running has become a requirement for many enterprises.”

Three initial Cray Urika-GX configurations featuring 16, 32, or 48 nodes will be available in Q3 2016, and larger configurations will be available in the second half of 2016.

Sign up for our insideHPC Newsletter