In this video from SC15, SYSTAP CEO Brad Bebee presents: Blazegraph GPU and DASL.
“Analytics applied over complex, many-to-many data relationships hit the ‘Graph Cache Thrash’ bottleneck and grind to a halt, failing to deliver good performance or to operate at scale,” said Brad Bebee, SYSTAP CEO. “GPU hardware provides a compelling performance increase for data-intensive, predictive analytic applications. With Blazegraph and our new GPU products, users can harness the computing power comparable to what was only available from supercomputers, such as a Cray, at a fraction of the cost.”
This week SYSTAP introduced two new products with GPU acceleration into the Blazegraph family: Blazegraph GPU and Blazegraph DASL (pronounced “Dazzle”). These products leverage the company’s patent-pending technology for making NVIDIA GPUs accessible for graph and machine learning applications to accelerate graph queries and simplify creation of sophisticated algorithms, respectively. Both solutions will be available as an on-demand SaaS offering in 2016.
Blazegraph GPU is eagerly anticipated by a number of organizations, including the British Museum, according to Dr. Barry Norton, ResearchSpace Development Management at the museum. A current Blazegraph customer, the museum anticipates using Blazegraph GPU to boost the power of its ResearchSpace program, which is a collaborative environment for humanities and cultural heritage research.
Exploding data volumes and the need to uncover new insights from multi-source data streams are driving the growth of applications in domains such as drug discovery, recommendation systems, cyber network analysis and social networking, and are enabled by machine learning and predictive analytics. However, the increasing levels of sophistication mean the analytics must process and develop insights over data with complex dependencies from multiple, unstructured sources.
Traditional SQL data structures are not adequate for researchers and data scientists who need to explore huge data sets with complex dependencies. Modern graph databases offer a powerful and efficient way to represent diverse entities and the relationships between them. However in-memory (cache) analytical techniques of popular graph databases are nearly incapacitated as the size of the data sets and relationships between them increase exponentially. In essence, graph databases are not bound by “compute” issues, but rather by memory bandwidth, or “cache thrashing” issues, which results when swapping data in and out of the cache as the system works to uncover new sets of relationships.
The standard measure of graph analytic performance is traversed edges per second (TEP). SYSTAP compared the hardware cost evaluation for its new Blazegraph GPU in traversing one billion edges per second, or a GTEP, to other approaches. Based on a calculation of Graph 500 results, the widely used Cray XMT supercomputer costs $180,000 per GTEP. The new Blazegraph GPU solution, including NVIDIA GPUs, offers comparable performance at $18,000 per GTEP, offering users a 10x cost advantage over conventional technology.
Prior to the introduction of Blazegraph GPU there was no affordable product on the market that could analyze very large graphs with appropriate performance,” said Philip Howard, research director at Bloor. “From this perspective, Blazegraph GPU is in a class of one.”
About Blazegraph GPU and Blazegraph DASL
Blazegraph GPU provides drop-in GPU acceleration, with full support for existing RDF/SPARQL and Property Graph (Tinkerpop) applications. Blazegraph GPU will deliver 200x to 300x graph query acceleration, requiring virtually no changes to an application. Leveraging NVIDIA’s General Purpose Graphics Processing Unit (GP-GPU) technology, Blazegraph GPU solves the “Graph Cache Thrash” challenge by exploiting the superior bandwidth to main memory and effective parallelism of GPUs. Blazegraph can run on a single GPU or cluster of GPUs. With Blazegraph high-performance computing on a cluster of 64 NVIDIA K40 GPUs, users can traverse a scale-free graph of 4.3 billion directed edges in .15 seconds for a throughput of 32 billion traversed edges per second (32 GTEPS).
Blazegraph DASL is a new domain-specific language that enables analytic experts to write algorithms for large-scale machine learning and other complex applications that efficiently run on GPUs, without the need for parallel programming expertise or low-level device optimization. By combining the ease of Spark with the speed of CUDA and GPUs, their applications can operate up to 1,000x faster than Spark without GPUs. It provides a Scala-based language to write graph and big data analytics and is complementary to the Spark and Hadoop ecosystems. Projects that take months to code and optimize directly on the GPU can be done in a matter of hours with Blazegraph DASL.
Blazegraph GPU is available now as licensed software. In early 2016, Blazegraph GPU will be introduced as an on-demand SaaS offering and as a hardware appliance. Blazegraph DASL will be generally available in Q1 2016. Pricing is based on number and type of server and includes a support subscription. Special pricing is available for non-profits and research organizations.