In this video from the 2016 Blue Waters Symposium, George Slota from Pennsylvania State University presents: Extreme-scale Graph Analysis on Blue Waters.
“In recent years, many graph processing frameworks have been introduced with the goal to simplify analysis of real-world graphs on commodity hardware. However, these popular frameworks lack scalability to modern massive-scale datasets. This work introduces a methodology for graph processing on distributed HPC systems that is simple to implement, generalizable to broad classes of graph algorithms, and scales to systems with hundreds of thousands of cores and graphs of billions of vertices and trillions of edges. We demonstrate our approach to be orders-of-magnitude faster than other distributed graph processing frameworks for several graph analytics. Additionally, we show how our methods aren’t limited to simple algorithms: We also implement a distributed version of the PuLP graph partitioner and use it to partition a real-world graph with over a hundred billion edges and synthetic graphs with over a trillion edges on Blue Waters; each partition computation completes in only minutes. This work opens the door for the complex study of the large and irregular interaction datasets that are ubiquitous throughout the social and physical sciences.”
George Slota recently started an Assistant Professor position in the Computer Science Department at Rensselaer Polytechnic Institute. He previously worked at Sandia National Labs in the Scalable Algorithms Department. Slota graduated with my Ph.D. in computer engineering from Penn State in 2016 after working in the Scalable Computing Lab with my advisor, Kamesh Madduri. He was partially supported by a Blue Waters Fellowship. His research interests are in graph and network mining, big data analytics, machine learning, bioinformatics, and their relation to parallel, scientific, and high performance computing.