Video: High Performance Clustering for Trillion Particle Simulations

Print Friendly, PDF & Email

prabhatIn this video from the Intel HPC Developer Conference at SC15, Prabhat from UC Berkeley presents: High Performance Clustering for Trillion Particle Simulations.

“Modern Cosmology and Plasma Physics codes are capable of simulating trillions of particles on petascale systems. Each time step generated from such simulations is on the order of 10s of TBs. Summarizing and analyzing raw particle data is challenging, and scientists often focus on density structures for follow-up analysis. We develop a highly scalable version of the clustering algorithm DBSCAN and apply it to the largest particle simulation datasets. Our system, called BD-CATDS, is the first one to perform end-to-end clustering analysis of trillion particle simulation output. We demonstrate clustering analysis of a 1.4 Trillion particle dataset from a plasma physics simulation, and a 10,240^3 particle cosmology simulation utilizing ~100,000 cores in 30 minutes. BD-CATS has enabled scientists to ask novel questions about acceleration mechanisms in particle physics, and has demonstrated qualitatively superior results in cosmology. Clustering is an example of one scientific data analytics problem. This talk will conclude with a broad overview of other leading data analytics challenges across scientific domains, and joint efforts between NERSC and Intel Research to tackle some of these challenges.”

See more talks in the Intel HPC Developer Conference Video Gallery

Sign up for our insideHPC Newsletter