Sifting Through a Trillion Electrons

June 27, 2012 by Doug Black

What happens when your scientific instrument generates more data than you can decipher? Linda Vu writes that Berkeley researchers are designing new strategies for extracting interesting data from massive scientific datasets.

In a recent case, a team ran a state-of-the-art plasma physics code called VPIC on NERSC’s Cray XE6 “Hopper” supercomputer that generated a 3D magnetic reconnection dataset of a trillion particles.

Although our VPIC runs typically generate two types of data—grid and particle—we never did a whole lot with the particle data because it was really hard to extract information from a trillion particle dataset, and there was no way to sift out the useful information,” said Homa Karimabadi, head of the space physics group at UCSD.

Using their new tools, the researchers wrote each 32 TB file to disk in about 20 minutes, at a sustained rate of 27 gigabytes per second. By applying an enhanced version of the FastQuery tool, the team indexed this massive dataset in about 10 minutes, then queried the dataset in three seconds for interesting features to visualize.

Read the Full Story.

Sifting Through a Trillion Electrons

Sponsored Guest Articles

Life Is Fleeting, But Data Is Forever – Meet Your Digital Twin

White Papers

Energy efficiency drives HPC to the cloud

Featured RSS Feed

More News from insideBIGDATA

Sifting Through a Trillion Electrons

Sponsored Guest Articles

Life Is Fleeting, But Data Is Forever – Meet Your Digital Twin

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Related Posts

Featured RSS Feed

More News from insideBIGDATA