In this video from EuroPython 2014, Travis Oliphant from Continuum Analytics presents: Python’s Role in Big Data Analytics: Past, Present, and Future.
In this special guest feature, Ferhat Hatay from Fujitsu writes that supercomputing technologies developed for data-intensive scientific computing can be a powerful tool for taking on the challenges of Big Data. We all feel it, data use and growth is explosive. Individuals and businesses are consuming — and generating — more data every day. The […]
“Moore’s Law got deflected in 2004, when it became no longer practical to ramp up the clock speed of CPUs to improve performance. So the chip industry improved CPU performance by adding more processors to a chip in concert with miniaturization. This was extra power, but you could not leverage it easily without building parallel software. Virtual machines could use multicore chips for server consolidation of light workloads, but large workloads needed parallel architectures to exploit the power. So, the software industry and the hardware industry moved towards exploiting parallelism in ways they had not previously done. This is the motive force behind the Big Data.”
“For those who haven’t been following the details of one of DOE’s more recent procurement rounds, the NERSC-8 and Trinity request for proposals (RFP) explicitly required that all vendor proposals include a burst buffer to address the capability of multi-petaflop simulations to dump tremendous amounts of data in very short order. The target use case is for petascale checkpoint-restart, where the memory of thousands of nodes (hundreds of terabytes of data) needs to be flushed to disk in an amount of time that doesn’t dominate the overall execution time of the calculation.”
In this video from the DDN User Group at ISC’14, Satoshi Matsuoka from the Tokyo Institute of Technology presents: A Look at Big Data in HPC. “HPC has been dealing with big data for all of its existence. But it turns out that the recent commercial emphasis on big data, has coincided with a fundamental change in the sciences as well. As scientific instruments and facilities produce large amounts of data in an unprecedented rate, the HPC community is reacting to this, with revisiting architecture, tools, and services to address this growth in data.”