What can NAND flash do now for high-performance computing (HPC) storage and how will it evolve? Garth Gibson, the co-founder and chief technology officer for Panasas, the (HPC) storage provider, has definite views on it. Here’s a snapshot of them.
El Reg: How can solid state technology benefit HPC in general?
Garth Gibson: The most demanding science done in HPC is high resolution simulation. This manifests as distributed shared memory — the science is limited my memory size. Memory is typically 40 per cent of capital costs and 40 per cent of power costs, and can be said to be the limiting technology in future HPC systems.
Solid state promises new choices in larger, lower power memory systems, possibly enabling advances in science better and faster. More narrowly, solid state technology does not have mechanical positioning delays, so small random accesses can have latencies that are two orders of magnitude shorter.
El Reg: Does Panasas have any involvement with NAND in its products? If so, how and why?
Garth Gibson: Panasas uses NAND flash to accelerate small random accesses. In HPC storage, the bulk of the data is sequentially accessed, so this means that the primary use of small random access acceleration is file system metadata (directories, allocation maps, file attributes) and small files.
But we also use this space for small random accesses into large files, which, although rare, can lead to disproportionately large fragmentation and read access slowdown. I am totally enamored of STT-RAM because it promises infinite rewrite and DRAM-class speeds.
El Reg: What are your views on SLC NAND and MLC NAND in terms of speed (IOPS, MB/sec), endurance, form factor and interfaces?
Garth Gibson: Our experience is that the NAND flash technologies are becoming more mature, and we can increasingly trust in the reliability mechanisms provided. This means that enterprise MLC is sufficiently durable and reliable to be used, although SLC continues to be faster when that extra speed can be fully exploited.
El Reg: Where in the HPC server-to-storage ‘stack’ could NAND be used and why?
Garth Gibson: The driving use of NAND flash in HPC by the end of this decade is likely to be so called “burst buffers”. These buffers are the target of memory to memory copies, enabling checkpoints (defensive IO enabling a later “restart from checkpoint” after a failure) to be captured faster.
The compute can then resume when the burst buffer drains to less expensive storage, typically on magnetic hard disk. But shortly after that use is established I expect scientists to want to do data analytics on multiple sequential checkpoints while these are still held in the burst buffer, because the low latency random access of NAND flash will allow brute-force analysis computations not effective in main memory or on magnetic disk.
El Reg: Does Panasas develop its own NAND controller technology? If yes or no – why?
Garth Gibson: Panasas is using best-in-class NAND flash controller technology today. But changes in NAND flash technology and vendors are rapid and important and we continue to track this technology closely, with an open mind to changing the way we use solid state.
El Reg: What does Panasas think of the merits and demerits of TLC NAND (3-bit MLC)?
Garth Gibson: TLC NAND flash is a new technology, not yet ready for use in Panasas equipment. As it evolves, it might become appropriate for burst buffers … hard to say now.
El Reg: How long before NAND runs out of steam?
Garth Gibson: As usual, technologists can point to challenges with current technology that seem to favor alternative technologies in a timeframe of 2 to 4 generations in the future. I’m told in such discussions that 2024 looks quite challenging for NAND flash, and much better for its competitors.
However, with that much time, the real issue is how much quality investment is made in the technology. The market impact of NAND flash is large enough now to ensure that significant effort will go into continued advances in NAND flash. This is not as clear for its competitors.
El Reg: What do you think of the various post-NAND technology candidates such as Phase Change Memory, STT-RAM, memristor, Racetrack and the various Resistive-RAMs?
Garth Gibson: I am totally enamored of STT-RAM because it promises infinite rewrite and DRAM-class speeds. Totally magic! I just hope the technology pans out, because it has a long way to go. Phase change is much more real, and suffering disappointing endurance improvement so far.
El Reg: Any other pertinent points?
Garth Gibson: Magnetic disk bits are small compared to solid state bits, and solid directions are available to continue to make them smaller. As long as society’s appetite for online data continues to grow, I would expect magnetic disk to continue to play an important role. However, I would expect that the memory hierarchy – on-chip to RAM to disk will become deeper, with NAND flash and its competitors between RAM and disk.
Not such good news in his views on memristor technology. Maybe HP will surprise us all. ®