When users, systems operators, and IT managers consider the purchase of a large storage system there are many factors that may be part of the specification and thus architecture of the entire system. As a purchaser and user of a large HPC system, it is important to understand how much actual work will get done per dollar of expense.
The components that are selected to meet the computing demands that are required would probably consist of the CPUs, the amount and speed of the memory, the storage speed and capacity, the network bandwidth and latency, and power and cooling requirements. Together and through experience, the general performance and cost, both CAPEX and OPEX can be quantified and reported. However, one important piece of information is left out in many calculations, are the efficiency calculations of various subsystems that together make up the overall cost and performance of the larger system.
When looking at the compute side of the system, although the specifications of the CPU are well understood, the mapping of an application performance to the theoretical performance must be considered. The same calculation is frequently glossed over when looking at the storage subsystem for large installations.
While the capacity of storage systems continues to grow yearly with new generations of both hard disk and solid state drives, so to does the raw performance per disk drive. In the past, the storage system was typically described (in terms of performance advancement) as not keeping up with Moore’s Law, which CPU manufacturers were always discussing. However, in recent years, the various measurements of the raw storage system is now exceeding Moore’s Law for both capacity and performance.
Modern (late 2016) 4TB drives can deliver about 300 Gigabytes per second (GB/s). It is important to remember that large installations that contain petabytes of storage can scale their total I/O to many terabytes per second, in aggregate. However, even though the raw performance of the latest generation of hard disk drives continues to improve, it is critical that the efficiency of these systems continues to improve as well. Efficiency could be described as the expected performance divided by the cost of the component: Efficiency = P/$. If the efficiency remains constant as the performance increases, then the cost increases as well. If the performance remains the same as the cost decreases, then the efficiency increases as well. However, large HPC organizations that need to speed their time to results, the Performance needs to increase faster than just the raw performance that a supplier can deliver.
The efficiency of the storage system is important to understand. As the raw performance increases, the ability of an application or set of applications, to get even more performance from the disk drive needs to increase as well. For example, let’s assume that a storage device’s raw performance is “X”. If the efficiency is .5, then the maximum sustained performance would be .5*X. If the following generation of storage devices increases the raw performance to 2X but the efficiency remains the same, then the realized performance is .5*2X = X. While the overall impressive, the efficiency remains related to the raw performance. However, if the efficiency increases to .6 (from .5), then the new realized performance is .6*2X. Compared to the previous generation, the realized performance has gone from .5X to 1.2X, which is greater than the raw performance.
Seagate has been at the forefront of delivering storage systems that have the highest raw performance in the industry as well as increasing the efficiency of the raw performance. Measurements have shown that Seagate disk drives have gone from being about 20 % efficient to almost 85 % efficient in the past 5 years. Coupled with the raw performance increasing from about 125 MB/sec to over 3 GB/sec, the overall performance per disk drive from .20 * 125 to .83 * 3,000. This is an astounding 100 times improvement in the efficiency of storage in just 5 years.
Read more about the latest Seagate products which can transform how you think about storage systems.