Data—the gold that today’s organizations spend significant resources to acquire—is ever-growing and underpins significant innovation in technologies for storing and accessing it. In this technology guide, Modernizing and Future-Proofing Your Storage Infrastructure, we’ll see how in this environment, different applications and workflows will always have data storage and access requirements, making it critical for planning to understand that a heterogeneous storage infrastructure is needed for a fully functioning organization. Most organizations require both High Performance Computing (HPC) technologies as well as other IT functionalities.
Industry Innovators
A number of industries benefit from the vast amounts of data that are generated today and that require fast and reliable access to data. The following industries require high-density and high-performing storage devices that are located close to the computing systems.
Energy
By finding reserves faster based on HPC technologies, energy companies can get their product to market faster. When searching for new energy deposits and simulating the extraction of oil and gas, the more data that is used, the more accurate the modeling and simulation will be. Fast and reliable access to new and historical data will lead to more confidence in drilling and fewer environmental effects during extraction. Better planning can be achieved in terms of delivery of energy with the addition of real-time monitoring of energy usage where massive amounts of data are coming from IoT devices, for example.
Financial Tech
Nanoseconds matter when determining when to trade stocks, bonds and other financial instruments. With extremely fast simulations of where prices may move to, using current and historical patterns, a financial services company can gain an advantage over its competitors, increasing profit and reducing losses. Using more of the historical data that is available can increase the confidence of trading financial instruments. Having the data close to the CPU can shave microseconds off a simulation, leading to faster decisions and the ability to increase returns or decrease the amount of money lost in a downturn.
Genomics
Decoding a genome takes significant computing power and fast access to significant amounts of data. As has been documented over the past 10 years, the time and the cost of sequencing genomes were reduced by a factor of 1 million. This is the result of a combination of faster CPUs, better algorithms and faster storage. As a single human genome takes up 100 gigabytes of storage space and an increasing number of genomes is sequenced, storage needs will grow from gigabytes to terabytes to petabytes and to exabytes. By 2025, an estimated 40 exabytes of near-line storage capacity will be required for human genomic data. The new frontier in personalized medicine is to understand each individual’s genetic makeup and deliver medicines accordingly.
New Space
As space commercialization heats up, faster simulations for many types of activities will be required. From designing new space suits and life support systems, to determining flight paths and mining of asteroids, the amount of data that is generated and consumed will explode. The possible uses of outer space are unlimited and unknown at this time. However, most scenarios will require massive amounts of data to be accessed in real time as well as significant amounts of computing power.
Storage Considerations
Storage plays a key role in all industries, and especially where HPC is used to solve complex problems. While many IT decision makers may choose to select CPUs first and make storage an afterthought, the selection of the storage system should be made at the same time as the overall infrastructure is being architected and selected. Without fast and reliable access to all of the data, workflows will suffer, eventually affecting product development and business decisions. By selecting and integrating the right storage system up front, incompatibilities and lack of performance or other factors can be avoided.
Storage for demanding applications must be more than Just a Bunch Of Disks (JBOD). The storage system that is chosen must enhance the computing tasks at hand. Simply throwing data to a disk drive is not enough. The storage system must have the intelligence to maximize performance in all slots, as well as deliver high performance across various types of data, from random small reads and writes to large sequential files.
As mentioned earlier, an effective storage system will respond quickly for concurrent applications that are using the data at the same time. If a storage system only responds to reads quickly, then applications that must store the data will suffer in performance. If optimized for writes, those applications that require fast access to the data will suffer as well. A well-chosen storage system will respond quickly for reading and writing data for multiple applications running at the same time in organizations that have their workflows integrated with the right data at the right time. The figure above shows an example of how multiple applications need to access the same storage system.
As the amount of data from all sources is growing and has been documented, how will the amounts of data that your application requires grow in the future? Will you need to store and access 2X the data in 12 months? 3X? How will you easily scale to what you need without a massive change to your existing investment? These are critical decisions that must be made when selecting your storage system. By selecting a vendor and system that can handle higher density or faster drives, both HDDs and SSDs in the future will ensure that your storage system can grow with your needs, without having to replace the chassis or other internal components. As organizations purchase new systems, these new systems must not disrupt ongoing operations and must be seamlessly integrated. Choosing the right mix of HDDs and SSDs depends on a number of factors. While there has been a lot of hype around the emergence of SSDs, the need will always exist for a mix of the two. For specific workloads, it is important to understand and compare the relative performance per dollar, per capacity of the different technologies.
The security of the data stored is of primary importance for any organization. Numerous services are available to protect data from an outside cyber attack that can be installed. For an additional level of security, the storage device can be entirely encrypted and the data secure once the drive is removed from a system. This is available from leading manufacturers.
Another overlooked critical performance measurement is how long it takes a disk to be rebuilt when needed. For critical business operations, the time it takes to rebuild a disk is quite important. Using innovative techniques such as Seagate ADAPT data protection, the disk rebuild time can be reduced up to 95% of the time compared with traditional RAID implementations.
By understanding the options, you can future-proof your organization’s requirements and be ready for the increase in storage needs. Scaling a storage system can be accomplished in a number of ways. They include:
- Purchasing the HDD or SSD from a vendor that is always innovating at the capacity and performance level, while recognized as a supplier to the most demanding enterprises and research organizations
- Adding or removing capacity without affecting running applications
- A chassis that is designed to house a significant number of devices without having to increase the power supplies or internal communication paths
- Chassis that can work together without requiring additional hardware from third-party vendors
Over the next few weeks we will explore these topics surrounding storage infrastructure:
- Introduction, Mixing Workloads, Components/Storage
- Challenges with HPC Today and How it is Changing, Industry Trends
- Industry Innovators, Storage Considerations
- Value, Seagate as a Trusted Partner, Next Steps
Download the complete insideHPC Special Research Report: Modernizing and Future-Proofing Your Storage Infrastructure, courtesy of Seagate.