In the late 1980s, genomic sequencing began to shift from wet lab work to a computationally intensive science; by end of the 1990s this trend was in full swing. The application of computer science and high performance computing (HPC) to these biological problems became the normal mode of operation for many molecular biologists.
The Open Compute Project, initiated by Facebook as a way to increase computing power while lowering associated costs with hyper-scale computing, has gained a significant industry following. This guide to Open Computing is design to help organizations optimize their HPC environment to achieve higher performance at a lower operating cost.
Advances in computational biology as applied to NGS workflows have led to an explosion of sequencing data. All that data has to be sequenced, transformed, analyzed, and stored. The machines capable of performing these computations at one point cost millions of dollars, but today the price tag has dropped into the hundreds of thousands of dollars range.
Clusters that are purchased for specific applications tend not to be flexible as workloads change. What is needed is an infrastructure that can expand or contract as the workload changes. IBM, a recognized leader in High Performance Computing is applying its expertise in both HPC and Cloud computing to bring together the technologies to create the HPC Cloud.
Demands by users that are running applications in the scientific, technical, financial or research areas can easily outstrip the capabilities of in-house clusters of servers. IT departments have to anticipate compute and storage needs for their most demanding users, which can lead to extra spending on both CAPEX and OPEX once the workload changes.
The term next generation sequencing (NGS) is really a misnomer. NGS implies a single methodology, but the fact is that over the past 10 to 15 years there have been multiple generations and the end is nowhere in sight. Technological advances in the field are continuing to emerge at a record setting pace.
The ClusterStor SDA is built on Seagate’s successful ClusterStor family of high performance storage solutions for HPC and Big Data, providing unmatched file system performance, optimized productivity and the HPC industry’s highest levels of efficiency, reliability, availability and serviceability. Taking full advantage of the Lustre file system, Seagate ClusterStor is designed for massive scale-out performance and capacity, with the ability to support from hundreds to tens-of-thousands of client compute nodes delivering data intensive workload throughput from several GB/sec to over 1TB/sec.
The Seagate ClusterStor Secure Data Appliance (SDA) is the HPC industry’s first scale-out secure storage system officially ICD-503 certified to consolidate multiple previously isolated systems, maintain data security, enforce security access controls, segregate data at different security levels, and provide audit trails, all in a single scale-out file system with proven linear performance and storage scalability.
There is always different levels of importance assigned to various data files in a computer system, specifically a very large system that is storing petabytes of data. In order to maximize the use of the highest speed storage, Hierarchical Storage Management (HSM) was developed to move and store data within easy use of users, yet at the appropriate speed and price.