Sign up for our newsletter and get the latest HPC news and analysis.

Interview: Terascala and High Performance Data Movement

In the HPC world, the need for speed is often coupled with requirements for robustness and reliability over a long time scales. Terascala tackles this problem head on via an OS that manages Lustre-based storage and application workflow. To learn more, we caught up with Steve Butler, CEO of Terascala.

insideHPC: Terascala bills itself as “The Fast Data Company.” What does this mean for the HPC crowd?

Steve Butler: Terascala is helping accelerate and standardize the way data is managed and monitored in HPC application workflows. We provide the operating system and analytics that transform block storage and controllers from Dell and NetApp into an easy-to-manage storage appliance that delivers multiple gigabytes per second of throughput for HPC workloads.

Terascala’s intelligent operating system, TeraOS, simplifies managing Lustre®-based storage and optimizes workflows, providing the high throughput storage HPC users need to solve bigger problems faster. For the HPC folks, this means that Terascala-powered storage appliances can reduce run times to hours instead of days or weeks.

Our recently announced Intelligent Storage Bridge (ISB) is another example of why we call ourselves the Fast Data Company. The ISB is a high-performance workflow manager designed to transfer large data sets quickly and automatically between fast scratch storage and enterprise storage. The ISB is the fastest way to move data between different storage tiers – a critical capability in both the HPC and Big Data worlds.

insideHPC: What does this mean for the Big Data side of things?

Steve Butler: We see the future of Big Data in terms of optimizing data flow via high performance data movement across an entire organization. Organizations that can optimize their data flow so that vital research and analysis can be performed quickly will thrive in today’s competitive environment.

Today, most Big Data is placed in storage that is far too slow for data-intensive applications. Once moved to HPC storage, data needs to move incredibly fast to the compute nodes that are running the application itself. The processed data then needs to move back from HPC to infrastructure (NAS or Cloud) where permanent storage exists. To keep Big Data moving quickly and constantly, we believe that organizations need a purpose-built, end-to-end, high performance data movement solution.

Terascala can be the connection between Big Data and HPC. For moving data to compute clusters, our storage appliances scale from 10s to 100s of GB/s. And, our Intelligent Storage Bridge (ISB) scales to 10s of GB/s in order to move data throughout an entire organization quickly and efficiently–connecting Lustre, NFS, CIFS, and cloud file systems.

insideHPC: You recently made an announcement surrounding a new release of the Intelligent Storage Bridge. You touched on this in your first answer, but can you provide more details?

Steve Butler: Terascala announced the availability of the latest version of our groundbreaking data movement solution, the Intelligent Storage Bridge (ISB). The ISB enhances the throughput and reliability of large data transfers, increasing fast scratch efficiency and overall application workflow performance. Our latest version of the ISB includes vendor-agnostic support of Lustre solutions, allowing organizations to bring together a wider range of HPC and enterprise storage solutions.

Because many of today’s HPC infrastructures are not optimized for efficient and daily movement of data, vast amounts of unstructured data are typically stranded, limiting valuable analysis opportunities. The ISB is the first solution of its kind to solve this “stranded data” challenge by automating the transfer of large data sets between high-performance fast scratch storage and a cost-effective enterprise storage system.

insideHPC: This solution includes vendor-agnostic support of Lustre users. What does this mean ultimately for your Lustre customers?

Steve Butler: The open source Lustre file system is the fastest in the world – approximately 2/3 of the fastest file systems in the world use it. Yet, at times, Lustre can challenging and requires expertise to manage. Terascala solves this problem with the ISB. Since we started shipping the ISB last year, it’s been helping customers make better use of their high performance resources. With this new release and the vendor-agnostic support for Lustre, the ISB is poised to become an indispensable part of any HPC infrastructure.

insideHPC: Do you have any recent success stories that you may want to share?

Steve Butler: One of Terascala’s first customers to deploy the ISB after it became available last year was Translational Genomics Research Institute (TGen). They’re a healthcare organization, focused on using genomic sequencing research to treat disease and bring the fight against childhood cancer to the forefront of the medical research community.

TGen has been using the ISB for more than half a year, moving genome data sets to and from the Dell-Terascala HSS on a regular basis. Using the ISB, their researcher productivity is higher and their genome analysis timeline is reduced. TGen told us recently that using the ISB, they’ve been able to cut the amount of time required to run their computations anywhere from about seven to nine days down to four to ten hours.

insideHPC: Things happen super-quickly in the HPC world–what’s next for Terascala?

Steve Butler: Our vision for the future is to continue to deliver a Terascala product suite that gives organizations the tools they need to optimize their HPC workflows and big data analysis.

As organizations acquire more and more data, a big challenge will be storing, moving and optimizing that data, so it can be used more effectively.

For example, as the cost of genomic sequencing continues to fall, the challenge has become how to effectively find and use the information contained in a tsunami of data. And a big part of the solution involves effective data movement. We believe that in record time Terascala, its partners and high performance computing in general will play a vital role in finding better treatments and more effective medications in the fight against disease.

Resource Links: