Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Long-time IBM HPC Strategist Dave Turek Named CTO of DNA-based Storage Start-up

Dave Turek

Long-time IBM HPC strategist David Turek has left the company after 25 years and has joined Catalog, which calls itself the world’s first DNA-based platform for massive digital storage and computation. He will serve as CTO of the start-up.

Catalog also announced today $10 million Series A funding in a round led by Horizons Ventures. In total, Catalog has raised $21 million, additional investors include NEA, OS Fund, Data Collective, Day One Ventures, among others.

At IBM, Turek led the overall high-performance computing strategy that, over his years there, included the SP2 program, Roadrunner, Blue Gene and CORAL systems). He also helped launch IBM’s Grid Computing business, and started and ran IBM’s Linux Cluster business. Turek told us he left IBM in June.

“Bringing David Turek on-board as our CTO is game-changing for the company and is an important milestone for the field of DNA-based data storage and computation technology,” said Catalog CEO and Co-founder Hyunjun Park. “The new capital will be used to reach commercialization.”

The basis of Catalog technology is what’s known as “synthetic DNA,” which the company “is ideal for data storage purposes for reasons that go beyond its longevity. It can hold a million times the data in the same volume as what is offered by magnetic and solid-state media. It takes almost no energy to store, can be replicated easily and inexpensively once encoded, and is trivial to transport, as a thousand petabytes of data (one exabyte) in DNA form will be roughly the size of a sugar cube.”

According to the company, current research efforts to develop DNA-based data storage struggle with the high cost and low synthesis speeds, limiting their near-term economic viability for storing meaningful quantities of digital information. Catalog’s breakthrough was in realizing unique DNA sequences that did not need to be chemically synthesized base-by-base to encode digital information. Rather, a combinatorial approach using a collection of prefabricated DNA sequences assembled in different ways can be utilized to much more quickly and inexpensively represent any digital data object with the same level of bit-level precision, the company said.

The Boston-based company’s first-generation DNA writer and data storage system, Mobius, can write at a speed of over 10 Mb/s and can generate over a trillion identifiers in a single run, enough to store up to1.63Tb of compressed data, according to Catalog. The inaugural run on Mobius was the English text version of Wikipedia, comprising 14 GB of information, more information than any other effort to store information in DNA to date, combined. “Even more, it was written at 4-5 orders of magnitude faster and cheaper than competitive technologies in DNA,” the company said in today’s announcement.

Turek told us that Catalog’s vision is to pursue breakthroughs in fields such as search, inference and digital signal processing.

“I am thrilled to join the team at Catalog,” he said. “While there is justifiable excitement about the innovations pioneered by Catalog to efficiently write data in DNA, we believe this is just the beginning as we vigorously push towards product commercialization. Our intention is to be the marketplace leader for the application of synthetic DNA for both storage and computation.”

Founded in 2016 by MIT scientists, Catalog said it is the first company to develop a solution

to make DNA data storage commercially viable. In 2019, Catalog was named by Fast Company as one of 2019’s Most Innovative Companies in biotech and selected as one of the Best Inventions of 2019 by Time Magazine. The company said it is in talks with potential pilot users of the technology at government agencies, multinational organizations and corporations in various sectors including media and entertainment, banking and finance, oil and gas, and others.

Resource Links: