A strategic partnership between the University of Michigan and software company Yottabyte promises to unleash a new wave of data-intensive research by providing a flexible computing cloud for complex computational analyses of sensitive and restricted data.
With the Yottabyte Research Cloud, researchers will be able to ask more questions, faster, of the ever-expanding and massive sets of data collected for their work,” said Yottabyte CEO Paul Hodges. “We are very pleased to be a part of the diverse and challenging research environment at U-M. This partnership is a great opportunity to develop and refine computing tools that will increase the productivity of U-M’s world class researchers.”
The Yottabyte Research Cloud will provide scientists high-performance, secure and flexible computing environments that enable the analysis of sensitive data sets restricted by federal privacy laws, proprietary access agreements or confidentiality requirements. Previously, the complexity of building secure and project-specific IT platforms often made the computational analysis of sensitive data prohibitively costly and time consuming.
The system is built on $5.5 million worth of hardware and software donated to the university by Yottabyte. U-M will provide $2 million to support delivery of services to researchers and general operations.
Brahmajee Nallamothu, professor of internal medicine, tested a pilot installation of the Yottabyte Research Cloud at the U-M Institute of Healthcare Policy and Innovation for his research on such topics as predictors of opioid use after surgery and the costs and uses of cancer screenings under the Affordable Care Act.
“We recently moved a health care claims database, which is multiple terabytes in size and requires a great deal of memory and fast storage to process, onto the pilot platform,” Nallamothu said. “The platform allows us to immediately increase or decrease computing resources to meet demand while permitting multiple users to access the data safely and remotely. Our previous setup relied on network storage and self-managed hardware, which was extremely inefficient compared to what we can do now.”
“The Yottabyte Research Cloud will improve research productivity by reducing the cost and time required to create the individualized, secure computing platforms that are increasingly necessary to support scientific discovery in the age of Big Data,” said Eric Michielssen, associate vice president for advanced research computing at U-M.
Many U-M scientists are working on a variety of research projects that could benefit from use of the Yottabyte Research Cloud:
- Health care research, for example in precision medicine, often requires working with sensitive patient information and large volumes of diverse data types. This research can yield results that positively impact patients’ lives, but often involves the analysis of millions of clinical observations that can include genomic, hospital, outpatient, pharmaceutical, laboratory and cost data. This requires a secure high-performance computing ecosystem coupled to massive amounts of multitiered storage.
In the social sciences, U-M research requires secure, remote access to sensitive research data about substance abuse, mental health and other topics. - Transportation researchers who mine large and sensitive datasets—for example, a 24-terabyte dataset that includes videos of drivers’ faces and GPS traces of their journeys—also stand to benefit from the security features and computing power.
In learning analytics, studies of the persistence of teacher effects on student learning could benefit from the enclaves to store and analyze data that includes observational measures scored from classroom videos and elementary and middle school students’ scores on standardized tests. - Researchers in brain science will be able to use the Yottabyte Research Cloud to investigate a wide range of topics including the effects of aging on brain function and structure and how we focus our attention in the presence of distraction.
The Yottabyte Research Cloud is U-M’s first foray into software-defined infrastructure for research, allowing on-the-fly personalized configuration of any-scale computing resources, which promises to change the way traditional IT infrastructure systems are deployed across the research community.