IBM has teamed with UNSW Australia and Brazilian health institute to examine big genomic data using IBM’s World Community Grid. The project seeks to make about 20 quadrillion comparisons of 200 million proteins underlying a wide variety of organisms.
According to IBM, that herculean effort would normally require that a PC spend 40,000 continuous years performing calculations, but the computing power of World Community Grid will reduce the task to months.
The World Community Grid taps into the goodwill and computer power of thousands of volunteers spanning the globe. They’ve all downloaded an app that borrows the unused power of the computing devices when it is not otherwise needed by their users, such as when they take a brief or extended break from using their computers. The scalability of this virtual supercomputer gives scientists a virtually limitless capacity to work with large amounts of data at no cost to them.
While the project will process protein sequences from various forms of life, it will pay special attention to microorganisms because of their ubiquity and importance. For example, there are about 10 times more microorganisms living in and upon human bodies than actual human cells. They control a huge variety of natural processes involved in human health (gut bacteria aid digestion and reduce allergies), food production (baker’s yeast increases yields, speeds preparation and improves taste), and agriculture and aquaculture (bacteria remove impurities). Microorganisms have been used to clean water in sewage treatment plants and even help consume oil spills. Microorganisms in exotic tropical plants show promise as efficient, sustainable fuel sources.
However, most of these discoveries were largely made through time-consuming trial and error. A better understanding of their genes and corresponding proteins might speed development of practical technologies and solutions. Despite their importance for our planet’s health, microorganisms are hard to analyze because of their tiny size, great numbers, and dizzying variety. If scientists want to search for useful genes in unknown organisms, their task is daunting. A small sample of water or soil can contain tens of thousands of organisms, and each organism may have thousands of genes. The acceleration of climate change and the disappearance of habitat have made the identification and analysis of DNA a race against time.
World Community Grid is enabled by software developed in 2002 by Berkeley Open Infrastructure for Network Computing (BOINC) at the University of California, Berkeley and with support from the National Science Foundation. The BOINC project choreographs the technical aspects of volunteer computing.
Sign up for our insideHPC Newsletter.