Google and IBM anounce program to train next generation of parallel specialists

Print Friendly, PDF & Email

[This expands on Michael’s post below, Ed.]

This is cool news, and complements relationships announced by other major companies in recent months. Companies are realizing that if the chip roadmap unfolds as its currently laid out then we’re already behind in training computer scientists to deal with the shift.

Google and IBM announced today that they’re creating a new initiative to teach software development methods which will help students address the challenges of “internet-scale applications in the future. ”

IBM logoThe goal of this initiative is to improve computer science students’ knowledge of highly parallel computing practices to better address the emerging paradigm of large-scale distributed computing. IBM and Google are teaming up to provide hardware, software and services to augment university curricula and expand research horizons. With their combined resources, the companies hope to lower the financial and logistical barriers for the academic community to explore this emerging model of computing.

The University of Washington is signed on to pilot, along with Carnegie-Mellon University, Massachusetts Institute of Technology, Stanford University, the University of California at Berkeley and the University of Maryland.

For this project, the two companies have dedicated a large cluster of several hundred computers (a combination of Google machines and IBM BladeCenter and System x servers) that is planned to grow to more than 1,600 processors. Students will access the cluster via the Internet to test their parallel programming course projects. The servers will run open source software including the Linux operating system, XEN systems virtualization and Apache’s Hadoop project, an open source implementation of Google’s published computing infrastructure, specifically MapReduce and the Google File System (GFS).

According to the release, IBM and Google are tossing these resources into the effort

  • A cluster of processors running an open source implementation of Google’s published computing infrastructure (MapReduce and GFS from Apache’s Hadoop project).
  • A Creative Commons licensed university curriculum developed by Google and the University of Washington focusing on massively parallel computing techniques available at: http://code.google.com/edu/content/parallel.html.
  • Open source software designed by IBM to help students develop programs for clusters running Hadoop. The software works with Eclipse, an open source development platform. The plugin is currently available at: http://lucene.apache.org/hadoop/.
  • Management, monitoring and dynamic resource provisioning of the cluster by IBM using IBM Tivoli systems management software.
  • A website to encourage collaboration among universities in the program. This will be built on Web 2.0 technologies from IBM’s Innovation Factory.

Read the whole story at HPCwire.