In this video, Maciej Besta and Torsten Hoefler describe their work on Slim Fly: A Cost Effective Low-Diameter Network Topology, which was honored as the Best Student Paper at SC14.
In an 1883 lecture on “The Practical Applications of Electricity”, Scottish physicist Lord Kelvin stated: “… when you can measure what you are speaking about, and express it in numbers, you know something about it …” High Performance Computing (HPC), therefore, inherited a healthy predisposition towards monitoring. Fast forwarding in time to the present, monitoring HPC clusters remains topical. And while I expect we can all agree upon the ongoing relevance, it is clear that there are very different perspectives as to how monitoring should be modernized. Whereas passive monitoring using meta-toolkits may address needs temporarily, unified solutions that combine monitoring with provisioning and management deliver value on an ongoing, sustainable basis.
System availability has become an increasing concern as high capacity drives have come to market in recent years. While availability has not traditionally been a chief concern in HPC, it has now become of paramount importance. Large capacity drives expose systems to multiple-day rebuild times and extended periods of vulnerability to data loss using traditional hardware based RAID.
From Wall Street to the Great Wall, enterprises and institutions of all sizes are faced with the benefits – and challenges – promised by ‘Big Data’. But before users can take advantage of the near limitless potential locked within their data, they must have affordable, scalable and powerful software tools to manage the data.
A new computational method has made it possible to detect genetic changes responsible for the onset and progression of tumors in a simple, quick and precise way. The SMUFIN (Somatic Mutations Finder) method is capable of analyzing the complete genome of a tumor and identifying its mutations in a few hours. In addition, it is able to identify alterations which had previously not been revealed, even using methods which require the use of supercomputers over several weeks.
As the countdown to Exascale continues, Exascale-like storage problems are already showing up in today’s massively parallel, heterogeneous HPC systems. Historically, storage and I/O have kept pace with growing system demands, but, because of the limitations of spinning media and the cost of solid state storage technologies, storage performance improvements have come at a disproportionately higher cost and lower efficiency than their compute counterparts.
This week we look at various attributes including how easy it is to scale Lustre file systems. The inherent scalability of Lustre aggregates storage capacity across many servers. I/O bandwidth also scales as more storage servers are added, and can be dynamically adjusted as needs change and demands for more storage capacity and bandwidth grow.