In this video, Sarah Tariq from Nvidia presents: Lessons learned in improving scaling of applications on large GPU clusters. Recorded at the HPC Advisory Council Stanford Conference 2013. Download the slides (PDF).
In this video, Sarah Tariq from Nvidia presents: Lessons learned in improving scaling of applications on large GPU clusters. Recorded at the HPC Advisory Council Stanford Conference 2013. Download the slides (PDF).
Ethernet wasn’t built with AI in mind. While cost-effective and ubiquitous, its best-effort, packet-based nature creates challenges in AI clusters… But fabric-scheduled Ethernet transforms Ethernet into a predictable, lossless, scalable fabric – ideal for AI. It uses cell spraying and virtual output queuing ….
In this insideHPC Guide, our friends over at WEKA offer 10 important questions to ask when starting with AI, specifically planning for success beyond the initial stages of a project. Reasons given for these failures include not having a plan ahead of time, not getting executive or business leadership buy-in, or failing to find the proper team to execute the project. Chasing the hot technology trend without having a proper strategy often leads companies down the path of failure.