This is the fourth article in a series taken from The insideHPC Guide to The Industrialization of Deep Learning.
With the infrastructure part of an industrial grade solution in place the software development and deployment environment becomes paramount. Perhaps even more so than the infrastructure, this is a rapidly evolving environment and yet needs to meet the needs of a typical production environment.
Deep learning solutions are typically a part of a broader high performance analytics function in for profit enterprises, with a requirement to deliver a fusion of business and data requirements. In addition to support large scale deployments, industrial solutions typically require portability, support for a range of development environments, and ease of use.
Ease of use is a very significant factor in production environments and especially relevant to deep learning solutions. These are massively parallel applications, and developing and optimizing parallelized code is a non-trivial exercise. Code development for GPU deployment is also very challenging even with the continuing improvements in environments such as CUDA. In most situations memory management between the primary system and the GPU-based code still needs to be handled by the developer. Portability and optimization of code is another important factor for production deployments where support for multiple accelerator types will be required. Currently even GPU models within the same product family can require differing configuration and optimization parameters, which will become increasingly problematic when multiple vendor’s accelerators are added to the mix. Despite the importance of these factors for commercial enterprise deployments they are frequently not addressed by toolkits developed by university research groups or hyperscale companies such as Google or Facebook that are primarily focused on their own specific and unique requirements.
By comparison, the HPE Cognitive Computing Toolkit which is the result of over five years of development in Hewlett Packard Labs has a specific focus on addressing the requirements of production deep learning environments and HPC software enablement. The HPE CCT toolkit is a domain-specific embedded language with associated optimizing compiler and runtime libraries. A primary goal is to deliver a high level write once development environment that delivers performance-portable, high-productivity programming for accelerators.
HPE CCT capabilities include:
- General high performance analytics integrating deep learning
- Automatic decomposition to deliver highly parallel code
- Eliminates the need for developers to handle GPU memory management
- Portability and optimization across multiple accelerator types and vendors
- Support for Scala, C++, Python and OpenCL
- Plug-ins to support environments such as TensorFlow
- Support for defining custom operations
In coming weeks, this series will consist of articles that explore:
- Deep Learning and Getting There
- Technologies for Deep Learning
- Components for Deep Learning
- Software Framework for Deep Learning (this article)
- Examples of Deep Learning
- HPE Solutions for Deep Learning / HPE Cognitive Computing Toolkit
If you prefer you can download the complete inside HPC Guide to The Industrialization of Deep Learning courtesy of Hewlett Packard Enterprise