Today NVIDIA announced that NVIDIA Nsight Systems 2019.1 is now available for download. As a system-wide performance analysis tool. With it, developers can visualize application algorithms, identify large optimization opportunities, and tune/scale efficiently across CPUs and GPUs.
In this release, we introduce a wide range of new features, refinements, and fixes. The enhancements aim to improve a user’s ability to analyze neural network performance, locate graphical stutter, and increase pattern discoverability.
NVIDIA Nsight Systems is a low overhead performance analysis tool designed to provide insights developers need to optimize their software. Unbiased activity data is visualized within the tool to help users investigate bottlenecks, avoid inferring false-positives, and pursue optimizations with higher probability of performance gains. Users will be able to identify issues, such as GPU starvation, unnecessary GPU synchronization, insufficient CPU parallelizing, and even unexpectedly expensive algorithms across the CPUs and GPUs of their target platform. It is designed to scale across a wide range of NVIDIA platforms such as: large Tesla multi-GPU x86 servers, Quadro workstations, Optimus enabled laptops, DRIVE devices with Tegra+dGPU multi-OS, and Jetson. NVIDIA Nsight Systems can even provide valuable insight into the behaviors and load of deep learning frameworks such as PyTorch and TensorFlow; allowing users to tune their models and parameters to increase overall single or multi-GPU utilization.
In this video, John Stone from the NIH Center for Macromolecular Modeling and Bioinformatics at University of Illinois at Urbana-Champaign, describes how he achieved over a 3x performance increase of VMD, a popular tool for analyzing large biomolecular systems.
We noticed that our new Quadro P6000 server was ‘starved’ during training and we needed experts for supporting us,” said Felix Goldberg, Chief AI Scientist at Tracxpoint. “NVIDIA Nsight Systems helped us to achieve over 90 percent GPU utilization. A deep learning model that previously took 600 minutes to train, now takes only 90.”
Nsight Systems is part of a larger family of Nsight tools. A developer can start with Nsight Systems to see the big picture and avoid picking less efficient optimizations based on assumptions and false-positive indicators. If the system-wide view of CPU-GPU interactions indicates large GPU workloads are a bottleneck, then Nsight Graphics and Nsight Compute can further assist in deeper analysis.