Nsight Compute 2020.2, now available for download, puts more control in the hands of developers enabling new usage models and opportunities to get the best performance. For many use-cases this also means getting results faster so you can spend less time profiling and more time optimizing. The latest release adds the highly-requested “Application Replay” feature and additional collection knobs to give you more control over what data you collect and how you collect it.
In order to collect all of the detailed performance data for CUDA kernels, Nsight Compute runs and reruns the kernels multiple times and restores the GPU memory context between each pass. Now, with Application Replay, Nsight Compute can rerun the entire application without needing to restore the memory context. This can decrease collection time significantly for applications with large GPU memory allocations.
There are also certain workloads that rely on CPU memory for flow control and with Application Replay those workloads can be profiled and the flow will be consistent across passes. This feature can be accessed via the command line option “–replay-mode=application” or setting the Replay Mode in the common tab of the Activity Configuration in the GUI.
Several new collection control flags are introduced via the Nsight Compute 2020.2 CLI, including:
Terminate the application once all requested kernels were profiled.
Set the output stream for printing tool output.
Control if the child application exit code should be checked.
The Nsight developer tools are available as individual downloads or as optional components with the CUDA Toolkit. New CUDA 11.1, Nsight Compute 2020.2, and Nsight Visual Studio Edition 2020.2 introduce support for NVIDIA GeForce RTX 30 Series and Quadro RTX Series GPU platforms.
There are multiple improvements to the CUDA debugger tools in the CUDA Toolkit 11.1 release. Performance enhancements make the debugger faster, particularly when loading modules. There are also new options to ‘Break on API Errors’ and “Break on Launch” for finer-grained control of your debugging session.
In addition to the new GPU platforms and CUDA 11.1 support, Nsight Visual Studio Edition introduces support for Microsoft Windows 10 Hardware Scheduling.
If you haven’t tried the Nsight suite of developer tools yet, or you’ve been needing some of these new controls, now is the time to check them out. Learn more about Nsight Compute, Nsight Systems, Nsight Visual Studio Edition, and Nsight Graphics.
You can access all of the NVIDIA developer tools from our download center.