Developer Blog: Introducing Low-Level GPU Virtual Memory Management

There is a growing need among CUDA applications to manage memory as quickly and as efficiently as possible. Before CUDA 10.2, the number of options available to developers has been limited to the malloc-like abstractions that CUDA provides. 

CUDA 10.2 introduces a new set of API functions for virtual memory management that enable you to build more efficient dynamic data structures and have better control of GPU memory usage in applications.

In this post, we explain how to use the new API functions and go over some real-world application use cases.

There are plenty of applications where it’s just hard to guess how big your initial allocation should be. You need a larger allocation but you can’t afford the performance and development cost of pointer-chasing through a specialized dynamic data-structure from the GPU. What you really want is to grow the allocation as you need more memory, yet maintain the contiguous address range that you always had.

If you have ever used libc’s realloc function, or C++’s std::vector, you have probably run into this yourself.

Read the full post, Introducing Low-Level GPU Virtual Memory Management, on the NVIDIA Developer Blog.

Tags: ,