Maximizing Unified Memory Performance in CUDA

Many of today’s applications process large volumes of data. While GPU architectures have very fast HBM or GDDR memory, they have limited capacity. Making the most of GPU performance requires the data to be as close to the GPU as possib … Read more

Simplify the Deployment of HPC Applications with NVIDIA GPU Cloud

NVIDIA announced the availability of HPC application and visualization containers on NVIDIA GPU Cloud (NGC). Together, these offerings make NGC a single source for researchers seeking access to deep learning frameworks, HPC applications, and visualization tools essential for their scientific workflows … Read more

A New STAC-A2 Record

The results are in, and GPUs are still the fastest solution on the planet for financial risk management. This is according to the latest STAC-A2 audited test results … Read more

Connect with NVIDIA at Supercomputing 2017

Join NVIDIA at SC17 in Denver, Colorado to learn how GPU-accelerated computing is changing the very definition of the word possible. Humanity’s moonshots, like understanding the most fundamental laws of physics, breakthroughs in drug development, and sustainable energy are being achieved right now … Read more

7 Things You Might Not Know about Numba

Numba is a Python compiler from Anaconda that can compile Python code for execution on CUDA-capable GPUs or multicore CPUs. Numba allows automatic just-in-time (JIT) compilation of Python functions, which can provide orders of magnitude speedup for Python and Numpy data processing … Read more

Flexible CUDA Thread Programming

In efficient parallel algorithms, threads cooperate and share data to perform collective computations. To share data, the threads must synchronize. The granularity of sharing varies from algorithm to algorithm, so thread synchronization should be flexible. Making synchronization an explicit part of the program ensures safety, maintainability, and modularity. CUDA 9 introduces Cooperative Groups, which aims to … Read more

PGI 17.7 Delivers OpenACC and CUDA Fortran for Volta GPUs

PGI compilers & tools are used by scientists and engineers who develop applications for high-performance computing (HPC) systems. They deliver world-class multicore CPU performance, an easy on-ramp to GPU computing with OpenACC directives, and performance portability across all major HPC platforms … Read more