Fast INT8 Inference for Autonomous Vehicles with TensorRT 3

Autonomous driving demands safety, and a high-performance computing solution to process sensor data with extreme accuracy. Researchers and developers creating deep neural networks (DNNs) for self driving must optimize their networks to ensure low-latency inference and energy efficiency. Thanks to a new Python API in NVIDIA TensorRT, this process just became easier … Read more

CUTLASS: Fast Linear Algebra in CUDA C++

Matrix multiplication is a key computation within many scientific applications, particularly those in deep learning. Many operations in modern deep neural networks are either defined as matrix multiplications or can be cast as suc … Read more

RESTful Inference with the TensorRT Container and NVIDIA GPU Cloud

Once you have built, trained, tweaked and tuned your deep learning model, you need an inference solution that you need to deploy to a datacenter or to the cloud, and you need to get the maximum possible performance. You may have heard that NVIDIA TensorRT can maximize inference performance on NVIDIA GPUs, but … Read more


NVIDIA is headed to NIPS (Neural Information Processing Systems) and we can’t wait to show you our latest AI innovations. Visit our booth (#109) to see cutting-edge technology in action and meet with 40+ members of our AI team focused on research, applied engineering, and solutions engineering … Read more