TensorRT is a high performance deep learning inference platform that delivers low latency and high throughput for apps such as recommenders, speech and image/video on NVIDIA GPUs. It includes parsers to import models, and plugins to support novel ops and layers before applying optimizations for inference. Today NVIDIA is open sourcing parsers and plugins in TensorRT so that the deep learning community can customize and extend these components to take advantage of powerful TensorRT optimizations for your apps.
NVIDIA is a strong supporter of the open source community with over 120 repositories available from our GitHub page, over 1500 contributions to deep learning projects by our deep learning frameworks team, and contributions of many large-scale projects such as RAPIDS, NVIDIA DIGITS, NCCL, TensorRT Inference Server, and now TensorRT.
Examples of how you can contribute:
- Extend parsers for ONNX format and Caffe to import models with novel ops into TensorRT
- Plugins enable you to run custom ops in TensorRT. Use open sourced plugins as reference, or build new plugins to support new layers and share with the community
- Samples provide a starting point for your inference apps, contribute samples that cover new workflows and pipelines
The TensorRT github repo is located here and includes contribution guidelines about how you can get involved. We welcome contributions from the community to all these components, refer to the contribution guidelines here. NVIDIA will merge and ship the latest code with TensorRT when a release is available.
Our goal is to make features available to the community quickly and make it easier for you to contribute back to the community.
Learn how to get started with TensorRT in the new NVIDIA Developer Blog post, “How to Speed Up Deep Learning Inference Using TensorRT”.
You can get TensorRT from the TensorRT product page, and the newly released source from GitHub, or get the compiled solution in a ready-to-deploy container from the NGC container registry.