Some of the biggest challenges in deploying an AI-based application are the accuracy of the model and being able to extract insights in real time. There’s a trade-off between accuracy and inference throughput. Making the model more accurate makes the model larger which reduces the inference throughput.
This post series addresses both challenges. In part 1, you train an accurate, deep learning model using a large public dataset and PyTorch. Then, you optimize and infer the RetinaNet model with TensorRT and NVIDIA DeepStream.
In this post, you learn how to train a RetinaNet network with a ResNet34 backbone for object detection. ResNet34 provides accuracy while being small enough to infer in real time at the edge. We provide a step-by-step guide, covering pulling a container, preparing the dataset, tuning the hyperparameters and training the model. Training uses mixed precision, a technique that dramatically accelerates training without compromising accuracy.
Read the blog, Building a Real-time Redaction App Using NVIDIA DeepStream, Part 1: Training in its entirety here.
Part 2 discusses how to build and deploy a real-time, AI-based application. This model is deployed on an NVIDIA Jetson powered AGX Xavier edge device using DeepStream SDK to redact faces on multiple video streams in real time.
In this post, you take the trained ONNX model from part 1 and deploy it on an edge device. We explain how to deploy on a Jetson AGX Xavier device using the DeepStream SDK, but you can deploy on any NVIDIA-powered device, from embedded Jetson devices to large datacenter GPUs such as T4.
The purpose of this post is to acquaint you with the available NVIDIA resources on training and deploying deep learning applications.
Read the blog, Building a Real-time Redaction App Using NVIDIA DeepStream, Part 2: Deployment in its entirety here.
For more technical how-to’s and tutorials, visit the NVIDIA Developer Blog.