Object detection and classification in imagery using deep neural networks (DNNs) and convolutional neural networks (CNNs) is a well-studied area. For some applications, these AI approaches are considered to be reliable enough to use in production with minimal intervention. Popular methods include YOLO, SSD, Faster-RCNN, MobileNet, RetinaNet, and others.
In most application contexts, imagery is collected from an egocentric viewpoint (like a mobile phone camera), with most objects being aligned vertically (a person) or horizontally (a car). This means that most of the objects in the image can be considered to be axis-aligned and can be described by four bounding box parameters: xmin, ymin, width and height.
However, there are many cases where objects or features are not aligned to the image axis. In those cases, the four parameters do not describe the object outline with high precision.
For example, try to describe a square that has been rotated by 45° using the four bounding box parameters. The area of the bounding box is twice that of the square that you are attempting to describe.
Read the full Developer Blog, Detecting Rotated Objects Using the NVIDIA Object Detection Toolkit.