AI Enables Markerless Animal Tracking

Researchers from Harvard University along with other collaborators in academia developed a deep learning-based method called DeepLabCut to automatically track and label body parts of moving species with human-like accuracy.

“Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time-consuming,” the researchers stated in their paper. “We present an efficient method for markerless pose estimation based on transfer learning with deep neural networks that achieves excellent results with minimal training data,” the team explained.  

Using NVIDIA GeForce GTX 1080 Ti and NVIDIA TITAN Xp GPUs with the cuDNN-accelerated TensorFlow deep learning framework, the team trained their neural networks to perform pose estimation and body part detection on hundreds of images from the ImageNet dataset.

“We demonstrate the versatility of this framework by tracking various body parts in multiple species across a broad collection of behaviors. Remarkably, even when only a small number of frames are labeled (~200), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy,” the team said.

Rat skilled reaching assay from Dr. Daniel Leventhal’s group at the University of Michigan. The data was collected during an automated pellet reaching task, and it was labeled by Dr. Daniel Leventhal. We used 180 labeled frames for training.

The toolbox works on mice and Drosophila, however, there are no limitations on the framework and the toolbox can be applied to other organisms, the researchers said.

Tracking animals via motion capture can reveal new clues about their biomechanics as well as offer a glimpse into how their brain works. In humans, motion capture and tracking can aid in physical therapy and help athletes achieve records that were unimaginable in the past.

“This solution requires no computational body-model, stick figure, time-information, or sophisticated inference algorithm,” the researchers said. “Thus, it can also be quickly applied to completely different behaviors that pose qualitatively distinct challenges to computer vision, like skilled reaching or egg-laying in Drosophila.”

One case study shows the project implemented on a horse.

This video is taking DeepLabCut, first trained on a different horse, then adding only 11 labeled frames of Justify on a race track, re-training briefly, and applying the automatic labels to the full video. Video from Byron Rogers of Performance Genetics

The code is available on GitHub and the paper was recently published in Nature.  

Read more >