To help automatically create a dance video, NVIDIA researchers in collaboration with University of California, Merced developed a deep learning-based model that can automatically compose new dance moves that are diverse, style-consistent, and match the beat.
“This is a challenging but interesting generative task with the potential to assist and expand content creations in arts and sports, such as a theatrical performance, rhythmic gymnastics, and figure skating,” the NVIDIA researchers stated in a paper presented this week at the 2019 Conference on Neural Information Processing Systems (NeurIPS 2019) in Vancouver, Canada.
At the core of the work is a decomposition-to-compositions framework which first learns how to move, and then how to compose.
To train the generative adversarial network used in the system, the team collected dance videos of three representative dance categories including Ballet, Zumba and Hip-Hop. In total, the team acquired more than 361,000 clips or approximately 71 hours of dancing footage.
For the pose processing, the team used OpenPose, an open-source, real-time multi-person system developed by Carnegie Mellon University that can jointly detect human body, hand facial and foot key points on single images.
The work was trained using the PyTorch deep learning framework and NVIDIA V100 GPUs. For inference, the work uses the same GPUs used during training. In future iterations of the work, the team plans to add more dancing styles such as pop-dance and partner dance.
“Extensive qualitative and quantitative evaluations demonstrate that the synthesized dances by the proposed method are not only realistic and diverse but also style-consistent and beat-matching,” the researchers stated in their paper.
The source code and models will be published on GitHub after the conference.
This paper is among several research projects currently presented by NVIDIA Research at the NeurIPS conference this week. Overall, the NVIDIA Research team consists of more than 200 scientists around the globe, focusing on areas including AI, computer vision, self-driving cars, robotics, and graphics.