Transforming Paintings and Photos Into Animations With AI

Researchers from the University of Washington and Facebook recently released a paper that shows a deep learning-based system that can transform still images and paintings into animations. The algorithm called Photo Wake-Up uses a convolutional neural network to animate a person or character in 3D from a single still image.

“Our method works with a large variety of whole-body, fairly frontal photos, ranging from sports photos to art, and posters,” the researchers stated in their paper. “In addition, the user is given the ability to edit the human in the image, view the reconstruction in 3D, and explore it in AR.”

To demonstrate the power of the algorithm the team used images of graffiti, cartoon characters, NBA star Stephen Curry, and Picasso paintings.

At the crux of the work is a unique approach that allows the researchers to more closely warp a 2D cutout of a person in a still image, enabling the algorithm to produce a realistic 3D animated mesh that matches the character in the image.

Using NVIDIA TITAN GPUs and the cuDNN-accelerated PyTorch deep learning framework the researchers based their software on a pre-trained model called SMPL, which was first developed by a team at Microsoft and the Max Planck Institute for Intelligent Systems in Germany.

Overview of the method. Given a photo, person detection, 2D pose estimation, and person segmentation, is performed using off-the-shelf algorithms. Then, A SMPL template model is fit to the 2D pose and projected into the image as a normal map and a skinning map. The core of our system is: find a mapping between person’s silhouette and the SMPL silhouette, warp the SMPL normal/skinning maps to the output, and build a depth map by integrating the warped normal map. This process is repeated to simulate the model’s back view and combine depth and skinning maps to create a complete, rigged 3D mesh. The mesh is further textured, and animated using motion capture sequences on an inpainted background.

As shown in the graphic above, the software first segments the human body from the image and superimposes a 3D mesh onto the shape. The mesh can then be animated to bring the photo or painting to life.

“We believe the method not only enables new ways for people to enjoy and interact with photos but also suggests a pathway to reconstructing a virtual avatar from a single image while providing insight into the state of the art of human modeling from a single photo,” the researchers said.

The work was recently published on ArXiv and also published on the team’s website.

Read more>