Researchers from University of California, Berkeley developed a deep learning-based method that creates a 3D reconstruction from a single 2d color image.
“Humans have the ability to effortlessly reason about the shapes of objects and scenes even if we only see a single image,” mentioned Christian Häne of the Berkeley Artificial Intelligence Research lab. “The question which immediately arises is how are humans able to reason about geometry from a single image? And in terms of artificial intelligence: how can we teach machines this ability?”
The researchers exploit the two dimensional nature of surfaces by hierarchically predicting fine resolution voxels with convolutional neural networks only where a surface is expected judging from the low resolution prediction. The difference in their method called hierarchical surface prediction (HSP) is in separating the voxels of an image into three categories: occupied space, free space, and boundaries — this allows them analyze the outputs at low resolution and only predict a higher resolution of the parts of the volume where there is evidence that it contains the surface.
Using a Quadro M6000, Tesla K80 and TITAN X GPUs with the cuDNN-accelerated Torch deep learning framework, they trained their neural networks on the synthetic ShapeNet dataset which consists of Computer Aided Design (CAD) models of objects including airplanes, chairs and cars.