Drawing inspiration from how humans interact with objects through touch, University of California, Berkeley researchers developed a deep learning-based perception framework that can recognize over 98 different objects from touch. According to the team, this is the first project that addresses this type of robot-object interaction using only touch at a large-scale.
“When we see a soft toy, we imagine what our fingers would feel touching the soft surface, when we feel the edge of the scissors, we can picture them in our mind,” the scientists stated in their paper. “In this work, we study how similar multi-modal associations can be learned by a robotic manipulator. We frame this problem as one of cross-modality instance recognition: recognizing that a tactile observation and a visual observation correspond to the same object instance.”
Using high-resolution touch sensing, NVIDIA TITAN X and GeForce GTX 1080 GPUs, with the cuDNN-accelerated TensorFlow deep learning framework, the team trained and tested a convolutional neural network for multi-modal association on over 33,000 images.
“We train a convolutional network to take in the tactile readings from two GelSight sensors which are mounted on the fingers of a parallel jaw gripper, as well as an image of an object from a camera, and predict whether these inputs come from the same object or not,” the researchers stated.
The GPUs help determine whether a grasp was successful, the researchers said.
The model can be used to confirm if an object image corresponds to a tactile reading, or to recognize object instances by touch.
The researchers hope to extend their framework to one-day help robots in warehouses retrieve objects from product images by feeling for them on shelves. Robots in a home environment could also retrieve objects from hard-to-reach places.
The work was recently published on ArXiv.