Computer vision systems decipher outdoor scenes
A new method devised by computer scientists at Carnegie Mellon University's Robotics Institute enables computers to gain a deeper understanding of an image by reasoning about the physical constraints of the scene. In much the same way that a child might use a set of toy building blocks to assemble something that looks like a building depicted on the cover of the toy set, the computer would analyze an outdoor scene by using virtual blocks to build a 3-D approximation of the image that makes sense based on volume and mass.
“When people look at a photo, they understand that the scene is geometrically constrained,” says Abhinav Gupta, a post-doctoral fellow in Robotics Institute. “We know that buildings are not infinitely thin, that most towers do not lean, and that heavy objects require support. It might not be possible to know the 3-D size and shape of all the objects in the photo, but we can narrow the possibilities. In the same way, if a computer can replicate an image, block by block, it can better understand the scene.”
Gupta’s approach to automated scene analysis could eventually be used to understand the objects in a scene and the spaces in between them and what might lie behind areas obscured by objects in the foreground. That level of detail would be important, for instance, if a robot needed to plan a route where it might walk.
Gupta presented the research, which he conducted with Alexei A. Efros, associate professor of robotics and computer science, and Robotics Professor Martial Hebert, at the European Conference on Computer Vision, Sept. 5-11 in Crete, Greece.
SOURCE: Carnegie Mellon University
Posted by Conard Holton, Vision Systems Design