Coherency-based stereo adds new dimension to machine vision

May 1, 2005
Machine-based stereovision relies on the identification of image points that correspond to one another in left and right stereo images.

Andrew Wilson, Editor, [email protected]

Machine-based stereovision relies on the identification of image points that correspond to one another in left and right stereo images. By identifying these points, shifts or disparities can be measured with precision. In many cases, however, lighting variations, sensor background noise, camera characteristics, and optical distortions result in different brightness distributions of the two stereo images. Search algorithms, which should assign the image points in the left and right images to one another, must therefore be extremely fault tolerant. Classical stereo methods can be divided into three groups: characteristic-, phase-, and correlation-based.

In characteristic-based methods, features, such as light-dark edges, are extracted from the raw image data of each camera. Sometimes these corresponding characteristics cannot be matched. Indeed, without previous knowledge, each light-dark edge in one image could potentially be assigned to any other light-dark edge found in the other image; thus feature-based methods only provide distance information when such characteristics can be found.

Phase-based stereo methods are inherently subpixel precise and are based on the Fourier transform. By performing Fourier transformations on the left and right images, the phase factor can be determined and thus the shift of the two images. However, when there is insufficient signal energy in any one of the frequency ranges lying between rough and fine resolutions, phase-based stereo methods only produce disparity maps similar to those of characteristic-based methods.

Correlation-based stereo methods perform considerably better by moving a pixel window across each of the images and finding the maximum correlation or match between the first and second images. To calculate a disparity for all image points, relatively large window sizes must be used. This leads to inaccuracies, especially at object edges, and is computationally more expensive, and, thus, correlation-based stereo methods are slower than characteristic-based stereo methods.

Rolf Henkel, a former executive director of 3D-IP (Bavaria, Germany; www.3d-ip.com), has developed a coherency-based stereo technique based on human perception. “People do not experience their visual surroundings from the perspective of the left or the right eye, but instead from the perspective of a virtual, cyclopic eye that lies in the middle of the two true eyes.” At the Center for Cognition Research at the University of Bremen (Bremen, Germany; www.iu-bremen.de), Henkel examined the neuronal processes that lead to the construction of the cyclopic eye. One of the by-products of his theory was the coherency-based technique, which is founded on the assumption that a single neuron in the visual cortex “sees” only a small section of the image known as the receptive field. If the receptive fields of the neurons that see in stereo do not match, no disparities can be calculated. Unfortunately, the neurons that can calculate disparities change from image to image (see image).

Neurons obtain information about only a small area of the outside world, the receptive field. If the receptive fields of the neurons that see in stereo do not match (red), no disparities can be calculated. Neurons that can calculate disparities (green) change from image to image. In dynamic coherence detection, correct data are filtered and the distance to the objects calculated.
Click here to enlarge image

Because of this, each group of neurons that calculates distances is split into two subgroups. One is the group of neurons that determines the distance to the object more or less correctly. The second group of neurons produces some random estimated value because of their restricted receptive fields. The group of the correctly encoding neurons can be found easily, because they agree in the estimated value-which is the calculated distance to the object. This makes them coherent, unlike their colleagues.

In dynamic coherence detection, the correct data are filtered and the distance to the objects in the scene calculated. If the input signals from the coherent neuron groups found are displayed depending on the solid angle, the result is a new view of the scene. Unlike conventional methods, all synaptic connections between the neurons and their detailed method of functioning are known and can be simulated and verified. Henkel and his colleagues have implemented the coherency-based stereo technique in an FPGA-based add-in board, used in conjunction with the microEnable 2 from Silicon Software (Mannheim, Germany; www.silicon-software.com) to provide a PC interface to transfer stereo data map data to the PC.

Voice Your Opinion

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!