Many machine-vision, medical-imaging, and security-imaging systems use edge-detection algorithms to locate parts, isolate areas of interest, or track moving objects. To perform these tasks, several available software packages use standard algorithms and techniques such as the Hough transform or least-squared error-fitting methods. Although these algorithms are capable of finding edges within images, they are computationally intensive. As a result, detected edge data may require further preprocessing to obtain the required results.
In contrast, a person's eyes and optical visual system are capable of recognizing shapes easily and can seemingly track moving objects effortlessly. Indeed, recent research from the University of California (Berkeley, CA) has shown that in the human retina, one group of ganglion cells only sends signals when a moving edge is detected, another group sees large uniform areas, and yet another group detects only the area surrounding a figure. Each representation emphasizes a different feature of the visual world, such as an edge, a blob, or a movement without any prior knowledge of the scene's content. (see Vision Systems Design, May 2001, p. 8).
Using a 500-MHz PC, sequential 314 x 231-pixel images are captured and processed, and an image of both the curve partitioning points and the generic curve segments are displayed. The moving object is then defined by grouping all the moving generic curve segments. The motion-detection process is executed at a speed equivalent to a camera rate of 500 frames/s.
"To extract such perceptual features," says Alan Parslow, chief executive officer of Deep Vision Inc. (Dartmouth, Nova Scotia, Canada), "it is necessary to understand the shape consistency, continuation, and similarity of regions within an image." In motion-detection and tracking applications, algorithms must also understand which image properties are conserved between consecutive frames and what mechanisms can infer motion. Parslow and his colleagues have used a mathematically elegant approach that uses perceptual processing and grouping to create perceptual edge segments and group them into traces. "In operation, the method used only needs to process a small part of the image where edges are most likely to be found," says Parslow.
In conventional image partitioning, curves are segmented on the discontinuation of tangents. In the method used by Deep Vision, partitions are based on the discontinuation of tangent monotonicity. Partitioning the segments in this way results in a number of simple generic curve segments (GCSs) that have their own unique descriptive characteristics and can be represented by a set of binary functions. To derive shape descriptors, both the GCSs and the curve partitioning points (CPPs) that define them are modeled mathematically.
Captured images are raster scanned to find an initial edge point. After the edge point is found, the algorithm tracks the curve until it finds the monotony of perceptual increase or decrease on edge direction and edge pixel proximity which is governed by one of the GCS rules. If tracking encounters a CPP, the detected GCS is detected. This process continues until no more edge points are encountered. This process is repeated until the complete image has been scanned.
To detect motion, the gradient differences between GCSs in consecutive images are calculated piece by piece. Moving objects are then extracted piecemeal by extracting only those GCSs with a gradient difference greater than a predefined threshold.
"At present," says Parslow, "Deep Vision does not plan to offer the patented algorithm as a stand-alone product." Instead, the company will integrate it into a number of image-processing and machine-vision applications specifically targeted at industrial, medical, scientific, and military markets.