Understanding region-based segmentation
Understanding region-based segmentation
Several variations of threshold-based approaches are available for region-based segmentation (see Vision Systems Design, Oct. 1998, p. 20). This month we investigate region-based segmentation operators that are based on clustering and morphological techniques.
The region-based segmentation method looks for similarities between adjacent pixels. That is, pixels that possess similar attributes are grouped into unique regions. As with all segmentation techniques, using gray-level intensity is the most common means of assigning similarity, but other possibilities exist, such as variance, color, and multispectral features (see Vision Systems Design, Sept. 1998, p. 32).
Region-growing techniques cluster the pixels that represent homogeneous areas in an image. Regions are grown by grouping adjacent pixels whose properties, such as intensity, differ by less than some specified amount. Each grown region is assigned a unique integer label in the output image. This class of algorithms tends to work well for difficult imagery, as it is adaptive and less susceptible to the effects of partial occlusion, adjacency, noise, and ambiguous boundaries.
In its simplest form, the region-grown operator performs connected-components analysis on gray-scale pixel values. At each pixel in an image, neighboring pixels are compared to a reference pixel value for the region. If the difference is less than or equal to the Difference Threshold, the neighboring pixel is added to the current region. In other words, for every pixel p(i) in a region, there exists another pixel p(j) in the region such that
abs (I[p(i)]) - I[p(j)] < Threshold
This algorithm works well with multiband data such as for color images. If there are multiple input images (I1...In), the threshold criteria must hold for all input images
(abs (I1[p(i)] - I1[p(j)]) < Threshold)
&...&
(abs (In[p(i)] - In[p(j)]) < Threshold)
The concept can be defined as a four-connected pixel neighborhood, whereby only the north, south, east, and west neighbors are considered for growing the region.
It can also be defined as an eight-connected pixel neighborhood, in which the NW, NE, SW, and SE neighbors are additionally incorporated for growing the region.
Region growing can be inhibited over background areas of the image by defining a lower limit on the acceptable pixel values. A typical use for this parameter would be the case where there is a clear foreground/ background threshold, and the only areas of interest are in the foreground.
Region growing can also be constrained so that the region-growing process does not cross over obvious edges in the image. In this mode, a logical edge map could be derived from a previous segmentation, a rough segmentation technique, or a desired object template.
The region-growing algorithm can also be made more adaptive to the data by modifying the threshold dynamically according to the mean and standard deviations of the region as it is being grown. One method of making the threshold adapt to the local data is
(1 - minimum(limit, standard_deviation/mean)) *Threshold
Therefore, the adaptive threshold will never be larger than the value of the Parameter Threshold, but can be much smaller. The value given to the limit parameter controls the range of adaptiveness. Using the adaptive threshold can help prevent bleeding across wide image gradients.
Another variant of region-growing techniques involves the use of "seeds." In this approach, markers, called seed points, are used to start the growth of regions, rather than a process where each pixel in the image is used as a region candidate (see Fig. 1).
The region-growing process might also be constrained to give more idea results. Interpixel difference thresholds can be used to control growth, or a more adaptive method can be used that considers contrast. In this method, pixels are added to the region only if they serve to increase the average contrast of the region against the surrounding background. This method works well where there is ambiguity in the data.
A final method uses expectations such as shape or size to control the region-growing process. This works well where the objects of interest may be adjacent or overlapping.
Watershed-based segmentation
For descriptive purposes, an image can be equated to a topographic surface, where the brighter areas correspond to hills or peaks and the darker areas to dips or valleys. If a method of "raining" on the image surface were possible, the image could be segmented via a watershed technique. The segmentation operator works by eroding the darkest areas of the image, the local minimums, and by marking where each local minimum runs into its neighboring local minimums, which are also being grown at the same time.
However, in most images, the edges between objects are not generally characterized topographically as ridges or valleys, but are more likely to be step edges, such as a light object on a dark background. In this type of image, a single watershed would be formed. This concept leads to the idea of applying watershed algorithms to edge images. Edge images, derived from gradient operators (see Vision Systems Design, Sept. 1998, p. 33), produce images with topographical ridges around objects. Therefore, when the watershed operator is applied to an edge magnitude image, the boundaries between the regions extracted correspond to the crest lines of the gradient. These crest lines usually represent a good approximation of the object`s contours. Watershed-segmentation techniques can also be used for calculation of equidistant neighborhoods about objects, sometimes called the "zones of influence" (see Fig. 2).
However, the watershed operation applied directly on an image usually oversegments the image. That is, it extracts too many regions. The reason is that, in general, there are many regional minima in an image, and the watershed operation extracts one catchment basin for each regional minimum. However, the tendency of the watershed operator to oversegment images can be controlled and reduced by preprocessing steps, such as low-pass, median, or morphological filtering, to eliminate many of the unwanted local minimums.
Whereas prefiltering steps might help prevent unwanted oversegmentation, the trade-off is that the finer-shape details of the resulting segmentations may be lost. The alternative is to allow the watershed operator to grow only from seeds, rather than from all the local minima. These seeds serve to initiate the flooding process. They can be derived from a variety of sources, such as a filtered regional minimal image, other segmentation techniques, previously known coordinates, or interactively via manual input.
Multiresolution projections
Image pyramids can be formed by recursively subsampling an image to produce multiple-resolution projections of the data (see Vision Systems Design, June 1998, p. 22). If a region segmentation is performed at one of the coarser levels, the boundaries it creates can then be projected back up to the higher-resolution data. This technique is useful when the data are wispy or specular, and it can be applied to a derivative of the data and multispectral images as well. Images in which the application of traditional region-segmentation techniques would oversegment the data can often be successfully segmented with a multiresolution approach (see Fig. 3).
Shape-based segmentation
Occasionally, objects to be segmented from an image are confined to some well-defined shape. Such is the case for fibers, spheres, and rectangles. It is useful in this case to again visualize the image surface as a topographical map, where the objects have a three-dimensional shape. Fibers can be ridges or valleys, spheres can be hills, and so forth. If an example of the shape, called a template, is created, it can be "slid under" the topographical surface. Where it "fits," it will then slide up into the shape.
The template can also be slid under the surface of the entire image to create an output image that contains gray-scale values corresponding to how high the shape can go. Where the shape does not fit, then large contiguous areas of high or low values will appear in the output image. Where the shape does fit, small local "peaks" will emerge. These peaks can be detected with local maximal operators and then used to mark the existence of desired shapes in the image, thereby "segmenting out" the desired objects. This technique can be implemented with morphological gray-scale erosion and dilation techniques (see Fig. 4).
PETER EGGLESTON
PETER EGGLESTON is senior director of business development, Imaging Products Division, Amerinex Applied Imaging Inc., Northampton, MA; e-mail: [email protected].
FIGURE 1. A multispectral, seeded, region-growing algorithm created this masking plug-in for Adobe Photoshop. A user selects sample background (red) and foreground (green) seeds. Color information derived from the seeds is then used to grow a region corresponding to the object of interest--the sunflower. Once a mask is derived, the extracted image portion can then be color corrected, used in the creation of a new image, or have special effects applied.
FIGURE 2. The watershed segmentation operator can measure object dispersion using the Skeletonization by Influence Zones (SKIZ) approach. In this example, an inorganic pigment in a polymer material is imaged as black dots (pigment) on a gray background (polymer). First, the background is extracted by thresholding the original image. Next, a distance image is created by setting every background pixel value to the shortest distance to the nearest pigment. This distance image is then thresholded using the watershed technique, creating equi-distant zones about each pigment particle.
FIGURE 3. Image pyramids and multiresolution processing techniques are useful in the segmentation of wispy objects, such as the clouds in this satellite photograph. These techniques can be used to avoid the oversegmentation of images that have a good deal of variation in the areas of interest.
FIGURE 4. Morphological templates support the extraction of specific shapes. In a top-hat segmentation, templates are slid over or under the image to mark areas of interest. Dilation and erosion can remove dark or light areas respectively, which do not correspond to shapes of interest.