MEDICAL DIAGNOSIS: Signature mapping software speeds tuberculosis diagnosis
According to the World Health Organization, more than 9 million cases of tuberculosis (TB) occurred worldwide in 2008, with 30% of these cases appearing in South Africa. Principally caused by mycobacteria, the disease usually attacks the lungs and is spread though coughing or sneezing. To identify whether a patient has the disease, sputum smears are first obtained, stained with a rhodamine-auramine stain in a centrifuge, and examined under a fluorescence microscope.
"To ensure that these tests are performed as effectively as possible," says Tom Ramsay, chief scientist with Guardian Technologies International (GTI; Herndon, VA, USA;www.guardiantechintl.com), "two slides are prepared for every individual patient and 100 unique areas, called fields of view, within each slide are examined to look for the presence of mycobacteria." With more than 3.5 million cases of TB occurring each year in South Africa alone, this necessitates the examination of 700 million different images. "As one can imagine," says Ramsay, "this process is tedious, labor intensive, time consuming, and, worse, prone to human error."
To overcome this, Ramsay and his colleagues at GTI have developed an automated slide analysis system based on Guardian's Signature Mapping image-analysis software and specialized off-the-shelf laboratory hardware (see figure). By combining and integrating control software from individual suppliers, and the Bx41 florescent microscope and XC10 digital camera from Olympus America (Center Valley, PA, USA;http://olympusamerica.com) with a Optiscan microscope stage navigation and PL-200 automatic slide loader from Prior Scientific (Rockland, MA, USA; www.prior.com), GTI has developed a digital virtual TB detection system that allows up to 200 slides to be scanned automatically.
After slides are loaded into each of the four 50-slide racks, they are transferred to the microscope stage automatically. Each individual slide is then illuminated with a mercury vapor light that is color corrected to provide a flatter white response. These slides are then automatically focused under the microscope and moved on anx-y stage so that 100 unique fields of view are imaged from each slide, which are captured by the system's CCD camera. These images are then transferred over a USB interface to the host PC for Signature Mapping computer-aided image analysis.
"Unlike machine-vision systems that may have to measure a product's shape to within specific tolerances, detecting the presence of TB bacteria is more complex,” says Ramsay. "Not only must the bacteria be correctly identified in the presence of foreign debris such as food, the system must also return a low false positive rate to reduce the number of slides that may later have to be examined individually and manually."
To automatically focus the slides under the microscope in the presence of debris, very specialized segmentation algorithms developed by Guardian are used to isolate the bacteria. Using these segmented image data, a histogram analysis is then used to focus the slides correctly.
This segmentation, which identifies all TB bacterial candidates, is only part of the process by which the Signature Mapping software operates. It must then identify the presence of any and all individual bacteria in the field of view in each slide. Since the mycobacteria may overlap, for example, a morphological algorithm alone cannot be used to measure the size or number of cells. Similarly, after staining, different parts of the bacterium may fluoresce differently; thus, a simple color analysis cannot detect the presence or number of cells. The system detection challenge is to find even one bacterium in 100 fields of view with precision to provide an accurate diagnosis while not labeling an artifact that might look like bacteria as “mycobacteria” when it is not.
Different statistical values regarding the image can be obtained by combining a number of these operators. For example, if a morphological operator is first used to detect the size and shape of the bacteria, then determining the variance of the color or gray-level patterns within the object would provide an added level of information about the presence of bacteria.
By combining neighborhood-based transforms, RGB to HIS analysis, contextual analysis, morphological analysis, boundary transforms, topology, and frequency-based transforms in this manner results in millions of features that can be extracted from a single image to provide a unique set of classifiers that are highly correlative with TB bacteria.
To determine how these features properly classify the presence of mycobacteria, they are fed into a support vector machine (SVM). This is used to classify the data according to specific rules and weightings as sets of vectors represented in a multidimensional vector space.
After image data are classified in this way, the SVM generates a hyper-plane to maximize the margin between each data set. Output from this SVM is then fed to a rules engine that uses a set of if-then rules to determine the final result of image data analysis.
"Of course," says Ramsay, "this method requires that the system be trained using a large sample of slides so that the image-processing algorithms and the SVM data that are produced provide the optimum result." Once trained, however, the system can be used to reduce the false positive rate and thus reduce the number of slides that need to be manually inspected.
In independent tests performed on the system, the system detected 92% of positive cases with a false positive rate of 3.75%. "In countries such as South Africa, where more than 3.5 million patients are tested for TB each year," says Ramsay, "deployment of such systems could decrease the number of slides that require examination by at least 80%."