Hand gesture recognition system targets medical applications
Researchers at Ben Gurion University (BGU; Beer-Sheva, Israel; www.bgu.ac.il) have developed a hand-gesture recognition system that enables doctors to manipulate digital images during medical procedures using hand gestures instead of touch screens or computer keyboards. The system has recently been tested during a neurosurgical brain biopsy at Washington Hospital Center (WHC; Washington, DC, USA; www.whcenter.org).
“Sterile human-machine interfaces are important, as they allow the surgeon to control information without the contamination associated with sealed touch screens that must be cleaned after each surgical procedure,” says Juan Wachs, a recent Ph.D. recipient from the Intelligent System Program in the Department of Industrial Engineering and Management at BGU. The hand-gesture control system, Gestix, allows doctors to remain in place during an operation, without the need to operate traditional computer-based interfaces.
Helman Stern, a principal investigator on the project and a professor in the department of industrial engineering and management, explains how Gestix functions in two stages: “There is an initial calibration stage where the machine recognizes the surgeons’ hand gestures, and a second stage where surgeons must learn and implement eight navigation gestures, rapidly moving the hand away from a “neutral area” and back again. Gestix users even have the option of zooming in and out by moving the hand clockwise or counterclockwise.”
To digitize images of the surgeon’s hand, the system employs a VC-C4 camera from Canon (Lake Success, NY, USA; www.usa.canon.com), whose pan/tilt/zoom mode is set using an infrared (IR) remote. This camera is placed over a flat-screen monitor and interfaced to an Intel Pentium 4-based PC using a frame grabber from Matrox Imaging (Dorval, QC, Canada; www.matrox.com/imaging).
Before the surgeon can use the system, a calibration process is required. This involves building a probability color model of the hand or glove by capturing images of the surgeon’s hand gesture. Each image is then tracked with an algorithm that segments the hand from the background using the color model and motion cues. After the images are thresholded, a sequence of morphological operations reduces the image to a set of blobs. The location of the hand is then represented by the 2-D coordinates of the centroid of the biggest blob.
On the flat-panel monitor, MRI, CT, and X-ray images are arranged over a 3-D cylinder. To display images of interest, this 3-D cylinder can be rotated in four directions. Information such as the centroid of the hand and orientation are interpreted by the PCs software and used to rotate these images on the graphical user interface of the flat-panel (see figure).
To browse the image database, the hand is moved rapidly out of a neutral screen area in any of four directions, and then back again. When this movement is detected, the displayed image is moved from the screen and replaced by a neighbor image. To zoom the image, the open palm of the hand is rotated within the neutral area. To avoid tracking of unintentional gestures, the surgeon drops his/her hand and the system enters a sleep mode. To reinitiate the system, the doctor waves the hand in front of the camera. While left/right/up/down gestures turn images pages left/right or up/down, a rotation gesture allows images to be zoomed in or out.
In addition to allowing sterile interaction with image data, the Gestix hand-gesture interface responds to the surgeon’s gesture commands in real time without requiring the surgeon to attach a microphone, use head-mounted (body-contact) sensing devices, or to use foot pedals to control the operation of the display system.
At BGU, several master’s theses, supervised by Helman Stern and Yael Edan, have used hand-gesture recognition as part of an interface to evaluate different aspects of interface design on performance in a variety of telerobotic and teleoperated systems. Ongoing research is aimed at expanding this work to include additional control modes (for example, voice) to create a multimodal telerobotic control system. Further research, based on video motion capture, is being conducted by Stern and Tal Oren of the department of industrial engineering and management and Amir Shapiro of the department of mechanical engineering. This system, combined with a tactile body display, is intended to help the vision impaired sense their surroundings.