IMAGE PROCESSING: FFT processor performs phase correlation
One of the most important transforms in image and signal processing is the Fourier transform. Attributed to French mathematician Jean Baptiste Joseph Fourier, the Fourier transform, and variations of it, can be used to transform images from the spatial domain to the frequency domain. This is useful in many image processing functions such as image filtering, for example where transforming image data into the frequency domain, applying simple point processing functions and performing an inverse fast Fourier transform (FTT) can be used to remove low-pass noise.
The Fourier transform has also been recently demonstrated in a system to remove atmospheric turbulence effects from a temporal sequence of images. In this application, the power spectrum and bi-spectrum of each image is first computed using a Fourier transform to provide information about the amplitude and phase of the signal. Averaging this amplitude and phase information, recombining this image data, and then computing an inverse FFT (IFFT) subsequently results in a corrected image (see “FPGAs make real-time atmospheric compensation a reality,” Vision Systems Design, June 2010).
Perhaps the most well-known application of the FFT, however, is that of image correlation. To accomplish this, 2-D FFTs are performed on an unknown image and an image template. Then, multiplying the two transforms together and applying an IFFT results in a convolved image where the value of each pixel is a measure of how well the target image matches the searched image at each point.
Today, the FFT and its variations can be performed in software, distributed across multicore processors, and embedded in FPGAs to increase performance. Taking the latter approach, Touit (San Diego, CA, USA;www.touit.com) has developed a x1 PCI Express real-time image co-processor, named Echelon, that incorporates a ECP3-150 FPGA from Lattice Semiconductor (Hillsboro, OR, USA; www.latticesemi.com) and 2GB DDR2-SODIMM that can perform image correlation of 64 × 64, 8-bit images at rates as fast as 140,000 frames/s (see figure).
To allow developers to digitize images into the board, the company offers an external daughter board that connects to the image co-processor. Using four 30-MHz, 8-bit ADCS930 ADCs, captured 64 × 64-bit images are transferred to DDR2 image FIFO on the image co-processor. In operation, the co-processor then forms a 1-D FFT on eight rows or columns of the image simultaneously.
Because the Lattice ECP3-150 features 32 complex multipliers, clocked at 125 MHz, the time to perform an FFT on a single 64 × 64, 8-bit image is 768 clock cycles or approximately 6 µs. To perform image correlation requires that a known template be first written into the board’s DDR2 memory. This is accomplished using software tools from Touit that allow templates from the PCs hard drive to be loaded into memory. Once loaded, these templates are converted into an 8-bit grayscale image and the 2-D FFT performed.
After this FFT is performed, the same memory and complex multipliers within the FPGA are used to multiply the two Fourier transforms together. Because the co-processor board requires 768 clock cycles for the 2-D FFT and 128 clock cycles for the multiplication, the resulting throughput is approximately 140,000 phase correlations per second on 64 × 64, 8-bit pixel image.
“To perform the 2-D FFT on a larger sized image using the Lattice ECP3-150 requires additional control logic,” says Colin Hankins, president of Touit. “The implementation then loses a lot of its parallelism and the number of cycles required to compute the 2-D FFT begins to increase significantly. Also, at a certain point the 2-D FFT architecture runs into an upper resource limit inside the Lattice ECP3-150 and can no longer perform in-place matrix rotation.”
One possibility to overcome this is to use more complex (and expensive) FPGAs to achieve higher throughputs with high resolution images. According to Hankins, these may be used in future generations of the board depending on market demand.
Today, Touit offers both Windows and Linux support for the Echelon co-processor that include device drivers, an API, and a Python module to allow developers to interface to the board through the Python command interpreter. Also included are some sample GUI control programs in Python as well as free libraries (NumPy and SciPy) for matrix operations and signal processing.