Image processing: A burgeoning vision opportunity

The enhancement of photographs and videos by means of digital computation, versus traditional "darkrooms," has typically required offline processing on a substantive standalone computer. Witness, for example, the longstanding popularity of multiple generations' worth of software such as Adobe's Photoshop and Premiere Pro. Now, however, this processing is increasingly being done by the same device that originally captured the still and video images. And increasingly, too, that device isn't a standalone camera; instead, it's a smartphone or tablet computer.

The integration of robust image capture capabilities into mobile electronics devices, and the resultant contraction of the traditional camera market, is a trend that began nearly a decade ago with the first "camera phones". But on-phone image processing is a more recent phenomenon, enabled by the increasingly robust CPUs and other processors, memory and storage facilities of modern devices, in combination with the "fusion" of other sensor technologies, and further augmented by "cloud" processing and archive facilities, supported by pervasive and high-speed connectivity.

Consider, for example, that newly introduced smartphones are now shipping with up-to-ten-core CPUs, with each core touting 64-bit instruction set capabilities, not to mention the SoCs' equally robust graphics processor cores (usable, too, for other massively parallel computational tasks such as image processing), DSP cores, imaging cores and even dedicated vision cores. These same phones now come with up to 4 GBytes of RAM and 128 GBytes of flash memory (the latter further supplemented in some cases by memory cards), plus speedy LTE and Wi-Fi communications subsystems. And their image sensors, increasingly found in dual per-camera arrays, are augmented by silicon accelerometers, gyroscopes, compasses, altimeters, and GSP receivers, all capable of measuring and logging location, orientation, and motion information.

The harness of this hardware and software potential appeared first in the ability to accurately "stitch" together multiple images into a wider (horizontal and/or vertical) panorama shot, perspective-correcting each source image in the process to create a more seamless end result. Another now-common capability is HDR (high dynamic range) imaging, which automatically blends together a series of shots rapidly taken in sequence and at different exposure settings. And, when a person is detected in the frame, the camera can automatically delay taking the photograph until that person is in focus and, if desired, smiling.

Even more exotic image enhancement techniques can now be done seemingly effortlessly by modern mobile electronics devices, and at near real-time speeds. They include the ability to subtract content (such as a "photobomber") from a frame or, conversely, to add a human being or other object to a scene different than the original one. Post-capture refocus, i.e. "plenoptic imaging", enables you to selectively sharpen only some (or all) of the depth planes in an image (or, depending on the exact capture technique employed, a merged series of rapidly snapped images at different focal plane settings).

High-quality digital zoom interpolates detail beyond that originally captured by the image sensor and lens, while software stabilization corrects for unintended motion during capture. Slow-motion features leverage fast image sensors, along with abundant processing and memory, to record video at faster-than-real time frame rates, while multi-shot combining can also be used to communicate object motion within a resultant still frame. Computer vision can also be employed to automatically pick out the best photo in a series, or create a "highlight" clip from a lengthier video sequence. And object (and face, for human subjects) recognition, in combination with GPS and other location services, finds use in "tagging" images as well as creating common-theme photo and video albums.

These and other concepts will, along with other emerging computer vision topics, be covered at next week's Embedded Vision Summit. Example technical presentations include:

"Image Sensors for Vision: Foundations and Trends," from Robin Jenkin, Director of Analytics, Algorithm and Module Development, ON Semiconductor
"Computational Photography: Understanding and Expanding the Capabilities of Standard Cameras," from Orazio Gallo, Senior Research Scientist at NVIDIA
"Video Stabilization Using Computer Vision: Techniques for Embedded Devices," from Ben Weiss, Computer Vision Developer, CEVA, and
"Digital Gimbal: Rock-steady Video Stabilization without Extra Weight!" from Petronel Bigioi, Senior Vice President, Engineering and General Manager, FotoNation

And these and other Alliance member companies will also demonstrate image processing and other computer vision concepts and products at the two-day Technology Showcase.

The Embedded Vision Summit, an educational forum for product creators interested in incorporating visual intelligence into electronic systems and software, takes place May 2-4 in Santa Clara, California. Register without further delay, as space is limited and seats are filling up! I'll see you there.

Regards,

Brian Dipert
Editor-in-Chief, Embedded Vision Alliance
[email protected]