The following blog post is from Jeff Bier, Founder of the Embedded Vision Alliance, Co-Founder & President of Berkeley Design Technology, Inc.
On a recent vacation, I was struck by how indispensable smartphones have become for travelers. GPS-powered maps enable us to navigate unfamiliar cities. Language translation apps help us make sense of unfamiliar languages. Looking for a train, taxi, museum, restaurant, shop or park? A few taps of the screen and you’ve found it.
And yet, there’s a vast amount of useful information that isn’t at our fingertips. Where’s the nearest available parking space? How crowded is that bus, restaurant, or museum right now?
This got me thinking about the potential for computer vision to enable cities – and the people in them – to operate more efficiently. The term “smart cities” is often used to describe cities that adopt modern processes and technology to enhance efficiency, convenience, safety and livability. Most of these improvements require lots of data about what’s going on throughout the city. Embedded vision – the combination of image sensors, processors, and algorithms to extract meaning from images – is uniquely capable of producing this data.
For example, consider street lights. Today, street lights are very simple; they use ambient light detectors to switch lights on at sunset and off at sunrise. But what if we exchanged the light sensor for an embedded vision module? Then street lights could reduce brightness when no people or vehicles are present, saving energy. And they could monitor parking space occupancy to enable drivers to quickly find the nearest vacant space – without requiring installation of a sensor in each parking space. They could spot potholes that need filling and blocked storm drains that need clearing. A network of such sensors could provide data about pedestrian and vehicle traffic flows to enable optimization of traffic signals.
Three key technologies are required to enable these types of capabilities to proliferate. First is embedded hardware: We need inexpensive but powerful microprocessors to run complex computer vision algorithms. Also, cheap image sensors and wireless modems. Second, we need robust algorithms that can reliably extract the needed information from images that are often noisy and cluttered (for example, in low light, or with raindrops on the lens). And third, we need ubiquitous wireless connectivity so that the valuable insights extracted by these devices can be shared.
To me, the really exciting thing about this opportunity is that these technologies are all available today.
Learning techniques are making it possible to create robust algorithms for challenging visual recognition tasks with much less engineering effort than was typically required to create traditional, special-purpose computer vision algorithms.
And, thanks to the increased focus on computer vision and machine learning by processor designers, processor performance and efficiency for computer vision and deep learning algorithms are improving fast – not by 10 or 20% per year, but by an order of magnitude or more in two or three years.
I don’t mean to suggest that creating sophisticated computer-vision-based solutions is simple. It’s not. But increasingly, creating such solutions is becoming possible for those with an idea and a skilled engineering team. For example, companies like ParkAssist and Sensity Systems are already deploying camera-based systems to improve parking.
Companies that succeed in getting such systems widely deployed early will enjoy growing opportunities over time. This is because images contain huge amounts of data, enabling a single embedded vision system to collect many diverse types of information – that is, to be a software-defined sensor. So, over time, improved algorithms, processors and sensors will allow more capabilities to fit into the same cost, size and power envelope. For example, a system initially deployed for managing parking spaces might later be upgraded to also monitor pedestrian and vehicle traffic, trash and road surface problems.
And this opportunity isn’t limited to outdoor environments. Inside of shops, for example, companies like RetailNext and GfK are already using vision-based systems to provide retailers with insights to optimize merchandise layout and staffing. And a start-up called Compology is even monitoring the contents of trash receptacles to optimize collection schedules, reducing costs and pollution.
Wherever there are people, or things that people care about, today we have unprecedented opportunities to add value by extracting useful information from images. Already, we’re seeing a few pioneering examples of innovative products delivering on this promise. But they are just the tip of the iceberg.
Speaking of algorithms: both their traditional and ascendant deep learning-based variants are among the topics I'll be covering in my upcoming webinar, "Embedded Vision: The Four Key Trends Driving the Proliferation of Visual Perception," delivered in partnership with Vision Systems Design and taking place on March 27, 2019 at 11 am ET (8 am PT). I encourage you to visit the event page for more information and to register.
And for even more information and one-on-one conversations on algorithms and other computer vision topics, register now for the 2019 Embedded Vision Summit, taking place May 20-23 in Santa Clara, California. Over the past six years, the Summit has become the preeminent event for people building products incorporating vision. The keynotes, presentations and demonstrations of the main two-day conference are bookended by in-depth full-day OpenCV and TensorFlow trainings on May 20, along with multiple workshopson May 23. Mark your calendar and plan to be there. Registration is now open on the Summit website.
Jeff Bier | Founder, Embedded Vision Alliance
Jeff Bier is the founder of the Embedded Vision Alliance, a partnership of 90+ technology companies that works to enable the widespread use of practical computer vision. The Alliance’s annual conference, the Embedded Vision Summit (May 20-23, 2019 in Santa Clara, California) is the preeminent event where engineers, product designers, and business people gather to make vision-based products a reality.
When not running the Alliance, Jeff is the president of BDTI, an engineering services firm: for over 25 years BDTI has helped hundreds of companies select the right technologies and develop optimized, custom algorithms and software for demanding applications in audio, video, machine learning and computer vision. If you are choosing between processor options for your next design, need a custom algorithm to solve a unique visual perception problem, or need to fit demanding algorithms into a small cost/size/power envelope, BDTI can help.