Computer vision and deep learning technology at the heart of Amazon Go
Amazon has unveiled its first convenience store called "Amazon Go," which uses computer vision and deep learning algorithms to enable shoppers to get what they want without having to go through checkout lines.
Amazon Go is currently in private beta testing in Seattle but will reportedly open to the public early this year. The shopping experience, according to Amazon, is made possible by the same types of technologies used in self-driving cars. That is, computer vision, sensor fusion, and deep learning technologies. With "Just Walk Out" technology, users can enter the store with the Amazon Go app, shop for products, and walk out of the store without lines or checkout. The technology automatically detects when products are taken or returned to shelves and keeps track of them in a virtual cart. When the shopping is finished, users leave the store and their Amazon account is charged shortly thereafter.
While details are not provided, the Amazon patent filings show that the cameras used in Amazon Go may include RGB cameras, depth sensing cameras, and infrared sensors. Within the patent filings, however, are some additional details that suggest simply using the app to enter may not be quite as simple as it sounds. It is noted that upon detecting a user entering and/or passing through a transition area, the user is identified, and that various techniques may be used to identify the user. This includes a camera that captures an image that is processed using facial recognition, and that "in some implementations, one or more input devices may collect data that is used to identify when the user enters the materials handling facility."
In addition, the position of the user may also be monitored as the users moves about the materials handling facility—which is what the store is referred to in the filings. When the position of the user enters or passes through a transition area, the users identify is known because the position of the user and the identity of the user has been maintained.
It also shows that, in order for the system to "automatically detect" when items are taken from a shelf, that cameras are used to "capture a series of images of a user’s hand before the hand crosses a plane into the inventory location and also capture a series of images of the user’s hand after it exits the inventory location." Based on a comparison of the images, it can be determined whether the used picked an item up, or placed an item back down. Software is also used to identify if the user is holding an item in their hand before the hand crosses into the inventory location, but is not holding an item when the hand is removed from the inventory location.
A shopper’s past purchase history may also be used to help identify an item when it is picked up, according to the filings:
"For example, if the inventory management system cannot determine if the picked item is a bottle of ketchup or a bottle of mustard, the inventory management system may consider past purchase history and/or what items the user has already picked from other inventory locations. For example, if the user historically has only picked/purchased ketchup, that information may be used to confirm that the user has likely picked ketchup from the inventory location."
It also says in that some cases, data from other input devices may be used to assist in determining the identity of items picked or placed in inventory locations.
"For example, it is determined that an item is placed into an inventory location, in addition to image analysis, a weight of the item may be determined based on data received from a scale, pressure sensor, load cell, etc., located at the inventory location. The image analysis may be able to reduce the list of potentially matching items down to a small list. The weight of the placed item may be compared to a stored weight for each of the potentially matching items to identify the item that was actually placed in the inventory location. By combining multiple inputs, a higher confidence score can be generated increasing the probability that the identified item matches the item actually picked from the inventory location and/or placed at the inventory location. In another example, one or more radio frequency identifier ("RFID") readers may collect or detect an RFID tag identifier associated with a RFID tag included in the item."
While the patent applications were filed in September 2014, it does appear that what the company is showing thus far is similar to what is described in the filings, noted GeekWire.
As of now, Amazon Go is only open to Amazon employees, but customers are able to sign up via the via the Amazon siteto be notified when the store opens to the public.
It is worth noting that those who are already weary of things like facial recognition and image analysis techniques used on Facebook and the like may not like the concept of this many cameras tracking and following them through the store as they shop. I could see how this might cause a slight feeling of unease for someone, but for me personally, I’d love to try it out and I am curious to see if this trend takes off.
View information on Amazon Go.
Share your vision-related news by contacting James Carroll, Senior Web Editor, Vision Systems Design
To receive news like this in your inbox, click here.
Join our LinkedIn group | Like us on Facebook | Follow us on Twitter
Learn more: search the Vision Systems Design Buyer's Guide for companies, new products, press releases, and videos
About the Author
James Carroll
Former VSD Editor James Carroll joined the team 2013. Carroll covered machine vision and imaging from numerous angles, including application stories, industry news, market updates, and new products. In addition to writing and editing articles, Carroll managed the Innovators Awards program and webcasts.