Academic researchers developing new activity recognition algorithm

Hamed Pirsiavash, a postdoc at MIT, and his former thesis advisor, Deva Ramanan of the University of California at Irvine have developed a new activity recognition algorithm that uses techniques from natural language processing to enable computers to more efficiently search video for actions.

James Carroll

May 14, 2014

2 min read

Add Us On Google

Content Dam Vsd En Articles 2014 05 Academic Researchers Developing New Activity Recognition Algorithm Leftcolumn Article Thumbnailimage File

While previous algorithms that perform similar tasks have been developed, the new algorithm reportedly has a number of advantages over its predecessors. According to the MIT news release, these include:

Execution time. The new algorithm’s execution time scales linearly with the size of the video file it’s searching, meaning that if one file is 10 times larger than another, the algorithm will take 10 times as long to search it, not 1,000 times longer, as with earlier algorithms.
Predicting actions. The algorithm is able to see a partially completed action and issue a probability that the action is of the type that it is looking for. It may revise this probability as the video continues, but does not have to wait until the action is complete to assess it.
Fixed memory. Regardless of how many frames of video the algorithm has reviewed, the amount of memory it requires is fixed, meaning that, unlike many of its predecessors, it can handle video streams of any length or size.

Pirsiavash and Ramanan’s algorithm utilizes aspects of a type of algorithm used in natural language processing, which is a field of computer science concerned with the interactions between computers and human (natural) languages. In the MIT new release, Pirsiavash explains how the natural language processing algorithm applies to activity prediction.

"One of the challenging problems they try to solve is, if you have a sentence, you want to basically parse the sentence, saying what is the subject, what is the verb, what is the adverb," Pirsiavash said. "We see an analogy here, which is, if you have a complex action — like making tea or making coffee — that has some subactions, we can basically stitch together these subactions and look at each one as something like verb, adjective, and adverb."

Page 1 | Page 2

About the Author

James Carroll

Former VSD Editor James Carroll joined the team 2013. Carroll covered machine vision and imaging from numerous angles, including application stories, industry news, market updates, and new products. In addition to writing and editing articles, Carroll managed the Innovators Awards program and webcasts.

Robotic Bin Picking System at Penna Flame Picks Large, Heavy Industrial Parts

Advancing Quantum Computing with Long Working Distance Objective Lenses | Avantier

Sponsored

AI Powered Machine Vision Applications Guide

Sponsored

Academic researchers developing new activity recognition algorithm

About the Author

James Carroll

Related

Robotic Bin Picking System at Penna Flame Picks Large, Heavy Industrial Parts

Advancing Quantum Computing with Long Working Distance Objective Lenses | Avantier

AI Powered Machine Vision Applications Guide

The Essential Guide for Automated Code Reading and Optical Character Recognition

Voice Your Opinion!

To join the conversation, and become an exclusive member of Vision Systems Design, create an account today!

Trending

SiLC’s 4D Vision System Brings Kilometer-Range Drone Tracking into Focus

Exploring Innovation at Automate 2026: Humanoid Robots, Machine Vision, and More

Rethinking Weed Control in Modern Agriculture

Sponsored Picks

AI Powered Machine Vision Applications Guide

The Essential Guide for Automated Code Reading and Optical Character Recognition

Advanced Machine Vision Made Easy