Amazon: Face and object recognition in videos via Deep Learning for everyone

[11:03 Mon,4.December 2017 by Thomas Richter]

Amazon has extended its image analysis service Recognition with an object recognition in videos, which works via Deep Learning. Video analytics can track people, identify activities, objects, celebrities, and inappropriate content in real-time in live video streams as well as in video stored on Amazon's S3 Cloud.

Faces can be identified from collections of several tens of millions. Using mimic analysis, additional information such as whether the eyes are open, whether someone is smiling or what feeling is expressed can be displayed. The exact position of faces in individual frames can be output via API and used, for example, to mark and edit individual faces in videos continuously - for example, to apply a special effect filter to them.

Video Analysis

With the help of specific identifiers, identified persons and faces as well as objects (such as vehicles, animals, beach) or complex activities (such as weddings, sports, dancing, a package is delivered, a candle is blown out, a fire is extinguished, a man goes to a car) are marked in scenes. For celebrities, there is a special recognition function that even displays additional information such as the respective IMDB profiles. For example, special, detailed search indexes for entire media archives can be generated automatically or special AWS Recognition-based services for videos on the network can be programmed and offered.

By AWS (Amazon Web Service) Account the new function can be used by everyone. A prerequisite for this is that the videos are in H. 264 format and uploaded to Amazon's S3 Cloud - single images in the usual formats can of course also be processed.

Here is a music video showing how object and face recognition works over the net. The detected objects are output together with the timestamp as JSON data for further processing.

Object detection

deutsche Version dieser Seite: Amazon: Gesichts- und Objekterkennung in Videos per Deep Learning für jedermann