Detecção de objetos em vídeos usando misturas de modelos baseados em partes deformáveis obtidas de um conjunto de imagens

Leissi Margarita Castaneda Leon

Full text
Author(s):	Leissi Margarita Castaneda Leon Total Authors: 1
Document type:	Master's Dissertation
Press:	São Paulo.
Institution:	Universidade de São Paulo (USP). Instituto de Matemática e Estatística (IME/SBI)
Defense date:	2012-10-23
Examining board members:	Roberto Hirata Junior; Carlos Hitoshi Morimoto; Eduardo Alves do Valle Junior
Advisor:	Roberto Hirata Junior
Abstract
The problem of detecting objects that belong to a specific class of objects, in videos is a widely studied activity due to its potential applications. For example, for videos that have been taken from a stationary camera, we can mention applications such as security and traffic surveillance; when the video have been taken from a dynamic camera, a possible application is autonomous driving. The literature, presents several different approaches to treat indiscriminately with each of the cases mentioned, and only consider images obtained from a stationary or dynamic camera to train the detectors. These approaches can lead to poor performaces when the tecniques are used in sequences of images from different types of camera. The state of the art in the detection of objects that belong to a specific class shows a tendency to the use of histograms, supervised training and basically follows the structure: object class model construction, detection of candidates in the image/frame, and application of a distance measure to those candidates. Another disadvantage is that some approaches use several models for each point of view of the car, generating a lot of models and, in some cases, one classifier for each point of view. In this work, we approach the problem of object detection, using a model of the object class created with a dataset of static images and we use the model to detect objects in videos (sequence of images) that were collected from static and dynamic cameras, i.e., in a totally different setting than used for training. The creation of the model is done by an off-line learning phase, using an image database of cars in several points of view, PASCAL 2007. The model is based on a mixture of deformable part models (MDPM), originally proposed by Felzenszwalb et al. (2010b) for detection in static images. We do not limit the model for any specific viewpoint. A set of experiments was elaborated to explore the best number of components of the integration, as well as the number of parts of the model. In addition, we performed a comparative study of symmetric and asymmetric MDPMs. We evaluated the proposed method to detect people and cars in videos obtained by a static or a dynamic camera. Our results not only show good performance of MDPM and better results than the state of the art approches in object detection on videos obtained from a stationary, or dynamic, camera, but also show the best number of components of the integration and parts or the created object. Finally, results show differences between symmetric and asymmetric MDPMs in the detection of objects in different videos. (AU)

FAPESP's process:	10/04914-9 - Object retrieval in videos based on models obtained from a set of images
Grantee:	Leissi Margarita Castañeda Leon
Support Opportunities:	Scholarships in Brazil - Master

Short URL