Financial entity: Ministerio de Ciencia e Innovación
Principal Investigator: Núñez Trujillo, Pedro Miguel and Jorge Manuel Miranda Dias
Type: Technological Development
Key reference:
Year: 2011

Active Perception for Scene Understanding and Behaviour Analysis (APSUBA): An application for Social Robotic.

The aim of the presented project is to develop an active perception system of the environment, acting as an agent inside a heterogeneous sensor network, for scene understanding and behaviour analysis. The proposed perception system is composed of two different mechanisms: an active visual perception system and a metric perception system acting inside a network of heterogeneous sensor such as 3D laser scanner, Inertial Measurement Unit (IMU) and camera. Both two systems will be calibrated.

The sensorial fusion allows detecting the regions of interest using active vision (e.g. changes in the scene, human-robot interaction, human/robot motion, etc). Metric information will be used for later segmentation and 3D modelled stages. Moreover, data fusion will be applied for both heterogeneous sensor calibration and perception.

Finally, the models will be used for scene recognition and behaviour understanding. In order to obtain the relevant elements in the scene, a perception-based grouping process will be employed, which is performed by a hierarchical irregular pyramid. Using the information given by the visual mechanism, the metric perception system will provide 3D information of the interest sector, through developing a multi-layer homography-based reconstruction approach. The segmentation in large datasets will be achieved using clusters provided by Gaussian Mixture Models (GMM). These segments will be modelled using high level geometric features (superquadric surfaces), which will be used for the last stage of the system: scene understanding and behaviour analysis using Bayesian rules.

The proposed project will provide contributions in different topics like mobile-structure sensor network, heterogeneous sensor calibration, localization, scene recognition, 3D reconstruction, active perception, sensorial fusion or human behaviour understanding. The results of this project are also interesting in other research fields (e.g. smart environment).

A social robot is an autonomous robot which is capable of interacting and communicating with humans or other autonomous physical agents by following social behaviours and rules. Perceiving and understanding scenes and human behaviours is a key issue for these robots, as they are involved in human-robot interaction processes and/or develop their activities in a dynamic scenario. The aim of this proposal is to develop a multimodal perception system for social robotics, which will be based on the fusion of two different perceptual mechanisms: an active visual perception system, which will be used to determine scene regions which are candidates to be relevant for the robot (e.g. human faces, novelty or changes in the robot working area, etc) and a metric perception system, a 3-dimensional (3D) laser, which will be employed to acquire a dense map of the perceived surface of the Region of Interest (ROI).

Despite of using data from mobile sensors, the idea is to use a structure heterogeneous sensor network for performing a homography-based multi-layer 3D reconstruction, in order to have more information about the objects, which are reported as “of interest”, by the active visual perception system. This will provide 3D information rather than 2D-1/2 which will be useful in segmentation and shape retrieval stages. Moreover, the structure sensor network will be used in a synergic way with odometry sensor for the sake of localization of the robot within the scene.

This last dense map will be used for segmenting and retrieving the shape of the ROI. At the segmentation stage, the mathematical space of the Gaussian Mixture Model (GMM) will be studied to provide a feature space that enables subsequent data compression and effective processing. On the other hand, high level geometric models, like superquadric surfaces, will be used for retrieving the shape of the segmented regions. From the temporal tracking of these high-level geometric models, the last stage will be able to extract information about the scene and/or the human behaviour. The overview of the proposal is shown in the figure.

Figure 1. Overview of the proposal

The aim of this project is to establish a forum where the research groups of the Mobile Robotic Laboratory (MRL) of the Instituto de Sistemas e Robotica (ISR, University of Coimbra), and the Robotic and Artificial Vision (Robolab, University of Extremadura) can exchange ideas and work together a) to develop a multimodal perception module which fuses two different perceptual sources mounted on the robot, vision and range, and the information given by a sensor network (composed by IMUs and Cameras), and b) integrate this module into a scene and behaviour understading system. Robolab group has experience on the development of active visual perception systems for mobile robot, which are being acquired under projects ACROSS (TSI-020301-2009-27), funded by the Spanish Government, and regional projects PRI09A037 and PDT09A059.

On the other hand, the MRL group has a broad experience on social robotic, smart environment or human behaviours understanding. They are working in related projects funded by the Portuguese Government or by the European Union (IRPS, PROMETHEUS, etc). Thus, research teams are very interested in generating in depth knowledge of fusing different sensors which will be applied for a novel multimodal perception system. This system will be used for scene recognition and behaviour analysis in the context of social robotic. Finally, for the purpose of this project, we rely on the collaboration of some members at Ingeniería de Sistemas Integrados group (ISIS, University of Málaga), which have a vast experience in robot navigation based on laser sensors and active perception.