Visual Perception in Robotics

Detection, segmentation, tracking, identification of humans and/or objects


Visual Perception for Robotics

The aim is to detect, segment, track and/or recognize structures of interest (humans, objects, plants, etc.) in images or video sequences grabbed in robotic contexts: uncontrolled environments and perceptual conditions, uncertainties, real-time processing requirement, reproducibility of results, possibly embeddability. The underlying techniques relate to computer vision, machine learning, image processing, 3D perception, etc.


Detection of rare events by vision

Visual inspection of objects

The application is the detection of defects on aircraft fuselages in high-resolution images captured by drones. The essential feature of the problem is the presence of highly unbalanced classes (e.g., lightning strikes vs paint defects on rivets or screws).

  • A first step consists in the extraction of salient regions of interest.
  • A classification is then performed within these regions by means of various computer vision and machine learning techniques, enabling increasing accuracy scores.

Doctoral thesis of Julien Miranda (CIFRE with DONECLE company). Contact : Ariane Herbulot.




Detection and tracking of patterns and obstacles in outdoor environments

Multispectral vision

During the taxi phase, aircraft pilots scarcely perceive their environment. Weather conditions worsen their task. The objective is to detect and follow lines or obstacles in video sequences delivered by on-board visual or infrared sensors.

DGA TOUCANS Project. Collaboration with AIRBUS and OKTAL-SE companies. Contact : Ariane Herbulot.







Fast (data-driven) 2D tracking-by-detection and Person counting -- Multi-target tracking

People tracking and counting under embeddability contraints

The objective is to detect, count, and track humans in public transport from azimuthal cameras, under embeddability constraints. An evaluation of several detectors based on deep neural networks (from the literature or synthesized ad hoc) allows to select an interesting compromise precision vs fps rate vs embeddability. Multi-target tracking, based on Siamese convolutional networks, is evaluated on the MOT17 challenge. The video rate is 25, 15 or 10 frames per second depending on whether 1, 2 or 3 cameras are used simultaneously.

Doctoral Thesis of Claire Labit-Bonis (CIFRE with ACTIA company). Contact : Frédéric Lerasle.





Combination of Operational Research and Computer Vision for person re-identification in camera networks

People reidentification in camera networks

This work, carried out in collaboration with the ROC team of LAAS-CNRS, is in line with past work (with CEA-List, ROC...). The objective is to address large-scale tracking and re-identification problems by combining operational research methods (graphs, discrete optimization) and computer vision (machine learning, video analysis).

Doctoral Thesis of Cyrillle Equoy (Funded by Toulouse III Paul Sabatier University via EDSYS Doctoral School. Co-supervised with ROC team of LAAS-CNRS). Contact : Frédéric Lerasle.