Optimisation convexe pour la cosegmentation

People and most animals have a natural ability to see the world and understand it effortlessly. The apparent simplicity of this task suggests that this ability is, to some extend, mechanical, i.e., does not require high level thinking or profound reasoning. This observation suggests that this visual perception of the world should be reproducible on a mechanical device such as a computer. Computer vision is the field of research dedicated to creating a form of visual perception on computers. The first work on computer vision dates from the 50's but the amount of power needed for treating and analyzing visual data was not available at this time. It is only recently that improvements in computer power and storage capacities, have permitted this field to really emerge. On the one hand, constant progress in computer vision has allowed to develop dedicated solutions to practical or industrial problems. Detecting human faces, tracking people in crowded areas or default in production chains are industrial applications where computer vision is used. On the other hand, when it comes to creating a general visual perception for computers, it is probably fair to say that less progress has been made, and the community is still struggling with fundamental problems. One of these problems is to reproduce our ability of grouping into meaningful regions, the visual input data recorded by an optical device. This procedure, called segmentation, separates a scene into meaningful entities (e.g., objects or actions). Segmentation seems not only natural but essential for people to fully understand a given scene, but it is still very challenging for a computer. One reason is the difficulty of clearly identify what meaningful'' should be, i.e., depending on the scene or the situation, a region may have different interpretations. In this thesis, we will focus on the segmentation task and will try to avoid this fundamental difficulty by considering segmentation as a weakly supervised learning problem. Instead of segmenting images according to some predefined definition ofmeaningful'' regions, we develop methods to segment multiple images jointly into entities that repeatedly appear across the set of images. In other words, we define ``meaningful'' regions from a statistical point of view: they are regions that appears frequently in a dataset, and we design procedures to discover them. This leads us to design models whose a scope goes beyond this application to vision. Our approach takes its roots in the field of machine learning, whose goal is to design efficient methods to retrieve and/or learn common patterns in data. The field of machine learning has also gained in popularity in the last decades due to the recent improvement in computer power and the ever growing size of databases now available. In this thesis, we focus on methods tailored to retrieving hidden information from poorly annotated data, i.e., with incomplete or partial annotations. In particular, given a specific segmentation task defined by a set of images, we aim at segmenting the images and learn a related model as to segment unannotated images. Finally, our research drives us to explore the field of numerical optimization so as to design algorithms especially tailored for our problems. In particular, many numerical problems considered in this thesis cannot be solved by off-the-shelf software because of the complexity of their formulation. We use and adapt recently developed tools to approximate problems by solvable ones. We illustrate the promise of our formulations and algorithms on other general applications in different fields beside computer vision. In particular, we show that our work may also be used in text classification and discovery of cell configurations.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00826236
Author Joulin, Armand
Maintainer CCSD
Last Updated May 11, 2026, 00:27 (UTC)
Created May 11, 2026, 00:27 (UTC)
Identifier NNT: 2012DENS0086
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire d'informatique de l'école normale supérieure (LIENS) ; Département d'informatique - ENS-PSL (DI-ENS) ; École normale supérieure - Paris (ENS-PSL) ; Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL) ; Université Paris Sciences et Lettres (PSL)-Université Paris Sciences et Lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)
creator Joulin, Armand
date 2012-12-17T00:00:00
harvest_object_id ea5ed360-2906-4649-ad82-f7e30e574637
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2026-03-31T00:00:00
set_spec type:THESE