Representation of documents combining text and image : application to categorization and multimedia information retrieval

Exploiting multimedia documents leads to representation problems of the textual and visual information within documents. Our goal is to propose a model to represent these both information and to combine them for two tasks: categorization and information retrieval. This model represents documents as bags of words, which requires to define adapted vocabularies. The textual vocabulary, usually very large, corresponds to the words of documents while the visual one is created by extracting low-level features from images. We study the different steps of its creation and the tf.idf weighting of visual words in images usually used for textual words. In the context of the text categorization, we introduce a criterion to select the most discriminative words for categories in order to reduce the vocabulary size without degrading the results of classification. We also present in the multilabel context, a method that lets us to select the number of categories which must be associated with a document. In multimedia information retrieval, we propose an analytical approach based on machine learning techniques to linearly combine the results from textual and visual information which significantly improves research results. Our model has shown its efficiency on different collections of important size and was evaluated in several international competitions such as XML Mining and ImageCLEF

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00630438
Author Moulin, Christophe
Maintainer CCSD
Last Updated May 20, 2026, 05:05 (UTC)
Created May 20, 2026, 05:05 (UTC)
Identifier NNT: 2011STET4007
Language fr
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire Hubert Curien (LabHC) ; Institut d'Optique Graduate School (IOGS)-Université Jean Monnet - Saint-Étienne (UJM) ; Université Jean Monnet (EPSCPE) (UJM EPE)-Université Jean Monnet (EPSCPE) (UJM EPE)-Centre National de la Recherche Scientifique (CNRS)
creator Moulin, Christophe
date 2011-06-22T00:00:00
harvest_object_id 224abd69-b0bf-45d1-875c-938bbcdb8b37
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2026-04-23T00:00:00
set_spec type:THESE