Improving information retrieval by combining user profile and document segmentation

Due to the ever-increasing quantity of available information, which users have to scan in order to find relevant items, noise has become a major issue in the implementation and use of information retrieval systems. The aim of this study was to design an information retrieval system permitting the "personalization" of search, by taking into account user profile. A pre-orientation system was first developed to give access to a personalized subcorpus. To limit noise in information retrieval systems, the textual material offered to the user is reduced and contains only those sections (units) of the document that interest him and are significant to him (where textual material is used in the sense of document units to be processed by content analysis in order to build descriptions of the documents). In this way, the documents are structured on the basis of utility functions. The selected document units are part of the sub-corpus defined by the pre-orientation system. Next, the profile of each user is characterized by determining competence in a given field and at different levels. Each user is characterized by: * -stable information, related to the person rather than to a particular search. This information provides a general description of the user and his habits, * -variable information, related to a specific search. The priority here is to describe the objective of the search (search may be either exhaustive or non-exhaustive; it may concern specialized or popular publications, etc.). The function of the pre-orientation system is to associate a set of characteristics applying to document units to a given user profile. Search is then applied only to the subset of the selected document units that are relevant to the user and established following his profile. Document units are not characterized on the basis of thematic criteria related to content, but rather on the basis of criteria relating to utility. The objective was to propose a hypothesis on the different parameters determining user profile and document unit characteristics, and to test such a hypothesis using an existing information retrieval system incorporating full-text natural language processing tools.

Data and Resources

Additional Info

Field Value
Source ISSN: 0306-4573
Author Lainé-Cruzel, Sylvie, Lafouge, Thierry, Lardy, Jean-Pierre, Ben Abdallah, Nabil
Maintainer CCSD
Last Updated May 5, 2026, 21:14 (UTC)
Created May 5, 2026, 21:14 (UTC)
Identifier hal-00096510
Language en
contributor Représentation des connaissances et documentation (RECODOC) ; Université Claude Bernard Lyon 1 (UCBL) ; Université de Lyon-Université de Lyon
creator Lainé-Cruzel, Sylvie
date 1999-05-05T00:00:00
harvest_object_id 93ea7c50-3428-4b8f-b8ea-9ae4128a95e9
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2025-04-11T00:00:00
relation info:eu-repo/semantics/altIdentifier/doi/10.1016/0306-4573(95)00062-3
set_spec type:ART