The Application of Co-clustering in Exploratory Data Analysis

Co-clustering is a clustering technique aiming at simultaneously partitioning the rows and the columns of a data matrix. Among the existing approaches, MODL is suitable for processing huge data sets with several continuous or categorical variables. We use it as the baseline approach in this thesis. We discuss the reliability of applying such an approach on data mining problems like graphs partitioning, temporal graphs segmentation or curve clustering. MODL tracks very fine patterns in huge data sets, that makes the results difficult to study. That is why, exploratory analysis tools must be defined in order to explore them. In order to help the user in interpreting the results, we define exploratory analysis tools aiming at simplifying the results in order to make possible an overall interpretation, tracking the most interesting patterns, determining the most representative values of the clusters and visualizing the results. We investigate the asymptotic behavior of these exploratory analysis tools in order to make the connection with the existing approaches. Finally, we highlight the value of MODL and the exploratory analysis tools owing to an application on call detailed records from the telecom operator Orange, collected in Ivory Coast.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00935278
Author Guigourès, Romain
Maintainer CCSD
Last Updated May 7, 2026, 07:08 (UTC)
Created May 7, 2026, 07:08 (UTC)
Identifier tel-00935278
Language fr
Rights https://about.hal.science/hal-authorisation-v1/
contributor Orange Labs [Lannion] ; France Télécom
creator Guigourès, Romain
date 2013-12-04T00:00:00
harvest_object_id 9a230d33-c557-44ab-951a-3aef2af3ab46
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2024-06-03T00:00:00
set_spec type:THESE