Finding homogeneous collections of dense subgraphs using constraint-based data mining approaches

The work presented in this thesis deals with data mining approaches for the analysis of attributed graphs. An attributed graph is a graph where properties, encoded by means of attributes, are associated to each vertex. In such data, our objective is the discovery of subgraphs formed by several dense groups of vertices that are homogeneous with respect to the attributes. More precisely, we define the constraint-based extraction of collections of subgraphs densely connected and such that the vertices share enough attributes. To this aim, we propose two new classes of patterns along with sound and complete algorithms to compute them efficiently using constraint-based approaches. The first family of patterns, named Maximal Homogeneous Clique Set (MHCS), contains patterns satisfying constraints on the number of dense subgraphs, on the size of these subgraphs, and on the number of shared attributes. The second class of patterns, named Collection of Homogeneous k-clique Percolated components (CoHoP), is based on a relaxed notion of density in order to handle missing values. Both approaches are used for the analysis of scientific collaboration networks and protein-protein interaction networks. The extracted patterns exhibit structures useful in a decision support process. Indeed, in a scientific collaboration network, the analysis of such structures might give hints to propose new collaborations between researchers working on the same subjects. In a protein-protein interaction network, the analysis of the extracted patterns can be used to study the relationships between modules of proteins involved in similar biological situations. The analysis of the performances, on real and synthetic data, with respect to different attributed graph characteristics, shows that the proposed approaches scale well for large datasets.

Data and Resources

Additional Info

Field Value
Source https://theses.hal.science/tel-00858751
Author Mougel, Pierre-Nicolas
Maintainer CCSD
Last Updated May 9, 2026, 20:42 (UTC)
Created May 9, 2026, 20:42 (UTC)
Identifier tel-00858751
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS) ; Université Lumière - Lyon 2 (UL2)-École Centrale de Lyon (ECL) ; Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL) ; Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon) ; Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)
creator Mougel, Pierre-Nicolas
date 2012-09-14T00:00:00
harvest_object_id 80cec75f-a885-4c5f-b9d7-6a254734bddc
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2025-10-24T00:00:00
set_spec type:THESE