Selection of GLM mixtures: a new criterion for clustering purpose

Model-based clustering from finite mixtures of generalized linear models is a challenging issue which has undergone many recent developments. In practice, the model selection step is usually performed by using AIC or BIC penalized criteria. Though, simulations show that they tend to overestimate the actual dimension of the model. These evidence led us to consider a new criterion close to ICL, firstly introduced in Baudry (2009). Its definition requires to introduce a contrast embedding an entropic term: using concentration inequalities, we derive key properties about the convergence of the associated M-estimator. The consistency of the corresponding classification criterion then follows depending on some classical requirements on the penalty term. Finally a simulation study enables to corroborate our theoretical results, and shows the effectiveness of the method in a clustering perspective.

Data and Resources

Additional Info

Field Value
Source https://hal.science/hal-00957880
Author Lopez, Olivier, Xavier, Milhaud
Maintainer CCSD
Last Updated May 6, 2026, 02:07 (UTC)
Created May 6, 2026, 02:07 (UTC)
Identifier hal-00957880
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire de Statistique Théorique et Appliquée (LSTA) ; Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS)
creator Lopez, Olivier
date 2014-02-25T00:00:00
harvest_object_id 59322cd2-6deb-4c37-824f-88917a77dcde
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2026-02-07T00:00:00
set_spec type:UNDEFINED