Optimal cross-validation in density estimation

The performance of cross-validation (CV) is analyzed in two contexts: (i) risk estimation and (ii) model selection in the density estimation framework. The main focus is given to one CV algorithm called leave-$p$-out (Lpo), where $p$ denotes the cardinality of the test set. Closed-form expressions are settled for the Lpo estimator of the risk of projection estimators, which makes V-fold cross-validation completely useless. From a theoretical point of view, these closed-form expressions enable to study the Lpo performances in terms of risk estimation. For instance, the optimality of leave-one-out (Loo), that is Lpo with $p=1$, is proved among CV procedures. Two model selection frameworks are also considered: estimation, as opposed to identification. Unlike risk estimation, Loo is proved to be suboptimal as a model selection procedure. In the estimation framework with finite sample size $n$, optimality is achieved for $p$ large enough (with $p/n =o(1)$) to balance overfitting. A link is also identified between the optimal $p$ and the structure of the model collection. These theoretical results are strongly supported by simulation experiments. When performing identification, model consistency is also proved for Lpo with $p/n\to 1$ as $n\to +\infty$.

Data and Resources

Additional Info

Field Value
Source https://hal.science/hal-00337058
Author Celisse, Alain
Maintainer CCSD
Last Updated May 23, 2026, 00:40 (UTC)
Created May 23, 2026, 00:40 (UTC)
Identifier hal-00337058
Language en
Rights https://about.hal.science/hal-authorisation-v1/
contributor Laboratoire Paul Painlevé - UMR 8524 (LPP) ; Université de Lille-Centre National de la Recherche Scientifique (CNRS)
creator Celisse, Alain
date 2008-10-10T00:00:00
harvest_object_id 200753fb-a987-4e8d-9efb-1ea2029027c9
harvest_source_id 3374d638-d20b-4672-ba96-a23232d55657
harvest_source_title test moissonnage SELUNE
metadata_modified 2024-04-22T00:00:00
relation info:eu-repo/semantics/altIdentifier/arxiv/0811.0802
set_spec type:UNDEFINED