The restricted consistency property of leave-nV-out cross-validation for high-dimensional variable selection

Yang Feng, Yi Yu

Research output: Contribution to journalArticle

Abstract

Cross-validation (CV) methods are popular for selecting the tuning parameter in high-dimensional variable selection problems. We show that a misalignment of the CV is one possible reason for its over-selection behavior. To fix this issue, we propose using a version of leave-nv-out CV (CV(nv)) to select the optimal model from a restricted candidate model set for high-dimensional generalized linear models. By using the same candidate model sequence and a proper order for the construction sample size nc in each CV split, CV(nv) avoids potential problems when developing theoretical properties. CV(nv) is shown to exhibit the restricted model-selection consistency property under mild conditions. Extensive simulations and a real-data analysis support the theoretical results and demonstrate the performance of CV(nv) in terms of both model selection and prediction.

Original languageEnglish (US)
Pages (from-to)1607-1630
Number of pages24
JournalStatistica Sinica
Volume29
Issue number3
DOIs
StatePublished - 2019

Keywords

  • Generalized linear models
  • Leave-nv-out cross-validation
  • Restricted maximum likelihood estimators
  • Restricted model-selection consistency
  • Variable selection

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint Dive into the research topics of 'The restricted consistency property of leave-n<sub>V</sub>-out cross-validation for high-dimensional variable selection'. Together they form a unique fingerprint.

Cite this