Abstract
High-dimensional multinomial regression models are very useful in practice but have received less research attention than logistic regression models, especially from the perspective of statistical inference. In this work, we analyze the estimation and prediction error of the contrast-based (Formula presented.) -penalized multinomial regression model and extend the debiasing method to the multinomial case, providing a valid confidence interval for each coefficient and (Formula presented.) value of the individual hypothesis test. We also examine cases of model misspecification and non-identically distributed data to demonstrate the robustness of our method when some assumptions are violated. We apply the debiasing method to identify important predictors in the progression into dementia of different subtypes. Results from extensive simulations show the superiority of the debiasing method compared to other inference methods.
Original language | English (US) |
---|---|
Pages (from-to) | 5711-5747 |
Number of pages | 37 |
Journal | Statistics in Medicine |
Volume | 43 |
Issue number | 30 |
DOIs | |
State | Published - Dec 30 2024 |
Keywords
- debiased Lasso
- dementia
- high-dimensional statistics
- hypothesis testing
- model misspecification
- non-identically distributed data
- p values
- penalized multinomial regression models
- statistical inference
ASJC Scopus subject areas
- Epidemiology
- Statistics and Probability