Using LASSO to Assist Imputation and Predict Child Well-being

Diana Stanescu, Erik Wang, Soichiro Yamauchi

    Research output: Contribution to journalArticlepeer-review


    This article documents an approach to predicting children’s well-being using data from the Fragile Families and Child Wellbeing Study, which are representative of births in large U.S. cities. The authors use the least absolute shrinkage and selection operator (LASSO) to preprocess the data. They then apply the Amelia algorithm to impute missing data. Finally, they use LASSO again for prediction with the imputed data. The authors report the performance of this approach for six outcome variables. The approach achieves the best performance for the variable material hardship. The out-of-sample mean squared error of the authors’ prediction is 0.019, the lowest among all submissions in the Fragile Families Challenge. The authors find that among variables with high predictive power, variables from mother surveys dominate. Furthermore, components of material hardship in the past strongly predict current material hardship.

    Original languageEnglish (US)
    Article number2378023118814623
    StatePublished - 2019


    • Fragile Families Challenge
    • LASSO
    • material hardship
    • prediction

    ASJC Scopus subject areas

    • General Social Sciences


    Dive into the research topics of 'Using LASSO to Assist Imputation and Predict Child Well-being'. Together they form a unique fingerprint.

    Cite this