TY - JOUR
T1 - Development and validation of a risk prediction model for premenopausal breast cancer in 19 cohorts
AU - Brantley, Kristen D.
AU - Jones, Michael E.
AU - Tamimi, Rulla M.
AU - Rosner, Bernard A.
AU - Kraft, Peter
AU - Nichols, Hazel B.
AU - O’Brien, Katie M.
AU - Adami, Hans Olov
AU - Aizpurua, Amaia
AU - de Gonzalez, Amy Berrington
AU - Blot, William J.
AU - Braaten, Tonje
AU - Chen, Yu
AU - DeHart, Jessica Clague
AU - Dossus, Laure
AU - Elias, Sjoerd
AU - Fortner, Renée T.
AU - Garcia-Closas, Montserrat
AU - Gram, Inger T.
AU - Håkansson, Niclas
AU - Hankinson, Susan E.
AU - Kitahara, Cari M.
AU - Koh, Woon Puay
AU - Linet, Martha S.
AU - MacInnis, Robert J.
AU - Masala, Giovanna
AU - Mellemkjær, Lene
AU - Milne, Roger L.
AU - Muller, David C.
AU - Park, Hannah Lui
AU - Ruddy, Kathryn J.
AU - Sandin, Sven
AU - Shu, Xiao Ou
AU - Tin Tin, Sandar
AU - Truong, Thérèse
AU - Vachon, Celine M.
AU - Vatten, Lars J.
AU - Visvanathan, Kala
AU - Weiderpass, Elisabete
AU - Willett, Walter
AU - Wolk, Alicja
AU - Yuan, Jian Min
AU - Zheng, Wei
AU - Sandler, Dale P.
AU - Schoemaker, Minouk J.
AU - Swerdlow, Anthony J.
AU - Eliassen, A. Heather
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - Background: Incidence of premenopausal breast cancer (BC) has risen in recent years, though most existing BC prediction models are not generalizable to young women due to underrepresentation of this age group in model development. Methods: Using questionnaire-based data from 19 prospective studies harmonized within the Premenopausal Breast Cancer Collaborative Group (PBCCG), representing 783,830 women, we developed a premenopausal BC risk prediction model. The data were split into training (2/3) and validation (1/3) datasets with equal distribution of cohorts in each. In the training dataset variables were chosen from known and hypothesized risk factors: age, age at menarche, age at first birth, parity, breastfeeding, height, BMI, young adulthood BMI, recent weight change, alcohol consumption, first-degree family history of BC, and personal history of benign breast disease (BBD). Hazard ratios (HR) and 95% confidence intervals (CI) were estimated by Cox proportional hazards regression using age as time scale, stratified by cohort. Given that complete information on all risk factors was not available in all cohorts, coefficients were estimated separately in groups of cohorts with the same available covariate information, adjusted to account for the correlation between missing and non-missing variables and meta-analyzed. Absolute risk of BC (in situ or invasive) within 5 years, was determined using country-, age-, and birth cohort-specific incidence rates. Discrimination (area under the curve, AUC) and calibration (Expected/Observed, E/O) were evaluated in the validation dataset. We compared our model with a literature-based model for women < 50 years (iCARE-Lit). Results: Selected model risk factors were age at menarche, parity, height, current and young adulthood BMI, family history of BC, and personal BBD history. Predicted absolute 5-year risk ranged from 0% to 5.7%. The model overestimated risk on average [E/O risk = 1.18 (1.14–1.23)], with underestimation of risk in lower absolute risk deciles and overestimation in upper absolute risk deciles [E/O 1st decile = 0.59 (0.58–0.60); E/O 10th decile = 1.48 (1.48–1.49)]. The AUC was 59.1% (58.1–60.1%). Performance was similar to the iCARE-Lit model. Conclusion: In this prediction model for premenopausal BC, the relative contribution of risk factors to absolute risk was similar to existing models for overall BC. The discriminatory ability was nearly identical (< 1% difference in AUC) to the existing iCARE-Lit model developed in women under 50 years. The inability to improve discrimination highlights the need to investigate additional predictors to better understand premenopausal BC risk.
AB - Background: Incidence of premenopausal breast cancer (BC) has risen in recent years, though most existing BC prediction models are not generalizable to young women due to underrepresentation of this age group in model development. Methods: Using questionnaire-based data from 19 prospective studies harmonized within the Premenopausal Breast Cancer Collaborative Group (PBCCG), representing 783,830 women, we developed a premenopausal BC risk prediction model. The data were split into training (2/3) and validation (1/3) datasets with equal distribution of cohorts in each. In the training dataset variables were chosen from known and hypothesized risk factors: age, age at menarche, age at first birth, parity, breastfeeding, height, BMI, young adulthood BMI, recent weight change, alcohol consumption, first-degree family history of BC, and personal history of benign breast disease (BBD). Hazard ratios (HR) and 95% confidence intervals (CI) were estimated by Cox proportional hazards regression using age as time scale, stratified by cohort. Given that complete information on all risk factors was not available in all cohorts, coefficients were estimated separately in groups of cohorts with the same available covariate information, adjusted to account for the correlation between missing and non-missing variables and meta-analyzed. Absolute risk of BC (in situ or invasive) within 5 years, was determined using country-, age-, and birth cohort-specific incidence rates. Discrimination (area under the curve, AUC) and calibration (Expected/Observed, E/O) were evaluated in the validation dataset. We compared our model with a literature-based model for women < 50 years (iCARE-Lit). Results: Selected model risk factors were age at menarche, parity, height, current and young adulthood BMI, family history of BC, and personal BBD history. Predicted absolute 5-year risk ranged from 0% to 5.7%. The model overestimated risk on average [E/O risk = 1.18 (1.14–1.23)], with underestimation of risk in lower absolute risk deciles and overestimation in upper absolute risk deciles [E/O 1st decile = 0.59 (0.58–0.60); E/O 10th decile = 1.48 (1.48–1.49)]. The AUC was 59.1% (58.1–60.1%). Performance was similar to the iCARE-Lit model. Conclusion: In this prediction model for premenopausal BC, the relative contribution of risk factors to absolute risk was similar to existing models for overall BC. The discriminatory ability was nearly identical (< 1% difference in AUC) to the existing iCARE-Lit model developed in women under 50 years. The inability to improve discrimination highlights the need to investigate additional predictors to better understand premenopausal BC risk.
KW - Premenopausal breast cancer
KW - Risk prediction model
KW - Young-onset breast cancer
UR - http://www.scopus.com/inward/record.url?scp=105004482602&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105004482602&partnerID=8YFLogxK
U2 - 10.1186/s13058-025-02031-8
DO - 10.1186/s13058-025-02031-8
M3 - Article
C2 - 40312753
AN - SCOPUS:105004482602
SN - 1465-5411
VL - 27
JO - Breast Cancer Research
JF - Breast Cancer Research
IS - 1
M1 - 67
ER -