TY - JOUR
T1 - Resource profile and user guide of the Polygenic Index Repository
AU - 23andMe Research Group
AU - Becker, Joel
AU - Burik, Casper A.P.
AU - Goldman, Grant
AU - Wang, Nancy
AU - Jayashankar, Hariharan
AU - Bennett, Michael
AU - Belsky, Daniel W.
AU - Karlsson Linnér, Richard
AU - Ahlskog, Rafael
AU - Kleinman, Aaron
AU - Hinds, David A.
AU - Agee, Michelle
AU - Alipanahi, Babak
AU - Auton, Adam
AU - Bell, Robert K.
AU - Bryc, Katarzyna
AU - Elson, Sarah L.
AU - Fontanillas, Pierre
AU - Furlotte, Nicholas A.
AU - Huber, Karen E.
AU - Litterman, Nadia K.
AU - McCreight, Jennifer C.
AU - McIntyre, Matthew H.
AU - Mountain, Joanna L.
AU - Northover, Carrie A.M.
AU - Pitts, Steven J.
AU - Sathirapongsasuti, J. Fah
AU - Sazonova, Olga V.
AU - Shelton, Janie F.
AU - Shringarpure, Suyash
AU - Tian, Chao
AU - Tung, Joyce Y.
AU - Vacic, Vladimir
AU - Wilson, Catherine H.
AU - Caspi, Avshalom
AU - Corcoran, David L.
AU - Moffitt, Terrie E.
AU - Poulton, Richie
AU - Sugden, Karen
AU - Williams, Benjamin S.
AU - Harris, Kathleen Mullan
AU - Steptoe, Andrew
AU - Ajnakina, Olesya
AU - Milani, Lili
AU - Esko, Tõnu
AU - Iacono, William G.
AU - McGue, Matt
AU - Magnusson, Patrik K.E.
AU - Mallard, Travis T.
AU - Cesarini, David
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Nature Limited.
PY - 2021/12
Y1 - 2021/12
N2 - Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs’ prediction accuracies, we constructed them using genome-wide association studies—some not previously published—from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the ‘additive SNP factor’. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.
AB - Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs’ prediction accuracies, we constructed them using genome-wide association studies—some not previously published—from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the ‘additive SNP factor’. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.
KW - Data Analysis
KW - Databases, Genetic
KW - Genome-Wide Association Study
KW - Humans
KW - Multifactorial Inheritance
KW - Polymorphism, Single Nucleotide
UR - http://www.scopus.com/inward/record.url?scp=85108188484&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85108188484&partnerID=8YFLogxK
U2 - 10.1038/s41562-021-01119-3
DO - 10.1038/s41562-021-01119-3
M3 - Article
C2 - 34140656
AN - SCOPUS:85108188484
SN - 2397-3374
VL - 5
SP - 1744
EP - 1758
JO - Nature human behaviour
JF - Nature human behaviour
IS - 12
ER -