Resource profile and user guide of the Polygenic Index Repository

23andMe Research Group

    Research output: Contribution to journalArticlepeer-review


    Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs’ prediction accuracies, we constructed them using genome-wide association studies—some not previously published—from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the ‘additive SNP factor’. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.

    Original languageEnglish (US)
    Pages (from-to)1744-1758
    Number of pages15
    JournalNature human behaviour
    Issue number12
    StatePublished - Dec 2021


    • Data Analysis
    • Databases, Genetic
    • Genome-Wide Association Study
    • Humans
    • Multifactorial Inheritance
    • Polymorphism, Single Nucleotide

    ASJC Scopus subject areas

    • Experimental and Cognitive Psychology
    • Social Psychology
    • Behavioral Neuroscience


    Dive into the research topics of 'Resource profile and user guide of the Polygenic Index Repository'. Together they form a unique fingerprint.

    Cite this