Abstract
Genotype imputation is the inference of unknown genotypes using known population structure observed in large genomic datasets; it can further our understanding of phenotype-genotype relationships and is useful for QTL mapping and GWASs. However, the compute-intensive nature of genotype imputation can overwhelm local servers for computation and storage. Hence, many researchers are moving toward using cloud services, raising privacy concerns. We address these concerns by developing an efficient, privacy-preserving algorithm called p−Impute. Our method uses homomorphic encryption, allowing calculations on ciphertext, thereby avoiding the decryption of private genotypes in the cloud. It is similar to k-nearest neighbor approaches, inferring missing genotypes in a genomic block based on the SNP genotypes of genetically related individuals in the same block. Our results demonstrate accuracy in agreement with the state-of-the-art plaintext solutions. Moreover, p−Impute is scalable to real-world applications as its memory and time requirements increase linearly with the increasing number of samples. p−Impute is freely available for download here: https://doi.org/10.5281/zenodo.5542001.
Original language | English (US) |
---|---|
Pages (from-to) | 173-182.e3 |
Journal | Cell Systems |
Volume | 13 |
Issue number | 2 |
DOIs | |
State | Published - Feb 16 2022 |
Keywords
- genome privacy
- genotype imputation
- homomorphic encryption
- Privacy
- Genome-Wide Association Study
- Cloud Computing
- Computer Security
- Genotype
ASJC Scopus subject areas
- Pathology and Forensic Medicine
- Cell Biology
- Histology