TY - JOUR
T1 - The Proteome Folding Project
T2 - Proteome-scale prediction of structure and function
AU - Drew, Kevin
AU - Winters, Patrick
AU - Butterfoss, Glenn L.
AU - Berstis, Viktors
AU - Uplinger, Keith
AU - Armstrong, Jonathan
AU - Riffle, Michael
AU - Schweighofer, Erik
AU - Bovermann, Bill
AU - Goodlett, David R.
AU - Davis, Trisha N.
AU - Shasha, Dennis
AU - Malmström, Lars
AU - Bonneau, Richard
PY - 2011/11
Y1 - 2011/11
N2 - The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.
AB - The incompleteness of proteome structure and function annotation is a critical problem for biologists and, in particular, severely limits interpretation of high-throughput and next-generation experiments. We have developed a proteome annotation pipeline based on structure prediction, where function and structure annotations are generated using an integration of sequence comparison, fold recognition, and grid-computing-enabled de novo structure prediction. We predict protein domain boundaries and three-dimensional (3D) structures for protein domains from 94 genomes (including human, Arabidopsis, rice, mouse, fly, yeast, Escherichia coli, and worm). De novo structure predictions were distributed on a grid of more than 1.5 million CPUs worldwide (World Community Grid). We generated significant numbers of new confident fold annotations (9% of domains that are otherwise unannotated in these genomes). We demonstrate that predicted structures can be combined with annotations from the Gene Ontology database to predict new and more specific molecular functions.
UR - http://www.scopus.com/inward/record.url?scp=80555142938&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80555142938&partnerID=8YFLogxK
U2 - 10.1101/gr.121475.111
DO - 10.1101/gr.121475.111
M3 - Article
C2 - 21824995
AN - SCOPUS:80555142938
SN - 1088-9051
VL - 21
SP - 1981
EP - 1994
JO - Genome Research
JF - Genome Research
IS - 11
ER -