TY - JOUR
T1 - Parametric Bayesian priors and better choice of negative examples improve protein function prediction.
AU - Youngs, Noah
AU - Penfold-Brown, Duncan
AU - Drew, Kevin
AU - Shasha, Dennis
AU - Bonneau, Richard
N1 - Funding Information:
Funding: This work was supported by U.S. National Science Foundation grants 0922738, 0929338, 1158273, and IOS-1126971, and National Institutes of Health GM 32877-21/22, RC1-AI087266, RC4-AI092765, PN2-EY016586, IU54CA 143907-01 and EY016586-06.
PY - 2013/5/1
Y1 - 2013/5/1
N2 - Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html
AB - Computational biologists have demonstrated the utility of using machine learning methods to predict protein function from an integration of multiple genome-wide data types. Yet, even the best performing function prediction algorithms rely on heuristics for important components of the algorithm, such as choosing negative examples (proteins without a given function) or determining key parameters. The improper choice of negative examples, in particular, can hamper the accuracy of protein function prediction. We present a novel approach for choosing negative examples, using a parameterizable Bayesian prior computed from all observed annotation data, which also generates priors used during function prediction. We incorporate this new method into the GeneMANIA function prediction algorithm and demonstrate improved accuracy of our algorithm over current top-performing function prediction methods on the yeast and mouse proteomes across all metrics tested. Code and Data are available at: http://bonneaulab.bio.nyu.edu/funcprop.html
UR - http://www.scopus.com/inward/record.url?scp=84886411294&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84886411294&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btt110
DO - 10.1093/bioinformatics/btt110
M3 - Article
C2 - 23511543
AN - SCOPUS:84886411294
SN - 0304-3975
VL - 29
SP - 1190
EP - 1198
JO - Unknown Journal
JF - Unknown Journal
IS - 9
ER -