TY - JOUR
T1 - Structure-based protein function prediction using graph convolutional networks
AU - Gligorijević, Vladimir
AU - Renfrew, P. Douglas
AU - Kosciolek, Tomasz
AU - Leman, Julia Koehler
AU - Berenberg, Daniel
AU - Vatanen, Tommi
AU - Chandler, Chris
AU - Taylor, Bryn C.
AU - Fisk, Ian M.
AU - Vlamakis, Hera
AU - Xavier, Ramnik J.
AU - Knight, Rob
AU - Cho, Kyunghyun
AU - Bonneau, Richard
N1 - Publisher Copyright:
© 2021, The Author(s).
PY - 2021/12/1
Y1 - 2021/12/1
N2 - The rapid increase in the number of proteins in sequence databases and the diversity of their functions challenge computational approaches for automated function prediction. Here, we introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model and protein structures. It outperforms current leading methods and sequence-based Convolutional Neural Networks and scales to the size of current sequence repositories. Augmenting the training set of experimental structures with homology models allows us to significantly expand the number of predictable functions. DeepFRI has significant de-noising capability, with only a minor drop in performance when experimental structures are replaced by protein models. Class activation mapping allows function predictions at an unprecedented resolution, allowing site-specific annotations at the residue-level in an automated manner. We show the utility and high performance of our method by annotating structures from the PDB and SWISS-MODEL, making several new confident function predictions. DeepFRI is available as a webserver at https://beta.deepfri.flatironinstitute.org/.
AB - The rapid increase in the number of proteins in sequence databases and the diversity of their functions challenge computational approaches for automated function prediction. Here, we introduce DeepFRI, a Graph Convolutional Network for predicting protein functions by leveraging sequence features extracted from a protein language model and protein structures. It outperforms current leading methods and sequence-based Convolutional Neural Networks and scales to the size of current sequence repositories. Augmenting the training set of experimental structures with homology models allows us to significantly expand the number of predictable functions. DeepFRI has significant de-noising capability, with only a minor drop in performance when experimental structures are replaced by protein models. Class activation mapping allows function predictions at an unprecedented resolution, allowing site-specific annotations at the residue-level in an automated manner. We show the utility and high performance of our method by annotating structures from the PDB and SWISS-MODEL, making several new confident function predictions. DeepFRI is available as a webserver at https://beta.deepfri.flatironinstitute.org/.
UR - http://www.scopus.com/inward/record.url?scp=85106878177&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85106878177&partnerID=8YFLogxK
U2 - 10.1038/s41467-021-23303-9
DO - 10.1038/s41467-021-23303-9
M3 - Article
C2 - 34039967
AN - SCOPUS:85106878177
SN - 2041-1723
VL - 12
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 3168
ER -