TY - JOUR
T1 - SUPERFAMILY - Sophisticated comparative genomics, data mining, visualization and phylogeny
AU - Wilson, Derek
AU - Pethica, Ralph
AU - Zhou, Yiduo
AU - Talbot, Charles
AU - Vogel, Christine
AU - Madera, Martin
AU - Chothia, Cyrus
AU - Gough, Julian
N1 - Funding Information:
European Union Framework Program 7 Impact grant (grant number 213037) and Medical Research Council for open access fees.
PY - 2009
Y1 - 2009
N2 - SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site.
AB - SUPERFAMILY provides structural, functional and evolutionary information for proteins from all completely sequenced genomes, and large sequence collections such as UniProt. Protein domain assignments for over 900 genomes are included in the database, which can be accessed at http://supfam.org/. Hidden Markov models based on Structural Classification of Proteins (SCOP) domain definitions at the superfamily level are used to provide structural annotation. We recently produced a new model library based on SCOP 1.73. Family level assignments are also available. From the web site users can submit sequences for SCOP domain classification; search for keywords such as superfamilies, families, organism names, models and sequence identifiers; find over- and underrepresented families or superfamilies within a genome relative to other genomes or groups of genomes; compare domain architectures across selections of genomes and finally build multiple sequence alignments between Protein Data Bank (PDB), genomic and custom sequences. Recent extensions to the database include InterPro abstracts and Gene Ontology terms for superfamiles, taxonomic visualization of the distribution of families across the tree of life, searches for functionally similar domain architectures and phylogenetic trees. The database, models and associated scripts are available for download from the ftp site.
UR - http://www.scopus.com/inward/record.url?scp=58149203228&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=58149203228&partnerID=8YFLogxK
U2 - 10.1093/nar/gkn762
DO - 10.1093/nar/gkn762
M3 - Article
C2 - 19036790
AN - SCOPUS:58149203228
SN - 0305-1048
VL - 37
SP - D380-D386
JO - Nucleic acids research
JF - Nucleic acids research
IS - SUPPL. 1
ER -