TY - JOUR
T1 - Profiling the malaria genome
T2 - A gene survey of three species of malaria parasite with comparison to other apicomplexan species
AU - Carlton, Jane M R
AU - Muller, Ralhston
AU - Yowell, Charles A.
AU - Fluegge, Michelle R.
AU - Sturrock, Kenneth A.
AU - Pritt, Jonathan R.
AU - Vargas-Serrato, Esmeralda
AU - Galinski, Mary R.
AU - Barnwell, John W.
AU - Mulder, Nicola
AU - Kanapin, Alexander
AU - Cawley, Simon E.
AU - Hide, Winston A.
AU - Dame, John B.
N1 - Funding Information:
Sequence data was obtained at the University of Florida, supported by a subcontract to J.B. Dame from National Institutes of Health contract N01-A1-65315. Data analysis was performed by J.M.-R. Carlton at the National Center for Biotechnology Information, by R. Muller and W.A. Hide at the South African National Bioinformatics Institute, and by A. Kanapin and N. Mulder at the European Bioinformatics Institute. We thank Chuong Huynh for help with the submission of data to GenBank, and Alan Christoffels for Perl scripts and parsers. The work of R. Muller was supported by a grant from the South African National Research Foundation. This paper is dedicated to the memory of R. Muller who was a joy and delight to work with and will be greatly missed.
PY - 2001
Y1 - 2001
N2 - We have undertaken the first comparative pilot gene discovery analysis of approximately 25 000 random genomic and expressed sequence tags (ESTs) from three species of Plasmodium, the infectious agent that causes malaria. A total of 5482 genome survey sequences (GSSs) and 5582 ESTs were generated from mung bean nuclease (MBN) and cDNA libraries, respectively, of the ANKA line of the rodent malaria parasite Plasmodium berghei, and 10 874 GSSs generated from MBN libraries of the Salvador I and Belem lines of Plasmodium vivax, the most geographically wide-spread human malaria pathogen. These tags, together with 2438 Plasmodium falciparum sequences present in GenBank, were used to perform first-pass assembly and transcript reconstruction, and non-redundant consensus sequence datasets created. The datasets were compared against public protein databases and more than 1000 putative new Plasmodium proteins identified based on sequence similarity. Homologs of previously characterized Plasmodium genes were also identified, increasing the number of P. vivax and P. berghei sequences in public databases at least 10-fold. Comparative studies with other species of Apicomplexa identified interesting homologs of possible therapeutic or diagnostic value. A gene prediction program, Phat, was used to predict probable open reading frames for proteins in all three datasets. Predicted and non-redundant BLAST-matched proteins were submitted to InterPro, an integrated database of protein domains, signatures and families, for functional classification. Thus a partial predicted proteome was created for each species. This first comparative analysis of Plasmodium protein coding sequences represents a valuable resource for further studies on the biology of this important pathogen.
AB - We have undertaken the first comparative pilot gene discovery analysis of approximately 25 000 random genomic and expressed sequence tags (ESTs) from three species of Plasmodium, the infectious agent that causes malaria. A total of 5482 genome survey sequences (GSSs) and 5582 ESTs were generated from mung bean nuclease (MBN) and cDNA libraries, respectively, of the ANKA line of the rodent malaria parasite Plasmodium berghei, and 10 874 GSSs generated from MBN libraries of the Salvador I and Belem lines of Plasmodium vivax, the most geographically wide-spread human malaria pathogen. These tags, together with 2438 Plasmodium falciparum sequences present in GenBank, were used to perform first-pass assembly and transcript reconstruction, and non-redundant consensus sequence datasets created. The datasets were compared against public protein databases and more than 1000 putative new Plasmodium proteins identified based on sequence similarity. Homologs of previously characterized Plasmodium genes were also identified, increasing the number of P. vivax and P. berghei sequences in public databases at least 10-fold. Comparative studies with other species of Apicomplexa identified interesting homologs of possible therapeutic or diagnostic value. A gene prediction program, Phat, was used to predict probable open reading frames for proteins in all three datasets. Predicted and non-redundant BLAST-matched proteins were submitted to InterPro, an integrated database of protein domains, signatures and families, for functional classification. Thus a partial predicted proteome was created for each species. This first comparative analysis of Plasmodium protein coding sequences represents a valuable resource for further studies on the biology of this important pathogen.
KW - Apicomplexa
KW - Comparative genomics
KW - Malaria
KW - Proteome
UR - http://www.scopus.com/inward/record.url?scp=0035662389&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0035662389&partnerID=8YFLogxK
U2 - 10.1016/S0166-6851(01)00371-1
DO - 10.1016/S0166-6851(01)00371-1
M3 - Article
C2 - 11738710
AN - SCOPUS:0035662389
SN - 0166-6851
VL - 118
SP - 201
EP - 210
JO - Molecular and Biochemical Parasitology
JF - Molecular and Biochemical Parasitology
IS - 2
ER -