Application of neural networks to biological data mining: A case study in protein sequence classification

Jason T.L. Wang, Qicheng Ma, Dennis Shasha, Cathy H. Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Biological data mining aims to extract significant information from DNA, RNA and proteins. The significant information may refer to motifs, functional sites, clustering and classification rules. This paper presents an example of biological data mining: The classification of protein sequences using neural networks. We propose new techniques to extract features from protein data and use them in combination with the Bayesian neural network to classify protein sequences obtained from the PIR protein database maintained at the National Biomedical Research Foundation. To evaluate the performance of the proposed approach, we compare it with other protein classifiers built based on sequence alignment and machine learning methods. Experimental results show the high precision of the proposed classifier and the complementarity of the tools studied in the paper.

Original languageEnglish (US)
Title of host publicationProceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
EditorsR. Ramakrishnan, S. Stolfo, R. Bayardo, I. Parsa, R. Ramakrishnan, S. Stolfo, R. Bayardo, I. Parsa
Pages305-309
Number of pages5
StatePublished - 2000
EventProceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001) - Boston, MA, United States
Duration: Aug 20 2000Aug 23 2000

Publication series

NameProceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Other

OtherProceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001)
CountryUnited States
CityBoston, MA
Period8/20/008/23/00

Keywords

  • Bioinformatics
  • Biological data mining
  • Feature extraction from protein data
  • Machine learning
  • Neural networks
  • Sequence alignment

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Application of neural networks to biological data mining: A case study in protein sequence classification'. Together they form a unique fingerprint.

  • Cite this

    Wang, J. T. L., Ma, Q., Shasha, D., & Wu, C. H. (2000). Application of neural networks to biological data mining: A case study in protein sequence classification. In R. Ramakrishnan, S. Stolfo, R. Bayardo, I. Parsa, R. Ramakrishnan, S. Stolfo, R. Bayardo, & I. Parsa (Eds.), Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 305-309). (Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining).