A Gene Selection Method Based On Outliers for Breast Cancer Subtype Classification

Rayol Mendoncaneto, David Fenyo, Zhi Li, Eduardo F. Nakamura, Fabiola Guerra Nakamura, Claudio T. Silva

Research output: Contribution to journalArticlepeer-review

Abstract

Breast cancer is the second most common cancer type and is the leading cause of cancer-related deaths worldwide. Since it is a heterogeneous disease, subtyping breast cancer plays an important role in performing a specific treatment. Gene expression data is a viable alternative to be employed on cancer subtype classification, as they represent the state of a cell at the molecular level, but generally has a relatively small number of samples compared to a large number of genes. Gene selection is a promising approach that addresses this uneven high-dimensional matrix of genes versus samples and plays an important role in the development of efficient cancer subtype classification. In this work, an innovative outlier-based gene selection (OGS) method is proposed to select relevant genes for efficiently and effectively classify breast cancer subtypes. Experiments show that our strategy presents an F1 score of 1.0 for basal and 0.86 for her 2, the two subtypes with the worst prognoses, respectively. Compared to other methods, our proposed method outperforms in the F1 score using 80% less genes. In general, our method selects only a few highly relevant genes, speeding up the classification, and significantly improving the classifier's performance.

Keywords

  • Breast cancer
  • Feature extraction
  • Gene expression
  • Gene expression
  • Logistics
  • Support vector machines
  • Task analysis
  • Tumors
  • breast cancer
  • gene selection
  • outlier genes

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'A Gene Selection Method Based On Outliers for Breast Cancer Subtype Classification'. Together they form a unique fingerprint.

Cite this