A Gene Selection Method Based on Outliers for Breast Cancer Subtype Classification

Rayol Mendonca-Neto, Zhi Li, David Fenyo, Claudio T. Silva, Fabiola G. Nakamura, Eduardo F. Nakamura

Research output: Contribution to journalArticlepeer-review


Breast cancer is the second most common cancer type and is the leading cause of cancer-related deaths worldwide. Since it is a heterogeneous disease, subtyping breast cancer plays an important role in performing a specific treatment. Gene expression data is a viable alternative to be employed on cancer subtype classification, as they represent the state of a cell at the molecular level, but generally has a relatively small number of samples compared to a large number of genes. Gene selection is a promising approach that addresses this uneven high-dimensional matrix of genes versus samples and plays an important role in the development of efficient cancer subtype classification. In this work, an innovative outlier-based gene selection (OGS) method is proposed to select relevant genes for efficiently and effectively classify breast cancer subtypes. Experiments show that our strategy presents an F1 score of 1.0 for basal and 0.86 for her 2, the two subtypes with the worst prognoses, respectively. Compared to other methods, our proposed method outperforms in the F1 score using 80% less genes. In general, our method selects only a few highly relevant genes, speeding up the classification, and significantly improving the classifier's performance.

Original languageEnglish (US)
Pages (from-to)2547-2559
Number of pages13
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Issue number5
StatePublished - 2022


  • Gene expression
  • breast cancer
  • gene selection
  • outlier genes

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics


Dive into the research topics of 'A Gene Selection Method Based on Outliers for Breast Cancer Subtype Classification'. Together they form a unique fingerprint.

Cite this