Directional false discovery rate control in large-scale multiple comparisons

Wenjuan Liang, Dongdong Xiang, Yajun Mei, Wendong Li

Research output: Contribution to journalArticlepeer-review

Abstract

The advance of high-throughput biomedical technology makes it possible to access massive measurements of gene expression levels. An important statistical issue is identifying both under-expressed and over-expressed genes for a disease. Most existing multiple-testing procedures focus on selecting only the non-null or significant genes without further identifying their expression type. Only limited methods are designed for the directional problem, and yet they fail to separately control the numbers of falsely discovered over-expressed and under-expressed genes with only a unified index combining all the false discoveries. In this paper, based on a three-classification multiple testing framework, we propose a practical data-driven procedure to control separately the two directions of false discoveries. The proposed procedure is theoretically valid and optimal in the sense that it maximizes the expected number of true discoveries while controlling the false discovery rates for under-expressed and over-expressed genes simultaneously. The procedure allows different nominal levels for the two directions, exhibiting high flexibility in practice. Extensive numerical results and analysis of two large-scale genomic datasets show the effectiveness of our procedure.

Original languageEnglish (US)
Pages (from-to)3195-3214
Number of pages20
JournalJournal of Applied Statistics
Volume51
Issue number15
DOIs
StatePublished - 2024

Keywords

  • Gene expression
  • data-driven
  • marginal FDR
  • multiple testing
  • separate control

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Directional false discovery rate control in large-scale multiple comparisons'. Together they form a unique fingerprint.

Cite this