A survey on Neyman-Pearson classification and suggestions for future research

Xin Tong, Yang Feng, Anqi Zhao

Research output: Contribution to journalReview articlepeer-review

Abstract

In statistics and machine learning, classification studies how to automatically learn to make good qualitative predictions (i.e., assign class labels) based on past observations. Examples of classification problems include email spam filtering, fraud detection, market segmentation. Binary classification, in which the potential class label is binary, has arguably the most widely used machine learning applications. Most existing binary classification methods target on the minimization of the overall classification risk and may fail to serve some real-world applications such as cancer diagnosis, where users are more concerned with the risk of misclassifying one specific class than the other. Neyman-Pearson (NP) paradigm was introduced in this context as a novel statistical framework for handling asymmetric type I/II error priorities. It seeks classifiers with a minimal type II error subject to a type I error constraint under some user-specified level. Though NP classification has the potential to be an important subfield in the classification literature, it has not received much attention in the statistics and machine learning communities. This article is a survey on the current status of the NP classification literature. To stimulate readers’ research interests, the authors also envision a few possible directions for future research in NP paradigm and its applications.

Original languageEnglish (US)
Pages (from-to)64-81
Number of pages18
JournalWiley Interdisciplinary Reviews: Computational Statistics
Volume8
Issue number2
DOIs
StatePublished - Mar 1 2016

Keywords

  • Classification
  • High Dimension
  • Neyman-Pearson Paradigm
  • Plug-In Methods

ASJC Scopus subject areas

  • Statistics and Probability

Fingerprint

Dive into the research topics of 'A survey on Neyman-Pearson classification and suggestions for future research'. Together they form a unique fingerprint.

Cite this