Bilevel feature selection in nearly-linear time

Chinmay Hegde

    Research output: Chapter in Book/Report/Conference proceedingConference contribution


    Selection of a small subset of informative features from data is a basic technique in signal processing, machine learning, and statistics. Joint selection of entire groups of features is desirable if the data features exhibit shared grouping structures. Bilevel feature selection constitutes a refinement of these ideas, producing a small subset of data features that themselves belong to a small number of feature groups. However, algorithms for bilevel feature selection suffer a computational cost that can be cubic in the size of the data, hence impeding their utility. In this paper, we propose an approach for bilevel feature selection that resolves this computational challenge. The core component of our approach is a novel fast algorithm for bilevel hard thresholding for a specific non-convex, discrete optimization problem. Our algorithm produces an approximate solution to this problem, but only incurs a nearly-linear running time. We extend this algorithm into a two-stage thresholding method that performs statistically as well as the best available methods for bilevel feature selection, but that also scales extremely well to massive dataset sizes.

    Original languageEnglish (US)
    Title of host publication2016 19th IEEE Statistical Signal Processing Workshop, SSP 2016
    PublisherIEEE Computer Society
    ISBN (Electronic)9781467378024
    StatePublished - Aug 24 2016
    Event19th IEEE Statistical Signal Processing Workshop, SSP 2016 - Palma de Mallorca, Spain
    Duration: Jun 25 2016Jun 29 2016

    Publication series

    NameIEEE Workshop on Statistical Signal Processing Proceedings


    Other19th IEEE Statistical Signal Processing Workshop, SSP 2016
    CityPalma de Mallorca

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Applied Mathematics
    • Signal Processing
    • Computer Science Applications


    Dive into the research topics of 'Bilevel feature selection in nearly-linear time'. Together they form a unique fingerprint.

    Cite this