TY - GEN
T1 - Bilevel feature selection in nearly-linear time
AU - Hegde, Chinmay
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/24
Y1 - 2016/8/24
N2 - Selection of a small subset of informative features from data is a basic technique in signal processing, machine learning, and statistics. Joint selection of entire groups of features is desirable if the data features exhibit shared grouping structures. Bilevel feature selection constitutes a refinement of these ideas, producing a small subset of data features that themselves belong to a small number of feature groups. However, algorithms for bilevel feature selection suffer a computational cost that can be cubic in the size of the data, hence impeding their utility. In this paper, we propose an approach for bilevel feature selection that resolves this computational challenge. The core component of our approach is a novel fast algorithm for bilevel hard thresholding for a specific non-convex, discrete optimization problem. Our algorithm produces an approximate solution to this problem, but only incurs a nearly-linear running time. We extend this algorithm into a two-stage thresholding method that performs statistically as well as the best available methods for bilevel feature selection, but that also scales extremely well to massive dataset sizes.
AB - Selection of a small subset of informative features from data is a basic technique in signal processing, machine learning, and statistics. Joint selection of entire groups of features is desirable if the data features exhibit shared grouping structures. Bilevel feature selection constitutes a refinement of these ideas, producing a small subset of data features that themselves belong to a small number of feature groups. However, algorithms for bilevel feature selection suffer a computational cost that can be cubic in the size of the data, hence impeding their utility. In this paper, we propose an approach for bilevel feature selection that resolves this computational challenge. The core component of our approach is a novel fast algorithm for bilevel hard thresholding for a specific non-convex, discrete optimization problem. Our algorithm produces an approximate solution to this problem, but only incurs a nearly-linear running time. We extend this algorithm into a two-stage thresholding method that performs statistically as well as the best available methods for bilevel feature selection, but that also scales extremely well to massive dataset sizes.
UR - http://www.scopus.com/inward/record.url?scp=84987924556&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84987924556&partnerID=8YFLogxK
U2 - 10.1109/SSP.2016.7551783
DO - 10.1109/SSP.2016.7551783
M3 - Conference contribution
AN - SCOPUS:84987924556
T3 - IEEE Workshop on Statistical Signal Processing Proceedings
BT - 2016 19th IEEE Statistical Signal Processing Workshop, SSP 2016
PB - IEEE Computer Society
T2 - 19th IEEE Statistical Signal Processing Workshop, SSP 2016
Y2 - 25 June 2016 through 29 June 2016
ER -