TY - GEN
T1 - Learning to Prune Instances of k-median and Related Problems
AU - Tayebi, Dena
AU - Ray, Saurabh
AU - Ajwani, Deepak
N1 - Publisher Copyright:
Copyright © 2022 by SIAM.
PY - 2022
Y1 - 2022
N2 - In a large number of industrial applications, combinatorial optimization problems are repeatedly solved with datasets from similar distribution. In recent years, machine learning techniques have been shown to be quite effective in speeding up such computations. However, black-box end-to-end machine learning approaches suffer from poor interpretability and the requirement for a large amount of labelled data. In this paper, we demonstrate a simple and highly effective way to incorporate the insights from the algorithmic and optimization literature on these problems into a machine learning framework to speed-up the solutions of these problems. We study the k-median problem and the following closely related combinatorial optimization problems: set cover, max coverage and uncapacitated facility location. These problems are well studied and a large number of approximation algorithms have been designed for these problems. We look at the kind of quantities these approximation algorithms employ and use these to derive useful features for training a classifier that helps quickly reduce the problem size by identifying the difficult core of the problem and pruning the remainder. The difficult core is then solved using an ILP solver. A prime advantage of using such features is that we do not require much data to train the classifier. This results in a much faster algorithm than just using the Integer Linear Programming solver. Several of the features we used for the classifier were derived from approximation algorithms that were designed for the metric instances of the k-median and the uncapacitated facility location problem. However, remarkably, the use of these features also leads to significant speed up in the non-metric instances of these problems and even the other two related problems.
AB - In a large number of industrial applications, combinatorial optimization problems are repeatedly solved with datasets from similar distribution. In recent years, machine learning techniques have been shown to be quite effective in speeding up such computations. However, black-box end-to-end machine learning approaches suffer from poor interpretability and the requirement for a large amount of labelled data. In this paper, we demonstrate a simple and highly effective way to incorporate the insights from the algorithmic and optimization literature on these problems into a machine learning framework to speed-up the solutions of these problems. We study the k-median problem and the following closely related combinatorial optimization problems: set cover, max coverage and uncapacitated facility location. These problems are well studied and a large number of approximation algorithms have been designed for these problems. We look at the kind of quantities these approximation algorithms employ and use these to derive useful features for training a classifier that helps quickly reduce the problem size by identifying the difficult core of the problem and pruning the remainder. The difficult core is then solved using an ILP solver. A prime advantage of using such features is that we do not require much data to train the classifier. This results in a much faster algorithm than just using the Integer Linear Programming solver. Several of the features we used for the classifier were derived from approximation algorithms that were designed for the metric instances of the k-median and the uncapacitated facility location problem. However, remarkably, the use of these features also leads to significant speed up in the non-metric instances of these problems and even the other two related problems.
UR - http://www.scopus.com/inward/record.url?scp=85127388254&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127388254&partnerID=8YFLogxK
U2 - 10.1137/1.9781611977042.15
DO - 10.1137/1.9781611977042.15
M3 - Conference contribution
AN - SCOPUS:85127388254
T3 - Proceedings of the Workshop on Algorithm Engineering and Experiments
SP - 184
EP - 194
BT - SIAM Symposium on Algorithm Engineering and Experiments, ALENEX 2022
PB - Society for Industrial and Applied Mathematics Publications
T2 - 2022 SIAM Symposium on Algorithm Engineering and Experiments, ALENEX 2022
Y2 - 9 January 2022 through 10 January 2022
ER -