TY - GEN
T1 - Simple heuristics yield provable algorithms for masked low-rank approximation
AU - Musco, Cameron
AU - Musco, Christopher
AU - Woodruff, David P.
N1 - Funding Information:
Funding David P. Woodruff : David Woodruff would like to thank support from the Office of Naval Research (ONR) grant N00014-18-1-2562.
Publisher Copyright:
© Cameron Musco, Christopher Musco, and David P. Woodruff.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - In the masked low-rank approximation problem, one is given data matrix A ∈ Rn×n and binary mask matrix W ∈ {0, 1}n×n. The goal is to find a rank-k matrix L for which: n n cost(L) def = X X Wi,j · (Ai,j - Li,j)2 ≤ OPT + εkAk2F , i=1 j=1 where OPT = minrank-k L cost(L) and ε is a given error parameter. Depending on the choice of W, the above problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, low-rank recovery from monotone missing data, and a number of other important problems. Many of these problems are NP-hard, and while algorithms with provable guarantees are known in some cases, they either 1) run in time nΩ(k2 /ε) or 2) make strong assumptions, for example, that A is incoherent or that the entries in W are chosen independently and uniformly at random. In this work, we show that a common polynomial time heuristic, which simply sets A to 0 where W is 0, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank k0 > k depending on the public coin partition number of W, the heuristic outputs rank-k0 L with cost(L) ≤ OPT + εkAk2F . This partition number is in turn bounded by the randomized communication complexity of W, when interpreted as a two-player communication matrix. For many important cases, including all those listed above, this yields bicriteria approximation guarantees with rank k0 = k · poly(log n/ε). Beyond this result, we show that different notions of communication complexity yield bicriteria algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.
AB - In the masked low-rank approximation problem, one is given data matrix A ∈ Rn×n and binary mask matrix W ∈ {0, 1}n×n. The goal is to find a rank-k matrix L for which: n n cost(L) def = X X Wi,j · (Ai,j - Li,j)2 ≤ OPT + εkAk2F , i=1 j=1 where OPT = minrank-k L cost(L) and ε is a given error parameter. Depending on the choice of W, the above problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, low-rank recovery from monotone missing data, and a number of other important problems. Many of these problems are NP-hard, and while algorithms with provable guarantees are known in some cases, they either 1) run in time nΩ(k2 /ε) or 2) make strong assumptions, for example, that A is incoherent or that the entries in W are chosen independently and uniformly at random. In this work, we show that a common polynomial time heuristic, which simply sets A to 0 where W is 0, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank k0 > k depending on the public coin partition number of W, the heuristic outputs rank-k0 L with cost(L) ≤ OPT + εkAk2F . This partition number is in turn bounded by the randomized communication complexity of W, when interpreted as a two-player communication matrix. For many important cases, including all those listed above, this yields bicriteria approximation guarantees with rank k0 = k · poly(log n/ε). Beyond this result, we show that different notions of communication complexity yield bicriteria algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.
KW - Bicriteria approximation algorithms
KW - Communication complexity
KW - Low-rank approximation
KW - Weighted low-rank approximation
UR - http://www.scopus.com/inward/record.url?scp=85108734821&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85108734821&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ITCS.2021.6
DO - 10.4230/LIPIcs.ITCS.2021.6
M3 - Conference contribution
AN - SCOPUS:85108734821
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 12th Innovations in Theoretical Computer Science Conference, ITCS 2021
A2 - Lee, James R.
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 12th Innovations in Theoretical Computer Science Conference, ITCS 2021
Y2 - 6 January 2021 through 8 January 2021
ER -