Sampling algorithms for ℓ 2 regression and applications

Petros Drineas, Michael W. Mahoney, S. Muthukrishnan

    Research output: Contribution to conferencePaperpeer-review

    Abstract

    We present and analyze a sampling algorithm for the basic linear-algebraic problem of ℓ 2 regression. The ℓ 2 regression (or least-squares fit) problem takes as input a matrix A ∈ ℝ n×d (where we assume n ≫ d) and a target vector b ∈ ℝ n, and it returns as output cross Z sign = min x∈ℝd |b - Ax| 2. Also of interest is x opt = A +b, where A + is the Moore-Penrose generalized inverse, which is the minimum-length vector achieving the minimum. Our algorithm randomly samples r rows from the matrix A and vector b to construct an induced ℓ 2 regression problem with many fewer rows, but with the same number of columns. A crucial feature of the algorithm is the nonuniform sampling probabilities. These probabilities depend in a sophisticated manner on the lengths, i.e., the Euclidean norms, of the rows of the left singular vectors of A and the manner in which b lies in the complement of the column space of A. Under appropriate assumptions, we show relative error approximations for both cross Z sign and x opt. Applications of this sampling methodology are briefly discussed.

    Original languageEnglish (US)
    Pages1127-1136
    Number of pages10
    DOIs
    StatePublished - 2006
    EventSeventeenth Annual ACM-SIAM Symposium on Discrete Algorithms - Miami, FL, United States
    Duration: Jan 22 2006Jan 24 2006

    Other

    OtherSeventeenth Annual ACM-SIAM Symposium on Discrete Algorithms
    Country/TerritoryUnited States
    CityMiami, FL
    Period1/22/061/24/06

    ASJC Scopus subject areas

    • Software
    • General Mathematics

    Fingerprint

    Dive into the research topics of 'Sampling algorithms for ℓ 2 regression and applications'. Together they form a unique fingerprint.

    Cite this