Loading…

The all-or-nothing phenomenon in sparse linear regression

We study the problem of recovering a hidden binary k -sparse p -dimensional vector \beta from n noisy linear observations Y=X\beta+W , where X_{ij} are i.i.d. \mathcal{N}(0,1) and W_i are i.i.d. \mathcal{N}(0,\sigma^2) . A closely related hypothesis testing problem is to distinguish the pair (X,Y) g...

Full description

Saved in:
Bibliographic Details
Published in:Mathematical statistics and learning (Online) 2021-12, Vol.3 (3), p.259-313
Main Authors: Reeves, Galen, Xu, Jiaming, Zadik, Ilias
Format: Article
Language:English
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We study the problem of recovering a hidden binary k -sparse p -dimensional vector \beta from n noisy linear observations Y=X\beta+W , where X_{ij} are i.i.d. \mathcal{N}(0,1) and W_i are i.i.d. \mathcal{N}(0,\sigma^2) . A closely related hypothesis testing problem is to distinguish the pair (X,Y) generated from this structured model from a corresponding null model where (X,Y) consist of purely independent Gaussian entries. In the low sparsity k=o(\sqrt{p}) and high signal-to-noise ratio k/\sigma^2 \to \infty regime, we establish an “all-or-nothing” information-theoretic phase transition at a critical sample size n^{\ast}=2 k\log (p/k) /\log (1+k/\sigma^2) , resolving a conjecture of Gamarnik and Zadik (2017). Specifically, we show that if \liminf_{p\to \infty} n/n^{\ast}>1 , then the maximum likelihood estimator almost perfectly recovers the hidden vector with high probability and moreover the true hypothesis can be detected with a vanishing error probability. Conversely, if \limsup_{p\to \infty} n/n^{\ast}
ISSN:2520-2316
2520-2324
DOI:10.4171/msl/22