Loading…
Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization
We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in our companion paper directly applies to smooth (non-strongly) convex optimization. Specifically, we show that with these stepsizes, gradient descent computes an $$\varepsilon $$ ε -minimizer in $$O(\varepsilon ^...
Saved in:
Published in: | Mathematical programming 2024-11 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in our companion paper directly applies to smooth (non-strongly) convex optimization. Specifically, we show that with these stepsizes, gradient descent computes an $$\varepsilon $$ ε -minimizer in $$O(\varepsilon ^{-\log _{\rho } 2}) = O(\varepsilon ^{-0.7864})$$ O ( ε - log ρ 2 ) = O ( ε - 0.7864 ) iterations, where $$\rho = 1+\sqrt{2}$$ ρ = 1 + 2 is the silver ratio. This is intermediate between the textbook unaccelerated rate $$O(\varepsilon ^{-1})$$ O ( ε - 1 ) and the accelerated rate $$O(\varepsilon ^{-1/2})$$ O ( ε - 1 / 2 ) due to Nesterov in 1983. The Silver Stepsize Schedule is a simple explicit fractal: the i -th stepsize is $$1 + \rho ^{\nu (i)-1}$$ 1 + ρ ν ( i ) - 1 where $$\nu (i)$$ ν ( i ) is the 2-adic valuation of i . The design and analysis are conceptually identical to the strongly convex setting in our companion paper, but simplify remarkably in this specific setting. |
---|---|
ISSN: | 0025-5610 1436-4646 |
DOI: | 10.1007/s10107-024-02164-2 |