I blogged about the Douglas-Rachford method before here and here. It’s a method to solve monotone inclusions in the form

\displaystyle 0 \in Ax + Bx

with monotone multivalued operators {A,B} from a Hilbert space into itself. Using the resolvent {J_{A} = (I+A)^{-1}} and the reflector {R_{A} = 2J_{A} - I}, the Douglas-Rachford iteration is concisely written as

\displaystyle u^{n+1} = \tfrac12(I + R_{B}R_{A})u_{n}.

The convergence of the method has been clarified is a number of papers, see, e.g.

Lions, Pierre-Louis, and Bertrand Mercier. “Splitting algorithms for the sum of two nonlinear operators.” SIAM Journal on Numerical Analysis 16.6 (1979): 964-979.

for the first treatment in the context of monotone operators and

Svaiter, Benar Fux. “On weak convergence of the Douglas–Rachford method.” SIAM Journal on Control and Optimization 49.1 (2011): 280-287.

for a recent very general convergence result.

Since {tA} is monotone if {A} is monotone and {t>0}, we can introduce a stepsize for the Douglas-Rachford iteration

\displaystyle u^{n+1} = \tfrac12(I + R_{tB}R_{tA})u^{n}.

It turns out, that this stepsize matters a lot in practice; too small and too large stepsizes lead to slow convergence. It is a kind of folk wisdom, that there is “sweet spot” for the stepsize. In a recent preprint Quoc Tran-Dinh and I investigated this sweet spot in the simple case of linear operators {A} and {B} and this tweet has a visualization.

A few days ago Walaa Moursi and Lieven Vandenberghe published the preprint “Douglas-Rachford splitting for a Lipschitz continuous and a strongly monotone operator” and derived some linear convergence rates in the special case they mention in the title. One result (Theorem 4.3) goes as follows: If {A} is monotone and Lipschitz continuous with constant {\beta} and {B} is maximally monotone and {\mu}-strongly monotone, than the Douglas-Rachford iterates converge strongly to a solution with a linear rate

\displaystyle r = \tfrac{1}{2(1+\mu)}\left(\sqrt{2\mu^{2}+2\mu + 1 +2(1 - \tfrac{1}{(1+\beta)^{2}} - \tfrac1{1+\beta^{2}})\mu(1+\mu)} + 1\right).

This is a surprisingly complicated expression, but there is a nice thing about it: It allows to optimize for the stepsize! The rate depends on the stepsize as

\displaystyle r(t) = \tfrac{1}{2(1+t\mu)}\left(\sqrt{2t^{2}\mu^{2}+2t\mu + 1 +2(1 - \tfrac{1}{(1+t\beta)^{2}} - \tfrac1{1+t^{2}\beta^{2}})t\mu(1+t\mu)} + 1\right)

and the two plots of this function below show the sweet spot clearly.


If one knows the Lipschitz constant of {A} and the constant of strong monotonicity of {B}, one can minimize {r(t)} to get on optimal stepsize (in the sense that the guaranteed contraction factor is as small as possible). As Moursi and Vandenberghe explain in their Remark 5.4, this optimization involves finding the root of a polynomial of degree 5, so it is possible but cumbersome.

Now I wonder if there is any hope to show that the adaptive stepsize Quoc and I proposed here (which basically amounts to {t_{n} = \|u^{n}\|/\|Au^{n}\|} in the case of single valued {A} – note that the role of {A} and {B} is swapped in our paper) is able to find the sweet spot (experimentally it does).