$\displaystyle \min_{x}\max_{y}F(x) + \langle Kx,y\rangle - G(y). \ \ \ \ \ (1)$

(where I omit all the standard assumptions, like convexity, continuity ans such…). Fenchel-Rockafellar duality says that solutions are characterized by the inclusion

$\displaystyle 0 \in\left( \begin{bmatrix} \partial F & 0\\ 0 & \partial G \end{bmatrix} + \begin{bmatrix} 0 & K^{T}\\ -K & 0 \end{bmatrix}\right) \begin{bmatrix} x^{*}\\y^{*} \end{bmatrix}$

Noting that the operators

$\displaystyle A = \begin{bmatrix} \partial F & 0\\ 0 & \partial G \end{bmatrix},\quad B = \begin{bmatrix} 0 & K^{T}\\ -K & 0 \end{bmatrix}$

are both monotone, we may apply any of the splitting methods available, for example the Douglas-Rachford method. In terms of resolvents

$\displaystyle R_{tA}(z) := (I+tA)^{-1}(z)$

$\displaystyle \begin{array}{rcl} z^{k+1} & = & R_{tB}(\bar z^{k})\\ \bar z^{k+1}& = & R_{tA}(2z^{k+1}-\bar z^{k}) + \bar z^{k}-z^{k+1}. \end{array}$

For the saddle point problem, this iteration is (with ${z = (x,y)}$)

$\displaystyle \begin{array}{rcl} x^{k+1} &=& R_{t\partial F}(\bar x^{k})\\ y^{k+1} &=& R_{t\partial G}(\bar y^{k})\\ \begin{bmatrix} \bar x^{k+1}\\ \bar y^{k+1} \end{bmatrix} & = & \begin{bmatrix} I & tK^{T}\\ -tK & I \end{bmatrix}^{-1} \begin{bmatrix} 2x^{k+1}-\bar x^{k}\\ 2y^{k+1}-\bar y^{k} \end{bmatrix} + \begin{bmatrix} \bar x^{k}- x^{k+1}\\ \bar y^{k}-y^{k+1} \end{bmatrix}. \end{array}$

The first two lines involve proximal steps and we assume that they are simple to implement. The last line, however, involves the solution of a large linear system. This can be broken down to a slightly smaller linear system involving the matrix ${(I+t^{2}K^{T}K)}$ as follows: The linear system equals

$\displaystyle \begin{array}{rcl} \bar x^{k+1} & = & x^{k+1} - tK^{T}(y^{k+1}+\bar y^{k+1}-\bar y^{k})\\ \bar y^{k+1} & = & y^{k+1} + tK(x^{k+1} + \bar x^{k+1}-\bar x^{k}). \end{array}$

Plugging ${\bar y^{k+1}}$ from the second equation into the first gives

$\displaystyle \bar x^{k+1} = x^{k+1} - tK^{T}(2y^{k+1}-\bar y^{k}) - tK^{T}K(x^{k+1}-\bar x^{k+1}-\bar x^{k})$

Denoting ${d^{k+1}= x^{k+1}+\bar x^{k+1}-\bar x^{k}}$ this can be written as

$\displaystyle (I+t^{2}K^{T}K)d^{k+1} = (2x^{k+1}-\bar x^{k}) - tK^{T}(2y^{k+1}-\bar y^{k}).$

and the second equation is just

$\displaystyle \bar y^{k+1} = y^{k+1} + tKd^{k+1}.$

This gives the overall iteration

$\displaystyle \begin{array}{rcl} x^{k+1} &=& R_{t\partial F}(\bar x^{k})\\ y^{k+1} &=& R_{t\partial G}(\bar y^{k})\\ d^{k+1} &=& (I+t^{2}K^{T}K)^{-1}(2x^{k+1}-\bar x^{k} - tK(2y^{k+1}-\bar y^{k}))\\ \bar x^{k+1}&=& \bar x^{k}-x^{k+1}+d^{k+1}\\ \bar y^{k+1}&=& y^{k+1}+tKd^{k+1} \end{array}$

This is nothing else than using the Schur complement or factoring as

$\displaystyle \begin{bmatrix} I & tK^{T}\\ -tK & I \end{bmatrix} = \begin{bmatrix} 0 & 0\\ 0 & I \end{bmatrix} + \begin{bmatrix} I\\tK \end{bmatrix} (I + t^{2}K^{T}K)^{-1} \begin{bmatrix} I & -tK^{T} \end{bmatrix}$

and has been applied to imaging problems by O’Connor and Vandenberghe in “Primal-Dual Decomposition by Operator Splitting and Applications to Image Deblurring” (doi). For many problems in imaging, the involved inversion may be fairly easy to perform (if ${K}$ is the image gradient, for example, we only need to solve an equation with an operator like ${(I - t^{2}\Delta)}$ and appropriate boundary conditions). However, there are problems where this inversion is a problem.

I’d like to show the following trick to circumvent the matrix inversion, which I learned from Bredies and Sun’s “Accelerated Douglas-Rachford methods for the solution of convex-concave saddle-point problems”: Here is a slightly different saddle point problem

$\displaystyle \min_{x}\max_{y,x_{p}}F(x) + \langle Kx,y\rangle + \langle Hx,x_{p}\rangle- G(y) - I_{\{0\}}(x_{p}). \ \ \ \ \ (2)$

We added a new dual variable ${x_{p}}$, which is forced to be zero by the additional indicator functional ${I_{\{0\}}}$. Hence, the additional bilinear term ${\langle Hx,x_{p}\rangle}$ is also zero, and we see that ${(x,y)}$ is a solution of (1) if and only if ${(x,y,0)}$ is a solution of (2). In other words: The problem just looks differently, but is, in essence, the same as before.

Now let us write down the Douglas-Rachford iteration for (2). We write this problem as

$\displaystyle \min_{x}\max_{\tilde y} F(x) + \langle \tilde Kx,\tilde y\rangle -\tilde G(\tilde y)$

with

$\displaystyle \tilde y = \begin{bmatrix} y\\x_{p} \end{bmatrix}, \quad \tilde K = \begin{bmatrix} K\\H \end{bmatrix}, \quad \tilde G(\tilde y) = \tilde G(y,x_{p}) = G(y) + I_{\{0\}}(x_{p}).$

Writing down the Douglas-Rachford iteration gives

$\displaystyle \begin{array}{rcl} x^{k+1} &=& R_{t\partial F}(\bar x^{k})\\ \tilde y^{k+1} &=& R_{t\partial \tilde G}(\bar{ \tilde y}^{k})\\ \begin{bmatrix} \bar x^{k+1}\\ \bar {\tilde y}^{k+1} \end{bmatrix} & = & \begin{bmatrix} I & t\tilde K^{T}\\ -t\tilde K & I \end{bmatrix}^{-1} \begin{bmatrix} 2x^{k+1}-\bar x^{k}\\ 2\tilde y^{k+1}-\bar {\tilde y}^{k} \end{bmatrix} + \begin{bmatrix} \bar x^{k}- x^{k+1}\\ \bar {\tilde y}^{k}-\tilde y^{k+1} \end{bmatrix}. \end{array}$

Switching back to variables without a tilde, we get, using ${R_{tI_{\{0\}}}(x) = 0}$,

$\displaystyle \begin{array}{rcl} x^{k+1} &=& R_{t\partial F}(\bar x^{k})\\ y^{k+1} &=& R_{t\partial \tilde G}(\bar{ y}^{k})\\ x_{p}^{k+1} &=& 0\\ \begin{bmatrix} \bar x^{k+1}\\ \bar {y}^{k+1}\\ \bar x_{p}^{k+1} \end{bmatrix} & = & \begin{bmatrix} I & tK^{T} & tH^{T}\\ -t K & I & 0\\ -t H & 0 & I \end{bmatrix}^{-1} \begin{bmatrix} 2x^{k+1}-\bar x^{k}\\ 2 y^{k+1}-\bar {y}^{k}\\ 2x_{p}^{k+1}-\bar x_{p}^{k} \end{bmatrix} + \begin{bmatrix} \bar x^{k}- x^{k+1}\\ \bar {y}^{k}-y^{k+1}\\ \bar x_{p}^{k}-x_{p}^{k+1} \end{bmatrix}. \end{array}$

First not that ${x_{p}^{k+1}=0}$ throughout the iteration and from the last line of the linear system we get that

$\displaystyle \begin{array}{rcl} -tH\bar x^{k+1} + \bar x_{p}^{k+1} = -\bar x_{p}^{k} -tH(\bar x^{k}-x^{k+1}) + \bar x_{p}^{k} \end{array}$

which implies that

$\displaystyle \bar x_{p}^{k+1} = tH\bar x^{k+1}.$

Thus, both variables ${x_{p}^{k}}$ and ${\bar x_{p}^{k}}$ disappear in the iteration. Now we rewrite the remaining first two lines of the linear system as

$\displaystyle \begin{array}{rcl} \bar x^{k+1} + tK^{T}\bar y^{k+1} + t^{2}H^{T}H\bar x^{k+1} &=& x^{k+1} + tK^{T}(\bar y^{k}-y^{k+1}) + t^{2}H^{T}H\bar x^{k}\\ \bar y^{k+1}-tK\bar x^{k+1} &=& y^{k+1} + tK(x^{k+1}-\bar x^{k}). \end{array}$

Again denoting ${d^{k+1}=x^{k+1}+\bar x^{k+1}-\bar x^{k}}$, solving the second equation for ${\bar y^{k+1}}$ and plugging the result in the first gives

$\displaystyle (I+t^{2}H^{T}H)\bar x^{k+1} +tK^{T}(y^{k+1}+tKd^{k+1}) = x^{k+1}+tK(\bar y^{k}-y^{k+1}) + t^{2}H^{T}H\bar x^{k}.$

To eliminate ${\bar x^{k+1}}$ we add ${(I+t^{2}H^{T}H)(x^{k+1}-\bar x^{k})}$ on both sides and get

$\displaystyle (I+t^{2}(H^{T}H+K^{T}K))d^{k+1} = 2x^{k+1}-\bar x^{k} -tK(y^{k+1}+\bar y^{k+1}-\bar y^{k}) + t^{2}H^{T}Hx^{k+1}.$

In total we obtain the following iteration:

$\displaystyle \begin{array}{rcl} x^{k+1} &=& R_{t\partial F}(\bar x^{k})\\ y^{k+1} &=& R_{t\partial G}(\bar y^{k})\\ d^{k+1} &=& (I+t^{2}(H^{T}H + K^{T}K))^{-1}(2x^{k+1}-\bar x^{k} - tK(2y^{k+1}-\bar y^{k}) + t^{2}H^{T}Hx^{k+1})\\ \bar x^{k+1}&=& \bar x^{k}-x^{k+1}+d^{k+1}\\ \bar y^{k+1}&=& y^{k+1}+tKd^{k+1} \end{array}$

and note that only the third line changed.

Since the above works for any matrix ${H}$, we have a lot of freedom. Let us see, that it is even possible to avoid any inversion whatsoever: We would like to choose ${H}$ in a way that ${I+t^{2}(H^{T}H + K^{T}K) = \lambda I}$ for some positive ${\lambda}$. This is equivalent to

$\displaystyle H^{T}H = \tfrac{\lambda-1}{t^{2}}I - K^{T}K.$

As soon as the right hand side is positive definite, Cholesky decomposition shows that such an ${H}$ exists, and this happens if ${\lambda\geq 1+t^{2}\|K\|^{2}}$. Further note, that we do need ${H}$ in any way, but only ${H^{T}H}$, and we can perform the iteration without ever solving any linear system since the third row reads as

$\displaystyle d^{k+1} = \tfrac{1}{\lambda}\left(2x^{k+1}-\bar x^{k} - tK(2y^{k+1}-\bar y^{k}) + ((\lambda-1)I - t^{2}K^{T}K)x^{k+1})\right).$