In sparse recovery one aims to leverage the sparsity of some vector {u} to get a good reconstruction of this vector from a noisy and indirect measurement {u^{0}}. Sparsity can be expressed in two different ways: The analysis way and the synthesis way. For both of them you need a system of other vectors {v_{k}}.

  • The analysis way: Here you consider the inner products {\langle u,v_{k}\rangle} and the sparsity assumption is, that this sequence is sparse.
  • The synthesis way: Here you assume that there is sparse sequence of coefficients {\psi_{k}} such that {u=\sum_{k} \psi_{k}v_{k}}.

The first approach is called “analysis” since here one analyzes the vector {u} with the system of vectors. In the second approach, you use the vectors {v_{k}} to synthesize the vector {u}.

We will keep things very simple here and only consider a noisy observation (and not an indirect one). The recovery approach in the analysis way is then to find {u} as a solution of

\displaystyle \tfrac12\|u-u_{0}\| + \alpha R(\langle u,v_{k}\rangle)

while for the synthesis way you write {u = \sum_{k} \psi_{k}v_{k} = V\psi} (where we collected the vectors {v_{k}} as the columns of the matrix {V}) and solve

\displaystyle \tfrac12\|V\psi-u_{0}\| + \alpha R(\psi)

The functional {R} should enforce the sparsity of the vector of coefficients in the former case and the sparsity of {\psi} in the latter case. With the matrix {V} the analysis way reads as

\displaystyle \tfrac12\|u-u_{0}\| + \alpha R(V^{T}u)

One important observation is, that both approaches coincide if the matrix {V} is orthonormal: Since {V^{-1} = V^{T}} in this case, {u=V\psi} is equivalent to {\psi = V^{T}u}, i.e. {\psi_{k} = \langle u,v_{k}\rangle}. For other sets of vectors {v_{k}}, this is not true and it is not so clear, how the analysis and the synthesis approach are related. In this post I’d like to show one instance where both approaches do lead to results that are not even close to being similar.

1. The “ROF analysis” approach to denoising

If {u^{0}} is an image, the famous Rudin-Osher-Fatemi denoising model fits under the “analysis” framework: The vectors {v_{k}} are all the finite differences of neighboring pixels such that {V^{T}u = \nabla u} with the discrete gradient {\nabla u}. As sparsifying functional you take a mixed {2}{1} norm, i.e. {R(\nabla u) = \|\nabla u\|_{2,1} = \sum_{ij}\sqrt{(\partial_{1}u_{ij})^{2} + (\partial_{2}u_{ij})^{2}}}. The denoised {u} is the solution to

\displaystyle \min_{u} \tfrac12\|u-u^{0}\|^{2} + \alpha\|\nabla u\|_{2,1}

Thus, the model will lead to some denoised {u} with a sparse gradient. This sparse gradient is partly the reason for the success of the model in that it keeps edges, but is also responsible for the drawback of this approach: The denoised image will be mostly piesewise constant.

2. The “ROF synthesis” approach

As {V^{T} = \nabla}, the corresponding synthesis approach would be to solve

\displaystyle \min_{\psi}\tfrac12\|\nabla^{T}\psi - u^{0}\|^{2} + \alpha \|\psi\|_{2,1}

and the set {u = \nabla^{T}\psi}. In this approach, the denoised image will be a sparse linear combination of particular two-pixel images. If you think about that, it does not really sound like a good idea to approximate an image by a sparse linear combination of two-pixel images. (One technicality: The operator {\nabla^{T}} is not onto – {\nabla} has a non-trivial kernel, consisting of the constant images and hence, all images in the range of {\nabla^{T}} have zero mean. Thus, to get something remotely close to {u^{0}} one should subtract the mean of {u^{0}} before the reconstruction and add it back afterwards.)

Here are some results:



Note that the results actually fit to what we would have predicted: For larger {\alpha} we get a “sparser gradient” (less edges, less variation) for the analysis approach, and “sparser two-pixel combinations” for the synthesis approach. In fact, for the synthesis approach to give something that is remotely close to the image, one needs quite a lot two-pixel images, i.e. a very low sparsity.

In conclusion, one should not consider the analysis and synthesis approach to be something similar.

Incidentally, the dual of the analysis approach

\displaystyle \min_{u} \tfrac12\|u-u^{0}\|^{2} + \alpha\|\nabla u\|_{2,1}

can be written as

\displaystyle \min_{\|\phi\|_{2,\infty}\leq\alpha}\tfrac12\|\nabla^{T}\phi-u^{0}\|^{2} = \min_{\phi}\tfrac12\|\nabla^{T}\phi-u^{0}\|^{2} + I_{\|\cdot\|_{2,\infty}\leq\alpha}(\phi)

which looks related to the synthesis approach (one basically replaces the {2,1}-norm by its conjugate function). Actually, a similar relation between the dual of the analysis approach and the synthesis approach always holds.


Yesterday I uploaded the paper An extended Perona-Malik model based on probabilistic models by Lars Mescheder and myself to the arXiv. Recently I already blogged about this work, so I do not have to add that much. The main theme of the work was that if we have an image that is a blurred and noisy version of some true image, we formulate the reconstruction via Bayesian statistics. As a prior model for images we used a Gaussian scale mixture, i.e. we have a latent variable z (in every pixel x) and the joint prior for the image u and the latent variable z is

p(u,z) \propto \exp\Big(-\sum_x \frac{z(x)}2 |\nabla u(x)|^2 + v(z(x))\Big)

where x denotes the pixels, $\nabla$ is the discrete gradient of the image and v is some non-negative function defined on the non-negative reals. Besides algorithms to approximate the MAP estimate, Lars proposed a mean-field approximation which does not calculate the most probable image, but iteratively approximates the posterior distribution by distributions which factorize over the variables u and z. Using some further approximation (since the resulting algorithm in its plain form is not implementable) one arrives at an algorithm which some keeps more uncertainty in u and, in practice, gives a point estimate for the denoised image that seems more “representative”. Here is an example:

This is the blurred and noisy image (the blur is motions blur and implemented with periodic boundary conditions for simplicity):
The next image is the approximation of the MAP estimate we got and you see the usual drawbacks. Since the MAP neglects all uncertainty in u and maximizes the posterior probability, the image is “too sharp” in the way that smooth transitions (e.g. at the lighthouse) turn into piecewise constant stairs. Also the rocks appear very blocky.
Here is the result from the mean-field approximation. Since uncertainty in u is taken into account, the image does not suffer from staircasing and also has a more natural appeal, most prominently in the part with the rocks.

The paper has some more examples and also shows another relation to Mumford-Shah denoising (loosely speaking, one uses a discrete latent variable z to serve as a binary variable to say if a pixel is an edge or not). Oh, by the way, the algorithms behind the mean-field approximation use some parts of more general duality between random variables that Lars and his co-authors develop in another paper.