Today there are several things I could blog on. The first is the planary by Rich Baraniuk on Compressed Sensing. However, I don’t think that I could reflect the content in a way which would be helpful for a potential reader. Just for the record: If you have the chance to visit one of Rich’s talk: Do it!

The second thing is the talk by Bernd Hofmann on source conditions, smoothness and variational inequalities and their use in regularization of inverse problems. However, this would be too technical for now and I just did not take enough notes to write a meaningful post.

As a third thing I have the talk by Christian Clason on inverse problems with uniformly distributed noise. He argued that for uniform noise it is much better to use an ${L^\infty}$ discrepancy term instead of the usual ${L^2}$-one. He presented a path-following semismooth Newton method to solve the problem $\displaystyle \min_x \frac{1}{p}\|Kx-y^\delta\|_\infty^p + \frac{\alpha}{2}\|x\|_2^2$

and showed examples with different kinds of noise. Indeed the examples showed that ${L^\infty}$ works much better than ${L^2}$ here. But in fact it works even better, if the noise is not uniformly distributed but “impulsive” i.e. it attains bounds ${\pm\delta}$ almost everywhere. It seems to me that uniform noise would need a slightly different penalty but I don’t know which one – probably you do? Moreover, Christian presented the balancing principle to choose the regularization parameter (without knowledge about the noise level) and this was the first time I really got what it’s about. What one does here is, to choose ${\alpha}$ such that (for some ${\sigma>0}$ which only depends on ${K}$, but not on the noise) $\displaystyle \sigma\|Kx_\alpha^\delta-y^\delta\|_\infty = \frac{\alpha}{2}\|x_\alpha^\delta\|_2^2.$

The rational behind this is, that the left hand side is monotonically non-decreasing in ${\alpha}$, while the right hand side is monotonically non-increasing. Hence, there should be some ${\alpha}$ “in the middle” which make both somewhat equally large. Of course, we do neither want to “over-regularize” (which would usually “smooth too much”) nor to “under-regularize” (which would not eliminate noise). Hence, balancing seems to be a valid choice. From a practical point of view the balancing is also nice because one can use the fixed-point iteration $\displaystyle \alpha^{n+1} = 2\sigma\frac{\|Kx_{\alpha^n}^\delta - y^\delta\|_\infty}{\|x_{\alpha_n}^\delta\|_2^2}$

which converges in a few number of iterations.

Then there was the talk by Esther Klann, but unfortunately, I was late so only heard the last half…

Last but not least we have the talk by Christiane Pöschl. If you are interested in Total-Variation-Denoising (TV denoising), then you probably have heard many times that “TV denoising preserves edges” (have a look at the Wikipedia page – it claims this twice). What Christiane showed (in a work with Vicent Caselles and M. Novaga) that this claim is not true in general but only for very special cases. In case of characteristic functions, the only functions for which the TV minimizer has sharp edges are these so-called calibrated sets, introduced by Caselles et el. Building on earlier works by Caselles and co-workers she calculated exact minimizers for TV denoising in the case that the image consists of characteristic functions of two convex sets or of a single star shaped domain, that is, for a given set $B$ she calculated the solution of $\displaystyle \min_u\int (u - \chi_B)^2dx + \lambda \int|Du|.$

This is not is as easy as it may sound. Even for the minimizer for a single convex set one has to make some effort. She presented a nice connection of the shape of the obtained level-sets with the morphological operators of closing and opening. With the help of this link she derived a methodology to obtain the exact TV denoising minimizer for all parameters. I do not have the images right now but be assured that most of the time, the minimizers do not have sharp edges all over the place. Even for simple geometries (like two rectangles touching in a corner) strange things happen and only very few sharp edges appear. I’ll keep you posted in case the paper comes out (or appears as a preprint).

Christiane has some nice images which make this much more clear: For two circles edges are preserved if they are far enough away from each other. If they are close, the area “in between” them is filled and, moreover, obey this fuzzy boundary. I remember myself seeing effects like this in the output of TV-solvers and thinking “well, it seems that the algorithm is either not good or not converged yet – TV should output sharp edges!”. For a star-shaped shape (well, actually a star) the output looks like this. The corners are not only rounded but also blurred and this is true both for the “outer” corners and the “inner” corners.

So, if you have any TV-minimizing code, go ahead and check if your code actually does the right things on images like this!
Moreover, I would love to see similar results for more complicated extensions of TV like Total Generalized Variation, I treated here.

Some time ago I picked up the phrase Ivanov regularization. Starting with an operator $A:X\to Y$ between to Banach spaces (say) one encounters the problem of instability of the solution of $Ax=y$ if $A$ has non-closed range. One dominant tool to regularize the solution is called Tikhonov regularization and consists of minimizing the functional $\|Ax - y^\delta\|_Y^p + \alpha \|x\|_Y^q$. The meaning behind these terms is as follows: The term $\|Ax -y^\delta \|_Y^p$ is often called discrepancy and it should be not too large to guarantee, that the “solution” somehow explains the data. The term $\|x\|_Y^q$ is often called regularization functional and shall not be too large to have some meaningful notion of “solution”. The parameter $\alpha>0$ is called regularization parameter and allows weighting between the discrepancy and regularization.

For the case of Hilbert space one typically chooses $p=q=2$ and gets a functional for which the minimizer is given more or less explicitly as $x_\alpha = (A^*A + \alpha I)^{-1} A^* y^\delta$.

The existence of this explicit solution seems to be one of the main reasons for the broad usage of Tikhonov regularization in the Hilbert space setting.

Another related approach is sometimes called residual method, however, I would prefer the term Morozov regularization. Here one again balances the terms “discrepancy” and “regularization” but in a different way: One solves $\min \|x\|_X\ \text{s.t.}\ \|Ax-y^\delta\|_Y\leq \delta.$

That is, one tries to find an $x$ with minimal norm which explains the data $y^\delta$ up to an accuracy $\delta$. The idea is, that $\delta$ reflects the so called noise level, i.e. an estimate of the error which is made during the measurment of $y$. One advantage of Morozov regularization over Tikhonov regularization is that the meaning of the parameter $\delta>0$ is much clearer that the meaning of $\alpha>0$. However, there is no closed form solution for Morozov regularization.

Ivanov regularization is yet another method: solve $\min \|Ax-y^\delta\|_Y\ \text{s.t.}\ \|x\|_X \leq \tau.$

Here one could say, that one wants to have the smallest discrepancy among all $x$ which are not too “rough”.

Ivanov regularization in this form does not have too many appealing properties: The parameter $\tau>0$ does not seem to have a proper motivation and moreover, there is again no closed form solution.

However, recently the focus of variational regularization (as all these method may be called) has shifted from using norms to the use of more general functionals. For example one considers Tikhonov in an abstract form as minimizing $S(Ax,y^\delta) + \alpha R(x)$

with a “general” similarity measure $S$ and a general regularization term $R$, see e.g. the dissertation of Christiane Pöschl (which can be found here, thanks Christiane) or the works of Jens Flemming. Prominent examples for the similarity measure are of course norms of differences or the Kullback-Leibler divergence or the Itakura-Saito divergence which are both treated in this paper. For the regularization term one uses norms and semi-norms in various spaces, e.g. Sobolev (semi-)norms, Besov (semi-)norms, the total variation seminorm or $\ell^p$ norms.

In all these cases, the advantage of Tikhonov regularization of having a closed form solution is not there anymore. Then, the most natural choice would be, in my opinion, Morozov regularization, because one may use the noise level directly as a parameter. However, from a practical point of view one also should care about the problem of calculating the minimizer of the respective problems. Here, I think that Ivanov regularization is important again: Often the similarity measure $S$ is somehow smooth but the regularization term $R$ is nonsmooth (e.g. for total variation regularization or sparse regularization with $\ell^p$-penalty). Hence, both Tikhononv and Morozov regularization have a nonsmooth objective function. Somehow, Tikhonov regularization is still a bit easier, since the minimization is unconstrained. Morozov regularization has a constraint which is usually quite difficult to handle. E.g. it is usually difficult (is it probably even ill posed?) to project onto the set defined by $S(Ax,y^\delta)\leq \delta$. Ivanov regularization has a smooth objective functional (at least if the similarity measure is smooth) and a constraint which is usually somehow simple (i.e. projections are not too difficult to obtain).

Now, I found, that all thee methods, Tikhonov, Morozov and Ivanov regularizazion are all treated in the book “Theory of linear ill-posed problems and its applications” by V. K. Ivanov,V. V. Vasin and Vitaliĭ Pavlovich Tanana in section 3.2, 3.3 and 3.4 respectively. Ivanov regularization goes under the name “method of quasi solutions” (section 3.2) and Morozov regularization is called “Method of residual”(section 3.4). Well, I think I should read these sections a bit closer now…