Some time ago I picked up the phrase Ivanov regularization. Starting with an operator between to Banach spaces (say) one encounters the problem of instability of the solution of if has non-closed range. One dominant tool to regularize the solution is called** Tikhonov regularization** and consists of minimizing the functional . The meaning behind these terms is as follows: The term is often called discrepancy and it should be not too large to guarantee, that the “solution” somehow explains the data. The term is often called regularization functional and shall not be too large to have some meaningful notion of “solution”. The parameter is called regularization parameter and allows weighting between the discrepancy and regularization.

For the case of Hilbert space one typically chooses and gets a functional for which the minimizer is given more or less explicitly as

.

The existence of this explicit solution seems to be one of the main reasons for the broad usage of Tikhonov regularization in the Hilbert space setting.

Another related approach is sometimes called **residual method**, however, I would prefer the term **Morozov regularization**. Here one again balances the terms “discrepancy” and “regularization” but in a different way: One solves

That is, one tries to find an with minimal norm which explains the data up to an accuracy . The idea is, that reflects the so called noise level, i.e. an estimate of the error which is made during the measurment of . One advantage of Morozov regularization over Tikhonov regularization is that the meaning of the parameter is much clearer that the meaning of . However, there is no closed form solution for Morozov regularization.

**Ivanov regularization** is yet another method: solve

Here one could say, that one wants to have the smallest discrepancy among all which are not too “rough”.

Ivanov regularization in this form does not have too many appealing properties: The parameter does not seem to have a proper motivation and moreover, there is again no closed form solution.

However, recently the focus of variational regularization (as all these method may be called) has shifted from using norms to the use of more general functionals. For example one considers Tikhonov in an abstract form as minimizing

with a “general” similarity measure and a general regularization term , see e.g. the dissertation of Christiane Pöschl (which can be found here, thanks Christiane) or the works of Jens Flemming. Prominent examples for the similarity measure are of course norms of differences or the Kullback-Leibler divergence or the Itakura-Saito divergence which are both treated in this paper. For the regularization term one uses norms and semi-norms in various spaces, e.g. Sobolev (semi-)norms, Besov (semi-)norms, the total variation seminorm or norms.

In all these cases, the advantage of Tikhonov regularization of having a closed form solution is not there anymore. Then, the most natural choice would be, in my opinion, Morozov regularization, because one may use the noise level directly as a parameter. However, from a practical point of view one also should care about the problem of calculating the minimizer of the respective problems. Here, I think that Ivanov regularization is important again: Often the similarity measure is somehow smooth but the regularization term is nonsmooth (e.g. for total variation regularization or sparse regularization with -penalty). Hence, both Tikhononv and Morozov regularization have a nonsmooth objective function. Somehow, Tikhonov regularization is still a bit easier, since the minimization is unconstrained. Morozov regularization has a constraint which is usually quite difficult to handle. E.g. it is usually difficult (is it probably even ill posed?) to project onto the set defined by . Ivanov regularization has a smooth objective functional (at least if the similarity measure is smooth) and a constraint which is usually somehow simple (i.e. projections are not too difficult to obtain).

Now, I found, that all thee methods, Tikhonov, Morozov and Ivanov regularizazion are all treated in the book “Theory of linear ill-posed problems and its applications” by V. K. Ivanov,V. V. Vasin and Vitaliĭ Pavlovich Tanana in section 3.2, 3.3 and 3.4 respectively. Ivanov regularization goes under the name “method of quasi solutions” (section 3.2) and Morozov regularization is called “Method of residual”(section 3.4). Well, I think I should read these sections a bit closer now…

May 23, 2011 at 9:15 am

I think \tau parameter in Ivanov regularization has Bayesian motivation, the same as Tikhonov. It would be dispersion parameter of thick-tailed distribution. It also arise naturally in practical problems; could be in more general form ||x-x_0|| <= \tau

From the numerical point of view it's also more convenient then Morozov – just apply interior point method for nonlinear optimization.

May 23, 2011 at 9:22 am

Well, I am not familiar with the term “dispersion parameter” and Bayesian estimation is not my strength. However: Do you have any pointers to literature?

Concerning more general functionals: There is this paper “Why least squares and maximum entropy?” by Imre Csiszar, where he motivates what kind of functionals my be used as regularizers.

May 23, 2011 at 10:54 am

I meant “scale parameter”, analog of variance in non-normal distribution

For Tikhnonv reg Bayesian meaning is straightforward – it’s a maximum likelihood estimator with a prior normal distribution

(google gives a plenty of notes on “Bayesian interpretation of Tikhonov regularization”)

For Ivanov Bayesian prior would be uniform distribution in tau-ball.

The literature – I’d like to see some too The general direction here would probably be “robust statistics” and “Bayesian estimators”, but I havn’t seen anything specific about Ivanov. I would especially interested in it because of connection with interior point method.

PS I’m not expert in statistics either – I’m numerical methods guy.

August 24, 2011 at 9:52 am

[…] he introduced the three forms of variational regularization which I usually call Tikhonov, Ivanov and Morozov regularization (on slide 34) and introduced the Pareto front (under the name it usually has in discrete ill-posed […]

April 4, 2012 at 12:10 pm

[…] Nadja Worliczek and myself. I have discussed some parts of this paper alread on this blog here and here. In this paper we tried to formalize the notion of “a variational method” for […]

August 22, 2012 at 5:55 pm

[…] |x|_1 text{such that} |y|_1leqalpha, x+Uy=z_0 end{equation*} (which I would call this the Ivanov version of the Tikhonov-problem (1)). This allows for precise exploitation of prior knowledge by […]

December 31, 2014 at 12:27 am

What is the purpose for using a penalty parameter [math]C[/math] in SVM?In this stack exchange answer [1], I give a somewhat simpler way to understand how the SVM optimization is derived. However, there is another way of deriving SVM. This comes from the ideas developed over a decade or two on Empirical Risk Minimization […