Some remark before you read this post: It is on a very specialized topic and only presents a theoretical insight which seems to be of no practical value whatsoever. Continue at your own risk of wasting your time.

Morozov’s discrepancy principle is a means to choose the regularization parameter when regularizing inverse ill-posed problems. To fix the setting, we consider two Hilbert spaces and and a bounded linear operator . We assume that the range of is a non-closed subset of . As a consequence of the Bounded Inverse Theorem the pseudo-inverse (defined on ) is unbounded.

This is the classical setting for linear inverse problems: We have a process, modeled by , such that the quantity we are interested in gives rise to on output . We are able to measure but, as it is always the case, the is some noise introduced in the measurement process and hence, we have access to a measured quantity which is a perturbation of . Our goal is, to approximate from the knowledge of . Note that simply taking is not an option, since firstly is not guaranteed to be in the domain of the pseudo-inverse and, somehow even more severely and also more practical, the unboundedness of will lead to a severe (in theory infinitely large) amplification of the noise, rendering the reconstruction useless.

The key idea is to approximate the pseudo-inverse by a family of bounded operators with the hope that one may have

(Note that the assumption is just a convention. It says that we assume that is a closer approximation to the closer is to zero.) Now, we have two tasks:

- Construct a good family of
*regularizing operators*and - devise a
*parameter choice*, i.e. a way to choose .

The famous Bakushinskii veto says that there is no parameter choice that can guarantee convergence in~(1) in the worst case and only uses the given data . The situation changes if one introduces knowledge about the *noise level* . (There is an ongoing debate if it is reasonable to assume that the noise level is known – my experience when working with engineers is that they are usually very good in quantifying the noise present in their system and hence, in my view the assumption that noise level is known is ok.)

One popular way to choose the regularization parameter in dependence on and is *Morozov’s discrepancy principle*:

Definition 1Morozov’s discrepancy principle states that one shall choose the regularization parameter such that

for some fixed .

In other words: You shall choose such that the reconstruction produces a discrepancy which is in the order of and slightly larger than the noise level.

Some years ago I wrote a paper about the use of Morozov’s discrepancy principle when using the augmented Lagragian method (aka Bregman iteration) as an iterative regularization method (where one can view the inverse of the iteration counter as a regularization parameter). The paper is Morozov’s principle for the augmented Lagrangian method applied to linear inverse problems (together with , the arxiv link is here). In that paper we derived an estimate for the (squared) error of and that behaves like

for some and the from Morozov’s discrepancy principle. The somehow complicated dependence on was a bit puzzling to me. One can optimize such that the error estimate is optimal. It turns out that attains the minimal value of about for about . I blamed the already quite complicated analysis of the augmented Lagragian method for this obviously much too complicated values (and in practice, using much closer to usually lead to much better results).

This term I teach a course on inverse problems and also covered Morozov’s discrepancy principle but this time for much simpler regularization methods, namely for *linear methods* such as, for example, Tikhonov regularization, i.e.

(but other linear methods exist). There I arrived at an estimate for the (squared) error of any of the form

Surprisingly, the dependence on for this basic regularization method is also not very simple. Optimizing the estimate over leads to an optimal value of about (with a minimal value of the respective constant of about ). Again, using closer to usually lead to much better results.

Well, these observations are not of great importance… I just found it curious to observe that the analysis of Morozov’s discrepancy principle seems to be inherently a bit more complicated than I thought…