Coming back to regularization, especially Ivanov regularization. Recall that I used the term Ivanov regularization for the minimization problem

I again stumbled upon some reference: It seems that in the case that the constraint defines a *compact* set, this method is usually referred to as “method of quasi solutions”. More precisely, I found this in “Elements of the theory of inverse problems” by A.M. Denisov, Chapter 6. There he uses metric spaces and proves the following:

Theorem 1Let be metric spaces with metrics , , respectively and continuous. Furthermore let be compact, be in the range of and assume that is the unique solution of which lies in . Finally for a with define and . Then it holds for that

Remark 1Before we prove this theorem, we relate is to what I called Ivanov regularization above: The set is encoded in~(1) as and the “discrepancy measure” is simply the metric . Hence, let denote a solution of

Because is feasible for this problem it follows from that . Hence, . In other words: Ivanov regularization produces one element in the set . Now, the theorem says that every element in is a good approximation for (at least asymptotically).

*Proof:* We take a sequence and assume to the contrary that there exist such that for every there exists such that it holds that . Since all lie in which is compact, there is a convergent subsequence with limit . We obtain . However, this contradicts the assumption: d_Y(A\bar x,Ax^\dagger) & = &d_Y(A\bar x,y^\dagger) = \lim_{n\rightarrow \infty} d_Y(Ax_{\delta_n},y^\dagger) \nonumber

& \leq &\lim_{n\rightarrow \infty} d_Y(Ax_{\delta_n},y^{\delta_n}) + d_Y(y^{\delta_n},y^\dagger) \leq \lim_{n\rightarrow\infty}2\delta_n =0.

Coming back to the interpretation of the Theorem~1 and Ivanov regularization: Instead of Ivanov regularization, one could also use the following feasibility problem: Find an such that both and . For the case of vector spaces and and a convex set , this would be a convex feasibility problem which one may attack by available methods.

A further important remark is that we did not assume any linearity on (of course: we did not even assume a linear structure on or ). Hence, the theorem seem very powerful: There is no regularization parameter involved and one still gets convergence to the true solution! However, one of the assumptions in the theorem is somehow strong: The uniqueness of . To illustrate this we consider a special case:

Example 1Let and be real (or complex) vector spaces and be linear with non-trivial null space. Furthermore, assume that is convex and compact and consider scaled versions for . Then the set of solutions of is an affine space in and there are three cases for the intersection of this set and :

- The intersection is empty.
- The intersection is a convex set and contains infinitely many elements.
- The intersection contains exactly one element.

The last case occurs precisely, when the affine space of solution is tangential to . Loosely speaking, one may say that this case only occurs, if the set is scaled precisely to the right size such that is only touches the affine space of solutions.

Another strong assumption in Theorem~1 is that the set is compact. First there is a way to somehow relax this condition. Basically, we only need compactness to obtain the converging subsequence. Hence, one could try to work with a weaker topology on (which would result in a weaker notion of compactness) and then obtain a limit of a subsequence which converges in the weaker sense only. Then one would need some tool to deduce that the weak limit is indeed a solution. This strategy work, for example in Banach spaces:

Example 2Let and be reflexive Banach spaces and be linear, bounded and one-to-one. We use the set as prior knowledge on the solution of . Moreover, we use the metrics induced the norms of and , respectively: and .

Obviously, is not compact (if is of infinite dimension) but it is weakly compact (and by the Eberlein-Smulian theorem also weakly-sequentially compact). In the situation of the proof of Theorem~1 we only get a weakly converging subsequence . However, a linear operator is also weak-to-weak linear, and hence . While we only have a weakly converging sequence, we still can obtain the contradiction in~(0) since the norm is weakly lower semicontinuous.

Another way to justify the assumption that the solution is in a known compact set, is that in practice we always use a representation of the solution which only use a finite number of degrees of freedom (think of a Galerkin ansatz for example). However, this interpretation somehow neglects that we are interested in finding the *true* solution to the *true infinite dimensional problem* and that the discretization of the problem should be treated as a different issue. Just building on the regularizing effect of discretization will almost surely result in a method which stability properties depend on the resolution of the discretization.

Finally: Another good reference of these somehow ancient results in regularization theory is one of the first books on this topics: “Solutions of ill-posed problems” by Tikhonov and Arsenin (1977). While it took me some time to get used the type of presentation, I have to admit that it is really worth to read this book (and other translation of Russian mathematical literature).