May 2011
Monthly Archive
May 31, 2011
Coming back to regularization, especially Ivanov regularization. Recall that I used the term Ivanov regularization for the minimization problem

I again stumbled upon some reference: It seems that in the case that the constraint
defines a compact set, this method is usually referred to as “method of quasi solutions”. More precisely, I found this in “Elements of the theory of inverse problems” by A.M. Denisov, Chapter 6. There he uses metric spaces and proves the following:
Theorem 1 Let
be metric spaces with metrics
,
, respectively and
continuous. Furthermore let
be compact,
be in the range of
and assume that
is the unique solution of
which lies in
. Finally for a
with
define
and
. Then it holds for
that

Remark 1 Before we prove this theorem, we relate is to what I called Ivanov regularization above: The set
is encoded in~(1) as
and the “discrepancy measure”
is simply the metric
. Hence, let
denote a solution of

Because
is feasible for this problem it follows from
that
. Hence,
. In other words: Ivanov regularization produces one element in the set
. Now, the theorem says that every element in
is a good approximation for
(at least asymptotically).
Proof: We take a sequence
and assume to the contrary that there exist
such that for every
there exists
such that it holds that
. Since all
lie in
which is compact, there is a convergent subsequence
with limit
. We obtain
. However, this contradicts the assumption: d_Y(A\bar x,Ax^\dagger) & = &d_Y(A\bar x,y^\dagger) = \lim_{n\rightarrow \infty} d_Y(Ax_{\delta_n},y^\dagger) \nonumber
& \leq &\lim_{n\rightarrow \infty} d_Y(Ax_{\delta_n},y^{\delta_n}) + d_Y(y^{\delta_n},y^\dagger) \leq \lim_{n\rightarrow\infty}2\delta_n =0. 
Coming back to the interpretation of the Theorem~1 and Ivanov regularization: Instead of Ivanov regularization, one could also use the following feasibility problem: Find an
such that both
and
. For the case of vector spaces
and
and a convex set
, this would be a convex feasibility problem which one may attack by available methods.
A further important remark is that we did not assume any linearity on
(of course: we did not even assume a linear structure on
or
). Hence, the theorem seem very powerful: There is no regularization parameter involved and one still gets convergence to the true solution! However, one of the assumptions in the theorem is somehow strong: The uniqueness of
. To illustrate this we consider a special case:
Example 1 Let
and
be real (or complex) vector spaces and
be linear with non-trivial null space. Furthermore, assume that
is convex and compact and consider scaled versions
for
. Then the set of solutions of
is an affine space in
and there are three cases for the intersection of this set and
:
- The intersection is empty.
- The intersection is a convex set and contains infinitely many elements.
- The intersection contains exactly one element.
The last case occurs precisely, when the affine space of solution is tangential to
. Loosely speaking, one may say that this case only occurs, if the set
is scaled precisely to the right size such that is only touches the affine space of solutions.
Another strong assumption in Theorem~1 is that the set
is compact. First there is a way to somehow relax this condition. Basically, we only need compactness to obtain the converging subsequence. Hence, one could try to work with a weaker topology on
(which would result in a weaker notion of compactness) and then obtain a limit of a subsequence which converges in the weaker sense only. Then one would need some tool to deduce that the weak limit is indeed a solution. This strategy work, for example in Banach spaces:
Example 2 Let
and
be reflexive Banach spaces and
be linear, bounded and one-to-one. We use the set
as prior knowledge on the solution of
. Moreover, we use the metrics induced the norms of
and
, respectively:
and
.
Obviously,
is not compact (if
is of infinite dimension) but it is weakly compact (and by the Eberlein-Smulian theorem also weakly-sequentially compact). In the situation of the proof of Theorem~1 we only get a weakly converging subsequence
. However, a linear operator
is also weak-to-weak linear, and hence
. While we only have a weakly converging sequence, we still can obtain the contradiction in~(0) since the norm is weakly lower semicontinuous.
Another way to justify the assumption that the solution is in a known compact set, is that in practice we always use a representation of the solution which only use a finite number of degrees of freedom (think of a Galerkin ansatz for example). However, this interpretation somehow neglects that we are interested in finding the true solution to the true infinite dimensional problem and that the discretization of the problem should be treated as a different issue. Just building on the regularizing effect of discretization will almost surely result in a method which stability properties depend on the resolution of the discretization.
Finally: Another good reference of these somehow ancient results in regularization theory is one of the first books on this topics: “Solutions of ill-posed problems” by Tikhonov and Arsenin (1977). While it took me some time to get used the type of presentation, I have to admit that it is really worth to read this book (and other translation of Russian mathematical literature).
Like this:
Like Loading...
May 23, 2011
Recently Arnd Rösch and I organized the minisymposium “Parameter identification and nonlinear optimization” at the SIAM Conference on Optimization. One of the aims of this symposium was, to initiate more connections between the communities in optimal control of PDEs on the one hand and regularization of ill-posed problems on the other hand. To give a little bit of background, let me somehow formulate the “mother problems” in both fields:
Example 1 (Mother problem in optimal control of PDEs) We consider a bounded Lipschitz domain
in
(or
). Assume that we are given a target (or desired state)
which is a real valued function of
. Our aim is to find a function (or control)
(also defined on
) such that the solution of the equation

Moreover, our solution (or control) shall obey some pointwise bounds

This motivates the following constrained optimization problem

Often, also the regularized problem is considered: For a small
solve:

(This problem is also extensively treated section 1.2.1 in the excellent book “Optimal Control of Partial Differential Equations” by Fredi Tröltzsch.)
For inverse problems we may formulate:
Example 2 (Mother problem in inverse problems) Consider a bounded and linear operator
between two Hilbert spaces and assume that
has non-closed range. In this case, the pseudo-inverse
is not a bounded operator. Consider now, that we have measured data
that is basically a noisy version of “true data”
. Our aim is, to approximate a solution of
by the knowledge of
. Since
does no have a closed range, it is usually the case that
is not in the domain of the pseudo inverse and
simply does not make sense. A widely used approach, also treated in my previous post is Tikhonov regularization, that is solving for a small regularization parameter

Clearly both mother problems have a very similar mathematical structure: We may use the solution operator of the PDE, denote it by
, and restate the mother problem of optimal control of PDEs in a form similar to the mother problem of inverse problems. However, there are some important conceptual differences:
Desired state vs. data: In Example 1
is a desired state which, however, may not be reachable. In Example 2
is noisy data and hence, shall not reached as good as possible.
Control vs. solution: In Example 1 the result
is an optimal control. It’s form is not of prime importance, as long as it fulfills the given bounds and allows for a good approximation of
. In Example 2 the result
is the approximate solution itself (which, of course shall somehow explain the measured data
). It’s properties are itself important.
Regularization: In Example 1 the regularization is mainly for numerical reasons. The problem itself also has a solution for
. This is due to the fact that the set of admissible
for a weakly compact set. However, in Example 2 one may not choose
: First because the functional will not have a minimizer anymore and secondly one really does not want
as small as possible since
is corrupted by noise. Especially, the people from inverse problems are interested in the case in which both
and
. However, in optimal control of PDEs,
is often seen as a model parameter which ensures that the control has somehow a small energy.
These conceptual difference sometimes complicate the dialog between the fields. One often runs into discussion dead ends like “Why should we care about decaying
—it’s given?” or “Why do you need these bounds on
? This makes your problem worse and you may not reach to state as good as possible\dots”. It often takes some time until the involved people realize that they really pursue different goals, that the quantities which even have similar names are something different and that the minimization problems can be solved with the same techniques.
In our minisymposium we had the following talks:
- “Identification of an Unknown Parameter in the Main Part of an Elliptic PDE”, Arnd Rösch
- “Adaptive Discretization Strategies for Parameter Identification Problems in PDEs in the Context of Tikhonov Type and Newton Type Regularization”, Barbara Kaltenbacher
- “Optimal Control of PDEs with Directional Sparsity”, Gerd Wachsmuth
- “Nonsmooth Regularization and Sparsity Optimization” Kazufumi Ito
- “
Fitting for Nonlinear Parameter Identification Problems for PDEs”, Christian Clason
- “Simultaneous Identification of Multiple Coefficients in an Elliptic PDE”, Bastian von Harrach
Finally, there was my own talk “Error Estimates for joint Tikhonov and Lavrentiev Regularization of Parameter Identification Probelms” which is based on a paper with the similar name which is at http://arxiv.org/abs/0909.4648 and published in Applicable Analysis. The slides of the presentation are here (beware, there may be some wrong exponents in the pdf…).
In a nutshell, the message of the talk is: Bound both on the control/solution and the state/data may be added also to a Tikhonov-regularized inverse problem. If the operator has convenient mapping properties then the bounds will eventually be inactive if the true solution has the same property. Hence, the known estimates for usual inverse problems are asymptotically recovered.
Like this:
Like Loading...
May 22, 2011
There are still some things left, I wanted to add about the issue of weak-* convergence in
, non-linear distortions and Young measures. The first is, that Young measures are not able to describe all effects of weak-* convergence, namely, the notion does not handle contractions properly. The second thing is, that there is an alternative approach based on the
-function which I also find graphically appealing.
1. Concentrations and Young measures
One can distinguish several “modes” that a sequence of functions can obey: In this blog entry of Terry Tao he introduces four more modes apart from oscillation:
- escape to horizontal infinity
- escape to width infinity
- escape to vertical infinity
- typewriter sequence
1.1. Escape to horizontal infinity
This mode is most easily described by the sequence
, i.e. the characteristic functions on an interval of unit length which escapes to infinity. Obviously, this sequence does not convergence in any
norm and its weak convergence depends on
:
For
the sequence does converge weakly (weakly-* for
) to zero. This can be seen as follows: Assume that for some non-negative
(with
) we have
. The we get with Hölders inequality that
. But this contradicts the fact that
.
For
the sequence does not convergence weakly to zero as can be seen by testing it with the function
and also does not converge weakly at all (test with
and observe that the dual pairings do not converge).
However, this type of convergence does not occur in bounded domains, and hence, can not be treated with Young measures as they have been introduced the in my previous entry.
1.2. Escape to width infinity
The paradigm for this mode of convergence is
. This sequence even convergence strongly in
for
but not in strongly in
. However, it converges weakly to 0 in
. This mode needs, similar to the previous mode, an unbounded domain.
1.3. Escape to vertical infinity
The prime example, normalized in
, for this mode is
. By testing with continuous functions (which is enough by density) one sees that the weak limit is zero.
If one wants to assign some limit to this sequence
one can say that the measure
does converge weakly in the sense of measures to
, i.e. twice the point-mass in zero.
Now, what does the Young measure say here?
We check narrow convergence of the Young measures
by testing with a function of type
for a Borel set
and bounded continuous function
. Then we get for

Hence,

We conclude that this mode of convergence can not be seen by Young measures. As Attouch et. al say in Section 4.3.7: “Young measures do not capture concentrations”.
1.4. Typewriter sequence
A typewriter sequence on an interval (as described in Example 4 of this blog entry of Terry Tao) is a sequence of functions which are mostly zero and the non-zero places revisit every place of the interval again, however, with smaller support and integral. This is an example of a sequence which converges in the
-norm but not pointwise at any point. However, this mode of convergence is not very interesting with respect to Young measures. It basically behaves like “Escape to vertical infinity” above.
2. Weak convergence via the
-function
While Young measures put a uniformly distributed measure on the graph of the function, and thus, are a more “graphical” representation of the function, the approach described now uses the area between the graph and the
-axis.
We consider an open and bounded domain
. Now we define the
-function as

The function looks like this:

We then associate to a given function
the function
. Graphically, this function has the value
, if the value
is positive and between zero and
and it is
, if
is negative and again between zero and
. In other words: the function
is piecewise constant of the area between zero and the graph of
encoding the sign of
. For the functions
from this Example 1 in the previous post this looks like this:

Similar to the approach via Young measure, we now consider the sequence of the new objects, i.e. the sequence of
and use a weak form of convergence here. For Young measures we used narrow convergence and here we use simple weak-* convergence.
On can show the following lemma:
Lemma 1 Assume that
converges weakly-* in
. Then, for the weak-* limit of the mappings
, denoted by
there exists a probability measure
such that

The proof (in a slightly different situation) can be found in Kinetic Formulation of Conservation Laws, Lemma 2.3.1.
Example 1 We again consider Example 1 from my previous post:

The graph of some
and the corresponding function
was shown above. Obviously the weak-* limit of these
-functions
is (in the case
) given by

This can be illustrated as

Now take the weak derivative with respect to
(which is, as the function
itself, independent of
) to get

and, comparing with Lemma~1, we see

Cool: That is precisely the same limit as obtained by the Young measure!
Well, the observation in this example is not an accident and indeed this approach is closely related to Young measures. Namely, it holds that
.
Maybe, I’ll come back to the proof of this fact later (which seemed not too hard, but used a different definitions of a Young measure I used here).
To conclude: Both the approach via Young measures and the approach via the
-function lead to the same new understanding of weak-* limits in
. This new understanding is a little bit deeper than the usual one as it allows to goes well with non-linear distortions of functions. And finally: Both approaches use a geometric approach: Young measures put a equidistributed measure on the graph of the function and the
-function puts
between the graph and the
-axis.
Like this:
Like Loading...
May 11, 2011
This entry is not precisely about some thing I stumbled upon but about some thing a that I wanted to learn for some time now, namely Young measures. Lately I had a several hour train ride and I had the book Kinetic Formulation of Conservation Laws with me.
While the book is about hyperbolic PDEs and their formulation as kinetic equation, it also has some pointers to Young measures. Roughly, Young measures are a way to describe weak limits of functions and especially to describe how these weak limits behave under non-linear functions, and hence, we start with this notation.
1. Weak convergence of functions
We are going to deal with sequences of function
in spaces
for some open bounded domain
and some
.
For
the dual space of
is
with
and the dual pairing is

Hence, a sequence
converges weakly in
to
, if for all
it holds that

We denote weak convergence (if the space is clear) with
.
For the case
one usually uses the so-called weak-* convergence: A sequence
in
converges weakly-* to
, if for all
it holds that

The reason for this is, that the dual space of
is not easily accessible as it can not be described as a function space. (If I recall correctly, this is described in “Linear Operators”, by Dunford and Schwarz.) Weak-* convergence will be denoted by
.
In some sense, it is enough to consider weak-* convergence in
to understand what’s that about with Young measures and I will only stick to this kind of convergence here.
Example 1 We consider
and two values
. We define a sequence of functions which jumps between these two values with an increasing frequency:

The functions
look like this:

To determine the weak limit, we test with very simple functions, lets say with
. Then we get

Hence, we see that the weak-* limit of the
(which is, by the way, always unique) has no other chance than being

In words: the weak-* limit converges to the arithmetic mean of the two values between which the functions oscillate.
2. Non-linear distortions
Now, the norm-limit behaves well under non-linear distortions of the functions. Let’s consider a sequence
which converges in norm to some
. That is,
. Since this means that
we see that for any boundedcontinuous function
we also have
and hence
.
The same is totally untrue for weak-* (and also weak) limits:
Example 2 Consider the same sequence
as in example~1which has the weak-* limit
. As a nonlinear distortion we take
which gives

Now we see

The example can be made a little bit more drastically by assuming
which gives
. Then, for every
with
we have
. However, with such a
we may construct any constant value
for the weak-* limit of
(take, e.g.
,
).
In fact, the relation
is only true for affine linear distortions
(unfortunately I forgot a reference for this fact\dots).
It arises the question, if it is possible to describe the weak-* limits of distortions of functions and if fact, this will be possible with the notions of Young measure.
3. Young measures
In my understanding, Young measures are a method to view a function somehow a little bit more geometrically in giving more emphasis on the graph of the function rather than is mapping property.
We start with defining Young measures and illustrate how they can be used to describe weak(*) limits. In what follows we use
for the Lebesgue measure on the (open and bounded) set
. A more through description in the spirit of this section is Variational analysis in Sobolev and BV spaces by Attouch, Buttazzo and Michaille.
Definition 1 (Young measure) A positive measure
on
is called a Young measureif for every Borel subset
of
it holds that

Hence, a Young measure is a measure such that the measure of every box
is determined by the projection of the box onto the set
, i.e. the intersection on
with
which is, of course,
:

There are special Young measures, namely these, who are associated to functions. Roughly spoken, a Young measure associated to a function
is a measure which is equidistributed on the graph of
.
Definition 2 (Young measure associated to
) For a Borel measurable function
we define the associated Young measure
by defining for every continuous and bounded function 

It is clear that
is a Young measure: Take
and approximate the characteristic function
by smooth functions
. Then

The left hand side converges to
while the right hand side converges to
as claimed.
The intuition that a Young measure associated to a function is an equidistributed measure on the graph can be made more precise by “slicing” it:
Definition 3 (Slicing a measure) Let
be a positive measure on
and let
be its projection onto
(i.e.
). Then
is sliced into measures
, i.e. it holds:
- Each
is a probability measure.
- The mapping
is measurable for every continuous
and it holds that

The existence of the slices is, e.g. proven in Variational analysis in Sobolev and BV spaces, Theorem 4.2.4.
For the Young measure
associated to
, the measure
in Definition~3is
and hence:

On the other hand:

and we see that
slices into

and this can be vaguely sketched:

4. Narrow convergence of Young measures and weak* convergence in
Now we ask ourself: If a sequence
converges weakly* in
, what does the sequence of associated Young measures do? Obviously, we need a notion for the convergence of Young measures. The usual notion here, is that of narrow convergence:
Definition 4 (Narrow convergence of Young measures) A sequence
of Young measures on
converges narrowly to
, if for all bounded and continuous functions
it holds that

Narrow convergence will also be denoted by
.
One may also use the non-continuous test functions of the form
with a Borel set
and a continuous and bounded
, leading to the same notion.
The set of Young measures is closed under narrow convergence, since we may test with the function
to obtain:

The next observation is the following:
Proposition 5 Let
be a bounded sequence in
. Then the sequence
of associated Young measures has a subsequence which converges narrowly to a Young measure
.
The proof uses the notion of tightness of sets of measures and the Prokhorov compactness theorem for Young measures (Theorem 4.3.2 in Variational analysis in Sobolev and BV spaces).
Example 3 (Convergence of the Young measures associated to Example 1) Consider the functions
from Example~1and the associated Young measures
. To figure out the narrow limit of these Young measures we test with a function
with a Borel set
and a bounded and continuous function
. We calculate
![\displaystyle \begin{array}{rcl} \int_{[0,1]\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu^{f_n}(x,y)} &= &\int_0^1\phi(x,f_n(x)){\mathrm d}{\mathfrak{L}(x)}\\ & = &\int_B\psi(f_n(x)){\mathrm d}{\mathfrak{L}(x)}\\ & \rightarrow &\mathfrak{L}(B)\frac{\psi(a)+\psi(b)}{2}\\ & = & \int_B\frac{\psi(a)+\psi(b)}{2}{\mathrm d}{\mathfrak{L}(x)}\\ & = & \int_{[0,1]}\int_{\mathbb R}\phi(x,y){\mathrm d}{\bigl(\tfrac{1}{2}(\delta_a+\delta_b)\bigr)(y)}{\mathrm d}{\mathfrak{L}(y)}. \end{array} \displaystyle \begin{array}{rcl} \int_{[0,1]\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu^{f_n}(x,y)} &= &\int_0^1\phi(x,f_n(x)){\mathrm d}{\mathfrak{L}(x)}\\ & = &\int_B\psi(f_n(x)){\mathrm d}{\mathfrak{L}(x)}\\ & \rightarrow &\mathfrak{L}(B)\frac{\psi(a)+\psi(b)}{2}\\ & = & \int_B\frac{\psi(a)+\psi(b)}{2}{\mathrm d}{\mathfrak{L}(x)}\\ & = & \int_{[0,1]}\int_{\mathbb R}\phi(x,y){\mathrm d}{\bigl(\tfrac{1}{2}(\delta_a+\delta_b)\bigr)(y)}{\mathrm d}{\mathfrak{L}(y)}. \end{array}](http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cbegin%7Barray%7D%7Brcl%7D+%5Cint_%7B%5B0%2C1%5D%5Ctimes%7B%5Cmathbb+R%7D%7D%5Cphi%28x%2Cy%29%7B%5Cmathrm+d%7D%7B%5Cmu%5E%7Bf_n%7D%28x%2Cy%29%7D+%26%3D+%26%5Cint_0%5E1%5Cphi%28x%2Cf_n%28x%29%29%7B%5Cmathrm+d%7D%7B%5Cmathfrak%7BL%7D%28x%29%7D%5C%5C+%26+%3D+%26%5Cint_B%5Cpsi%28f_n%28x%29%29%7B%5Cmathrm+d%7D%7B%5Cmathfrak%7BL%7D%28x%29%7D%5C%5C+%26+%5Crightarrow+%26%5Cmathfrak%7BL%7D%28B%29%5Cfrac%7B%5Cpsi%28a%29%2B%5Cpsi%28b%29%7D%7B2%7D%5C%5C+%26+%3D+%26+%5Cint_B%5Cfrac%7B%5Cpsi%28a%29%2B%5Cpsi%28b%29%7D%7B2%7D%7B%5Cmathrm+d%7D%7B%5Cmathfrak%7BL%7D%28x%29%7D%5C%5C+%26+%3D+%26+%5Cint_%7B%5B0%2C1%5D%7D%5Cint_%7B%5Cmathbb+R%7D%5Cphi%28x%2Cy%29%7B%5Cmathrm+d%7D%7B%5Cbigl%28%5Ctfrac%7B1%7D%7B2%7D%28%5Cdelta_a%2B%5Cdelta_b%29%5Cbigr%29%28y%29%7D%7B%5Cmathrm+d%7D%7B%5Cmathfrak%7BL%7D%28y%29%7D.+%5Cend%7Barray%7D+&bg=ffffff&fg=000000&s=0)
We conclude:

i.e. the narrow limit of the Young measures
is notthe constant function
but the measure
. This expression may be easier to digest in sliced form:

i.e. the narrow limit is something like the “probability distribution” of the values of the functions
. This can be roughly put in a picture:

Obviously, this notion of convergence goes well with nonlinear distortions:

Recall from Example~1: The weak-* limit of
was the constant function
, i.e.
![\displaystyle \phi\circ f_n \rightharpoonup^* \tfrac{\phi(a)+\phi(b)}{2}\chi_{[0,1]}. \displaystyle \phi\circ f_n \rightharpoonup^* \tfrac{\phi(a)+\phi(b)}{2}\chi_{[0,1]}.](http://s0.wp.com/latex.php?latex=%5Cdisplaystyle+%5Cphi%5Ccirc+f_n+%5Crightharpoonup%5E%2A+%5Ctfrac%7B%5Cphi%28a%29%2B%5Cphi%28b%29%7D%7B2%7D%5Cchi_%7B%5B0%2C1%5D%7D.+&bg=ffffff&fg=000000&s=0)
The observation from the previous example is in a similar way true for general weakly-* converging sequences
:
Theorem 6 Let
in
with
. Then it holds for almost all
that

In other words:
is the expectation of the probability measure
.
Like this:
Like Loading...
May 4, 2011
Some time ago I picked up the phrase Ivanov regularization. Starting with an operator
between to Banach spaces (say) one encounters the problem of instability of the solution of
if
has non-closed range. One dominant tool to regularize the solution is called Tikhonov regularization and consists of minimizing the functional
. The meaning behind these terms is as follows: The term
is often called discrepancy and it should be not too large to guarantee, that the “solution” somehow explains the data. The term
is often called regularization functional and shall not be too large to have some meaningful notion of “solution”. The parameter
is called regularization parameter and allows weighting between the discrepancy and regularization.
For the case of Hilbert space one typically chooses
and gets a functional for which the minimizer is given more or less explicitly as
.
The existence of this explicit solution seems to be one of the main reasons for the broad usage of Tikhonov regularization in the Hilbert space setting.
Another related approach is sometimes called residual method, however, I would prefer the term Morozov regularization. Here one again balances the terms “discrepancy” and “regularization” but in a different way: One solves

That is, one tries to find an
with minimal norm which explains the data
up to an accuracy
. The idea is, that
reflects the so called noise level, i.e. an estimate of the error which is made during the measurment of
. One advantage of Morozov regularization over Tikhonov regularization is that the meaning of the parameter
is much clearer that the meaning of
. However, there is no closed form solution for Morozov regularization.
Ivanov regularization is yet another method: solve

Here one could say, that one wants to have the smallest discrepancy among all
which are not too “rough”.
Ivanov regularization in this form does not have too many appealing properties: The parameter
does not seem to have a proper motivation and moreover, there is again no closed for solution.
However, recently the focus of variational regularization (as all these method may be called) has shifted from using norms, to the use of more general functionals. One even considers Tikhonov in an abstract form as minimizing

with a “general” similarity measure
and a general regularization term
, see e.g. the dissertation of Christiane Pöschl (which can be found here, thanks Christiane) or the works of Jens Flemming. Prominent examples for the similarity measure are of course norms of differences or the Kullback-Leibler divergence or the Itakura-Saito divergence which are both treated in this paper. For the regularization term one uses norms and semi-norms in various spaces, e.g. Sobolev (semi-)norms, Besov (semi-)norms, the total variation seminorm or
norms.
In all these cases, the advantage of Tikhonov regularization of having a closed form solution is not there anymore. Then, the most natural choice would be, in my opinion, Morozov regularization, because one may use the noise level directly as a parameter. However, from a practical point of view one also should care about the problem of calculating the minimizer of the respective problems. Here, I think that Ivanov regularization is important again: Often the similarity measure
is somehow smooth but the regularization term
is nonsmooth (e.g. for total variation regularization or sparse regularization with
-penalty). Hence, both Tikhononv and Morozov regularization have a nonsmooth objective function. Somehow, Tikhonov regularization is still a bit easier, since the minimization is unconstrained. Morozov regularization has a constraint which is usually quite difficult to handle. E.g. it is usually difficult (is it probably even ill posed?) to project onto the set defined by
. Ivanov regulaization has a smooth objective functional (at least if the similarity measure is smooth) and a constraint which is usually somehow simple (i.e. projections are not too difficult to obtain).
Now, I found, that all thee methods, Tikhonov, Morozov and Ivanov regularizazion are all treated in the book “Theory of linear ill-posed problems and its applications” by V. K. Ivanov,V. V. Vasin and Vitaliĭ Pavlovich Tanana in section 3.2, 3.3 and 3.4 respectively. Ivanov regularization goes under the name “method of quasi solutions” (section 3.2) and Morozov regularization is called “Method of residual”(section 3.4). Well, I think I should read these sections a bit closer now…
Like this:
Like Loading...
May 4, 2011
Posted by Dirk under
Math | Tags:
Math blog |
1 Comment
Will this be the start of a mathematical blog of myself? Maybe, but maybe not.
The reason for the start of this blog was basically the observation that other people use a blog in a way which helps them to do and especially organize their research. While searching things on the web, I occasionally stumble upon things I consider interesting and my usual procedure was to do one of the following things:
- Scribble down a note on a piece of paper which I put to the other small notes on my desk.
- Download some document and store it on some place one my computer (in fact I have a folder which I called “Archivieren” which means: “To be archived”.)
- Do nothing special but just try to remember the place where I found the information.
None of these ways seemed to be working well and I have the feeling that in every three cases it happened frequently that I did not use the information in the best way I could. I am going to try to use this blog as another option to keep track of things I find and think about. Let’s see how this will evolve…
Like this:
Like Loading...