Today I’d like to collect some comments one a few papers I stumbled upon recently on the arXiv.
1. TGV minimizers in 1D
First, about a month ago two very similar paper appeared in the same week:
- arXiv/1309.5900: A study of the one dimensional total generalised variation regularisation problem by Konstantinos Papafitsoros and Kristian Bredies
- arXiv/1309.7152: Exact Solutions of One-Dimensional TGV by Christiane Pöschl and Otmar Scherzer
Both papers treat the recently proposed “total generalized variation” model (which is a somehow-but-not-really-higher-order generalization of total variation). The total variation of a function (
) is defined by duality
(Note that the demanded high regularity of the test functions is not essential here, as we take a supremum over all these functions under the only, but important, requirement that the functions are bounded. Test functions from
would also do.)
Several possibilities for extensions and generalization of the total variation exist by somehow including higher order derivatives. The “total generalized variation” is a particular successful approach which reads as (now using two non-negative parameter which do a weighting):
To clarify some notation: are the symmetric
matrices,
is the negative adjoint of
which is the differential operator that collects all partial derivatives up to the
-th order in a
-tensor. Moreover
is some matrix norm (e.g. the Frobenius norm) and
is some vector norm (e.g. the 2-norm).
Both papers investigate so called denoising problems with TGV penalty and discrepancy, i.e. minimization problems
for a given . Moreover, both papers treat the one dimensional case and investigate very special cases in which they calculate minimizers analytically. In one dimension the definition of
becomes a little more familiar:
Some images of both papar are really similar: This one from Papafitsoros and Bredies
and this one from Pöschl and Scherzer
Although both paper have a very similar scopes it is worth to read both. The calculations are tedious but both paper try to make them accessible and try hard (and did a good job) to provide helpful illustrations. Curiously, the earlier paper cites the later one but not conversely…
2. Generalized conditional gradient methods
Another paper I found very interesting was
This paper shows a nice duality which I haven’t been aware of, namely the one between the subgradient descent methods and conditional gradient methods. In fact the conditional gradient method which is treated is a generalization of the conditional gradient method which Kristian and I also proposed a while ago in the context of -minimization in the paper Iterated hard shrinkage for minimization problems with sparsity constraints: To minimize the sum
with a differentiable and a convex
for which the subgradient of
is easily invertible (or, put differently, for which you can minimize
easily), perform the following iteration:
- At iterate
linearize
but not
and calculate a new point
by
- Choose a stepsize
and set the next iterate as a convex combination of
and
Note that for and indicator function
you obtain the conditional gradient method (also known as Frank-Wolfe method). While Kristian and I derived convergence with an asymptotic rate for the case of and
strongly coercive, Francis uses the formulation
the assumption that the dual
of
has a bounded effective domain (which say that
has linear growth in all directions). With this assumption he obtains explicit constants and rates also for the primal-dual gap. It was great to see that eventually somebody really took the idea from the paper Iterated hard shrinkage for minimization problems with sparsity constraints (and does not think that we do heuristics for
minimization…).