Today I report on two things I came across here at ISMP:
- The first is a talk by Russell Luke on Constraint qualifications for nonconvex feasibility problems. Luke treated the NP-hard problem of sparsest solutions of linear systems. In fact he did not tackle this problem but the problem to find an
-sparse solution of an
system of equations. He formulated this as a feasibility-problem (well, Heinz Bauschke was a collaborator) as follows: With the usual malpractice let us denote by
the number of non-zero entries of
. Then the problem of finding an
-sparse solution to
is:
In other words: find a feasible point, i.e. a point which lies in the intersection of the two sets. Well, most often feasibility problems involve convex sets but here, the first one given by this “
-norm” is definitely not convex. One of the simplest algorithms for the convex feasibility problem is to alternatingly project onto both sets. This algorithm dates back to von Neumann and has been analyzed in great detail. To make this method work for non-convex sets one only needs to know how to project onto both sets. For the case of the equality constraint
one can use numerical linear algebra to obtain the projection. The non-convex constraint on the number of non-zero entries is in fact even easier: For
the projection onto
consists of just keeping the
largest entries of
while setting the others to zero (known as the “best
-term approximation”). However, the theory breaks down in the case of non-convex sets. Russell treated problem in several papers (have a look at his publication page) and in the talk he focused on the problem of constraint qualification, i.e. what kind of regularity has to be imposed on the intersection of the two sets. He could shows that (local) linear convergence of the algorithm (which is observed numerically) can indeed be justified theoretically. One point which is still open is the phenomenon that the method seems to be convergent regardless of the initialization and that (even more surprisingly) that the limit point seems to be independent of the starting point (and also seems to be robust with respect to overestimating the sparsity
). I wondered if his results are robust with respect to inexact projections. For larger problems the projection onto the equality constraint
are computationally expensive. For example it would be interesting to see what happens if one approximates the projection with a truncated CG-iteration as Andreas, Marc and I did in our paper on subgradient methods for Basis Pursuit.
- Joel Tropp reported on his paper Sharp recovery bounds for convex deconvolution, with applications together with Michael McCoy. However, in his title he used demixing instead of deconvolution (which, I think, is more appropriate and leads to less confusion). With “demixing” they mean the following: Suppose you have two signals
and
of which you observe only the superposition of
and a unitarily transformed
, i.e. for a unitary matrix
you observe
Of course, without further assumptions there is no way to recover
and
from the knowledge of
and
. As one motivation he used the assumption that both
and
are sparse. After the big bang of compressed sensing nobody wonders that one turns to convex optimization with
-norms in the following manner:
This looks a lot like sparse approximation: Eliminatingone obtains the unconstraint problem \begin{equation*} \min_y \|z_0-Uy\|_1 + \lambda \|y\|_1. \end{equation*}
Phrased differently, this problem aims at finding an approximate sparse solution of
such that the residual (could also say “noise”)
is also sparse. This differs from the common Basis Pursuit Denoising (BPDN) by the structure function for the residual (which is the squared
-norm). This is due to the fact that in BPDN one usually assumes Gaussian noise which naturally lead to the squared
-norm. Well, one man’s noise is the other man’s signal, as we see here. Tropp and McCoy obtained very sharp thresholds on the sparsity of
and
which allow for exact recovery of both of them by solving (1). One thing which makes their analysis simpler is the following reformulation: The treated the related problem \begin{equation*} \min_{x,y} \|x\|_1 \text{such that} \|y\|_1\leq\alpha, x+Uy=z_0 \end{equation*} (which I would call this the Ivanov version of the Tikhonov-problem (1)). This allows for precise exploitation of prior knowledge by assuming that the number
is known.
First I wondered if this reformulation was responsible for their unusual sharp results (sharper the results for exact recovery by BDPN), but I think it’s not. I think this is due to the fact that they have this strong assumption on the “residual”, namely that it is sparse. This can be formulated with the help of
-norm (which is “non-smooth”) in contrast to the smooth
-norm which is what one gets as prior for Gaussian noise. Moreover, McCoy and Tropp generalized their result to the case in which the structure of
and
is formulated by two functionals
and
, respectively. Assuming a kind of non-smoothness of
and
the obtain the same kind of results and especially matrix decomposition problems are covered.