November 2011
Monthly Archive
November 29, 2011
November 28, 2011
I stumbled upon the notion of “time varying channels” in signal processing and after reading a bit I realized that these are really interesting objects and appear in many different parts of mathematics. In this post I collect of few of their realizations and relations.
In signal processing a “time varying channel” is a mapping which maps a signal
to an output
via

Before we explain the name “time varying channel” we fix the notation for the Fourier transform I am going to use in this post:
Definition 1 For
we define the Fourier transform by

and denote the inverse Fourier transform by
or
. For a function
we denote with
and
the Fourier transform with respect to the first and second
-component, respectively.
Remark 1 In all what follows we do formal calculations with integral not caring about integrability. All calculation are justified in the case of Schwartz-functions and often hold in a much broader context of tempered distributions (this for example happens if the integrals represent Fourier transforms of functions).
The name “time varying channel” can be explained as follows: Consider a pure frequency as input:
. A usual linear channel gives as output a damped signal and the damping depends on the frequency
:
. If we send the pure frequency in our time varying channel we get

Hence, the time varying channel also damps the pure frequencies but with a time dependent factor.
Let’s start quite far away from signal processing:
1. Pseudo-differential operators
A general linear differential operator of order
on functions on
is defined with multiindex notation as

with coefficient functions
. Using Fourier inversion we get

and hence

For a general function (usually obeying some restrictions)
we call the corresponding
a pseudo-differential operator with symbol
.
2. As integral operators
By integral operator I mean something like

We plug the definition of the Fourier transform into
and obtain

Using the Fourier transform we can express the relation between
and
as

3. As “time varying convolution”
The convolution of two functions
and
is defined as

and we write “the convolution with
” as an operator
.
Defining

we deduce from (1)

4. As superposition of time-frequency shifts
Using that iterated Fourier transforms with respect to components give the Fourier transform, i.e.
, we obtain

From (1) we get

Before plugging this into
we define time shifts
and frequency shifts (or modulations)
as

With this we get

Hence,
is also a weighted superposition of
(time-frequency shifts) with weight
.
5. Back to time varying channels
Simple substitution brings us back the situation of a time varying channel

with
.
6. As superposition of product-convolution operators
Finally, I’d like to illustrate that this kind of operators can be seen as superposition of product convolution operators.
Introducing product operators
we define a product-convolution operator as a convolution with a function
followed by a multiplication with another function
:
.
To express
with product-convolution operators we choose an orthonormal basis which consists of tensor-products of functions
and develop
into this basis as

Then

and we obtain

Remark 2 The integral operators of the form
are indeed general objects as can be seen from the Schwartz kernel theorem. Every reasonable operator mapping Schwartz functions linearly onto the tempered distributions is itself a generalized integral operator.
I tried to capture the various relation between the first five representations time varying channels in a diagram (where I went out of motivation before filling in all fields…):

Like this:
Like Loading...
November 24, 2011
If you want to have a sparse solution to a linear system of equation and have heard of compressed sensing or sparse reconstruction than you probably know what to do: Get one of the many solvers for Basis Pursuit and be happy.
Basis Pursuit was designed as a convex approximation of the generally intractable problem of finding the sparsest solution (that is, the solution with the smallest number of non-zero entries). By abuse of notation, we define for 

(Because of
some people prefer the, probably more correct but also more confusing, notation
…).
Then, the sparsest solution of
is the solution of

and Basis Pursuit replaces
with “the closest convex proxy”, i.e.

The good thing about Basis Pursuit suit is, that is really gives the sparsest solution under appropriate conditions as is widely known nowadays. Here I’d like to present two simple examples in which the Basis Pursuit solution is
- not even close to the sparsest solution (by norm).
- not sparse at all.
1. A small bad matrix
We can build a bad matrix for Basis Pursuit, even in the case
: For a small
define

Of course, the sparsest solution is

while the solution of Basis Pursuit is

The summarize: For 

(There is also a least squares solution that has three non-zero entries and a one-norm slightly larger than 2.)
Granted, this matrix is stupid. Especially, its first column has a very small norm compared to the others. Ok, let’s construct a matrix with normalized columns.
2. A small bad matrix with normalized columns
Fix an integer
and a small
. We define a
-matrix

Ok, the first two columns do not have norm 1 yet, so we normalize them by multiplying with the right constant

(which is close to
) to get

Now we take the right hand side

and see what solutions to
are there.
First, there is the least squares solution
. This has only non-zero entries, the last
entries are slightly smaller than
and the first two are between
and
, hence,
(in fact, slightly larger).
Second, there is a very sparse solution

This has two non-zero entries and a pretty large one-norm:
.
Third there is a solution with small one-norm:

We have
non-zero entries and
. You can check that this
is also the unique Basis Pursuit solution (e.g. by observing that
is an element of
and that the first two entries in
are strictly smaller than 1 and positive – put differently, the vector
is dual certificate for
).
To summarize, for
it holds that

The geometric idea behind this matrix is as follows: We take
simple normalized columns (the identity part in
) which sum up to the right hand side
. Then we take two normalized vectors which are almost orthogonal to
but have
in their span (but one needs huge factors here to obtain
).
Well, this matrix looks very artificial and indeed it’s constructed for one special purpose: To show that minimal
-norm solution are not always sparse (even when a sparse solution exists). It’s some kind of a hobby for me to construct instances for sparse reconstruction with extreme properties and I am thinking about a kind of “gallery” of these instances (probably extending the “gallery” command in Matlab).
By the way: if you want to play around with this matrix, here is the code
n = 100;
epsilon = sqrt(8/(n^2-n))+0.1;
c = 1/sqrt(2+n*epsilon^2/4);
A = zeros(n,n+2);
A(1:2,1:2) = ([1 -1;-1,1]+epsilon/2)*c;
A(3:n,1:2) = epsilon/2*c;
A(1:n,3:n+2) = eye(n);
b = ones(n,1);
Like this:
Like Loading...
November 22, 2011
Posted by Dirk under
Math | Tags:
google scholar |
Leave a Comment
On 0xDE I learned that Google has a new serivce, called Google scholar citations. You can make a profile and collect all your publications that Google scholar can find. It also calculates several of these statistics about your citations like the h-index.
Using this one realizes a few things quickly:
- Google scholar is not even close to be a reliable data base for mathematical publications. I find talks, poster, preprints multiple times,…
- Google scholar is not even close to have a proper database of the “citation network”. Some preprint series (in which I have some preprints) include a list of all paper in that series at the end of each paper. Google scholar takes this as the list of reference and hence, produces wrong results here.
- Its hard to tell different people with the same name apart.
It seems that one purpose of Google scholar citation is to improve on all three parts by letting the people doing the work (what is not a bad idea). Note that similar services exist for several disciplines. For mathematics there is MathSciNet which is maintained by the Amarican Mathematical Society. They really put effort in their database with respect to all three point above. Moreover, there is the Zentralblatt.
While it is really nice to have a search engine for science which also finds preprints and not only published material, this service makes it even easier to produce stupid things like “impact statistics”, “productivity indices”. On the positive side it is really helpful to see what paper do cite another one (this makes it easier to find the source of some idea or original proofs). On the downside, this service may lead to the impression that research output (if there is such thing) can be measured by numbers in any way. I do not think that this is true. In the end: What does it say that one paper has been cited 50 times and another one five times? Think about it. It does not mean that the first paper has been read more, had spread more interest, had more impact or is even better in any way. It plainly says that the first paper has been cited ten times more. That’s it. There are papers which introduced great ideas in an awkward way and then a follow-up explained the same idea better. Hence, the second one gets cited although the first is equally important… There are weird paper which get cited a lot because everyone does. There are paper which many people did not read but cite because the have heard or read that this paper is the reference for something.
By the way: My profile is here.
Like this:
Like Loading...
November 11, 2011
If mathematicians study an object they sometimes study the properties of the object. E.g. you can study the object “2” (that is the natural number 2) by saying that it denotes a certain cardinality, name the cardinality of two things. Well, that does not give you anything new. However, mathematicians often study objects by studying what either the object can do to other things or what other things can do the objects. In our (quite silly) example of the number two you see that you can use 2 to add it to other numbers (or, conversely, you can add other numbers to 2). One may feel that this does not add any insight but for objects which are more complicated (e.g. like sets of other objects or sets of these) this view can be particularly useful: If you consider a normed vector space
, you can ask either for properties of the space itself (e.g. completeness) or you ask how one can map the objects of the space to other sets. Using the underlying ingredients to give more structure on the mappings under consideration this gives new constructions: In the case of a normed vector space you have the underlying field
. This leads to mappings from
to
. You also have a linear structure on
. This leads to linear mappings from
to
. Finally, there is a norm on
and if there is an absolute value on
this leads you to bounded linear mappings from
to
. Any guess what: These objects form the so-called dual space.
Dual spaces of (say) Banach spaces are telling you new things about the space itself and it seems to me that if you want to understand the nature of a Banach space you need to know the dual space (or a space of which your space is the dual, i.e. a pre-dual space).
This post shall not be about duality in general but on dual spaces of a particular type of spaces, namely of spaces which consist of continuous functions. While these spaces usually form nice Banach spaces, they are not as simple as Lebesgue spaces or Sobolev spaces in that there are no Hilbert spaces among them and even no reflexive spaces.
1. What spaces of continuous functions?
For sets
and
one can study continuous functions
as soon as both
and
are topological spaces. For these functions to form a vector space we also want that
is a vector space over a field
. To get a normed space of continuous functions we endow
with a norm
and try to define a norm of
as

But wait: The supremum may not exist. We better add the assumption that
is bounded. Now this gives us a normed space and we denote:

Together with the norm
this gives a Banach space as soon as
is a Banach space. In the case that
is compact, we can omit the condition of boundedness and obtain

and again get a Banach space as soon as
is a Banach space.
Instead of assuming compactness of the whole space
we could shift this property to all functions in this space. This leads to

This does not lead to a Banach space in the case that
is not compact. Indeed, there is a last possibility by considering the closure of
with respect to
leading to the space

Informally one says that this are the continuous function “that vanish on the boundary of
”.
All these spaces admit dual spaces and all these spaces are described by some kind of “Riesz representation theorem”. These dual spaces consists of some kind of measures and the dual pairing is then defined as integrating the continuous functions with respect to these measures. Here I want to clarify (at least for myself) what these dual spaces are and how they are related.
The case in which
(i.e. the real or complex numbers) is a little easier and we omit the argument
in the spaces in this case:
,
,…
2. The dual of
for normal
We deal with real or complex valued functions on a set
and assume that
in a normal topological space (also called
-space), that is the points in
are all closed and two disjoint closed sets can be separated by open neighborhoods. Of course a metric space
is a normal topological space.
2.1. Regular bounded finitely additive measures
To see what the dual space of
could be, we have to think of integrating a bounded continuous functions against a measure
. Here the topological structure of
comes into play because it naturally leads to a
-algebra on
, namely the one which is generated by the closed sets in
(or equivalently be the open sets) which is also called the Borel algebra
on
. On a
-algebra we can define a set function (which shall be a measure) by mapping the elements in
to real (or complex) numbers. Such a set function
is called bounded if
for all
and finitely additive if for any finite number of disjoint set
it holds that

The set of all bounded and finitely additive set functions (nowadays often called bounded finitely additive measures) is denoted by
. Equipped with the variational norm
,
becomes a Banach space. The space
is a fairly large space of measures: Note that we did neither assume that the member are countably additive nor that the values
shall be non-negative. However, we assumed the all values
are finite which excludes, e.g. the Lebesgue-measure on
(while weighted Lebesgue measures can be allowed if the weight is integrable). The space
is slightly too large to be the dual of
and we need another restriction. We call an element
regular, if for any
and any
there exists a closed set
and an open set
such that for every
it holds that
. The space of regular bounded finitely additive measures is denoted by
and is closed subspace of 
2.2. Intgration with respect to
Now we can define the integral of a bounded continuous function
with respect to a regular bounded finitely additive measure
as follows: Since
is a bounded set in
we can cover is with open sets
with diameter less than a given
. Define
and
. If
is not empty, choose
(otherwise choose
). Since
is continuous, the sets
are in
and we can define

which is an
-approximation to
and indeed,
is the uniform limit of the
. For each
, the integral is naturally defined as

and the limit of this expression is used as integral for
.
2.3. Duality
Since it holds that

every
defines a continuous linear functional an
. This shows that the dual space of
contains
. Indeed both spaces are equal:
Theorem 1 For a normal topological space
it holds that

The proof is lengthy and technical and the prime reference is “Linear Operators” by Dunford and Schwartz (Theorem IV.6.2).
3. The dual of
for compact Hausdorff
Now we assume that
is a compact Hausdorff space (think of closed and bounded subsets of
). Due to compactness we can omit the boundedness assumption on our continuous functions (they are bounded anyway).
3.1. Regular bounded countably additive measures
We move from finitely additive measures to countably additive measure that is, (1) holds for countably many disjoint sets
. Together with finiteness and regularity we arrive at the space
of regular bounded countably additive measures. In our case, where
is also a topological space and the
-algebra is the Borel algebra this space is also called the space of regular Borel measures} or Radon measure} and often denoted by
.
3.2. Duality
Here we have the following representation theorem:
Theorem 2 If
is compact and Hausdorff it holds that

By

every
determines an element in
. Moreover, it is clear that
. It remains to show that every
determines a
such that for all
it holds that
.
4. The dual of
for locally compact Hausdorff
This case is basically the same as for
with compact Hausdorff
. The absence of compactness is compensated by the fact that the function “vanish on the boundary”. Well, “vanishing on the boundary” really means, that the function can be approximated by continuous function with compact support (in the
-norm) and hence, the result do not differ from the previous one:
Theorem 3 If
is locally compact and Hausdorff it holds that

Final remarks:
- It turned out that the situation is easier than I expected. Basically there are two cases: Bounded continuous functions on a normal set
. Here we get the regular bounded and finitely additive measures as the dual space. The case are continuous functions on a compact space or continuous function which “vanish at the boundary” on a locally compact space. In both we arrive at cases regular bounded countably additive measure aka Radon measures.
Like this:
Like Loading...
November 2, 2011
How many samples are needed to reconstruct a sparse signal?
Well, there are many, many results around some of which you probably know (at least if you are following this blog or this one). Today I write about a neat result which I found quite some time ago on reconstruction of nonnegative sparse signals from a semi-continuous perspective.
1. From discrete sparse reconstruction/compressed sensing to semi-continuous
The basic sparse reconstruction problem asks the following: Say we have a vector
which only has
non-zero entries and a fat matrix
(i.e.
) and consider that we are given measurements
. Of course, the system
is underdetermined. However, we may add a little more prior knowledge on the solution and ask: Is is possible to reconstruct
from
if we know that the vector
is sparse? If yes: How? Under what conditions on
,
,
and
? This question created the expanding universe of compressed sensing recently (and this universe is expanding so fast that for sure there has to be some dark energy in it). As a matter of fact, a powerful method to obtain sparse solutions to underdetermined systems is
-minimization a.k.a. Basis Pursuit on which I blogged recently: Solve

and the important ingredient here is the
-norm of the vector in the objective function.
In this post I’ll formulate semi-continuous sparse reconstruction. We move from an
-vector
to a finite signed measure
on a closed interval (which we assume to be
for simplicty). We may embed the
-vectors into the space of finite signed measures by choosing
points
,
from the interval
and build
with the point-masses (or Dirac measures)
. To a be a bit more precise, we speak about the space
of Radon measures on
, which are defined on the Borel
-algebra of
and are finite. Radon measures are not very scary objects and an intuitive way to think of them is to use Riesz representation: Every Radon measure arises as a continuous linear functional on a space of continuous functions, namely the space
which is the closure of the continuous functions with compact support in
with respect to the supremum norm. Hence, Radon measures work on these functions as
. It is also natural to speak of the support
of a Radon measure
and it holds for any continuous function
that

An important tool for Radon measures is the Hahn-Jordan decomposition which decomposes
into a positive part
and a negative part
, i.e.
and
are non-negative and
. Finally the variation of a measure, which is

provides a norm on the space of Radon measures.
Example 1 For the measure
one readily calculates that

and hence

In this sense, the space of Radon measures provides a generalization of
.
We may sample a Radon measure
with
linear functionals and these can be encoded by
continuous functions
as

This sampling gives a bounded linear operator
. The generalization of Basis Pursuit is then given by

This was introduced and called “Support Pursuit” in the preprint Exact Reconstruction using Support Pursuit by Yohann de Castro and Frabrice Gamboa.
More on the motivation and the use of Radon measures for sparsity can be found in Inverse problems in spaces of measures by Kristian Bredies and Hanna Pikkarainen.
2. Exact reconstruction of sparse nonnegative Radon measures
Before I talk about the results we may count the degrees of freedom a sparse Radon measure has: If
with some
than
is defined by the
weights
and the
positions
. Hence, we expect that at least
linear measurements should be necessary to reconstruct
. Surprisingly, this is almost enough if we know that the measure is nonnegative! We only need one more measurement, that is
and moreover, we can take fairly simple measurements, namely the monomials:
(with the convention that
). This is shown in the following theorem by de Castro and Gamboa.
Theorem 1 Let
with
,
and let
,
be the monomials as above. Define
. Then
is the unique solution of the support pursuit problem, that is of

Proof: The following polynomial will be of importance: For a constant
define

The following properties of
will be used:
-
for
-
has degree
and hence, is a linear combination of the
,
, i.e.
.
- For
small enough it holds for
that
.
Now let
be a solution of (SP). We have to show that
. Due to property 2 we know that

Due to property 1 and non-negativity of
we conclude that

Moreover, by Lebesgue’s decomposition we can decompose
with respect to
such that

and
is singular with respect to
. We get

and we conclude that
and especially
. This shows that
is a solution to
. It remains to show uniqueness. We show the following: If there is a
with support in
such that
, then
. To see this, we build, for any
, the sets
![\displaystyle \Omega_r = [-1,1]\setminus \bigcup_{i=1}^s ]x_i-r,x_i+r[. \displaystyle \Omega_r = [-1,1]\setminus \bigcup_{i=1}^s ]x_i-r,x_i+r[.](http://s0.wp.com/latex.php?latex=%5Cdisplaystyle++%5COmega_r+%3D+%5B-1%2C1%5D%5Csetminus+%5Cbigcup_%7Bi%3D1%7D%5Es+%5Dx_i-r%2Cx_i%2Br%5B.+&bg=ffffff&fg=000000&s=0)
and assume that there exists
such that
(
denoting the restriction of
to
). However, it holds by property 3 of
that

and consequently

which is a contradiction. Hence,
for all
and this implies
. Since
has its support in
we conclude that
. Hence the support of
is exactly
. and since
and hence
. This can be written as a Vandermonde system

which only has the zero solution, giving
. 
3. Generalization to other measurements
The measurement by monomials may sound a bit unusual. However, de Castro and Gamboa show more. What really matters here is that the monomials for a so-called Chebyshev-System (or Tchebyscheff-system or T-system – by the way, have you ever tried to google for a T-system?). This is explained, for example in the book “Tchebycheff Systems: With Applications in Analysis and Statistics” by Karlin and Studden. A T-system on
is simply a set of
functions
such that any linear combination of these functions has at most
zeros. These systems are called after Tchebyscheff since they obey many of the helpful properties of the Tchebyscheff-polynomials.
What is helpful in our context is the following theorem of Krein:
Theorem 2 (Krein) If
is a T-system for
,
and
are in the interior of
, then there exists a linear combination
which is non-negative and vanishes exactly the the point
.
Now consider that we replace the monomials in Theorem~1 by a T-system. You recognize that Krein’s Theorem allows to construct a “generalized polynomial” which fulfills the same requirements than the polynomial
is the proof of Theorem~1 as soon as the constant function 1 lies in the span of the T-system and indeed the result of Theorem~1 is also valid in that case.
4. Exact reconstruction of
-sparse nonnegative vectors from
measurements
From the above one can deduce a reconstruction result for
-sparse vectors and I quote Theorem 2.4 from Exact Reconstruction using Support Pursuit:
Theorem 3 Let
,
,
be integers such that
and let
be a complete T-system on
(that is,
is a T-system on
for all
). Then it holds: For any distinct reals
and
defined as

Basis Pursuit recovers all nonnegative
-sparse vectors
.
5. Concluding remarks
Note that Theorem~3 gives a deterministic construction of a measurement matrix.
Also note, that nonnegativity is crucial in what we did here. This allowed (in the monomial case) to work with squares and obtain the polynomial
in the proof of Theorem~1 (which is also called “dual certificate” in this context). This raises the question how this method can be adapted to all sparse signals. One needs (in the monomial case) a polynomial which is bounded by 1 but matches the signs of the measure on its support. While this can be done (I think) for polynomials it seems difficult to obtain a generalization of Krein’s Theorem to this case…
Like this:
Like Loading...