Let be a compact subset of
and consider the space
of continuous functions
with the usual supremum norm. The Riesz Representation Theorem states that the dual space of
is in this case the set of all Radon measures, denoted by
and the canonical duality pairing is given by
We can equip with the usual notion of weak* convergence which read as
We call a measure positive if
implies that
. If a positive measure satisfies
(i.e. it integrates the constant function with unit value to one), we call it a probability measure and we denote with
the set of all probability measures.
Example 1 Every non-negative integrable function
with
induces a probability measure via
Quite different probability measures are the
-measures: For every
there is the
-measure at this point, defined by
In some sense, the set of probability measure is the generalization of the standard simplex in
to infinite dimensions (in fact uncountably many dimensions): The
-measures are the extreme points of
and since the set
is compact in the weak* topology, the Krein-Milman Theorem states that
is the weak*-closure of the set of convex combinations of the
-measures – similarly as the standard simplex in
is the convex combination of the canonical basis vectors of
.
Remark 1 If we drop the positivity assumption and form the set
we have the
is the set of convex combinations of the measures
(
). Hence,
resembles the hyper-octahedron (aka cross polytope or
-ball).
I’ve taken the above (with almost similar notation) from the book “ A Course in Convexity” by Alexander Barvinok. I was curious to find (in Chapter III, Section 9) something which reads as a nice glimpse on semi-continuous compressed sensing: Proposition 9.4 reads as follows
Proposition 1 Let,
and suppose that the subset
of
consisting of the probability measures
such that for
![]()
is not empty. Then there exists
such that
and
are convex combinations of at most
![]()
-measures, and
- it holds that for all
we have
In terms of compressed sensing this says: Among all probability measures which comply with the data measured by
linear measurements, there are two extremal ones which consists of
-measures.
Note that something similar to “support-pursuit” does not work here: The minimization problem does not make much sense, since
for all
.