Let ${\Omega}$ be a compact subset of ${{\mathbb R}^d}$ and consider the space ${C(\Omega)}$ of continuous functions ${f:\Omega\rightarrow {\mathbb R}}$ with the usual supremum norm. The Riesz Representation Theorem states that the dual space of ${C(\Omega)}$ is in this case the set of all Radon measures, denoted by ${\mathfrak{M}(\Omega)}$ and the canonical duality pairing is given by

$\displaystyle \langle\mu,f\rangle = \mu(f) = \int_\Omega fd\mu.$

We can equip ${\mathfrak{M}(\Omega)}$ with the usual notion of weak* convergence which read as

$\displaystyle \mu_n\rightharpoonup^* \mu\ \iff\ \text{for every}\ f:\ \mu_n(f)\rightarrow\mu(f).$

We call a measure ${\mu}$ positive if ${f\geq 0}$ implies that ${\mu(f)\geq 0}$. If a positive measure satisfies ${\mu(1)=1}$ (i.e. it integrates the constant function with unit value to one), we call it a probability measure and we denote with ${\Delta\subset \mathfrak{M}(\Omega)}$ the set of all probability measures.

Example 1 Every non-negative integrable function ${\phi:\Omega\rightarrow{\mathbb R}}$ with ${\int_\Omega \phi(x)dx}$ induces a probability measure via

$\displaystyle f\mapsto \int_\Omega f(x)\phi(x)dx.$

Quite different probability measures are the ${\delta}$-measures: For every ${x\in\Omega}$ there is the ${\delta}$-measure at this point, defined by

$\displaystyle \delta_x(f) = f(x).$

In some sense, the set ${\Delta}$ of probability measure is the generalization of the standard simplex in ${{\mathbb R}^n}$ to infinite dimensions (in fact uncountably many dimensions): The ${\delta}$-measures are the extreme points of ${\Delta}$ and since the set ${\Delta}$ is compact in the weak* topology, the Krein-Milman Theorem states that ${\Delta}$ is the weak*-closure of the set of convex combinations of the ${\delta}$-measures – similarly as the standard simplex in ${{\mathbb R}^n}$ is the convex combination of the canonical basis vectors of ${{\mathbb R}^n}$.

Remark 1 If we drop the positivity assumption and form the set

$\displaystyle O = \{\mu\in\mathfrak{M}(\Omega)\ :\ |f|\leq 1\implies |\mu(f)|\leq 1\}$

we have the ${O}$ is the set of convex combinations of the measures ${\pm\delta_x}$ (${x\in\Omega}$). Hence, ${O}$ resembles the hyper-octahedron (aka cross polytope or ${\ell^1}$-ball).

I’ve taken the above (with almost similar notation) from the book “ A Course in Convexity” by Alexander Barvinok. I was curious to find (in Chapter III, Section 9) something which reads as a nice glimpse on semi-continuous compressed sensing: Proposition 9.4 reads as follows

Proposition 1 Let ${g,f_1,\dots,f_m\in C(\Omega)}$, ${b\in{\mathbb R}^m}$ and suppose that the subset ${B}$ of ${\Delta}$ consisting of the probability measures ${\mu}$ such that for ${i=1,\dots,m}$

$\displaystyle \int f_id\mu = b_i$

is not empty. Then there exists ${\mu^+,\mu^-\in B}$ such that

1. ${\mu^+}$ and ${\mu^-}$ are convex combinations of at most ${m+1}$ ${\delta}$-measures, and
2. it holds that for all ${\mu\in B}$ we have

$\displaystyle \mu^-(g)\leq \mu(g)\leq \mu^+(g).$

In terms of compressed sensing this says: Among all probability measures which comply with the data ${b}$ measured by ${m}$ linear measurements, there are two extremal ones which consists of ${m+1}$ ${\delta}$-measures.

Note that something similar to “support-pursuit” does not work here: The minimization problem ${\min_{\mu\in B, \mu(f_i)=b_i}\|\mu\|_{\mathfrak{M}}}$ does not make much sense, since ${\|\mu\|_{\mathfrak{M}}=1}$ for all ${\mu\in B}$.