This is the follow up to this post on the completion of the space of Radon measures with respect to transport norm. In the that post we have seen that

\displaystyle \mathrm{KR}_0(X)^*\cong \mathrm{Lip}_0(X),

i.e. that the completion of the Radon measure with zero total mass with respect to he dual Lipschitz norm

\displaystyle \|\mu\|_{\mathrm{Lip}_0}^* = \sup\{\int f{\mathrm d}\mu\ :\ \mathrm{Lip}(f)\leq 1,\ f(e)=0\}

where {e\in X} is some base point in the metric space {X}.

Recall that on a compact metric space {(X,d)} we have {\mathfrak{M}(X)}, the space of Radon measures on {X}. The Kantorovich-Rubinstein norm for measure is defined by

\displaystyle \|\mu\|_{\mathrm{KR}} = \sup\{\int f{\mathrm d} \mu\ :\ \mathrm{Lip}(f)\leq 1,\ \|f\|_\infty\leq 1\}.

Theorem 1 For a compact metric space {(X,d)} it holds that

\displaystyle \mathrm{KR}(X)^* \cong \mathrm{Lip}(X).

Proof: We give an explicit identification for {\mathrm{KR}(X)^*} as follows:

  1. Define a Lipschitz function from an element of {\mathrm{KR}(X)^*}: For every {\phi\in\mathrm{KR}(X)^*} and {x\in X} we set

    \displaystyle (T\phi)(x) = \phi(\delta_x).

    Now we check that by linearity and continuity of {\phi} that for any {x\in X} it holds that

    \displaystyle |(T\phi)(x)| = |\phi(\delta_x)|\leq \|\phi\|\|\delta_x\|_{\mathrm{KR}} = \|\phi\|.

    This shows that {T\phi:X\rightarrow {\mathbb R}} is a bounded function. Similarly for all {x,y\in X} we have

    \displaystyle |(T\phi)(x)-(T\phi)(y)| = |\phi(\delta_x-\delta_y)|\leq \|\phi\|\|\delta_x-\delta_y\|_{\mathrm{KR}}\leq \|\phi\|\min(2,d(x,y))\leq \|\phi\|d(x,y).

    This shows that {(T\phi)} is actually Lipschitz continuous, and moreover, that {T:\mathrm{KR}(X)^*\rightarrow\mathrm{Lip}(X)} is continuous with norm {\|T\|\leq 1}.

  2. Define an element in {\mathrm{KR}(X)^*} from a Lipschitz function: We just set for {f\in\mathrm{Lip}(X)} and {\mu\in\mathfrak{M}(X)}

    \displaystyle (Sf)(\mu) = \int_Xf{\mathrm d}\mu.

    By the definition of the {\mathrm{KR}}-norm we have

    \displaystyle |(Sf)(\mu)\leq \|f\|_{\mathrm{Lip}}\|\mu\|_{\mathrm{KR}},

    which shows that {S(f)} can be extended to a continuous and linear functional {S:\mathrm{KR}(X)\rightarrow{\mathbb R}}, i.e. {Sf\in\mathrm{KR}(X)^*}, and we also have that {\|S\|\leq 1}.

  3. Check that {S} and {T} invert each other: Finally we check that {T} and {S} are inverses of each other. We begin with {TS}: For {x\in X} and {f\in\mathrm{Lip}(X)} observe that

    \displaystyle T(Sf)(x) = Sf(\delta_x) = \int_X f{\mathrm d}\delta_x = f(x)

    i.e. {TS} is the identity on {\mathrm{Lip}(X)}. Conversely, for any {\phi\in\mathrm{KR}(X)^*} and {x\in X} we check

    \displaystyle S(T\phi)(\delta_x) = \int_X T\phi{\mathrm d}\delta_x = (T\phi)(x_0) = \phi(\delta_x).

    By density of the Dirac measures in {\mathrm{KR}(X)} we conclude that indeed {ST\phi = \phi}, i.e. {ST} is the identity on {\mathrm{KR}(X)^*}.


Let {\Omega} be a compact subset of {{\mathbb R}^d} and consider the space {C(\Omega)} of continuous functions {f:\Omega\rightarrow {\mathbb R}} with the usual supremum norm. The Riesz Representation Theorem states that the dual space of {C(\Omega)} is in this case the set of all Radon measures, denoted by {\mathfrak{M}(\Omega)} and the canonical duality pairing is given by

\displaystyle  \langle\mu,f\rangle = \mu(f) = \int_\Omega fd\mu.

We can equip {\mathfrak{M}(\Omega)} with the usual notion of weak* convergence which read as

\displaystyle  \mu_n\rightharpoonup^* \mu\ \iff\ \text{for every}\ f:\ \mu_n(f)\rightarrow\mu(f).

We call a measure {\mu} positive if {f\geq 0} implies that {\mu(f)\geq 0}. If a positive measure satisfies {\mu(1)=1} (i.e. it integrates the constant function with unit value to one), we call it a probability measure and we denote with {\Delta\subset \mathfrak{M}(\Omega)} the set of all probability measures.

Example 1 Every non-negative integrable function {\phi:\Omega\rightarrow{\mathbb R}} with {\int_\Omega \phi(x)dx} induces a probability measure via

\displaystyle  f\mapsto \int_\Omega f(x)\phi(x)dx.

Quite different probability measures are the {\delta}-measures: For every {x\in\Omega} there is the {\delta}-measure at this point, defined by

\displaystyle  \delta_x(f) = f(x).

In some sense, the set {\Delta} of probability measure is the generalization of the standard simplex in {{\mathbb R}^n} to infinite dimensions (in fact uncountably many dimensions): The {\delta}-measures are the extreme points of {\Delta} and since the set {\Delta} is compact in the weak* topology, the Krein-Milman Theorem states that {\Delta} is the weak*-closure of the set of convex combinations of the {\delta}-measures – similarly as the standard simplex in {{\mathbb R}^n} is the convex combination of the canonical basis vectors of {{\mathbb R}^n}.

Remark 1 If we drop the positivity assumption and form the set

\displaystyle  O = \{\mu\in\mathfrak{M}(\Omega)\ :\ |f|\leq 1\implies |\mu(f)|\leq 1\}

we have the {O} is the set of convex combinations of the measures {\pm\delta_x} ({x\in\Omega}). Hence, {O} resembles the hyper-octahedron (aka cross polytope or {\ell^1}-ball).

I’ve taken the above (with almost similar notation) from the book “ A Course in Convexity” by Alexander Barvinok. I was curious to find (in Chapter III, Section 9) something which reads as a nice glimpse on semi-continuous compressed sensing: Proposition 9.4 reads as follows

Proposition 1 Let {g,f_1,\dots,f_m\in C(\Omega)}, {b\in{\mathbb R}^m} and suppose that the subset {B} of {\Delta} consisting of the probability measures {\mu} such that for {i=1,\dots,m}

\displaystyle  \int f_id\mu = b_i

is not empty. Then there exists {\mu^+,\mu^-\in B} such that

  1. {\mu^+} and {\mu^-} are convex combinations of at most {m+1} {\delta}-measures, and
  2. it holds that for all {\mu\in B} we have

    \displaystyle  \mu^-(g)\leq \mu(g)\leq \mu^+(g).

In terms of compressed sensing this says: Among all probability measures which comply with the data {b} measured by {m} linear measurements, there are two extremal ones which consists of {m+1} {\delta}-measures.

Note that something similar to “support-pursuit” does not work here: The minimization problem {\min_{\mu\in B, \mu(f_i)=b_i}\|\mu\|_{\mathfrak{M}}} does not make much sense, since {\|\mu\|_{\mathfrak{M}}=1} for all {\mu\in B}.

There are still some things left, I wanted to add about the issue of weak-* convergence in {L^\infty}, non-linear distortions and Young measures. The first is, that Young measures are not able to describe all effects of weak-* convergence, namely, the notion does not handle contractions properly. The second thing is, that there is an alternative approach based on the {\chi}-function which I also find graphically appealing.

1. Concentrations and Young measures

One can distinguish several “modes” that a sequence of functions can obey: In this blog entry of Terry Tao he introduces four more modes apart from oscillation:

  1. escape to horizontal infinity
  2. escape to width infinity
  3. escape to vertical infinity
  4. typewriter sequence

1.1. Escape to horizontal infinity

This mode is most easily described by the sequence {f_n = \chi_{[n,n+1]}}, i.e. the characteristic functions on an interval of unit length which escapes to infinity. Obviously, this sequence does not convergence in any {L^p({\mathbb R})} norm and its weak convergence depends on {p}:

For {p>1} the sequence does converge weakly (weakly-* for {p=\infty}) to zero. This can be seen as follows: Assume that for some non-negative {g\in L^q} (with {1/p + 1/q = 1}) we have {\epsilon \leq \int g f_n = \int_n^{n+1} g}. The we get with Hölders inequality that {\epsilon^q \leq \int_n^{n+1} |g|^q}. But this contradicts the fact that {g\in L^q}.

For {p=1} the sequence does not convergence weakly to zero as can be seen by testing it with the function {g \equiv 1} and also does not converge weakly at all (test with {g = \sum_n (-1)^n\chi_{[n,n+1]}} and observe that the dual pairings do not converge).

However, this type of convergence does not occur in bounded domains, and hence, can not be treated with Young measures as they have been introduced the in my previous entry.

1.2. Escape to width infinity

The paradigm for this mode of convergence is {f_n = \frac{1}{n}\chi_{[0,n]}}. This sequence even convergence strongly in {L^p} for {p>1} but not in strongly in {L^1}. However, it converges weakly to 0 in {L^1}. This mode needs, similar to the previous mode, an unbounded domain.

1.3. Escape to vertical infinity

The prime example, normalized in {L^2(]-1,1[)}, for this mode is {f_n = \sqrt{n}\chi_{[-1/n,1/n]}}. By testing with continuous functions (which is enough by density) one sees that the weak limit is zero.

If one wants to assign some limit to this sequence {f_n} one can say that the measure {f_n^2\mathfrak{L}} does converge weakly in the sense of measures to {2\delta_0}, i.e. twice the point-mass in zero.

Now, what does the Young measure say here?

We check narrow convergence of the Young measures {\mu^{f_n}} by testing with a function of type {\psi(x,y) = \chi_B(x)\phi(y)} for a Borel set {B} and bounded continuous function {\phi}. Then we get for {n\rightarrow \infty}

\displaystyle  |\int_{\Omega\times{\mathbb R}} \psi(x,y){\mathrm d}\mu^{f_n}(x,y)| \leq \int_{-1/n}^{1/n}|\psi(\sqrt{n})|{\mathrm d}\mathfrak{L}(x)\rightarrow 0


\displaystyle  \mu^{f_n}\rightharpoonup 0.

We conclude that this mode of convergence can not be seen by Young measures. As Attouch et. al say in Section 4.3.7: “Young measures do not capture concentrations”.

1.4. Typewriter sequence

A typewriter sequence on an interval (as described in Example 4 of this blog entry of Terry Tao) is a sequence of functions which are mostly zero and the non-zero places revisit every place of the interval again, however, with smaller support and integral. This is an example of a sequence which converges in the {L^1}-norm but not pointwise at any point. However, this mode of convergence is not very interesting with respect to Young measures. It basically behaves like “Escape to vertical infinity” above.

2. Weak convergence via the {\chi}-function

While Young measures put a uniformly distributed measure on the graph of the function, and thus, are a more “graphical” representation of the function, the approach described now uses the area between the graph and the {x}-axis.

We consider an open and bounded domain {\Omega\subset {\mathbb R}^d}. Now we define the {\chi}-function as {\chi:{\mathbb R}\times{\mathbb R}\rightarrow {\mathbb R}}

\displaystyle  \chi(\xi,u) = \begin{cases} 1, & 0<\xi<u\\ -1, & u<\xi<0\\ 0, &\text{else}. \end{cases}

The function looks like this:

We then associate to a given function {u:\Omega\rightarrow{\mathbb R}} the function {U:(x,\xi) \mapsto \chi(\xi,u(x))}. Graphically, this function has the value {1}, if the value {\xi} is positive and between zero and {u(x)} and it is {-1}, if {\xi} is negative and again between zero and {u(x)}. In other words: the function {(x,\xi)\mapsto \chi(\xi,u(x))} is piecewise constant of the area between zero and the graph of {u} encoding the sign of {u}. For the functions {f_n} from this Example 1 in the previous post this looks like this:

Similar to the approach via Young measure, we now consider the sequence of the new objects, i.e. the sequence of {(x,\xi)\mapsto \chi(x,f_n(x))} and use a weak form of convergence here. For Young measures we used narrow convergence and here we use simple weak-* convergence.

On can show the following lemma:

Lemma 1 Assume that {f_n} converges weakly-* in {L^\infty({\mathbb R})}. Then, for the weak-* limit of the mappings {(x,\xi)\mapsto\chi(\xi,f_n(x))}, denoted by {F} there exists a probability measure {\nu_x} such that

\displaystyle  \partial_\xi F(x,\cdot) = \delta_0 - \nu_x.

The proof (in a slightly different situation) can be found in Kinetic Formulation of Conservation Laws, Lemma 2.3.1.

Example 1 We again consider Example 1 from my previous post:

\displaystyle  f_n(x) = \begin{cases} a & \text{for }\ \tfrac{2k}{n} \leq x < \tfrac{2k+1}{n},\ k\in{\mathbb Z}\\ b & \text{else.} \end{cases}

The graph of some {f_n} and the corresponding function {F_n:(x,\xi) \mapsto \chi(x,f_n(x))} was shown above. Obviously the weak-* limit of these {\chi}-functions {F_n} is (in the case {b<0<a}) given by

\displaystyle  F(x,\xi) = \begin{cases} \tfrac12, & 0 < \xi < a\\ -\tfrac12, & b< \xi < 0. \end{cases}

This can be illustrated as

Now take the weak derivative with respect to {\xi} (which is, as the function {F} itself, independent of {x}) to get

\displaystyle  \partial_\xi F(\cdot,x) = \delta_0 - \tfrac12 (\delta_a + \delta_b)

and, comparing with Lemma~1, we see

\displaystyle  \nu_x = \tfrac12 (\delta_a + \delta_b).

Cool: That is precisely the same limit as obtained by the Young measure!

Well, the observation in this example is not an accident and indeed this approach is closely related to Young measures. Namely, it holds that {\mu^{f_n}_x\rightharpoonup \nu_x}.

Maybe, I’ll come back to the proof of this fact later (which seemed not too hard, but used a different definitions of a Young measure I used here).

To conclude: Both the approach via Young measures and the approach via the {\chi}-function lead to the same new understanding of weak-* limits in {L^\infty}. This new understanding is a little bit deeper than the usual one as it allows to goes well with non-linear distortions of functions. And finally: Both approaches use a geometric approach: Young measures put a equidistributed measure on the graph of the function and the {\chi}-function puts {\pm1} between the graph and the {x}-axis.

This entry is not precisely about some thing I stumbled upon but about some thing a that I wanted to learn for some time now, namely Young measures. Lately I had a several hour train ride and I had the book Kinetic Formulation of Conservation Laws with me.

While the book is about hyperbolic PDEs and their formulation as kinetic equation, it also has some pointers to Young measures. Roughly, Young measures are a way to describe weak limits of functions and especially to describe how these weak limits behave under non-linear functions, and hence, we start with this notation.

1. Weak convergence of functions

We are going to deal with sequences of function {(f_n)} in spaces {L^p(\Omega)} for some open bounded domain {\Omega} and some {1\leq p\leq \infty}.

For {1\leq p < \infty} the dual space of {L^p(\Omega)} is {L^q(\Omega)} with {1/p + 1/q = 1} and the dual pairing is

\displaystyle \langle f,g\rangle_{L^p(\Omega)\times L^q(\Omega)} = \int_\Omega f\, g.

Hence, a sequence {(f_n)} converges weakly in {L^p(\Omega)} to {f}, if for all {g\in L^p(\Omega)} it holds that

\displaystyle \int f_n\, g \rightarrow \int f\, g.

We denote weak convergence (if the space is clear) with {f_n\rightharpoonup f}.

For the case {p=\infty} one usually uses the so-called weak-* convergence: A sequence {(f_n)} in {L^\infty(\Omega)} converges weakly-* to {f}, if for all {g\in L^1(\Omega)} it holds that

\displaystyle \int f_n\, g \rightarrow \int f\, g.

The reason for this is, that the dual space of {L^\infty(\Omega)} is not easily accessible as it can not be described as a function space. (If I recall correctly, this is described in “Linear Operators”, by Dunford and Schwarz.) Weak-* convergence will be denoted by {f_n\rightharpoonup^* f}.

In some sense, it is enough to consider weak-* convergence in {L^\infty(\Omega)} to understand what’s that about with Young measures and I will only stick to this kind of convergence here.

Example 1 We consider {\Omega = [0,1]} and two values {a,b\in{\mathbb R}}. We define a sequence of functions which jumps between these two values with an increasing frequency:

\displaystyle f_n(x) = \begin{cases} a & \text{for }\ \tfrac{2k}{n} \leq x < \tfrac{2k+1}{n},\ k\in{\mathbb Z}\\ b & \text{else.} \end{cases}

The functions {f_n} look like this:

To determine the weak limit, we test with very simple functions, lets say with {g = \chi_{[x_0,x_1]}}. Then we get

\displaystyle \int f_n\, g = \int_{x_0}^{x_1} f_n \rightarrow (x_1-x_0)\tfrac{a+b}{2}.

Hence, we see that the weak-* limit of the {f_n} (which is, by the way, always unique) has no other chance than being

\displaystyle f \equiv \frac{a+b}{2}.

In words: the weak-* limit converges to the arithmetic mean of the two values between which the functions oscillate.

2. Non-linear distortions

Now, the norm-limit behaves well under non-linear distortions of the functions. Let’s consider a sequence {f_n} which converges in norm to some {f}. That is, {\|f_n -f\|_\infty \rightarrow 0}. Since this means that {\sup| f_n(x) - f(x)| \rightarrow 0} we see that for any boundedcontinuous function {\phi:{\mathbb R}\rightarrow {\mathbb R}} we also have {\sup |\phi(f_n(x)) - \phi(f(x))|\rightarrow 0} and hence {\phi\circ f_n \rightarrow \phi\circ f}.

The same is totally untrue for weak-* (and also weak) limits:

Example 2 Consider the same sequence {(f_n)} as in example~1which has the weak-* limit {f\equiv\frac{a+b}{2}}. As a nonlinear distortion we take {\phi(s) = s^2} which gives

\displaystyle \phi\circ f_n(x) = \begin{cases} a^2 & \text{for }\ \tfrac{2k}{n} \leq x < \tfrac{2k+1}{n},\ k\in{\mathbb Z}\\ b^2 & \text{else.} \end{cases}

Now we see

\displaystyle \phi\circ f_n \rightharpoonup^* \frac{a^2 + b^2}{2} \neq \Bigl(\frac{a+b}{2}\Bigr)^2 = \phi\circ f.

The example can be made a little bit more drastically by assuming {b = -a} which gives {f_n\rightharpoonup^* f\equiv 0}. Then, for every {\phi} with {\phi(0) = 0} we have {\phi\circ f\equiv 0}. However, with such a {\phi} we may construct any constant value {c} for the weak-* limit of {\phi\circ f_n} (take, e.g. {\phi(b) = 0}, {\phi(a) = 2c}).

In fact, the relation {\phi\circ f_n \rightharpoonup^* \phi\circ f} is only true for affine linear distortions {\phi} (unfortunately I forgot a reference for this fact\dots).

It arises the question, if it is possible to describe the weak-* limits of distortions of functions and if fact, this will be possible with the notions of Young measure.

3. Young measures

In my understanding, Young measures are a method to view a function somehow a little bit more geometrically in giving more emphasis on the graph of the function rather than is mapping property.

We start with defining Young measures and illustrate how they can be used to describe weak(*) limits. In what follows we use {\mathfrak{L}} for the Lebesgue measure on the (open and bounded) set {\Omega}. A more through description in the spirit of this section is Variational analysis in Sobolev and BV spaces by Attouch, Buttazzo and Michaille.

Definition 1 (Young measure) A positive measure {\mu} on {\Omega\times {\mathbb R}} is called a Young measureif for every Borel subset {B} of {\Omega} it holds that

\displaystyle \mu(B\times{\mathbb R}) = \mathfrak{L}(B).

Hence, a Young measure is a measure such that the measure of every box {B\times{\mathbb R}} is determined by the projection of the box onto the set {\Omega}, i.e. the intersection on {B\times{\mathbb R}} with {\Omega} which is, of course, {B}:

There are special Young measures, namely these, who are associated to functions. Roughly spoken, a Young measure associated to a function {u:\Omega\rightarrow {\mathbb R}} is a measure which is equidistributed on the graph of {u}.

Definition 2 (Young measure associated to {u}) For a Borel measurable function {u:\Omega\rightarrow{\mathbb R}} we define the associated Young measure{\mu^u} by defining for every continuous and bounded function {\phi:\Omega\times{\mathbb R}\rightarrow{\mathbb R}}

\displaystyle \int_{\Omega\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu^u(x,y)} = \int_\Omega \phi(x,u(x)){\mathrm d} \mathfrak{L}(x).

It is clear that {\mu^u} is a Young measure: Take {B\subset \Omega} and approximate the characteristic function {\chi_{B\times{\mathbb R}}} by smooth functions {\phi_n}. Then

\displaystyle \int_{\Omega\times{\mathbb R}}\phi_n(x,y){\mathrm d}{\mu^u(x,y)} = \int_\Omega \phi_n(x,u(x)){\mathrm d} \mathfrak{L}(x).

The left hand side converges to {\mu^u(B\times{\mathbb R})} while the right hand side converges to {\int_B 1{\mathrm d}{\mathfrak{L}} = \mathfrak{L}(B)} as claimed.

The intuition that a Young measure associated to a function is an equidistributed measure on the graph can be made more precise by “slicing” it:

Definition 3 (Slicing a measure) Let {\mu} be a positive measure on {\Omega\times{\mathbb R}} and let {\sigma} be its projection onto {\Omega} (i.e. {\sigma(B) = \mu(B\times{\mathbb R})}). Then {\mu} is sliced into measures {(\sigma_x)_{x\in\Omega}}, i.e. it holds:

  1. Each {\mu_x} is a probability measure.
  2. The mapping {x\mapsto \int_{\mathbb R} \phi(x,y){\mathrm d}{\mu_x(y)}} is measurable for every continuous {\phi} and it holds that

    \displaystyle \int_{\Omega\times{\mathbb R}} \phi(x,y){\mathrm d}{\mu(x,y)} = \int_\Omega\int_{\mathbb R} \phi(x,y){\mathrm d}{\mu_x(y)}{\mathrm d}{\sigma(x)}.

The existence of the slices is, e.g. proven in Variational analysis in Sobolev and BV spaces, Theorem 4.2.4.

For the Young measure {\mu^u} associated to {u}, the measure {\sigma} in Definition~3is {\mathfrak{L}} and hence:

\displaystyle \int_{\Omega\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu^u(x,y)} = \int_\Omega\int_{\mathbb R} \phi(x,y){\mathrm d}{\mu^u_x(y)}{\mathrm d}{\mathfrak{L}(x)}.

On the other hand:

\displaystyle \int_{\Omega\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu^u(x,y)} = \int_\Omega\phi(x,u(x)){\mathrm d}{\mathfrak{L}} = \int_\Omega\int_{\mathbb R} \phi(x,y) {\mathrm d}{\delta_{u(x)}(y)}{\mathrm d}{\mathfrak{L}(x)}

and we see that {\mu^u} slices into

\displaystyle \mu^u_x = \delta_{u(x)}

and this can be vaguely sketched:

4. Narrow convergence of Young measures and weak* convergence in {L^\infty(\Omega)}

Now we ask ourself: If a sequence {(u^n)} converges weakly* in {L^\infty(\Omega)}, what does the sequence of associated Young measures do? Obviously, we need a notion for the convergence of Young measures. The usual notion here, is that of narrow convergence:

Definition 4 (Narrow convergence of Young measures) A sequence {(\mu_n)} of Young measures on {\Omega\times{\mathbb R}} converges narrowly to {\mu}, if for all bounded and continuous functions {\phi} it holds that

\displaystyle \int_{\Omega\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu_n(x,y)} \rightarrow \int_{\Omega\times{\mathbb R}} \phi(x,y){\mathrm d}{\mu(x,y)}.

Narrow convergence will also be denoted by {\mu_n\rightharpoonup\mu}.

One may also use the non-continuous test functions of the form {\phi(x,y) = \chi_B(x)\psi(y)} with a Borel set {B\subset\Omega} and a continuous and bounded {\psi}, leading to the same notion.

The set of Young measures is closed under narrow convergence, since we may test with the function {\phi(x,y) = \chi_B(x)\chi_{\mathbb R}(y)} to obtain:

\displaystyle \mathfrak{L}(B) = \lim_{n\rightarrow\infty} \int_{\Omega\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu_n(x,y)} = \int_{\Omega\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu(x,y)} = \mu(B\times E).

The next observation is the following:

Proposition 5 Let {(u^n)} be a bounded sequence in {L^\infty(\Omega)}. Then the sequence {(\mu^{u_n})} of associated Young measures has a subsequence which converges narrowly to a Young measure {\mu}.

The proof uses the notion of tightness of sets of measures and the Prokhorov compactness theorem for Young measures (Theorem 4.3.2 in Variational analysis in Sobolev and BV spaces).

Example 3 (Convergence of the Young measures associated to Example 1) Consider the functions {f_n} from Example~1and the associated Young measures {\mu^{f_n}}. To figure out the narrow limit of these Young measures we test with a function {\phi(x,y) = \chi_B(x)\psi(y)} with a Borel set {B} and a bounded and continuous function {\psi}. We calculate

\displaystyle \begin{array}{rcl} \int_{[0,1]\times{\mathbb R}}\phi(x,y){\mathrm d}{\mu^{f_n}(x,y)} &= &\int_0^1\phi(x,f_n(x)){\mathrm d}{\mathfrak{L}(x)}\\ & = &\int_B\psi(f_n(x)){\mathrm d}{\mathfrak{L}(x)}\\ & \rightarrow &\mathfrak{L}(B)\frac{\psi(a)+\psi(b)}{2}\\ & = & \int_B\frac{\psi(a)+\psi(b)}{2}{\mathrm d}{\mathfrak{L}(x)}\\ & = & \int_{[0,1]}\int_{\mathbb R}\phi(x,y){\mathrm d}{\bigl(\tfrac{1}{2}(\delta_a+\delta_b)\bigr)(y)}{\mathrm d}{\mathfrak{L}(y)}. \end{array}

We conclude:

\displaystyle \mu^{f_n} \rightharpoonup \tfrac{1}{2}(\delta_a+\delta_b)\otimes\mathfrak{L}

i.e. the narrow limit of the Young measures {\mu^{f_n}} is notthe constant function {(a+b)/2} but the measure { \mu = \tfrac{1}{2}(\delta_a+\delta_b)\otimes\mathfrak{L}}. This expression may be easier to digest in sliced form:

\displaystyle \mu_x = \tfrac{1}{2}(\delta_a+\delta_b)

i.e. the narrow limit is something like the “probability distribution” of the values of the functions {f_n}. This can be roughly put in a picture:

Obviously, this notion of convergence goes well with nonlinear distortions:

\displaystyle \mu^{\phi\circ f^n} \rightharpoonup \tfrac{1}{2}(\delta_{\phi(a)} + \delta{\phi(b)})\otimes\mathfrak{L}.

Recall from Example~1: The weak-* limit of {\phi\circ f_n} was the constant function {\tfrac{\phi(a)+\phi(b)}{2}}, i.e.

\displaystyle \phi\circ f_n \rightharpoonup^* \tfrac{\phi(a)+\phi(b)}{2}\chi_{[0,1]}.

The observation from the previous example is in a similar way true for general weakly-* converging sequences {f_n}:

Theorem 6 Let {f_n\rightharpoonup^* f} in {L^\infty(\Omega)} with {\mu^{f_n}\rightharpoonup\mu}. Then it holds for almost all {x} that

\displaystyle f(x) = \int_{\mathbb R} y{\mathrm d}{\mu_x(y)}.

In other words: {f(x)} is the expectation of the probability measure {\mu_x}.