With this post I delve into a topic which is somehow new to me, although I planned to look deeper into this for quite some time already. I stumbled upon the paper Gromov-Wasserstein distances and the metric approach to object matching by Facundo Mémoli which was a pleasure to read and motivated this post.
1. Comparing measures with norms and metrics
There are different notions in mathematics to compare two objects, think of the size of real numbers, the cardinality of sets or the length of the difference of two vectors. Here we will deal with not only comparison of objects but with “measures of similarity”. Two fundamental notions for this are norms in vector spaces and metrics. The norm is the stronger concept in that it uses more structure than a metric and also, every norm induces a metric but not the other way round. There are occasions in which both a norm and a metric are available but lead to different concepts of similarity. One of these instances occurs in sparse recovery, especially in the continuous formulation, e.g. as described in a previous post. Consider the unit interval
and two Radon measures
and
on
(
could also be an aritrary metric space). On the space of Radon measures
there is the variation norm

where the supremum is taken over all partitions
of
into a finite number of measurable sets. Moreover, there are different metrics one can put on the space of Radon measures, e.g. the Prokhorov metric which is defined for two probability measures (e.g. non-negative ones with unit total mass)

where
denotes the
-neighborhood of
. Another familiy of metrics are the Wasserstein metrics: For
define

where the infimum is taken over all measure couplings of
and
, that is, all measures
on
such that for measurable
it holds that

Example 1 We compare two Dirac measures
and
located at distinct points
in
as seen here:
The variation norm measures their distance as

(choose
such that it contains
and
small enough that
,
but
and
). The calculate the Prokhorov metric note that you only need to consider
‘s which contain only one of the points
and hence, it evaluates to

For the Wasserstein metric we observe that there is only one possible measure coupling of
and
, namely the measure
. Hence, we have

The variation norm distinguishes the two Diracs but is not able to grasp the distance of their supports. On the other hand, both metrics return the geometric distance of the supports in the underlying space
as distance of the Diracs. Put in pictures: The variation norm of the difference measures the size ob this object

while both metrics capture the distance of the measures like here

It should not stay unnoted that convergence in both the Prokhorov metric and the Wasserstein metrics is exactly the weak convergence of probability measures.
The above example provides a motivation to study metric structures on spaces, even if they are also equipped with a norm. Another reason to shift attention from normed spaces to metric spaces is the fact that there has emerged a body of work to build a theory of analysis in metric spaces (see, e.g. this answer on mathoverflow or the book Gradient Flows: In Metric Spaces And In The Space Of Probability Measures by Ambrosio, Gigli and Savaré (which puts special emphasis on the space of probability measures)). Yet another motivation for the study of metrics in this way is the problem of comparing shapes (without being precisely defined yet): Which of these shapes look most alike?

(Note that shapes need not to be two dimensional figures, you may also think of more complex objects like surfaces in three dimensions or Riemannian manifolds.)
One may also ask the question how two compare different images defined on different shapes, i.e. different “distributions of colour” on two different shapes.
2. Comparing shapes: Metric spaces
Up to now we tried to compare different measures, defined on the same set. At least to me it seems that both the Prokhorov and the Wasserstein metrics are suited to measure the similarity of measures and in fact, they do so somehow finer than the usual norm does.
Let’s try to go one step further and ask ourselves, how we could compare two measures
and
which are defined on two different sets? While thinking about an answer one need to balance several things:
- The setup should be general enough to allow for the comparison of a wide range of objects.
- It should include enough structure to allow meaningful statements.
- It should lead to a measure which is easy enough to handle both analytically and computationally.
For the first and second bullet: We are going to work with measures not on arbitrary sets but on metric spaces. This will allow to measure distances between points in the sets and, as you probably know, does not pose a severe restriction. Although metric spaces are much more specific than topological spaces, we still aim at quantitative measures which are not provided by topologies. With respect to the last bullet: Note that both the Prokhorov and the Wasserstein metric are defined as infimums over fairly large and not too well structured sets (for the Prokhorov metric and need to consider all measurable sets and their
-neighborhoods, for the Wasserstein metric, one need to consider all measure couplings). While they can be handled quite well theoretically, their computational realization can be cumbersome.
In a similar spirit than Facundo Memoli’s paper we work our way up from comparing subsets of metric spaces up to comparing two different metric spaces with two measures defined on them.
2.1. Comparing compact subsets of a metric space: Hausdorff
Let
be a compact metric space. Almost hundred years ago Hausdorff introduced a metric on the family of all non-empty compact subsets of a metric space as follows: The Hausdorff metric of two compact subsets
and
of
is defined as

(again, using the notion of
-neighborhood). This definition seems to be much in the spirit of the Prokhorov metric.
Proposition 2.1 in Facundo Memolis paper shows that the Hausdorff metric has an equivalent description as

where the infimum is taken over all correspondences
of
and
, i.e., all subset
such that for all
there is
such that
and for all
there
such that
. One may also say set coupling of
and
instead of correspondence.
Example 2 There is always the full coupling
. Three different set couplings of two subsets
and
of the unit interval are shown here:

the “full one”
in green and two “slim” ones in red and orange. Other “slim” couplings can be obtained from surjective mappings
by
(or with the roles of
and
swapped): If you couple a set
with itself, there is also the trivial coupling

which is just the diagonal of 
Note that the alternative definition of the Hausdorff metric is more in the spirit of the Wasserstein metric: It does not use enlarged objects (by
-neighborhoods) but couplings.
The Hausdorff metric is indeed a metric on the set
of all non-empty compact subsets of a metric space
and if
itself is compact it even holds that
is a compact metric space (a result, known as Blaschke Selection Theorem).
One may say that we went up an abstraction ladder one step by moving from
to
.
2.2. Comparing compact metric spaces: Gromov-Hausdorff
In the previous subsection we worked within one metric space
. In the book “Metric Structures for Riemannian and Non-Riemannian Spaces” Misha Gromov introduced a notion to compare two different metric spaces. For compact metric space
and
the Gromov-Hausdorff metric is defined as

where the infimum is taken over
- all metric spaces
and
- all isometric embeddings
and
which embed
and
into
respectively.
In words: To compute the Gromov-Hausdorff metric, you try embed both
and
into a common larger space isometrically such that they are as close as possible according to the Hausdorff metric in that space.
Strictly speaking, the above definition is not well stated as one can not form an infimum over all metric spaces since this collection does not form a set according to the rules of set theory. More precisely one should write that
is the infimum over all
such that there exists a metric space
and isometric embeddings
and
of
and
, respectively, such that
.
As the Hausdorff metric could be reformulated with set couplings there is a reformulation of the Gromov-Hausdorff metric based on metric couplings: A metric coupling of two metric spaces
and
is a metric
on the disjoint union
of
and
such that for all
and
it holds that
and
.
Example 3 We couple a metric space
with itself. We denote with
an identical copy of
and look for a metric
on
that respects the metrics
and
in the way a metric coupling has to.
To distinguish elements from
and
we put a
on all quantities from
. Moreover, for
we denote by
its identical copy in
(and similarly for
,
is its identical twin). Then, for any
we can define
(i.e. the distance between any two identical twins is
. By the triangle inequality we get for
and
that
should fulfill

and hence

Indeed we can choose
if
and
leading to one specific metric coupling for any
. This couplings allow to distinguish identical twins and behave as a metric on the whole disjoint union. In the limiting case
we do not obtain a metric but a semi-metric or pseudo-metric which is just the same as a metric but without the assumption that
implies that
.
Example 4 The above example of a metric coupling of a metric space with itself was somehow “reproducing” the given metric as accurate as possible. There are also other couplings that put very different distances to points
and there is also a way to visualize metric couplings: When building the disjoint union of two metric spaces
and
, you can imagine this as isometrically embedding both in a larger metric space
in a non-overlapping way and obtain the metric coupling
as the restriction of the metric on
to
. For
you can embed both into
. A metric coupling which is similar (but not equal) to the coupling of the previous example is obtained by putting
and
side by side at distance
as here (one space in green, the other in blue).

A quite different coupling is obtained by putting
and
side by side, but in a reversed way as here:

You may even embed them in a more weired way as here:

but remember that the embeddings has to be isometric, hence, distortions like here are not allowed.
This example illustrate that the idea of metric coupling is in similar spirit as of “embedding two spaces in a common larger one”.
With the notion of metric coupling, the Gromov-Hausdorff metric can be written as

where the infimum is taken over all set couplings
of
and
and all metric couplings
of
and
.
In words: To compute the Gromov-Hausdorff metric this way, you look for a set coupling of the base sets
and
and a metric coupling
of the metrics
and
such that the maximal distance of two coupled points
and
is as small as possible. While this may look more complicated than the original definition from~(2), note that the original definition uses all metric spaces
in which you can embed
and
isometrically, which seems barely impossible to realize. Granted, the new definition also considers a lot of quantities.
Also note that this definition is in spirit of the Wasserstein metric from~(1): If there were natural measures
on the set couplings
we could write \begin{equation*} d_{GH}(X,Y) = \inf_{R,d} \Big(\int d(x,y)^pd\mu_R\Big)^{1/p} \end{equation*} and in the limit
we would recover definition~(3).
Example 5 The Gromov-Hausdorff distance of a metric space
to itself is easily seen to be zero: Consider the trivial coupling
from Example~2 and the family
of metric couplings from Example~3. Then we have
for any
showing
. Let’s take one of the next-complicated examples and compute the distance of
and
, both equipped with the euclidean metric. We couple the sets
and
by
and the respective metrics by embedding
and
into
as follows: Put
at the line from
to
and
at the line from
to
:

This shows that
and actually, we have equality here.
There is another reformulation of the Gromov-Hausdorff metric, the equivalence of which is shown in Theorem 7.3.25 in the book “A Course in Metric Geometry” by Dmitri Burago, Yuri Burago and Sergei Ivanov:

where the infimum is taken over all set couplings
of
and
.
In words: Look for a set coupling such that any two coupled pairs
and
have the “most equal” distance.
This reformulation may have the advantage over the form (3) in that is only considers the set couplings and the given metrics
and
and no metric coupling is needed.
Note that, as the previous reformulation~(3), it is also in the spirit of the Wasserstein metric: If there were natural measures
in the set couplings
, we could write

and recover the formulation~(4) in the limit
.
One may say that we went up an abstraction ladder one step further by moving from
to
to
.
Since this post has been grown pretty long already, I decided to do the next step (which is the already announced metric on metric spaces which additionally carry some measure on them – so-called metric measure spaces) in a later post.
Like this:
Like Loading...