If you are working on optimization with partial differential equations as constraints, you may be interested in the website
“OPTPDE – A Collection of Problems in PDE-Constrained Optimization”, http://www.optpde.net.
If you have developed an algorithm which can handle a certain class of optimization problems you need to do evaluations and tests on how well the method performs. To do so, you need well constructed test problems. This could be either problems where the optimal solution is known analytically our problems where the solution is known with a rigorous error bound obtained with a bullet-proof solver. Both things are not always easy to obtain and OPTPDE shall serve as a resource for such problems. It has been designed by Roland Herzog, Arnd Rösch, Stefan Ulbrich and Winnifried Wollner.
The generation of test instance for optimization problems seems quite important to me and indeed, several things can go wrong if this is not done right. Frequently, one sees tests for optimization routines on problems where the optimal solution is not known. Since there are usually different ways to express optimality conditions it is not always clear how to check for optimality; even more so, if you only check for “approximate optimality”, e.g. up to machine precision. A frequently observed effect is a kind of “trusted method bias”. By this I mean that an optimal solution is calculated by some trusted method and comparing the outcome of the tested routine with this solution. However, the trusted method uses some stopping criterion usually based on some specific set of formulations of optimality conditions and these can be different from what the new method has been tuned to. And most often, the stopping criteria do not give a rigorous error bound for the solution or the optimal objective value.
For sparse reconstruction problems, I dealt with this issue in “Constructing test instance for Basis Pursuit Denoising” (preprint available here) but I think this methodology could be used for other settings as well.
Like this:
Like Loading...
Here I continue my previous post on methods to compare shapes. In that post we started with different metrics between two probability measures and defined on the same set , namely the Prokhorov metric and the Wasserstein metric. Then we also had a look on the Hausdorff metric which measures the distance between two compact subsets and of a metric space and finally considered the Gromov-Hausdorff metric between two metric spaces and . Our tools have been
- set couplings of and , i.e. subsets such that for all there such that and for all there is such that ,
- metric couplings of and , i.e. metrics on the disjoint union of and such that if and are in and if and are in . In fact, one could also work with semi-metrics , i.e. they do not need to be positive definite, and
- measure couplings of and , i.e. measures on such that and for all -/-measurable set and , respectively.
Now we make the next (and final) step and compare metric spaces and which are both equipped with a measure. These objects are known as metric measure spaces or mm-spaces and are formally defined as follows:
Definition 1 (mm-space) A metric measure space is a tripel consisting a compact metric space and a Borel probability measure on .
Note that sometimes it is included that has full support (i.e., equal to ) in the definition of an mm-space, but it seems that not everybody does it like that.
1. Comparing mm-spaces: Gromov-Wasserstein
Our question is: How to we compare two mm-spaces and ? The plan is simply to augment the previous versions of the Gromov-Hausdorff distances defined in my previos post here and here by something which takes the measures on the respective metric spaces into account. We recall both formulations of the Gromov-Hausdorff distance: The first is
where the infimum is taken over all set couplings of and and metric couplings of and , and the second is
where the infimum is also taken over all set couplings of and .
Basically, we have already seen what we should do to build a metric between mm-spaces. The idea is: if there were some “natural measure” for any set coupling of and , then we would simply define the distance between and as
(as a generalization of (1)) and
(as a generalization of (2)). In both cases we can have . Note that the obvious modification for leads to something very similar to the Gromov-Hausdorff metrics.
But indeed there are “natural measures” on the set couplings of and : At least for the full coupling there are the measure couplings of and ! (One does not need to consider smaller set couplings since this can be taken into account by the measure couplings; they do not need to have full support anyway.)
Applying this idea to the version (1) of the Gromov-Hausdorff metric we arrive at the following expression, which can be called Gromov-Wasserstein metric,
where the infimum is taken over all measure couplings of and and all metric couplings of and .
Starting from the version (2) of the Gromov-Hausdorff metric we arrive at another formulation:
where the infimum is taken over all measure couplings of and .
While both versions of the Gromov-Hausdorff metric for compact metric spaces where equal, the same is not true for both generalizations to mm-spaces: In his paper Memoli proves that and gives an example (right after Remark 5.14) where strict inequality holds.
2. Comparing mm-spaces: Gromov-Prokhorov
Instead of starting from the Gromov-Hausdorff distance between metric spaces and augmenting their definition with something that takes the measures into account, we could also start from the “>Prokhorov metric between probability measures and augment the definition with something that takes the metric into account. In fact there are also two possibilities to do so: In the appendix of his paper, Memoli quotes this version (from this paper by Greven, Pfaffelhuber and Winter) of a metric between mm-spaces which we call Gromov-Prokhorov metric
where the infimum is taken over all measure couplings of and and all metric couplings of and .
The next version (also from the same paper by Greven, Pfaffelhuber and Winter where it was called Eurandom metric) is
where the infimum is taken over all measure couplings only.
3. A very simple example
The calculation of the proposed metrics by hand can be quite cumbersome. Let’s look at the simplest example.
We consider metric spaces (with the euclidean metric) accompanied with the measures and for some points and . In this case the is only one measure coupling of and , namely
Now it is easy to calculate the variant (4) of the Gromov-Wasserstein metric:
Let’s have a look at the variant (3): Since there is only one measure coupling, the metric is
As we have learned in Example 4 in the previous post, we can find a metric coupling of and that brings the points in and in arbitrarily close together (by embedding both and into some such that these points are only -far away from each other). Hence, we see that we have
similarly to .
Now let’s look at the Gromov-Prokhorov metric from (5). Again we only have one measure coupling and we get
Since the measure coupling is a Dirac, we can evaluate
As observed previously, there are metric couplings which bring the points and arbitrarily close together, and hence for any there is a metric coupling such that which shows that
Finally, consider the second variant of the Gromov-Prokhorov metric from (6). Since we only have the one measure coupling we have
Evaluating the tensored Dirac delta is easy: We have that is either one or zero and it is one if and only if the point is in the set that is measured. However, we have that is, of course, always zero (and never larger than any ). Hence, the measure is always zero and hence, this version of the Gromov-Prokhorov distance also gives
Note that all four metrics can not see that the two Diracs are at different points. The reason seems to be, that one can “deform the metric space outside of the support of the measures” arbitrarily, in some sense.
It seems, that the computation of and were easier, since one needs to know all measure couplings and no metric couplings has been involved.
4. Next difficult example: Compare some mm-space to a point
Ok, we have seen that mm-spaces which carry their measure in just a single point all look alike in both versions of the Gromov-Wasserstein metric and also in both versions of the Gromov-Prokhorov metric. Let’s look at a slightly more difficult example: We consider some mm-space and want to calculate its distance to a single point, i.e. the mm-space with the only possible metric and measure. This should somehow measure the “size” or “spread” of the mm-space .
First, we need to know all measure couplings and all metric couplings between these spaces. The measure couplings are very easy: There is just one, namely
(i.e. all subsets of are treated as if they were subsets of ). Concerning metric couplings, there are a few more. We allow semi-metrics on the disjoint union : Since should respect the metric on we see that all metric couplings are parametrized by the points in by identifying (the element in ) with this point , i.e. all metric couplings are of the form defined by
(This only gives semi-metrics since we have although, formally .)
Let’s calculate the first Gromov-Wasserstein metric: There is only one measure coupling and we use the parametrization of the metric couplings to deduce
This quantity seems to be known as “minimal -th central moment” of .
The second variant of the Gromov-Wasserstein metric is (remember, there is only one measure coupling)
This quantity (without the factor ) is called the “-diameter” of .
Let’s turn to the Gromov-Prokhorov metrics. The first one is (remember, that the metric couplings are parametrized by the points in )
If this looks familiar, then you may have encountered the Ky-Fan metric already? The Ky-Fan metric is a metric between random variables and defined of the same probability space with values in a metric space with metric . It reads as
Hence, the first version of the Gromov-Prokhorov metric is
i.e., the minimal Ky-Fan metric between the identity mapping and the constant mappings. (In other words, it measures how far the identity is from the constant mappings in the sense of Ky Fan.)
The second variant of the Gromov-Prokhorov metric is (remember, the only measure coupling is )
I do not have a neat name or a good intuition for this metric yet (although it also looks like it measures “size” or “non-localization” of in some sense). If you have one, let me know!
Like this:
Like Loading...