With this post I delve into a topic which is somehow new to me, although I planned to look deeper into this for quite some time already. I stumbled upon the paper Gromov-Wasserstein distances and the metric approach to object matching by Facundo Mémoli which was a pleasure to read and motivated this post.

1. Comparing measures with norms and metrics

There are different notions in mathematics to compare two objects, think of the size of real numbers, the cardinality of sets or the length of the difference of two vectors. Here we will deal with not only comparison of objects but with “measures of similarity”. Two fundamental notions for this are norms in vector spaces and metrics. The norm is the stronger concept in that it uses more structure than a metric and also, every norm induces a metric but not the other way round. There are occasions in which both a norm and a metric are available but lead to different concepts of similarity. One of these instances occurs in sparse recovery, especially in the continuous formulation, e.g. as described in a previous post. Consider the unit interval {I = [0,1]} and two Radon measures {\mu_1} and {\mu_2} on {I} ({I} could also be an aritrary metric space). On the space of Radon measures {\mathfrak{M}(I)} there is the variation norm

\displaystyle \|\mu\|_{\mathfrak{M}}= \sup_\Pi\sum_{A\in\Pi}|\mu(A)|

where the supremum is taken over all partitions {\Pi} of {I} into a finite number of measurable sets. Moreover, there are different metrics one can put on the space of Radon measures, e.g. the Prokhorov metric which is defined for two probability measures (e.g. non-negative ones with unit total mass)

\displaystyle  \begin{array}{rcl}  d_P(\mu_1,\mu_2) & = & \inf\{\epsilon>0\ :\ \mu_1(A)\leq \mu_2(A^\epsilon) + \epsilon,\nonumber\\ & & \qquad \mu_2(A)\leq \mu_1(A^\epsilon) + \epsilon\ \text{for all measurable}\ A\} \end{array}

where {A^\epsilon} denotes the {\epsilon}-neighborhood of {A}. Another familiy of metrics are the Wasserstein metrics: For {p\geq 1} define

\displaystyle d_{W,p}(\mu_1,\mu_2) = \Big(\inf_\nu\int_{I\times I} |x-y|^p d\nu(x,y)\Big)^{1/p} \ \ \ \ \ (1)

where the infimum is taken over all measure couplings of {\mu_1} and {\mu_2}, that is, all measures {\nu} on {I\times I} such that for measurable {A} it holds that

\displaystyle \nu(A\times I) = \mu_1(A)\ \text{and}\ \nu(I\times A) = \mu_2(A).

Example 1 We compare two Dirac measures {\mu_1 = \delta_{x_1}} and {\mu_2 = \delta_{x_2}} located at distinct points {x_1\neq x_2} in {I} as seen here:

058_two_diracsThe variation norm measures their distance as

\displaystyle \|\mu_1-\mu_2\|_{\mathfrak{M}} = \sup_\Pi\sum_{A\in\Pi}|\delta_{x_1}(A) - \delta_{x_2}(A)| = 2

(choose {\Pi} such that it contains {A_1} and {A_2} small enough that {x_1\in A_1}, {x_2\in A_2} but {x_1\notin A_2} and {x_2\notin A_1}). The calculate the Prokhorov metric note that you only need to consider {A}‘s which contain only one of the points {x_{1/2}} and hence, it evaluates to

\displaystyle d_P(\mu_1,\mu_2) = |x_1-x_2|.

For the Wasserstein metric we observe that there is only one possible measure coupling of {\delta_{x_1}} and {\delta_{x_2}}, namely the measure {\nu = \delta_{(x_1,x_2)}}. Hence, we have

\displaystyle d_{W,p}(\mu_1,\mu_2) = \Big(\int_{I\times I}|x-y|^pd\delta_{(x_1,x_2)}(x,y)\Big)^{1/p} = |x_1-x_2|.

The variation norm distinguishes the two Diracs but is not able to grasp the distance of their supports. On the other hand, both metrics return the geometric distance of the supports in the underlying space {I} as distance of the Diracs. Put in pictures: The variation norm of the difference measures the size ob this object

058_two_diracs_difference

while both metrics capture the distance of the measures like here

Two diracs

It should not stay unnoted that convergence in both the Prokhorov metric and the Wasserstein metrics is exactly the weak convergence of probability measures.

The above example provides a motivation to study metric structures on spaces, even if they are also equipped with a norm. Another reason to shift attention from normed spaces to metric spaces is the fact that there has emerged a body of work to build a theory of analysis in metric spaces (see, e.g. this answer on mathoverflow or the book Gradient Flows: In Metric Spaces And In The Space Of Probability Measures by Ambrosio, Gigli and Savaré (which puts special emphasis on the space of probability measures)). Yet another motivation for the study of metrics in this way is the problem of comparing shapes (without being precisely defined yet): Which of these shapes look most alike?

058_shapes

(Note that shapes need not to be two dimensional figures, you may also think of more complex objects like surfaces in three dimensions or Riemannian manifolds.)

One may also ask the question how two compare different images defined on different shapes, i.e. different “distributions of colour” on two different shapes.

2. Comparing shapes: Metric spaces

Up to now we tried to compare different measures, defined on the same set. At least to me it seems that both the Prokhorov and the Wasserstein metrics are suited to measure the similarity of measures and in fact, they do so somehow finer than the usual norm does.

Let’s try to go one step further and ask ourselves, how we could compare two measures {\mu_1} and {\mu_2} which are defined on two different sets? While thinking about an answer one need to balance several things:

  • The setup should be general enough to allow for the comparison of a wide range of objects.
  • It should include enough structure to allow meaningful statements.
  • It should lead to a measure which is easy enough to handle both analytically and computationally.

For the first and second bullet: We are going to work with measures not on arbitrary sets but on metric spaces. This will allow to measure distances between points in the sets and, as you probably know, does not pose a severe restriction. Although metric spaces are much more specific than topological spaces, we still aim at quantitative measures which are not provided by topologies. With respect to the last bullet: Note that both the Prokhorov and the Wasserstein metric are defined as infimums over fairly large and not too well structured sets (for the Prokhorov metric and need to consider all measurable sets and their {\epsilon}-neighborhoods, for the Wasserstein metric, one need to consider all measure couplings). While they can be handled quite well theoretically, their computational realization can be cumbersome.

In a similar spirit than Facundo Memoli’s paper we work our way up from comparing subsets of metric spaces up to comparing two different metric spaces with two measures defined on them.

2.1. Comparing compact subsets of a metric space: Hausdorff

Let {(X,d)} be a compact metric space. Almost hundred years ago Hausdorff introduced a metric on the family of all non-empty compact subsets of a metric space as follows: The Hausdorff metric of two compact subsets {A} and {B} of {X} is defined as

\displaystyle d_H(A,B) = \inf\{\epsilon>0 \ :\ A\subset B_\epsilon,\ B \subset A_\epsilon\}

(again, using the notion of {\epsilon}-neighborhood). This definition seems to be much in the spirit of the Prokhorov metric.

Proposition 2.1 in Facundo Memolis paper shows that the Hausdorff metric has an equivalent description as

\displaystyle d_H(A,B) = \inf_R \sup_{(a,b) \in R} d(a,b)

where the infimum is taken over all correspondences {R} of {A} and {B}, i.e., all subset {R\subset A\times B} such that for all {a\in A} there is {b\in B} such that {(a,b) \in R} and for all {b\in B} there {a\in A} such that {(a,b)\in R}. One may also say set coupling of {A} and {B} instead of correspondence.

Example 2 There is always the full coupling {R = A\times B}. Three different set couplings of two subsets {A} and {B} of the unit interval are shown here:

058_set_coupling

the “full one” {A\times B} in green and two “slim” ones in red and orange. Other “slim” couplings can be obtained from surjective mappings {f:A\rightarrow B} by {R = \{(a,f(a))\ :\ a\in A\}} (or with the roles of {A} and {B} swapped): If you couple a set {A} with itself, there is also the trivial coupling

\displaystyle R = \{(a,a)\ : \ a\in A\}

which is just the diagonal of {A\times A}

Note that the alternative definition of the Hausdorff metric is more in the spirit of the Wasserstein metric: It does not use enlarged objects (by {\epsilon}-neighborhoods) but couplings.

The Hausdorff metric is indeed a metric on the set {\mathfrak{C}(X)} of all non-empty compact subsets of a metric space {X} and if {X} itself is compact it even holds that {(\mathfrak{C}(X),d_H)} is a compact metric space (a result, known as Blaschke Selection Theorem).

One may say that we went up an abstraction ladder one step by moving from {(X,d)} to {(\mathfrak{C}(X),d_H)}.

2.2. Comparing compact metric spaces: Gromov-Hausdorff

In the previous subsection we worked within one metric space {X}. In the book “Metric Structures for Riemannian and Non-Riemannian Spaces” Misha Gromov introduced a notion to compare two different metric spaces. For compact metric space {X} and {Y} the Gromov-Hausdorff metric is defined as

\displaystyle d_{GH}(X,Y) = \inf_{Z,f,g} d_H(f(X),g(Y)) \ \ \ \ \ (2)

where the infimum is taken over

  • all metric spaces {Z} and
  • all isometric embeddings {f} and {g} which embed {X} and {Y} into {Z} respectively.

In words: To compute the Gromov-Hausdorff metric, you try embed both {X} and {Y} into a common larger space isometrically such that they are as close as possible according to the Hausdorff metric in that space.

Strictly speaking, the above definition is not well stated as one can not form an infimum over all metric spaces since this collection does not form a set according to the rules of set theory. More precisely one should write that {d_{GH}(X,Y)} is the infimum over all {r>0} such that there exists a metric space {Z} and isometric embeddings {f} and {g} of {X} and {Y}, respectively, such that {d_H(f(X),g(Y))<r}.

As the Hausdorff metric could be reformulated with set couplings there is a reformulation of the Gromov-Hausdorff metric based on metric couplings: A metric coupling of two metric spaces {(X,d_X)} and {(Y,d_Y)} is a metric {d} on the disjoint union {X\sqcup Y} of {X} and {Y} such that for all {x,x'\in X} and {y,y'\in Y} it holds that {d(x,x') = d_X(x,x')} and {d(y,y') = d_Y(y,y')}.

Example 3 We couple a metric space {(X,d)} with itself. We denote with {(X',d')} an identical copy of {(X,d)} and look for a metric {D} on {X\times X'} that respects the metrics {d} and {d'} in the way a metric coupling has to.

To distinguish elements from {X} and {X'} we put a {'} on all quantities from {X'}. Moreover, for {x\in X} we denote by {x'} its identical copy in {X'} (and similarly for {x'\in X'}, {x} is its identical twin). Then, for any {\epsilon>0} we can define {D_\epsilon(x,x') = D_\epsilon(x',x) = \epsilon} (i.e. the distance between any two identical twins is {\epsilon}. By the triangle inequality we get for {x\in X} and {y'\in X'} that {D_\epsilon(x,y')} should fulfill

\displaystyle D_\epsilon(x',y') - D_\epsilon(x',x) \leq D_\epsilon(x,y') \leq D_\epsilon(x,y) + D_\epsilon(y,y')

and hence

\displaystyle d(x,y) - \epsilon \leq D_\epsilon(x,y') \leq d(x,y) + \epsilon.

Indeed we can choose {D_\epsilon(x,y') = d(x,y)} if {x\in X} and {y'\in Y} leading to one specific metric coupling for any {\epsilon}. This couplings allow to distinguish identical twins and behave as a metric on the whole disjoint union. In the limiting case {\epsilon\rightarrow 0} we do not obtain a metric but a semi-metric or pseudo-metric which is just the same as a metric but without the assumption that {d(x,y) = 0} implies that {x=y}.

Example 4 The above example of a metric coupling of a metric space with itself was somehow “reproducing” the given metric as accurate as possible. There are also other couplings that put very different distances to points {D(x,y')} and there is also a way to visualize metric couplings: When building the disjoint union of two metric spaces {X} and {Y}, you can imagine this as isometrically embedding both in a larger metric space {Z} in a non-overlapping way and obtain the metric coupling {D} as the restriction of the metric on {Z} to {X\sqcup Y}. For {X=Y=[0,1]} you can embed both into {Z = {\mathbb R}^2}. A metric coupling which is similar (but not equal) to the coupling of the previous example is obtained by putting {X} and {Y} side by side at distance {\epsilon} as here (one space in green, the other in blue).

058_metric_coupling_embedding1

A quite different coupling is obtained by putting {X} and {Y} side by side, but in a reversed way as here:

058_metric_coupling_embedding2

You may even embed them in a more weired way as here:

058_metric_coupling_embedding3

but remember that the embeddings has to be isometric, hence, distortions like here are not allowed.

058_metric_coupling_embedding4This example illustrate that the idea of metric coupling is in similar spirit as of “embedding two spaces in a common larger one”.

With the notion of metric coupling, the Gromov-Hausdorff metric can be written as

\displaystyle d_{GH}(X,Y) = \inf_{R,d} \sup_{(x,y)\in R} d(x,y) \ \ \ \ \ (3)

where the infimum is taken over all set couplings {R} of {X} and {Y} and all metric couplings {d} of {(X,d_X)} and {(Y,d_Y)}.

In words: To compute the Gromov-Hausdorff metric this way, you look for a set coupling of the base sets {X} and {Y} and a metric coupling {d} of the metrics {d_X} and {d_Y} such that the maximal distance of two coupled points {x} and {y} is as small as possible. While this may look more complicated than the original definition from~(2), note that the original definition uses all metric spaces {Z} in which you can embed {X} and {Y} isometrically, which seems barely impossible to realize. Granted, the new definition also considers a lot of quantities.

Also note that this definition is in spirit of the Wasserstein metric from~(1): If there were natural measures {\mu_R} on the set couplings {R} we could write \begin{equation*} d_{GH}(X,Y) = \inf_{R,d} \Big(\int d(x,y)^pd\mu_R\Big)^{1/p} \end{equation*} and in the limit {p\rightarrow\infty} we would recover definition~(3).

Example 5 The Gromov-Hausdorff distance of a metric space {(X,d_X)} to itself is easily seen to be zero: Consider the trivial coupling {R = \{(x,x)\ :\ x\in X\}} from Example~2 and the family {D_\epsilon} of metric couplings from Example~3. Then we have {d_{GH}(X,X) \leq \epsilon} for any {\epsilon >0} showing {d_{GH}(X,X) = 0}. Let’s take one of the next-complicated examples and compute the distance of {X = [0,1]} and {Y=[0,2]}, both equipped with the euclidean metric. We couple the sets {X} and {Y} by {R = \{(x,2x)\ : \ x\in X\}} and the respective metrics by embedding {X} and {Y} into {{\mathbb R}^2} as follows: Put {Y} at the line from {(0,0)} to {(2,0)} and {X} at the line from {(\tfrac12,\epsilon)} to {(1\tfrac12,\epsilon)}:

058_gromov_hausdorff_12

This shows that {d_{GH}(X,Y) \leq \tfrac12} and actually, we have equality here.

There is another reformulation of the Gromov-Hausdorff metric, the equivalence of which is shown in Theorem 7.3.25 in the book “A Course in Metric Geometry” by Dmitri Burago, Yuri Burago and Sergei Ivanov:

\displaystyle d_{GH}(X,Y) = \tfrac12\inf_R \sup_{\overset{\overset{x_{1/2}\in X}{y_{1/2}\in Y}}{(x_i,y_i)\in R}}\big| d_X(x_1,x_2) - d_Y(y_1,y_2)\big| \ \ \ \ \ (4)

where the infimum is taken over all set couplings {R} of {X} and {Y}.

In words: Look for a set coupling such that any two coupled pairs {(x_1,y_1)} and {(x_2,y_2)} have the “most equal” distance.

This reformulation may have the advantage over the form (3) in that is only considers the set couplings and the given metrics {d_X} and {d_Y} and no metric coupling is needed.

Note that, as the previous reformulation~(3), it is also in the spirit of the Wasserstein metric: If there were natural measures {\mu_R} in the set couplings {R}, we could write

\displaystyle d_{GH}(X,Y) = \tfrac12\inf_R \Big(\int_{R\times R}\big| d_X(x_1,x_2) - d_Y(y_1,y_2)\big|^p d\mu_R(x_1,y_1)d\mu_R(x_2,y_2)\Big)^{1/p}.

and recover the formulation~(4) in the limit {p\rightarrow\infty}.

One may say that we went up an abstraction ladder one step further by moving from {(X,d)} to {(\mathfrak{C}(X),d_H)} to {(\text{All compact metric spaces},d_{GH})}.

Since this post has been grown pretty long already, I decided to do the next step (which is the already announced metric on metric spaces which additionally carry some measure on them – so-called metric measure spaces) in a later post.

While this blog is intended to be totally job-related, this post will be unusual in that it will be a little bit personal. And indeed, this post is about the mix of work and non-work life.

A little more than half a year ago I started my life as a part-time mathematician. To be more precise, I am in Elternzeit, which is the German framework for parental leave. According to German law, parents have several rights related to their work when they get kids. One of them is to reduce their weekly hours of work when they have small children and this is what I did: For one year I reduced my weekly hours of work to 50%. The reason for the reduction is that my wife has a very busy year of work (for those who know: she is in her “Referendariat” to become a teacher – but I am not going to rant about the German Referendariat for teachers here…). There is also Elterngeld is Germany: That is, that you get some fraction of your salary if you stay at home to be with your kids, but this only applies in the first 14 months but my kids are older and hence, I just get about half the salary during the Elternzeit. But one of the first things I learned was that this is not totally true. In fact, my “parental-leave”-salary is higher than half of my usual salary. This has two reasons: First, there is “income-tax-progression” here, that is, for more salary, you pay a higher tax rate. Second, some parts of my salary are bonuses which are not halved.

Besides the financial issues there are other things about parental leave as a mathematician at a university which I find interesting and worth blogging about. I divided these issues into two parts: work-related and family related.

1. Work life

In this year of parental leave I am in my office for about four hours a day. I go there, when the kids are in school and at day-care and I leave early enough to prepare lunch for our schoolgirl. Four hours a day is not much and indeed, I get much less things done, than I used to do. But that was totally expected. To counteract a bit, I do a little less reviewing papers. Moreover, my teaching duties are halved. And also, blogging is a little bit reduced. But anyway, I do not make it to get all the things done, that I would like to. However, this is not much different from how it used to be. And with even less time, I get better in prioritizing.

But more important than the mere reduction of time for work is the fact that I am not able to be in the office in the afternoon, and I am not able to make appointments or to attend meetings in the afternoon. This does influence my daily work to some extend. While all my colleagues know that I am on part-time (and, as far as I experience, have no objections to that), they usually do not have it in mind when they make dates. If I am lucky, dates are found with doodle, and then it’s easy for me, but I have to remind people that I do not have time in the afternoon frequently. Moreover, some regular meetings are scheduled for the afternoon by default and I just miss them. Finally, there are exceptional dates (e.g. the ones related with job interviews and negotiations on which I have written here and here). In these cases, I also mentioned my limited possibilities for meetings and the severe restrictions. For the job interviews I even asked for a video-interview, but that did not work out. On one occasion, they tried to schedule my job interview for the weekend (which would be easier to arrange for me) but some members of the committee objected due to family reasons – which is totally OK in my opinion. For the negotiations, I tried to have them during school holidays but this did not work out either. Hence, we had to arrange additional child care (our children have four grand parents and always some of them could step in – what I didn’t try was, to get reimbursed for the train tickets for the grandmas; I think this would be reasonable but would cause serious irritation in the administration).

Another severe change in my work-life is that I can not attend conferences and meetings with the only exception when they are during school holidays. Here we arranged holidays of the kids at their grandparents and this way I could attend some nice meetings this year.

2. Family life

In my parental leave my wife and I changed roles. It is not that I did not care about family life, household and such when I was working full-time but still, I was working full-time and my wife was working half-time or less. Since she has to work a LOT, I am in charge of everything here at home now. But, as before, it is not that she is not doing anything in that respect, but the important thing is that I am in charge. And that makes more of a difference than I anticipated. The pure amount of work doing shopping, cooking, cleaning and so on is OK because I have the time to do so (although – it is work nonetheless). But to be in charge is a bit more. Especially, if there is something to be done, I can not say “Let’s wait for one day, maybe it will be done by the time…”. If I will not do it, nobody will (again: Not totally true, but this is the way of thinking which I employ). Moreover, there are some things which come with being “house husband” which did not come to my mind when I planned my part time year: organizing children’s birthday parties, arranging new afternoon activities and weekend activities or trips, taking care of seasonal decoration, thinking of Christmas presents early enough or arranging routine medical check-ups on time are just a few of them. All this is also not hard, but harder than I thought, when you are in charge.

In total, all this role change and house-keeping is a very refreshing experience. Being a house husband is not something be scared of. It is rewarding, it’s mostly fun, it’s a lot of work and it is quite some responsibility.

As a final remark, I think, that this year of my parental leave does also mean something for our children. Seeing that dad is at home and being the one which is always there, and that both parents can do this job and that, the other way round, mom is at work most of the day, comes home late and needs rest and such, is an experience for them which shows them that there are different ways to organize a family. It probably also influences them in the way they see gender roles.

In conclusion: If you think about staying at home for kids and the laws of your country give you the opportunity to do so, then I can recommend it. Even being fully in charge and with a full-time working wife – probably even more so.

There are several answers to the following question:

1. What is a convex set?

For a convex set you probably know these definitions:

Definition 1 A subset {C} of a real vector space {V} is convex if for any {x,y\in C} and {\lambda\in[0,1]} it holds that {\lambda x + (1-\lambda)y\in C}.

In other words: If two points lie in the set, then every convex combination also lies in the set.

While this is a “definition from the inside”, convex sets can also be characterized “from the outside”. We add closedness as an assumption and get:

Definition 2 A closed subset {C} of a real locally convex topological vector space {V} is convex if it is the intersection of closed halfspaces (i.e. sets of the form {\{x\in V\ :\ \langle a,x\rangle\geq c\}} for some {a} in the dual space {V^*} and {c\in {\mathbb R}}).

Moreover, we could define convex sets via convex functions:

Definition 3 A set {C\subset V} is convex if there is a convex function {f:V\rightarrow {\mathbb R}} such that {C = \{x\ :\ f(x)\leq 0\}}.

Of course, this only makes sense once we have defined convex functions. Hence, we could also ask the question:

2. What is a convex function?

We can define a convex function by means of convex sets as follows:

Definition 4 A function {f:V\rightarrow{\mathbb R}} from a real vector space into the real numbers is convex, if its epigraph {\text{epi} f = \{(x,\mu)\ :\ f(x)\leq \mu\}\subset V\times {\mathbb R}} is convex (as a subset of the vector space {V\times{\mathbb R}}).

The epigraph consists of the points {(x,\mu)} which lie above the graph of the function and carries the same information as the function.

(Let me note that one can replace the real numbers here and in the following with the extended real numbers {\bar {\mathbb R} = {\mathbb R}\cup\{-\infty,\infty\}} if one uses the right extension of the arithmetic and the obvious ordering, but we do not consider this in this post.)

Because epigraphs are not arbitrary convex sets but have a special form (if a point {(x,\mu)} is in an epigraph, then every {(x,\lambda)} with {\lambda\geq \mu} is also in the epigraph), and because the underlying vector space {V\times {\mathbb R}} comes with an order in the second component, some of the definitions for convex sets from above have a specialized form:

From the definition “convex combinations stay in the set” we get:

Definition 5 A function {f:V\rightarrow {\mathbb R}} is convex, if for all {x,y\in V} and {\lambda\in [0,1]} it holds that

\displaystyle f(\lambda x + (1-\lambda)y) \leq \lambda f(x) + (1-\lambda)f(y).

In other words: The secant of any two points on the graph lies above the graph.

From the definition by “intersection of half spaces” we get another definition. Since we added closedness in this case we add this assumption also here. However, closedness of the epigraph is equivalent to lower-semicontinuity (lsc) of the function and since lsc functions are very convenient we use this notion:

Definition 6 A function {f:V\rightarrow{\mathbb R}} is convex and lsc if it is the pointwise supremum of affine functions, i.e., for some set {S} of tuples {(a,c)\in V^*\times {\mathbb R}} it holds that

\displaystyle f(x) = \sup_{(a,c) \in S} \langle a,x\rangle + c.

A special consequence if this definition is that tangents to the graph of a convex function lie below the function. Another important consequence of this fact is that the local behavior of the function, i.e. its tangent plane at some point, carries some information about the global behavior. Especially, the property that the function lies above its tangent planes allows one to conclude that local minima of the function are also global. Probably the last properties are the ones which give convex functions a distinguished role, especially in the field of optimization.

Some of the previous definitions allow for generalizations into several direction and the quest for the abstract core of the notion of convexity has lead to the field of abstract convexity.

3. Abstract convexity

When searching for abstractions of the notion of convexity one may get confused by the various different approaches. For example there are generalized notions of convexity, e.g. for function of spaces of matrices (e.g. rank-one convexity, quasi-convexity or polyconvexity) and there are also further different notions like pseudo-convexity, invexity or another form ofquasi-convexity. Here we do not have generalization in mind but abstraction. Although both things are somehow related, the way of thinking is a bit different. Our aim is not to find a useful notion which is more general than the notion of convexity but to find a formulation which contains the notion of convexity and abstracts away some ingredients which probably not carry the essence of the notion.

In the literature one also finds several approaches in this direction and I mention only my favorite one (and I restrict myself to abstractly convex functions and not write about abstractly convex sets).

To me, the most appealing notion of an abstract convex function is an abstraction of the definition as “pointwise supremum of affine functions”. Let’s look again at the definition:

A function {f:V\rightarrow {\mathbb R}} is convex and lsc if there is a subset {S} of {V^*\times {\mathbb R}} such that

\displaystyle f(x) = \sup_{(a,c)\in S} \langle a,x\rangle + c.

We abstract away the vector space structure and hence, also the duality, but keep the real-valuedness (together with its order) and define:

Definition 7 Let {X} be a set and let {W} be a set of real valued function on {X}. Then a function {f:X\rightarrow{\mathbb R}} is said to be {W}-convex if there is a subset {S} of {W\times {\mathbb R}} such that

\displaystyle f(x) = \sup_{(w,c)\in S} w(x) + c.

What we did in this definition was simply to replace continuous affine functions {x\mapsto \langle a,x\rangle} on a vector space by an arbitrary collection of real valued functions {x\mapsto w(x)} on a set. On sees immediately that for every function {w\in W} and any real number {c} the function {w+c} is {W} convex (similarly to the fact that every affine linear function is convex).

Another nice thing about this approach is, that it allows for some notion of duality/conjugation. For {f:V\rightarrow{\mathbb R}} we define the {W}-conjugate by

\displaystyle f^{W*}(w) = \sup_{w\in W} \Big(w(x)- f(x) \Big)

and we can even formulate a biconjugate

\displaystyle f^{W**}(x) = \sup_x \Big(w(x) - f^{W*}(w)\Big).

We naturally have a Fenchel inequality

\displaystyle w(x) \leq f(x) + f^{W*}(w)

and we may even define subgradients as usual. Note that a conventional subgradient is an element of the dual space which defines a tangent plane at the point where the subgradient is taken, that is, {a\in V^*} is a subgradient of {f} at {x}, if for all {y\in V} it holds that {f(y) \geq f(x) + \langle a,y-x\rangle} or

\displaystyle f(y) - \langle a,y\rangle\geq f(x) - \langle a,x\rangle.

A {W}-subgradient is an element of {W}, namely we define: {w\in\partial^{W}f(x)} if

\displaystyle \text{for all}\ y:\quad f(y) -w(y) \geq f(x) - w(x).

Then we also have a Fenchel equality:

\displaystyle w\in\partial^W f(x)\iff w(x) = f(x) + f^{W*}(w).

One may also take dualization as the starting point for an abstraction.

4. Abstract conjugation

We could formulate the {W}-conjugate as follows: For {\Phi(w,x) = w(x)} we have

\displaystyle f^{W*}(x) = \sup_w \Big(\Phi(w,x) - f(x)\Big).

This opens the door to another abstraction: For some sets {X,W} (without any additional structure) define a coupling function {\Phi: X\times W \rightarrow {\mathbb R}} and define the {\Phi}-conjugate as

\displaystyle f^{\Phi*}(w) = \sup_x \Big(\Phi(w,x) - f(x)\Big)

and the {\Phi}-biconjugate as

\displaystyle f^{\Phi**}(x) = \sup_x \Big(\Phi(w,x) - f^{W*}(w)\Big)

I case anybody wondered: I did not make it to produce a single post from Oberwolfach last week. For me it was an unusual dense week. Not only that there were many interesting talks on a tight schedule. The “talk-free” parts of the days I had a lot of interesting and stimulating discussions and the night life did not permit the writing posts either. Sorry – but you may probably read about the outcome of some of the discussions in future blog post…

Guess where this is:

Well, I spoiled the solution in the title anyway (and of course, some of you recognized the sculpture in the last picture): I am in Oberwolfach this week at the workshop “Computational Inverse Problems”. Maybe I’ll find some time to blog about some talks but I am not sure if I can make it. Usually the Oberwolfach workshops are quite intense and hence, blogging will be too much…

Anyway, the autumn looks great here these days:

In a previous post I wrote about my experience with three job interviews for math professorships in Germany I had earlier this year. In this post I’ll tell the next part of the story. As the previous post on this topic, this post is a mixture of a description of my own experience and the general procedure. I won’t spoil any names of people or institution here but if you are interested in more details, feel free to contact me.

1. The “Ruf”

It turned out that I actually got two job offers which are called “Ruf” in Germany (which translates to “call”). The first one arrived (unofficially) by phone once the committee had formed its list. Some time later I received another phone call when the list had passed the Fakultätsrat. The official letter from the Rektor needed some additional weeks to arrive. The other offer arrived first by email (from the Rektorats office) and by paper mail a few days later. The first thing I did after each offer arrived, was to contact the speaker of my math department (by phone), to inform my colleagues (by email) and to inform the dean and the president (both by email and paper mail). I am not sure about the precise protocol here, but I thought it would be best to put the cards of the table.

Both offer-letters were short and formal and the most important thing was that they asked me to prepare a document in which I list my “Vorstellungen zur Ausgestaltung und Ausstattung” (unofficially this is called the “list of wishes”) which is the next point I am going to discuss:

2. The “List of Wishes”

I had no real clue how this document should look like. I was lucky that I could ask friends and that they were so kind to send me documents they had prepared for similar occasions. However, I do not think that there any rules here. University Rektors, Kanzlers or Presidents have negotiations with people from all kinds of disciplines and I am sure that the list of wishes from neuro-biologists, mechanical engineers or philosophers will differ considerably from the one I wrote.

I think the list of wishes should contain:

  • A section about what you plan to do. I think this one is difficult because it should be concrete but honest and also be written in layman’s terms.
  • A section about the workgroup which you have in mind. Especially important: How many associated positions for PhD students/teaching assistants do you expect the university to provide? Do you bring any third party funded people?
  • A description of the office space you need.
  • A list of what you wish for as “start-up funding” (buying computers, laptops, furniture, books, servers,…).
  • Your wishes for a yearly budget (for traveling and guests).
  • What salary do you expect? Are there special reasons for a bonus?
  • Any other circumstances which you find important.

Moreover, you should propose a date (or several) for the negotiations.

I found this list of wishes particularly difficult because it depends largely on the structure and habits of the particular department/faculty and the university. Moreover, some things will be handled by the department and some things will be handled by the Rektorat and it is not always clear a-priori who is responsible for what. In any case it is helpful to speak to the dean of the faculty and any other people you know in the department to find out how this is organized. E.g. usually the organization of secretaries is basically fixed, as well as office space. It is in general a good idea (if not even necessary) to speak to the dean prior to the official negotiation because several things will be handled by the dean and not by the Rektor and if you did not agree with the dean about these things you can get in awkward situations in the negotiations. The salary is always discussed with the Rektor and the Kanzler (not with the Dean), there are a lot of rules according to which salary of professors is organized in Germany. There are federal laws, and university laws concerning the base salary and any additional bonuses. Moreover, these bonuses can be given permanently or temporary. If they are temporary then can be linked to a “target agreement” and this target agreement can or can not contain that the bonus will turn into permanent in case of success… I’ve heard that the base rule here is that it is better to overshoot than to undershoot.

Once you’ve sent your list of wishes you have to wait again. The Rektorat has to find a date for the negotiation and since several important people are involved (to be precise: Rektor, Kanzler and Dean), this may take some time. But after some time, I was contacted (by mail or phone) to agree on a date for the next important step: The negotiation.

3. The negotiation

In both case I was a little bit nervous when the day of negotiation came. As preparation I read several “Gesetzesblätter” again and moreover, tried to get information about the negotiation partners (especially the Rektor and the Kanzler because I knew the Deans already).

The actual negotiations took something between half an hour and one hour and besides Rektor, Kanzler, Dean and myself there was somebody from personnel administration. In both cases the conversation was basically conducted by the Rektor and started with some small talk.

In my first negotiation, the Rektor opened the official part by asking me to explain my research plans in layman’s terms. He seemed honestly interested and tried to suggest connections with other areas of expertise of the university. Next we moved into the list of wishes and then the Kanzler took over. We went over my list of wishes part by part and it was basically that the Kanzler said what the university could offer and what not. Sometimes the dean of the faculty was asked if the faculty could supply something but it seems that this was also prepared in advance. When it came to positions for research assistants, I was also asked about my plans to get third-party funding and how I would like to see and organize my work group. The last point was the negotiation of the salary. Here the dean left (somehow, salary is a highly confidential thing here) and the discussion was basically between the Kanzler and me. The Kanzler tried to briefly explain how the system for extra bonuses works and why he would only offer much less than I wished. I had a few things to reply and in the end, I did not feel that the negotiation ended with a definitive decision. However, they said that I should present my other offers when I get them and that they would be willing to think about outdoing them. Finally, we agreed on a decision deadline for me. The official offer arrived (first by email than in paper form) about a week later. I forwarded the official offer to the second university for the preparation of the next negotiations. I don’t know if this is the usual practice but again, I think that playing with cards on the table in easier here.

My second negotiation directly started with the discussion of the salary (there was no question about my field of research). The Rektorat had prepared a nice tableaux in which they illustrated my current situation, the other offer, and their offer side by side and basically they just topped the others by a small but non-negligible amount. Next we went over my list of wishes as in the first negotiation: The Kanzler said what they can offer and I could comment. Again, the last thing was to agree on a decision deadline for me. Also, the Rektor suggested that I could contact him I would “conditionally accept the offer” depending on smaller things (I am not totally sure what this could be, but salary can definitely be a point here).

In conclusion: I was a bit surprised that the negotiation was kind of a pleasant and open discussion but also that not too much negotiation actually happened (but this probably due to my limited “negotiation capabilities”…).

If I had to extract a general rule of thumb: Getting start-up funding for computers and furniture is easy (at least as a mathematician), increasing the yearly budget is harder, getting more positions for research assistants is hard if not impossible.

4. Next steps

After I had informed my faculty about the job offer, the administration asked, if they should prepare for “Bleibeverhandlungen” (negotiations to stay), i.e. if there was a chance that I would like to stay rather than leave and also if I truly consider to accept an offer (my honest response was that there was a chance for both, but I’ve heard that it could be a disadvantage to give a strong tendency at this stage). Some weeks later I received an official letter from the President of my university in which he invited me to negotiations to stay and asked me to present the official offers when I receive them along with another list of wishes…

Now the final steps will be: Preparation of the next list of wishes, negotiations to stay and finally, the most difficult part, forming a well-founded decision.

I am migrating to a new laptop and, as usual, this is a bit more work than I expected…

One issue I could not resolve to my satisfaction was the migration of my RSS-archives in Akregator (especially since I plan to switch to a different feed reader). Hence, I just collect in this post the posts and articles I planned to read or wanted to keep for some reason:

From nuit-blanche we have:

Form the blog of Clément Mouhot I have

And there are some articles:

And some preprints from the arxiv:

And finally there is a review on a book I planned to buy: A Mathematician Comes of Age by Steven G. Krantz.

Moreover, there is a list of posts which I marked with “Important”:

 

Follow

Get every new post delivered to your Inbox.

Join 30 other followers