The fundamental theorem of calculus relates the derivative of a function to the function itself via an integral. A little bit more precise, it says that one can recover a function from its derivative up to an additive constant (on a simply connected domain).

In one space dimension, one can fix some {x_{0}} in the domain (which has to be an interval) and then set

\displaystyle F(x) = \int_{x_{0}}^{x}f'(t) dt.

Then {F(x) = f(x) + c} with {c= - f(x_{0})}.

Actually, a similar claim is true in higher space dimensions: If {f} is defined on a simply connected domain in {{\mathbb R}^{d}} we can recover {f} from its gradient up to an additive constant as follows: Select some {x_{0}} and set

\displaystyle F(x) = \int_{\gamma}\nabla f(x)\cdot dx \ \ \ \ \ (1)

for any path {\gamma} from {x_{0}} to {x}. Then it holds under suitable conditions that

\displaystyle F(x) = f(x) - f(x_{0}).

And now for something completely different: Convex functions and subgradients.

A function {f} on {{\mathbb R}^{n}} is convex if for every {x} there exists a subgradient {x^{*}\in{\mathbb R}^{n}} such that for all {y} one has the subgradient inequality

\displaystyle f(y) \geq f(x) + \langle x^{*}, y-x\rangle.

Writing this down for {x} and {y} with interchanged roles (and {y^{*}} as corresponding subgradient to {y}), we see that

\displaystyle \langle x-y,x^{*}-y^{*}\rangle \geq 0.

In other words: For a convex function {f} it holds that the subgradient {\partial f} is a (multivalued) monotone mapping. Recall that a multivalued map {A} is monotone, if for every {y \in A(x)} and {y^{*}\in A(y)} it holds that {\langle x-y,x^{*}-y^{*}\rangle \geq 0}. It is not hard to see that not every monotone map is actually a subgradient of a convex function (not even, if we go to “maximally monotone maps”, a notion that we sweep under the rug in this post). A simple counterexample is a (singlevalued) linear monotone map represented by

\displaystyle \begin{bmatrix} 0 & 1\\ -1 & 0 \end{bmatrix}

(which can not be a subgradient of some {f}, since it is not symmetric).

Another hint that monotonicity of a map does not imply that the map is a subgradient is that subgradients have some stronger properties than monotone maps. Let us write down the subgradient inequalities for a number of points {x_{0},\dots x_{n}}:

\displaystyle \begin{array}{rcl} f(x_{1}) & \geq & f(x_{0}) + \langle x_{0}^{*},x_{1}-x_{0}\rangle\\ f(x_{2}) & \geq & f(x_{1}) + \langle x_{1}^{*},x_{2}-x_{1}\rangle\\ \vdots & & \vdots\\ f(x_{n}) & \geq & f(x_{n-1}) + \langle x_{n-1}^{*},x_{n}-x_{n-1}\rangle\\ f(x_{0}) & \geq & f(x_{n}) + \langle x_{n}^{*},x_{0}-x_{n}\rangle. \end{array}

If we sum all these up, we obtain

\displaystyle 0 \geq \langle x_{0}^{*},x_{1}-x_{0}\rangle + \langle x_{1}^{*},x_{2}-x_{1}\rangle + \cdots + \langle x_{n-1}^{*},x_{n}-x_{n-1}\rangle + \langle x_{n}^{*},x_{0}-x_{n}\rangle.

This property is called {n}-cyclical monotonicity. In these terms we can say that a subgradient of a convex function is cyclical monotone, which means that it is {n}-cyclically monotone for every integer {n}.

By a remarkable result by Rockafellar, the converse is also true:

Theorem 1 (Rockafellar, 1966) Let {A} by a cyclically monotone map. Then there exists a convex function {f} such that {A = \partial f}.

Even more remarkable, the proof is somehow “just an application of the fundamental theorem of calculus” where we recover a function by its subgradient (up to an additive constant that depends on the basepoint).

Proof: we aim to “reconstruct” {f} from {A = \partial f}. The trick is to choose some base point {x_{0}} and corresponding {x_{0}^{*}\in A(x_{0})} and set

\displaystyle \begin{array}{rcl} f(x) &=& \sup\left\{ \langle x_{0}^{*}, x_{1}-x_{0}\rangle + \langle x_{1}^{*},x_{2}-x_{1}\rangle + \cdots + \langle x_{n-1}^{*},x_{n}-x_{n-1}\rangle\right.\\&& \qquad\left. + \langle x_{n}^{*},x-x_{n}\rangle\right\}\qquad\qquad (0)\end{array}

where the supremum is taken over all integers {m} and all pairs {x_{i}^{*}\in A(x_{i})} {i=1,\dots, m}. As a supremum of affine functions {f} is clearly convex (even lower semicontinuous) and {f(x_{0}) = 0} since {A} is cyclically monotone (this shows that {f} is proper, i.e. not equal to {\infty} everywhere). Finally, for {\bar x^{*}\in A(\bar x)} we have

\displaystyle \begin{array}{rcl} f(x) &\geq& \sup\left\{ \langle x_{0}^{*}, x_{1}-x_{0}\rangle + \langle x_{1}^{*},x_{2}-x_{1}\rangle + \cdots + \langle x_{n-1}^{*},x_{n}-x_{n-1}\rangle\right.\\ & & \qquad \left.+ \langle x_{n}^{*},\bar x-x_{n}\rangle + \langle \bar x^{*}, x-\bar x\rangle\right\} \end{array}

with the supremum taken over all integers {m} and all pairs {x_{i}^{*}\in A(x_{i})} {i=1,\dots, m}. The right hand side is equal to {f(x) + \langle \bar x^{*},x-\bar x\rangle} and this shows that {f} is indeed convex. \Box

Where did we use the fundamental theorem of calculus? Let us have another look at equation~(0). Just for simplicity, let us denote {x_{i}^{*} =\nabla f(x_{i})}. Now consider a path {\gamma} from {x_{0}} to {x} and points {0=t_{0}< t_{1}<\cdots < t_{n}< t_{n+1} = 1} with {\gamma(t_{i}) = x_{i}}. Then the term inside the supremum of~(0) equals

\displaystyle \langle \nabla f(\gamma(t_{0})),\gamma(t_{1})-\gamma(t_{0})\rangle + \dots + \langle \nabla f(\gamma(t_{n})),\gamma(t_{n+1})-\gamma(t_{n})\rangle.

This is Riemannian sum for an integral of the form {\int_{\gamma}\nabla f(x)\cdot dx}. By monotonicity of \nabla {f}, we increase this sum, if we add another point {\bar t} (e.g. {t_{i}<\bar t<t_{i+1}}, and hence, the supremum does converge to the integral, i.e.~(0) is equal to

\displaystyle f(x) = \int_{\gamma}\nabla f(x)\cdot dx

where {\gamma} is any path from {x_{0}} to {x}.