There are several answers to the following question:
1. What is a convex set?
For a convex set you probably know these definitions:
Definition 1 A subset of a real vector space is convex if for any and it holds that .
In other words: If two points lie in the set, then every convex combination also lies in the set.
While this is a “definition from the inside”, convex sets can also be characterized “from the outside”. We add closedness as an assumption and get:
Definition 2 A closed subset of a real locally convex topological vector space is convex if it is the intersection of closed halfspaces (i.e. sets of the form for some in the dual space and ).
Moreover, we could define convex sets via convex functions:
Definition 3 A set is convex if there is a convex function such that .
Of course, this only makes sense once we have defined convex functions. Hence, we could also ask the question:
2. What is a convex function?
We can define a convex function by means of convex sets as follows:
Definition 4 A function from a real vector space into the real numbers is convex, if its epigraph is convex (as a subset of the vector space ).
The epigraph consists of the points which lie above the graph of the function and carries the same information as the function.
(Let me note that one can replace the real numbers here and in the following with the extended real numbers if one uses the right extension of the arithmetic and the obvious ordering, but we do not consider this in this post.)
Because epigraphs are not arbitrary convex sets but have a special form (if a point is in an epigraph, then every with is also in the epigraph), and because the underlying vector space comes with an order in the second component, some of the definitions for convex sets from above have a specialized form:
From the definition “convex combinations stay in the set” we get:
Definition 5 A function is convex, if for all and it holds that
In other words: The secant of any two points on the graph lies above the graph.
From the definition by “intersection of half spaces” we get another definition. Since we added closedness in this case we add this assumption also here. However, closedness of the epigraph is equivalent to lower-semicontinuity (lsc) of the function and since lsc functions are very convenient we use this notion:
Definition 6 A function is convex and lsc if it is the pointwise supremum of affine functions, i.e., for some set of tuples it holds that
A special consequence if this definition is that tangents to the graph of a convex function lie below the function. Another important consequence of this fact is that the local behavior of the function, i.e. its tangent plane at some point, carries some information about the global behavior. Especially, the property that the function lies above its tangent planes allows one to conclude that local minima of the function are also global. Probably the last properties are the ones which give convex functions a distinguished role, especially in the field of optimization.
Some of the previous definitions allow for generalizations into several direction and the quest for the abstract core of the notion of convexity has lead to the field of abstract convexity.
3. Abstract convexity
When searching for abstractions of the notion of convexity one may get confused by the various different approaches. For example there are generalized notions of convexity, e.g. for function of spaces of matrices (e.g. rank-one convexity, quasi-convexity or polyconvexity) and there are also further different notions like pseudo-convexity, invexity or another form ofquasi-convexity. Here we do not have generalization in mind but abstraction. Although both things are somehow related, the way of thinking is a bit different. Our aim is not to find a useful notion which is more general than the notion of convexity but to find a formulation which contains the notion of convexity and abstracts away some ingredients which probably not carry the essence of the notion.
In the literature one also finds several approaches in this direction and I mention only my favorite one (and I restrict myself to abstractly convex functions and not write about abstractly convex sets).
To me, the most appealing notion of an abstract convex function is an abstraction of the definition as “pointwise supremum of affine functions”. Let’s look again at the definition:
A function is convex and lsc if there is a subset of such that
We abstract away the vector space structure and hence, also the duality, but keep the real-valuedness (together with its order) and define:
Definition 7 Let be a set and let be a set of real valued function on . Then a function is said to be -convex if there is a subset of such that
What we did in this definition was simply to replace continuous affine functions on a vector space by an arbitrary collection of real valued functions on a set. On sees immediately that for every function and any real number the function is convex (similarly to the fact that every affine linear function is convex).
Another nice thing about this approach is, that it allows for some notion of duality/conjugation. For we define the -conjugate by
and we can even formulate a biconjugate
We naturally have a Fenchel inequality
and we may even define subgradients as usual. Note that a conventional subgradient is an element of the dual space which defines a tangent plane at the point where the subgradient is taken, that is, is a subgradient of at , if for all it holds that or
A -subgradient is an element of , namely we define: if
Then we also have a Fenchel equality:
One may also take dualization as the starting point for an abstraction.
4. Abstract conjugation
We could formulate the -conjugate as follows: For we have
This opens the door to another abstraction: For some sets (without any additional structure) define a coupling function and define the -conjugate as
and the -biconjugate as