February 2012


In my last post I treated a model for opinion dynamics. The specific feature of this model by Canuto, Fagnani and Tilli was that it is not based on a finite number of agents, each moving around in opinion space, but on the density of agents on the opinion space. On the one hand, this makes the “agents” somehow disappear in the model and especially, it is not possible to “track an individual agent” anymore (correct me, if I’m wrong). On the other hand, this “approximation by infinity” makes the analysis a bit easier.

In this post I’d like to collect similar equations for other related phenomena, namely flocking and synchronization.

1. Flocking dynamics

Some animals (as well as humans) build flocks and usually, this happens without a kind of leader. Flocks can be visually appealing, as you can see here or here. There are several mathematical models around to describe the leaderless formation of flocks.

One of the first flocking models is due to Vicsek at al. and a very popular one is more recent and due to Cucker and Smale: At time {t} there are agents at positions {x_i(t)} with velocities {v_i(t)} and each agent moves according to its velocity and adapts its velocity in view of the velocities of its neighbors. The equations read like this:

\displaystyle  \dot x_i(t) = v_i(t),\qquad \dot v_i(t) = \lambda\sum_{j=1}^N a(|x_j(t)-x_i(t)|)(v_j(t)-v_i(t)).

In the second equation for the velocities, you recognize that the velocities of the {i}th agent is a weighted mean of the velocities of the other agents and the weight {a} for the {j}th agent depends on their mutual distance.

For the continuous counterpart of this model, we need a density function of the phase space, that is a density function {f:{\mathbb R}^3\times{\mathbb R}^3\times{\mathbb R}\rightarrow{\mathbb R}} and {f(x,v,t)} describes “how many agents are at time {t} at position {x} with velocity {v}”. This is very similar to the famous Boltzmann equation from gas dynamics. The evolution of this density function according to the above model has been derived by Ha and Tadmor. With the operator

\displaystyle  Q(f)(x,v,t) = \int\int a(|x-y|)(w-v)f(y,w,t)dwdy

the model reads as

\displaystyle  \partial_t f(x,v,t) + v\cdot\nabla_x f(x,v,t) + \lambda\nabla_v \Big(Q(f)(x,v,t)f(x,v,t))\Big) = 0.

The term “{v\cdot\nabla_x f(x,v,t)}” leads to the movement of the agent along the velocities, while the last term “{\lambda\nabla_v\Big(Q(f)(x,v,t)f(x,v,t))\Big)}” is the term which models the interaction of the velocities.

2. Synchronization

Another natural phenomenon where group behavior arises is that of synchronization. Different “oscillators” are mutually coupled and may evolve to a state in which all of them oscillate with the same frequency. This phenomenon is modelled by the Kuramoto equation: The state of the {i}th oscillator at time {t} is given by its phase {\theta_i(t)} (which is a real number but has to be considered modulo {2\pi}). If an oscillator has natural frequency {\omega_i} (or phase velocity), its phase evolves according to {\theta_i(t) = \theta_i(0) + t\omega_i}. Put differently, its phase satisfies the differential equation {\dot\theta_i = \omega_i}. The Kuromoto equation for {N} coupled oscillators is

\displaystyle  \dot\theta_i(t) = \omega_i + \tfrac{K}{N}\sum\sin(\theta_j(t)-\theta_i(t)).

The coupling strength is given by {K>0} and the coupling term says that an oscillator “slows down” if it sees some oscillators “behind him” and “speeds up” if it sees some “ahead of him”.

In the limit for {N\rightarrow\infty} we consider a distribution {\rho:{\mathbb R}\times{\mathbb R}\times {\mathbb R}\rightarrow{\mathbb R}} and {\rho(\theta,\omega,t)} described “how many oscillators at time {t} have phase {\theta} and natural frequency {\omega}”. Further, the distribution of natural frequencies is given by {g(\omega)}. Then, the equation for the evolution of the oscillators is

\displaystyle  \partial_t\rho(\theta,\omega,t) + \partial_\theta\Big[\rho(\theta,\omega,t)\Big(\omega + K \int_0^{2\pi}\int_{\mathbb R} \sin(\theta'-\theta)\rho(\theta',\omega',t)g(w')dw'd\theta'\Big)\Big].

For more information, see the Wikipedia page or this article by Strogatz.

3. Consensus

Finally, I’d like to mention another model for opinion formation, the model by Ben-Naim, Krapivsky and Redner: The density {P(x,t)} describes how many agents at time {t} have the opinion {x} and evolves according to

\displaystyle  \partial_t P(x,t) = \iint_{|x_1-x_2|\leq1} P(x_1,t)P(x_2,t)\big[\delta(x - \tfrac{x_1+x_2}{2}) - \delta(x-x_1)\big]dx_1dx_2.

The model is derived from the Deffuant-Weisbuch model in which two randomly chosen agents meet in each step, and if their opinion is close enough to each other, they agree on their mean value.

It happened again that I found a paper that I would have liked to write myself. Actually, this is not too bad this time because I did not really started to work on the subject. The paper is “An Eulerian Approach to the Analysis of Krause’s Consensus Models” by Claudio Canuto, Fabio Fagnani and Paolo Tilli. Actually, my brother Jan did his PhD thesis on “Krause’s Consensus Model” together with Ulrich Krause himself. I got a little bit involved through discussions with Jan on analytic aspects of the consensus models and we even wrote a small note the topic.

The general theme of “consensus models” is, to model, simulate and analyze the process in which opinions evolve and especially describe the phenomenon that a consensus may or may not form.

1. A time discrete consensus model with finitely many agents

The abstraction to mathematics usually goes like this: Each agent {j} has an opinion {x_j} which is a vector in {{\mathbb R}^d}. The group of agents interact in some way, i.e., they meet at a time {t} with their opinions {x_j(t)}, discuss and they adapt their opinions to each other which results in new opinions {x_j(t+1)}.

For a total number of {n} agents, the dynamics reads as

\displaystyle  x_j(t+1) = F_j(x_1(t),\dots,x_n(t),t).

In words: The option of the {j}-th agent depends of all other opinions in a time dependent manner. In the following we stack the {n} (row-)vectors of opinions into an {n\times d} matrix {x(t)} and may formulate more compactly {x(t+1) = F(x(t),t)}.

There are several simplifications one can make:

  • Time-independent models, i.e. {F} does not depend on {t}.
  • Linear models: The function {F} is linear in the first {n} variables, i.e. there is a {n\times n}-matrix {A(t)} such that

    \displaystyle  x(t+1) = A(t)x(t).

  • “Semi-linear” models where the matrix also depends of the present opinions:

    \displaystyle  x(t+1) = A(x(t),t)x(t).

    The difference between “semi-linear” and non-linear models is in fact not that large. As soon as every vector {x_j(t+1)} is in the linear span of the previous vectors {x_1(t),\dots,x_n(t)} one can find a matrix {A} (depending on {x(t)}, of course) auch that {x(t+1) = Ax(t)}. However, the semi-linear view is sometimes useful when one tries to formulate conditions on the process which imply some wanted properties.

One special goal in opinion dynamics is, to find conditions of the mappings {F} which ensure that the process converge to a consensus, i.e. to a state where all agents share the same opinion {x_1 = \cdots = x_n}.

Example 1 (Krause’s model) Krause’s model for opinion dynamics say’s that in each step, each agent forms an average of the opinions of other agents if their opinion is not too far away (one also say’s that the agent have bounded confidence). Put in formula, the models needs a confidence bound {R>0}. Then, the agent {j} is influenced by all agents in the set {I_j(t) = \{i\ |\ \|x_j(t) - x_i(t)\|\leq R\}} and takes the mean value of these agents as next opinion:

\displaystyle   x_j(t+1) = \frac{1}{|I_j(t)|}\sum_{i\in I_j(t)} x_i(t). \ \ \ \ \ (1)

We can formulate this as a semi-linear model: Define the matrix {A(x(t))} as

\displaystyle  A(x(t))_{i,j} = \begin{cases} \frac{1}{|I_j(t)|} & \text{ if } \|x_j(t) - x_i(t)\|\leq R\\ 0 & \text{ else.} \end{cases}

Then the dynamics is

\displaystyle  x(t+1) = A(x(t))x(t)

It turned out that it is quite difficult to analyze this model properly. One specific difficulty is that the “communication topology” keeps changing over time. Especially, it seems quite hard to analyze under which circumstances the system converges to a consensus.

Together with Jan we proved a theorem about non-linear models which uses a very general notion of “average” or “mean”.

A “generalized mean” of {n} real number is a mapping {g:{\mathbb R}^n\rightarrow{\mathbb R}} which such that {\min x^i \leq g(x)\leq\ max x^i} (which is called “sandwich inequality”). Of course the usual arithmetic mean, the geometric mean and so on are genealized means in this sense. But how would a generalied mean of {n} vectors in {{\mathbb R}^d} look like? What should be the sandwich inequality? One natural idea is the following: {g:({\mathbb R}^d)^n\rightarrow{\mathbb R}^d} is a generalized mean of {g(x)} is contained in the convex hull of the vectors {x^1,\dots,x^n}, in formula: {g(x)\in\text{conv}(x^1,\dots,x^n)} (could be called the “convex-hull sandwich inclusion”). However, then the “componentwise geometric mean” is not a generalized mean… Hence, it seems more appropriate to use the notion of “cube mean”, that is, {g(x)} is contained in the smallest cube (i.e. {d}-dimensional interval) which contains all {x^i}‘s. We denote this by {g(x)\in\text{cube}(x^1,\dots,x^n)}.

Now we know what a generalized mean could be:

Definition 1 An iteration map {F:({\mathbb R}^d)^n\rightarrow({\mathbb R}^d)^n} is called a cube averaging map if for all {x} it holds that:

\displaystyle  \textup{cube}(f^1(x),\dots,f^n(x)) \subset \textup{cube}(x^1,\dots,x^n).

Of cource, it is not enough to have cube-averaging maps {F(\cdot,t)} to get convergence to consensus of {x(t+1) = F(x(t),t)} (since the identity is valid…). We strengthen the notion of averaging a little bit:

Definition 2 {F} is a proper cube-averaging map if for all {x = (x^1,\dots,x^n)} which are not a consensus (i.e. not all {x^i} are equal) the inclusion

\displaystyle  \textup{cube}(f^1(x),\dots,f^n(x)) \subset \textup{cube}(x^1,\dots,x^n).

is strict.

However, this notion is still not enough: It may happen that the “cube-hulls” shrink too slow, i.e. the decrease of the cube-hull slows down so much that it will not shrink to a single point. Hence we strengthen a little bit more. With the help of the Hausdorff distance {d_H} of compact sets we formulate:

Definition 3 A family {\mathcal{F}} of cube-averaging maps is called equi-proper, if for every {x} which is not a consensus there is a {\delta>0} such that for every {F\in\mathcal{F}} it holds that

\displaystyle  d_H(\textup{cube}((f^1(x),\dots,f^n(x)), \textup{cube}(x^1,\dots,x^n))>\delta

And still this is not enough to show convergence to consensus:

Example 2 We consider {d=1} (one-dimensional opinions), {n=3} (three agents) and only one averaging map (i.e. a time independent process):

\displaystyle  f(x^1,x^2) =\left\{ \begin{array}{cl} (x^1,x^2,\frac{1}{2}x^3 + \frac{1}{2}\min\{x^1,x^2\}) &\hbox{ if } x^3<\min\{x^1,x^2\} ,\\ (x^1,x^1,x^1) & \hbox{ otherwise. } \end{array} \right.

A close inspection shows, that this is indeed a proper (cube-)averaging maps. However,if we start with {x_3<\min(x_1,x_2)} a consensus will never be reached.

What is missing is continuity. And not only continuiuty, we need that the family {F(\cdot,t)} is equicontinuous.

Theorem 4 If the maps {F(\cdot,t)} form a family of equicontinuous and equiproper cube averaging maps, the solution of {x(t+1) = F(x(t),t)} always converges to consensus.

The (technical) proof, a generalization, more examples and counterexample can be found in our paper “On conditions for convergence to consensus” (or the preprint version).

In general, Krause’s model is not covered by the above theorem, however, one could adapt the theorem to this situation and deduce a bit more.

2. The model of Canuto et al.

Canuto et al. want to analyze Krause’s model in the limit of infinitely many agents. Often this limit allows for a more detailed analysis, since the quite arbitrary large number {n} is “approximated by {\infty}” which discards all effects which rely on the number {n} itself.

First, let’s get back to Krause’s model and formulate it in “update form”:

\displaystyle  x_j(t+1) = x_j(t) + \sum_{i}A(x(t))_{ij}(x_i(t) - x_j(t))

In the paper by Canuto et al. they slightly deviate from Krause’s model. They consider the dynamics

\displaystyle   x_j(t+1) = x_j(t) + \frac{1}{n}\sum_{i=1}^n \xi(x_i(t) - x_j(t))\, (x_i(t) - x_j(t)) \ \ \ \ \ (2)

with a bounded function {\xi} which is also symmetric ({\xi(x) = \xi(-x)}). We assume that {\xi} is non-negative and bounded by one. The non-negativity says, that other opinions do not contribute negatively to the average (which would mean that one agent actually distrusts another agent so strongly that he moves away from him). The symmetry says that confidence is mutual.

Note that Krause’s model (1) can not be modelled like this: One can get close by setting

\displaystyle  \xi(x) = \begin{cases} 1, & \|x\|\leq R\\ 0, & \text{ else} \end{cases}

but this gives a different weight to the own opinion:

\displaystyle  \begin{array}{rcl}  x_j(t+1) &=& x_j(t) + \frac{1}{n}\sum_i \xi(x_i(t) - x_j(t))\, (x_i(t) - x_j(t))\\ & =& \Big(1 - \frac{1}{n}\sum_{i=1}^n \xi(x_i(t) - x_j(t))\Big) x_j(t) + \frac{1}{n}\sum_{i=1}^n \xi(x_i(t) - x_j(t))\, x_i(t)\\ & = & \sum_{i=1}^n\tilde A(x(t))_{i,j} x_i(t) \end{array}

with

\displaystyle  \tilde A(x(t))_{i,j} = \begin{cases} \frac1n & \text{ if } \|x_i(t) - x_j(t)\|\leq R, \text{ and } i\neq j\\ 1 - \frac{|I_j(t)|}{n} & i=j\\ 0 & \text{ else.} \end{cases}

In other words: The modified model of Canuto et al. gives a larger weight to the own opinion.

3. A time discrete consensus model with “infinitely” many agents

Now we want to go to infinitely many agents. Canuto et al. suggest not to assume to have a countably infinite number of agent but that we identify the set of agents and their opinions by their distribution of opinions. In mathematical terms: We consider a measure on the set of possible opinions: At each time {t} there is a normalized (i.e. probability) measure {\mu_t} which we take as a Borel measure of {{\mathbb R}^d}. Now we want to derive a dynamics for this measure which resembles the dynamics of (2).

Informally, the agents with opinion {x} move to another place {\gamma_t(x)} according to the update in (2), i.e. we have

\displaystyle   \gamma_t(x) = x + \int_{{\mathbb R}^d}(y-x)\xi(y-x)d\mu_t(y). \ \ \ \ \ (3)

Since {\gamma_t} describes how the density is moved, the dynamics of {\mu_t} is described as the “push-forward” of {\mu_t} by {\gamma_t} that is: For every Borel set {E} we have

\displaystyle   \mu_{t+1}(E) = \mu_t(\gamma_t^{-1}(E)) \ \ \ \ \ (4)

(often this is written as {\mu_{t+1} = \gamma_t\# \mu_t}).

This is a valid generalization of (2):

Example 3 Consider an opinion distribution {\mu_t = \frac{1}{n}\sum_{j=1}^n \delta_{x_j(t)}} that is, there are {n} agents with opinions {x_j(t)}. We calculate {\mu_{t+1}} according to (4): For a Borel set {E} we have

\displaystyle  \begin{array}{rcl}  \mu_{t+1}(E) & = & \mu_t(\gamma_t^{-1}(E))\\ & = & \frac1n \sum_j \delta_{x_j(t)}(\gamma_t^{-1}(E)). \end{array}

It holds that

\displaystyle  \delta_{x_j(t)}(\gamma_t^{-1}(E)) = \begin{cases} 1 & \text{ if } \gamma_t(x_j(t))\in E\\ 0 & \text{ else.} \end{cases}

Now we calculate {\gamma_t(x_j(t))} accroding to (3) (note that the integral becomes a sum since it is with respect to a singular measure consisting of {n} delta’s):

\displaystyle  \gamma_t(x_j(t)) = x + \frac1n\sum_i (x_i(t) -x_j(t))\,\xi(x_i(t) - x_j(t)).

Hence, we have that {\gamma_t(x_j(t))\in E} iff {x_j(t+1)} (according to (2)) is in {E}. We combine this with our previous finding and obtain

\displaystyle  \mu_{t+1}(E) = \frac1n\sum_j \delta_{x_j(t+1)}(E).

I found that pretty neat. Now Canuto et al. prove a few results about the dynamics of (3) and (4) which I quote without proof:

Theorem 5 Let {\xi} be non-negative, bounded by {1} and symmetric and let the initial Borel probability measure {\mu_0} have finite second moment. Then the second moments of {\mu_t} (according to (3) and (4)) are non-increasing in {t} and the sequence {\mu_t} converges in the weak-{*} sense to some probability measure {\mu_\infty}. If moreover there exist {R>0} and {\delta>0} such that {\xi(x)>\delta} for {\|x\|\leq R} then {\mu_\infty} is a purely atomic measure (i.e. a superposition of delta’s) and the atoms are at least {R} apart from each other.

In words: The system always converges to a state in which the opinions are clustered at points which are at least {R} apart. If the initial condition has a bounded support, we can conclude that in the limit there are only finitely many opinions left.

By investigating symmetry-perserving properties of the dynamics, Canuto et. al obtain the following intersting side-result:

Corollary 6 If {\xi} is a radial function and {\mu_0} has radial symmetry with respect to {x_0} then it holds that {\mu_\infty = \delta_{x_0}}.

I.e. if the initial profile is symmetric around some {x_0} then the dynamics will converge to a consensus in the point {x_0}. Note that this result has no counterpart for finitely many agents since you can not put finitely many deltas in {{\mathbb R}^d} in a radially symmetric way…

Canuto et al. also derive a numerical scheme for their modell and maybe I’ll blog about this in the future…

I learned today that Ulrich Tautenhahn passed away last year in October. Although he was not a colloborator of mine and I did not know him very well, I thought I could dedicate a small post to him. Ulrich Tautenhahn worked in inverse problems and regularization theory and I was especially interested in his work on parameter choice rules (a problem which, in my humble opinion, could get more attention, by the way). He was one of the early pioneers in inverse problems in the former GDR and wrote his PhD thesis on nonlinear inverse problems as early as 1980.If you go through his list of publications you will see that he worked on quite a lot of different aspects of inverse problems. His activity has increased lately and he also followed recent trends like regularization in Banach spaces and multiparameter regularization. Farewell, Mr. Tautenhahn.