Why should you calculate the Jordan normal form anyway?

In my previous post I illustrated why it is not possible to compute the Jordan canonical form numerically (i.e. in floating point numbers). The simple reason: For every matrix ${A}$ and every ${\epsilon>0}$ there is a matrix ${A_{\epsilon}}$ which differs from ${A}$ by at most ${\epsilon}$ (e.g. in every entry – but all norms for matrices are equivalent, so this does not really play a role) such that ${A_{\epsilon}}$ is diagonalizable. So why should you bother about computing the Jordan canonical form anyway? Or even learning or teaching it? Well, the prime application of the Jordan canonical form is to calculate solutions of linear systems of ODEs. The equation

$\displaystyle y'(t) = Ay(t),\quad y(0) = y_{0}$

with matrix ${A\in {\mathbb R}^{n\times n}}$ and initial value ${y_{0}\in{\mathbb R}^{n}}$ (both could also be complex). This system has a unique solution which can be given explicitly with the help of the matrix exponential as

$\displaystyle y(t) = \exp(At)y_{0}$

where the matrix exponential is

$\displaystyle \exp(At) = \sum_{k=0}^{\infty}\frac{A^{k}t^{k}}{k!}.$

It is not always simple to work out the matrix exponential by hand. The straightforward way would be to calculate all the powers of ${A}$ , weight them by ${1/k!}$ and sum the series. This may be a challenge, even for simple matrices. My favorite example is the matrix

$\displaystyle A = \begin{bmatrix} 0 & 1\\ 1 & 1 \end{bmatrix}.$

Its first powers are

$\displaystyle A^{2} = \begin{bmatrix} 1 & 1\\ 1 & 2 \end{bmatrix},\quad A^{3} = \begin{bmatrix} 1 & 2\\ 2 & 3 \end{bmatrix}$

$\displaystyle A^{4} = \begin{bmatrix} 2 & 3\\ 3 & 5 \end{bmatrix},\quad A^{5} = \begin{bmatrix} 3 & 5\\ 5 & 8 \end{bmatrix}.$

You may notice that the Fibonicci numbers appear (and this is pretty clear on a second thought). So, finding a explicit form for ${\exp(A)}$ leads us to finding an explicit form for the ${k}$ -th Fibonacci number (which is possible, but I will not treat this here).

Another way is diagonalization: If ${A}$ is diagonalizable, i.e. there is an invertible matrix ${S}$ and a diagonal matrix ${D}$ such that

$\displaystyle S^{-1}AS = D\quad\text{or, equivalently}\quad A = SDS^{-1},$

you see that

$\displaystyle \exp(At) = S\exp(Dt)S^{-1}$

and the matrix exponential of a diagonal matrix is simply the exponential function applied to the diagonal entries.

But not all matrices are diagonalizable! The solution that is usually presented in the classroom is to use the Jordan canonical form instead and to compute the matrix exponential of Jordan blocks (using that you can split a Jordan block ${J = D+N}$ into the sum of a diagonal matrix ${D}$ and a nil-potent matrix ${N}$ and since ${D}$ and ${N}$ commute one can calculate ${\exp(J) = \exp(D)\exp(N)}$ and both matrix exponentials are quite easy to compute).

But in light of the fact that there are a diagonalizable matrices arbitrarily close to any matrix, on may ask: What about replacing a non-diagonalizable matrix ${A}$ with a diagonalizable one (with a small error) and then use this one?

Let’s try this on a simple example:

We consider

$\displaystyle A = \begin{bmatrix} -1 & 1\\ 0 & -1 \end{bmatrix}$

which is not diagonalizable. The linear initial value problem

$\displaystyle y' = Ay,\quad y(0) = y_{0}$

has the solution

$\displaystyle y(t) = \exp( \begin{bmatrix} -t & t\\ 0 & -t \end{bmatrix}) y_{0}$

and the matrix exponential is

$\displaystyle \begin{array}{rcl} \exp( \begin{bmatrix} -t & t\\ 0 & -t \end{bmatrix}) & = &\exp(\begin{bmatrix} -t & 0\\ 0 & -t \end{bmatrix})\exp(\begin{bmatrix} 0 & t\\ 0 & 0 \end{bmatrix})\\& = &\begin{bmatrix} \exp(-t) & 0\\ 0 & \exp(-t) \end{bmatrix}\begin{bmatrix} 1 & t\\ 0 & 1 \end{bmatrix}\\ &=& \begin{bmatrix} \exp(-t) & t\exp(-t)\\ 0 & \exp(-t) \end{bmatrix}. \end{array}$

So we get the solution

$\displaystyle y(t) = \begin{bmatrix} e^{-t}(y^{0}_{1} + ty^{0}_{2})\\ e^{-t}y^{0}_{2} \end{bmatrix}.$

Let us take a close-by matrix which is diagonalizable. For some small ${\epsilon}$ we choose

$\displaystyle A_{\epsilon} = \begin{bmatrix} -1 & 1\\ 0 & -1+\epsilon \end{bmatrix}.$

Since ${A_{\epsilon}}$ is upper triangular, it has its eigenvalues on the diagonal. Since ${\epsilon\neq 0}$ , there are two distinct eigenvalues and hence, ${A_{\epsilon}}$ is diagonalizable. Indeed, with

$\displaystyle S = \begin{bmatrix} 1 & 1\\ 0 & \epsilon \end{bmatrix},\quad S^{-1}= \begin{bmatrix} 1 & -\tfrac1\epsilon\\ 0 & \tfrac1\epsilon \end{bmatrix}$

we get

$\displaystyle A = S \begin{bmatrix} -1 & 0 \\ 0 & -1+\epsilon \end{bmatrix}S^{-1}.$

The matrix exponential of ${A_{\epsilon}t}$ is

$\displaystyle \begin{array}{rcl} \exp(A_{\epsilon}t) &=& S\exp( \begin{bmatrix} -t & 0\\ 0 & -t(1-\epsilon) \end{bmatrix} )S^{-1}\\ &=& \begin{bmatrix} e^{-t} & \tfrac{e^{-(1-\epsilon)t} - e^{-t}}{\epsilon}\\ 0 & e^{-(1-\epsilon)t} \end{bmatrix}. \end{array}$

Hence, the solution of ${y' = Ay}$ , ${y(0) = y_{0}}$ is

$\displaystyle y(t) = \begin{bmatrix} e^{-t}y^{0}_{1} + \tfrac{e^{-(1-\epsilon)t} - e^{-t}}{\epsilon}y^{0}_{2}\\ e^{-(1-\epsilon)t}y^{0}_{2} \end{bmatrix}.$

How is this related to the solution of ${y'=Ay}$ ? How far is it away?

Of course, the lower right entry of ${\exp(A_{\epsilon}t)}$ converges to ${e^{-t}}$ for ${\epsilon \rightarrow 0}$ , but what about the upper right entry? Note that the entry

$\displaystyle \tfrac{e^{-(1-\epsilon)t} - e^{-t}}{\epsilon}$

is nothing else that the (negative) difference quotient for the derivative of the function ${f(a) = e^{-at}}$ at ${a=1}$ . Hence

$\displaystyle \tfrac{e^{-(1-\epsilon)t} - e^{-t}}{\epsilon} \stackrel{\epsilon\rightarrow 0}{\longrightarrow} -f'(1) = te^{-t}$

and we get

$\displaystyle \exp(A_{\epsilon}t) \stackrel{\epsilon\rightarrow 0}{\longrightarrow} \begin{bmatrix} e^{-t} & te^{-t}\\ 0 & e^{-t} \end{bmatrix} = \exp(At)$

as expected.

It turns out that a fairly big $\epsilon$ is already enough to get a quite good approximation and even the correct asymptotics: The blue curve it first component of the exact solution (initialized with the second standard basis vector), the red one corresponds $\epsilon = 0.1$ and the yellow on (pretty close to the blue on) is for $\epsilon = 0.01$ .

to $\e 102_jordan_ode

regularize