I remember from my introductory class in linear algebra that my instructor said

It is impossible to calculate the Jordan canonical form of a matrix numerically.

Another quote I remember is

The Jordan canonical form does not depend continuously on the matrix.

For both quotes I did not remember the underlying reasons and since I do teach an introductory class on linear algebra this year, I got thinking about these issues again.

Here is a very simple example for the fact in the second quote:

Consider the matrix $\displaystyle A_{\varepsilon} = \begin{pmatrix}1 & \varepsilon\\ 0 & 1\end{pmatrix}$

for ${\varepsilon>0}$. This matrix has ${1}$ as a double eigenvalue, but the corresponding eigenspace is one-dimensional and spanned by ${v_{1} = e_{1}}$. To extend this vector to a basis we calculate a principle vector by solving $\displaystyle (A_{\varepsilon}-I)v_{2} = v_{1}$ $\displaystyle v_{2} = \begin{pmatrix} 0\\\varepsilon^{-1} \end{pmatrix}.$

Defining ${S = [v_{1}\, v_{2}]}$ we get the Jordan canonical form of ${A_{\varepsilon}}$ as $\displaystyle J_{\varepsilon} = S^{-1}A_{\varepsilon}S = \begin{pmatrix} 1 & 1\\ 0 & 1 \end{pmatrix}$

So we have $\displaystyle A_{\varepsilon}\rightarrow A = I\quad\text{and}\quad J_{\varepsilon} \rightarrow J = \begin{pmatrix} 1 & 1\\ 0 & 1 \end{pmatrix},$

but ${J}$ is not the Jordan canonical form of ${A}$. So, in short: The Jordan canonical form of the limit of ${A_{\varepsilon}}$ is not the limit of the Jordan canonical form of ${A_{\varepsilon}}$. In other words: Taking limits does not commute with forming the Jordan canonical form.

A side note: Of course, the Jordan canonical form is not even unique in general, so speaking of “dependence on the matrix” is an issue. What we have shown is, that there is no way to get continuous dependence on the matrix even if non-uniqueness is not an issue (like in the example above).

What about the first quote? Why is it impossible to compute the Jordan canonical form numerically? Let’s just try! We start with the simplest non-diagonalizable matrix $\displaystyle A = \begin{pmatrix} 1 & 1\\ 0 & 1 \end{pmatrix}$

If we ask MATLAB or Octave to do the eigenvalue decomposition we get

>> [V,D] = diag(A)
V =

1.00000 -1.00000
0.00000  0.00000

D =

1 0
0 1

We see that ${V}$ does not seem to be invertible and indeed we get 

>> rank(V)
ans = 1


What is happening? MATLAB did not promise the produce an invertible ${V}$ and it does not promise that the putcome would fulfill ${V^{-1}AV = D}$ (which is my definition of diagonalizability). It does promise that ${AV = VD}$ and in fact

>> A*V
ans =

1.00000  1.00000
0.00000 -0.00000

>> V*D
ans =

1.00000 -1.00000
0.00000  0.00000

>> A*V-V*D
ans =

0.0000e+00 2.2204e-16
0.0000e+00 0.0000e+00

so the promised identity is fulfilled up to machine precision (which is actually equal $\texttt{2.2204e-16}$ and we denote it by ${\epsilon}$ from here on).

How did MATLAB diagonalize this matrix? Here is the thing: The diagonalizable matrices are dense in ${{\mathbb C}^{n\times n}}$! (You probably have heard that before…) What does that mean numerically? Any matrix that you represent in floating point numbers is actually a representative of a whole bunch of matrices. Each entry is only known up to a certain precision. But this bunch of matrices does contain some matrix which is diagonalizable! This is exactly, what it means to be a dense set! So it is impossible to say if a matrix given in floating point numbers is actually diagonalizable or not. So, what matrix was diagonalized by MATLAB? Let us have closer look at the matrix ${V}$: The entries in the first row are in fact ${1}$ and ${-1}$:

>> V(1,:)
ans =
1 -1

In the second row we have 

>> V(2,1)
ans =
0
>> V(2,2)
ans = 2.2204e-16

and there we have it. The inverse of ${V}$ does exist (although the matrix has numerical rank ${1}$) and it is

>> inv(V) warning: matrix singular to machine precision, rcond = 1.11022e-16
ans =

1.0000e+00 4.5036e+15
0.0000e+00 4.5036e+15

and note that $\texttt{4.5036e+15}$ is indeed just the inverse of the machine precision, so this inverse is actually 100% accurate. Recombining gives

>> inv(V)*D*V warning: matrix singular to machine precision, rcond = 1.11022e-16
ans =

1 0
0 1


which is not even close to our original ${A}$.
How is that? Here is a solution. The matrix $\displaystyle \tilde A = \begin{pmatrix} 1 & 1\\ 0 & 1+\epsilon^{-2} \end{pmatrix}$

is indistinguishable from ${A}$ numerically. However, it has two distinct eigenvalues, so it is diagonalizable. Indeed, a basis of eigenvectors is $\displaystyle \tilde V = \begin{pmatrix} 1 & -1\\ 0 & \epsilon^{-2} \end{pmatrix}$

which is indistinguishable from ${V}$ above and it holds that $\displaystyle \tilde V^{-1}\tilde A\tilde V = \begin{pmatrix} 1 & 0\\ 0 & 1+\epsilon^{-2}. \end{pmatrix}$

which is indistinguishable from $D$.