6.3 Eigenvectors and eigenvalues
Mappings \(g\) that have the same domain and codomain allow for the notion of a fixed point. Recall that an element \(x\) of a set \(\mathcal{X}\) is called a fixed point of a mapping \(g :\mathcal{X} \to \mathcal{X}\) if \(g(x)=x,\) that is, \(x\) agrees with its image under \(g.\) In Linear Algebra, a generalisation of the notion of a fixed point is that of an eigenvector. A vector \(v \in V\) is called an eigenvector of the linear map \(g : V \to V\) if \(v\) is merely scaled when applying \(g\) to \(v,\) that is, there exists a scalar \(\lambda \in \mathbb{K}\) – called eigenvalue – such that \(g(v)=\lambda v.\) Clearly, the zero vector \(0_V\) will satisfy this condition for every choice of scalar \(\lambda.\) For this reason, eigenvectors are usually required to be different from the zero vector. In this terminology, fixed points \(v\) of \(g\) are simply eigenvectors with eigenvalue \(1,\) since they satisfy \(g(v)=v=1v.\)
It is natural to ask whether a linear map \(g : V \to V\) always admits an eigenvector. In the remaining part of this chapter we will answer this question and further develop our theory of linear maps, specifically endomorphisms. We start with some precise definitions.
Let \(g : V \to V\) be an endomorphism of a \(\mathbb{K}\)-vector space \(V.\)
An eigenvector with eigenvalue \(\lambda \in \mathbb{K}\) is a non-zero vector \(v \in V\) such that \(g(v)=\lambda v.\)
If \(\lambda\in \mathbb{K}\) is an eigenvalue of \(g,\) the \(\lambda\)-eigenspace \(\operatorname{Eig}_{g}(\lambda)\) is the subspace of vectors \(v\in V\) satisfying \(g(v)=\lambda v.\)
The dimension of \(\operatorname{Eig}_{g}(\lambda)\) is called the geometric multiplicity of the eigenvalue \(\lambda.\)
The set of all eigenvalues of \(g\) is called the spectrum of \(g\).
For \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) we speak of eigenvalues, eigenvectors, eigenspaces and spectrum to mean those of the endomorphism \(f_\mathbf{A}: \mathbb{K}^n \to \mathbb{K}^n.\)
By definition, the zero vector \(0_V\) is not an eigenvector, it is however an element of the eigenspace \(\operatorname{Eig}_{g}(\lambda)\) for every eigenvalue \(\lambda.\)
The scalar \(0\) is an eigenvalue of an endomorphism \(g : V \to V\) if and only if the kernel of \(g\) is different from \(\{0_V\}.\) In the case where the kernel of \(f\) does not only consist of the zero vector, we have \(\operatorname{Ker}g=\operatorname{Eig}_{g}(0)\) and the geometric multiplicity of \(0\) is the nullity of \(g.\)
The endomorphism \(f_\mathbf{D}: \mathbb{K}^n \to \mathbb{K}^n\) associated to a diagonal matrix with distinct diagonal entries \[\mathbf{D}=\begin{pmatrix} \lambda_1 & && \\ & \lambda_2 && \\ &&\ddots & \\ &&& \lambda_n \end{pmatrix}\] has spectrum \(\{\lambda_1,\ldots,\lambda_n\}\) and corresponding eigenspaces \(\operatorname{Eig}_{f_{\mathbf{D}}}(\lambda_i)=\operatorname{span}\{\vec{e}_i\}.\)
Consider the \(\mathbb{R}\)-vector space \(\mathsf{P}(\mathbb{R})\) of polynomials and \(f=\frac{\mathrm{d}}{\mathrm{d}x} : \mathsf{P}(\mathbb{R}) \to \mathsf{P}(\mathbb{R})\) the derivative by the variable \(x.\) The kernel of \(f\) consists of the constant polynomials and hence \(0\) is an eigenvalue for \(f.\) For any non-zero scalar \(\lambda\) we cannot have polynomials \(p\) satisfying \(\frac{\mathrm{d}}{\mathrm{d}x} p=\lambda p,\) as the left hand of this last expression has a smaller degree than the right hand side.
Previously we defined the trace and determinant for an endomorphism \(g : V \to V\) by observing that the trace and determinant of the matrix representation of \(g\) are independent of the chosen basis of \(V.\) Similarly, we can consider eigenvalues of \(g\) and eigenvalues of the matrix representation of \(g\) with respect to some ordered basis of \(V.\) Perhaps unsurprisingly, the eigenvalues are the same:
Let \(g : V \to V\) be an endomorphism of a finite dimensional \(\mathbb{K}\)-vector space \(V.\) Let \(\mathbf{b}\) be an ordered basis of \(V\) with corresponding linear coordinate system \(\boldsymbol{\beta}.\) Then \(v \in V\) is an eigenvector of \(g\) with eigenvalue \(\lambda \in \mathbb{K}\) if and only if \(\boldsymbol{\beta}(v) \in \mathbb{K}^n\) is an eigenvector with eigenvalue \(\lambda\) of \(\mathbf{M}(g,\mathbf{b},\mathbf{b}).\) In particular, conjugate matrices have the same eigenvalues.
Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}\) an ordered basis of \(V\) and \(\mathbf{c}\) an ordered basis of \(W.\) The matrix representation of a linear map \(g : V \to W\) with respect to the ordered bases \(\mathbf{b}\) and \(\mathbf{c}\) is the unique matrix \(\mathbf{M}(g,\mathbf{b},\mathbf{c}) \in M_{m,n}(\mathbb{K})\) such that \[f_{\mathbf{M}(g,\mathbf{b},\mathbf{c})}=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1},\] where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) denote the linear coordinate systems corresponding to \(\mathbf{b}\) and \(\mathbf{c},\) respectively.
Conversely, if \(\lambda\) is an eigenvalue of \(f_\mathbf{A}\) with non-zero eigenvector \(\vec{x},\) then it follows as above that \(v=\boldsymbol{\beta}^{-1}(\vec{x}) \in V\) is an eigenvector of \(g\) with eigenvalue \(\lambda.\)
Because of (ii) in particular, one can say that two matrices \(\mathbf{A}\) and \(\mathbf{B}\) are similar without ambiguity.
Theorem 3.106 shows that \(\mathbf{A}\) and \(\mathbf{B}\) are similar if and only if there exists an endomorphism \(g\) of \(\mathbb{K}^n\) such that \(\mathbf{A}\) and \(\mathbf{B}\) represent \(g\) with respect to two ordered bases of \(\mathbb{K}^n.\)
The “nicest” endomorphisms are those for which there exists an ordered basis consisting of eigenvectors:
An endomorphism \(g : V \to V\) is called diagonalisable if there exists an ordered basis \(\mathbf{b}\) of \(V\) such that each element of \(\mathbf{b}\) is an eigenvector of \(g.\)
For \(n \in \mathbb{N},\) a matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) is called diagonalisable over \(\mathbb{K}\) if the endomorphism \(f_\mathbf{A}: \mathbb{K}^n \to \mathbb{K}^n\) is diagonalisable.
We consider \(V=\mathsf{P}(\mathbb{R})\) and the endomorphism \(g : V \to V\) which replaces the variable \(x\) with \(2x.\) For instance, we have \[g(x^2-2x+3)=(2x)^2-2(2x)+3=4x^2-4x+3.\] Then \(g\) is diagonalisable. The vector space \(\mathsf{P}(\mathbb{R})\) has an ordered basis \(\mathbf{b}=(1,x,x^2,x^3,\ldots).\) Clearly, for all \(k \in \mathbb{N}\cup\{0\}\) we have \(g(x^k)=2^kx^k,\) so that \(x^k\) is an eigenvector of \(g\) with eigenvalue \(2^k.\)
For \(\alpha \in (0, \pi)\) consider \[\mathbf{R}_{\alpha}=\begin{pmatrix} \cos \alpha & -\sin \alpha \\ \sin\alpha & \cos \alpha \end{pmatrix}.\] Recall that the endomorphism \(f_{\mathbf{R}_{\alpha}} : \mathbb{R}^2 \to \mathbb{R}^2\) rotates vectors counter-clockwise around the origin \(0_{\mathbb{R}^2}\) by the angle \(\alpha.\) Since \(\alpha \in (0,\pi),\) the endomorphism \(f_{\mathbf{R}_{\alpha}}\) has no eigenvectors and hence is not diagonalisable.
Let \(g : V \to V\) be an endomorphism of a finite dimensional \(\mathbb{K}\)-vector space \(V.\) Let \(\mathbf{b}\) be an ordered basis of \(V\) with corresponding linear coordinate system \(\boldsymbol{\beta}.\) Then \(v \in V\) is an eigenvector of \(g\) with eigenvalue \(\lambda \in \mathbb{K}\) if and only if \(\boldsymbol{\beta}(v) \in \mathbb{K}^n\) is an eigenvector with eigenvalue \(\lambda\) of \(\mathbf{M}(g,\mathbf{b},\mathbf{b}).\) In particular, conjugate matrices have the same eigenvalues.
Recall, if \(\mathcal{X},\mathcal{Y}\) are sets, \(f : \mathcal{X} \to \mathcal{Y}\) a mapping and \(\mathcal{Z}\subset \mathcal{X}\) a subset of \(\mathcal{X},\) we can consider the restriction of \(f\) to \(\mathcal{Z}\), usually denoted by \(f|_\mathcal{Z},\) which is the mapping \[f|_\mathcal{Z} : \mathcal{Z} \to \mathcal{Y}, \quad z \mapsto f(z).\] So we simply take the same mapping \(f,\) but apply it to the elements of the subset only.
Closely related to the notion of an eigenvector is that of a stable subspace. Let \(v \in V\) be an eigenvector with eigenvalue \(\lambda\) of the endomorphism \(g : V \to V.\) The \(1\)-dimensional subspace \(U=\operatorname{span}\{v\}\) is stable under \(g,\) that is, \(g(U)\subset U.\) Indeed, since \(g(v)=\lambda v\) and since every vector \(u \in U\) can be written as \(u=tv\) for some scalar \(t\in \mathbb{K},\) we have \(g(u)=g(tv)=tg(v)=t\lambda v \in U.\) This motivates the following definition:
A subspace \(U\subset V\) is called stable or invariant under the endomorphism \(g : V \to V\) if \(g(U) \subset U,\) that is \(g(u) \in U\) for all vectors \(u \in U.\) In this case, the restriction \(g|_{U}\) of \(g\) to \(U\) is an endomorphism of \(U.\)
Notice that a finite dimensional subspace \(U\subset V\) is stable under \(g\) if and only if \(g(v_i) \in U\) for \(1\leqslant i\leqslant m,\) where \(\{v_1,\ldots,v_m\}\) is a basis of \(U.\)
Every eigenspace of an endomorphism \(g : V \to V\) is a stable subspace. By definition \(g|_{\operatorname{Eig}_{g}(\lambda)} : \operatorname{Eig}_{g}(\lambda) \to \operatorname{Eig}_{g}(\lambda)\) is multiplication by the scalar \(\lambda \in \mathbb{K}.\)
We consider \(V=\mathbb{R}^3\) and \[\mathbf{R}_{\alpha}=\begin{pmatrix} \cos \alpha & -\sin \alpha & 0 \\ \sin \alpha & \cos \alpha & 0 \\ 0 & 0 & 1 \end{pmatrix}\] for \(\alpha \in (0,\pi).\) The endomorphism \(f_{\mathbf{R}_\alpha} : \mathbb{R}^3 \to \mathbb{R}^3\) is the rotation by the angle \(\alpha \in \mathbb{R}\) around the axis spanned by \(\vec{e}_3.\) Then the plane \(U=\{\vec{x}=(x_i)_{1\leqslant i\leqslant 3} \in \mathbb{R}^3 | x_3=0\}\) is stable under \(f=f_{\mathbf{R}_{\alpha}}.\) Here \(f|_{\Pi} : \Pi \to \Pi\) is the rotation in the plane \(U\) around the origin with angle \(\alpha.\)
Moreover, the vector \(\vec{e}_3\) is an eigenvector with eigenvalue \(1\) so that \[\operatorname{Eig}_{f}(1)=\operatorname{span}\{\vec{e}_3\}.\]
We consider again the \(\mathbb{R}\)-vector space \(\mathsf{P}(\mathbb{R})\) of polynomials and \(f=\frac{\mathrm{d}}{\mathrm{d}x} : \mathsf{P}(\mathbb{R}) \to \mathsf{P}(\mathbb{R})\) the derivative by the variable \(x.\) For \(n \in \mathbb{N}\) let \(U_n\) denote the subspace of polynomials of degree at most \(n.\) Since \(U_{n-1}\subset U_n,\) the subspace \(U_n\) is stable under \(f.\)
Stable subspaces correspond to zero blocks in the matrix representation of linear maps. More precisely:
Let \(V\) be a \(\mathbb{K}\)-vector space of dimension \(n \in \mathbb{N}\) and \(g : V \to V\) an endomorphism. Furthermore, let \(U\subset V\) be a subspace of dimension \(1\leqslant m\leqslant n\) and \(\mathbf{b}\) an ordered basis of \(U\) and \(\mathbf{c}=(\mathbf{b},\mathbf{b}^{\prime})\) an ordered basis of \(V.\) Then \(U\) is stable under \(g\) if and only if the matrix \(\mathbf{A}=\mathbf{M}(g,\mathbf{c},\mathbf{c})\) has the form \[\mathbf{A}=\begin{pmatrix} \hat{\mathbf{A}} & \ast \\ \mathbf{0}_{n-m,m} & \ast \end{pmatrix}\] for some matrix \(\hat{\mathbf{A}} \in M_{m,m}(\mathbb{K}).\) In the case where \(U\) is stable under \(g,\) we have \(\hat{\mathbf{A}}=\mathbf{M}(g|_U,\mathbf{b},\mathbf{b}) \in M_{m,m}(\mathbb{K}).\)
Proof. Write \(\mathbf{b}=(v_1,\ldots,v_m)\) for vectors \(v_i \in U\) and \(\mathbf{b}^{\prime}=(w_1,\ldots,w_{n-m})\) for vectors \(w_i \in V.\)
Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V,\) \(\mathbf{c}=(w_1,\ldots,w_m)\) an ordered basis of \(W\) and \(g : V \to W\) a linear map. Then there exist unique scalars \(A_{ij} \in \mathbb{K},\) where \(1\leqslant i\leqslant m, 1\leqslant j \leqslant n\) such that \[\tag{3.15} g(v_j)=\sum_{i=1}^m A_{ij}w_i, \qquad 1\leqslant j\leqslant n.\] Furthermore, the matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i\leqslant m, 1\leqslant j \leqslant n}\) satisfies \[f_\mathbf{A}=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1}\] and hence is the matrix representation of \(g\) with respect to the ordered bases \(\mathbf{b}\) and \(\mathbf{c}.\)
Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V,\) \(\mathbf{c}=(w_1,\ldots,w_m)\) an ordered basis of \(W\) and \(g : V \to W\) a linear map. Then there exist unique scalars \(A_{ij} \in \mathbb{K},\) where \(1\leqslant i\leqslant m, 1\leqslant j \leqslant n\) such that \[\tag{3.15} g(v_j)=\sum_{i=1}^m A_{ij}w_i, \qquad 1\leqslant j\leqslant n.\] Furthermore, the matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i\leqslant m, 1\leqslant j \leqslant n}\) satisfies \[f_\mathbf{A}=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1}\] and hence is the matrix representation of \(g\) with respect to the ordered bases \(\mathbf{b}\) and \(\mathbf{c}.\)
Notice that a finite dimensional subspace \(U\subset V\) is stable under \(g\) if and only if \(g(v_i) \in U\) for \(1\leqslant i\leqslant m,\) where \(\{v_1,\ldots,v_m\}\) is a basis of \(U.\)
Let \(V\) be a \(\mathbb{K}\)-vector space of dimension \(n \in \mathbb{N}\) and \(g : V \to V\) an endomorphism. Furthermore, let \(U\subset V\) be a subspace of dimension \(1\leqslant m\leqslant n\) and \(\mathbf{b}\) an ordered basis of \(U\) and \(\mathbf{c}=(\mathbf{b},\mathbf{b}^{\prime})\) an ordered basis of \(V.\) Then \(U\) is stable under \(g\) if and only if the matrix \(\mathbf{A}=\mathbf{M}(g,\mathbf{c},\mathbf{c})\) has the form \[\mathbf{A}=\begin{pmatrix} \hat{\mathbf{A}} & \ast \\ \mathbf{0}_{n-m,m} & \ast \end{pmatrix}\] for some matrix \(\hat{\mathbf{A}} \in M_{m,m}(\mathbb{K}).\) In the case where \(U\) is stable under \(g,\) we have \(\hat{\mathbf{A}}=\mathbf{M}(g|_U,\mathbf{b},\mathbf{b}) \in M_{m,m}(\mathbb{K}).\)
Suppose \(V\) is the direct sum of subspaces \(U_1,\) \(U_2,\ldots,U_m,\) all of which are stable under the endomorphism \(g : V \to V.\) If \(\mathbf{b}_i\) is an ordered basis of \(U_i\) for \(i=1,\ldots,m.\) Then the matrix representation of \(g\) with respect to the ordered basis \(\mathbf{c}=(\mathbf{b}_1,\ldots,\mathbf{b}_m)\) takes the block form \[\mathbf{A}=\begin{pmatrix} \mathbf{A}_1 & && \\ & \mathbf{A}_2 && \\ &&\ddots & \\ &&& \mathbf{A}_m \end{pmatrix}\] where \(\mathbf{A}_i=\mathbf{M}(g|_{U_i},\mathbf{b}_i,\mathbf{b}_i)\) for \(i=1,\ldots,m.\)
6.4 The characteristic polynomial
The eigenvalues of an endomorphism are the solutions of a polynomial equation:
Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space and \(g : V \to V\) an endomorphism. Then \(\lambda \in \mathbb{K}\) is an eigenvalue of \(g\) if and only if \[\det\left(\lambda \mathrm{Id}_{V}-g\right)=0.\] Moreover if \(\lambda\) is an eigenvalue of \(g,\) then \(\operatorname{Eig}_{g}(\lambda)=\operatorname{Ker}(\lambda \mathrm{Id}_V-g).\)
Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space and \(g : V \to V\) an endomorphism. Then the following statements are equivalent:
\(g\) is injective;
\(g\) is surjective;
\(g\) is bijective;
\(\det(g) \neq 0.\)
Let \(g : V \to V\) be an endomorphism of a finite dimensional \(\mathbb{K}\)-vector space \(V.\) The function \[\operatorname{char}_g : \mathbb{K}\to \mathbb{K}, \quad x\mapsto \det\left(x\mathrm{Id}_V -g\right)\] is called the characteristic polynomial of the endomorphism \(g\).
In practice, in order to compute the characteristic polynomial of an endomorphism \(g : V \to V,\) we choose an ordered basis \(\mathbf{b}\) of \(V\) and compute the matrix representation \(\mathbf{A}=\mathbf{M}(g,\mathbf{b},\mathbf{b})\) of \(g\) with respect to \(\mathbf{b}.\) We then have \[\operatorname{char}_g(x)=\det\left(x\mathbf{1}_{n}-\mathbf{A}\right).\] By the characteristic polynomial of a matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K}),\) we mean the characteristic polynomial of the endomorphism \(f_\mathbf{A}: \mathbb{K}^n \to \mathbb{K}^n,\) that is, the function \(x\mapsto \det\left(x\mathbf{1}_{n}-\mathbf{A}\right).\)
A zero of a polynomial \(f : \mathbb{K}\to \mathbb{K}\) is a scalar \(\lambda\in \mathbb{K}\) such that \(f(\lambda)=0.\) The multiplicity of a zero \(\lambda\) is the largest integer \(n\geqslant 1\) such that there exists a polynomial \(\hat{f} : \mathbb{K}\to \mathbb{K}\) so that \(f(x)=(x-\lambda)^n\hat{f}(x)\) for all \(x \in \mathbb{K}.\) Zeros are also known as roots.
The polynomial \(f(x)=x^3-x^2-8x+12\) can be factorised as \(f(x)=(x-2)^2(x+3)\) and hence has zero \(2\) with multiplicity \(2\) and \(-3\) with multiplicity \(1.\)
Let \(\lambda\) be an eigenvalue of the endomorphism \(g : V \to V.\) The multiplicity of the zero \(\lambda\) of \(\operatorname{char}_g\) is called the algebraic multiplicity of \(\lambda\).
We consider \[\mathbf{A}=\begin{pmatrix} 1 & 5 \\ 5 & 1 \end{pmatrix}.\] Then \[\begin{aligned} \operatorname{char}_{\mathbf{A}}(x)&=\operatorname{char}_{f_\mathbf{A}}(x)=\det\left(x\mathbf{1}_{2}-A\right)=\det\begin{pmatrix} x-1 & -5 \\ -5 & x-1\end{pmatrix}\\ &=(x-1)^2-25=x^2-2x-24=(x+4)(x-6). \end{aligned}\] Hence we have eigenvalues \(\lambda_1=6\) and \(\lambda_2=-4,\) both with algebraic multiplicity \(1.\) By definition we have \[\operatorname{Eig}_{\mathbf{A}}(6)=\operatorname{Eig}_{f_\mathbf{A}}(6)=\left\{\vec{v} \in \mathbb{K}^2 | \mathbf{A}\vec{v}=6\vec{v}\right\}\] and we compute that \[\operatorname{Eig}_{\mathbf{A}}(6)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ 1 \end{pmatrix}\right\}\] Since \(\dim \operatorname{Eig}_\mathbf{A}(6)=1,\) the eigenvalue \(6\) has geometric multiplicity \(1.\) Likewise we compute \[\operatorname{Eig}_\mathbf{A}(-4)=\operatorname{span}\left\{\begin{pmatrix} -1 \\ 1 \end{pmatrix}\right\}\] so that the eigenvalue \(-4\) has geometric multiplicity \(1\) as well. Notice that we have an ordered basis of eigenvectors of \(\mathbf{A}\) and hence \(\mathbf{A}\) is diagonalisable, c.f. Example 3.96 Example 3.96 ➔.Let \(\mathbf{e}=(\vec{e}_1,\vec{e}_2)\) denote the ordered standard basis of \(\mathbb{R}^2.\) Consider the matrix \[\mathbf{A}=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}=\mathbf{M}(f_\mathbf{A},\mathbf{e},\mathbf{e}).\] We want to compute \(\mathrm{Mat}(f_\mathbf{A},\mathbf{b},\mathbf{b}),\) where \(\mathbf{b}=(\vec{v}_1,\vec{v}_2)=(\vec{e}_1+\vec{e}_2,\vec{e}_2-\vec{e}_1)\) is not the standard basis of \(\mathbb{R}^2.\) We obtain \[\begin{aligned} f_\mathbf{A}(\vec{v}_1)&=A\vec{v}_1=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}\begin{pmatrix} 1 \\ 1 \end{pmatrix}=\begin{pmatrix} 6 \\ 6 \end{pmatrix}=6\cdot \vec{v}_1+ 0\cdot \vec{v}_2\\ f_\mathbf{A}(\vec{v}_2)&=A\vec{v}_2=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}\begin{pmatrix} -1 \\ 1 \end{pmatrix}=\begin{pmatrix} -4 \\ 4 \end{pmatrix}=0\cdot \vec{v}_1+ 4\cdot \vec{v}_2 \end{aligned}\] Therefore, we have \[\mathbf{M}(f_\mathbf{A},\mathbf{b},\mathbf{b})=\begin{pmatrix} 6 & 0 \\ 0 & 4\end{pmatrix}.\]
We consider \[\mathbf{A}=\begin{pmatrix} 2 & 1 \\ 0 & 2 \end{pmatrix}\] Then \(\operatorname{char}_\mathbf{A}(x)=(x-2)^2\) so that we have a single eigenvalue \(2\) with algebraic multiplicity \(2.\) We compute \[\operatorname{Eig}_\mathbf{A}(2)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ 0 \end{pmatrix}\right\}\] so that the eigenvalue \(2\) has geometric multiplicity \(1.\) Notice that we cannot find an ordered basis consisting of eigenvectors, hence \(\mathbf{A}\) is not diagonalisable.
The determinant and trace of an endomorphism do appear as coefficients in its characteristic polynomial:
Let \(g : V \to V\) be an endomorphism of a \(\mathbb{K}\)-vector space \(V\) of dimension \(n.\) Then \(\operatorname{char}_g\) is a polynomial of degree \(n\) and \[\operatorname{char}_g(x)=x^n-\operatorname{Tr}(g)x^{n-1}+\cdots +(-1)^n\det(g).\]
Proof. We fix an ordered basis \(\mathbf{b}\) of \(V.\) Writing \(\mathbf{M}(g,\mathbf{b},\mathbf{b})=\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}\) and using the Leibniz formula (5.8), we have \[\operatorname{char}_g(x)=\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^nB_{i\sigma(i)},\] where \[B_{ij}=\left\{\begin{array}{cc} x-A_{ii}, & i=j,\\ -A_{ij}, & i\neq j.\end{array}\right.\] Therefore, \(\operatorname{char}_g\) is a finite sum of products containing \(x\) at most \(n\) times, hence \(\operatorname{char}_g\) is a polynomial in \(x\) of degree at most \(n.\) The identity permutation contributes the term \(\prod_{i=1}^nB_{ii}\) in the Leibniz formula, hence we obtain \[\operatorname{char}_g(x)=\prod_{i=1}^n(x-A_{ii})+\sum_{\sigma \in S_n,\sigma \neq 1}\operatorname{sgn}(\sigma)\prod_{i=1}^nB_{i\sigma(i)}\] We now use induction to show that \[\prod_{i=1}^n(x-A_{ii})=x^n-\operatorname{Tr}(\mathbf{A})x^{n-1}+C_{n-2}x^{n-2}+\cdots +c_1x+c_0\] for scalars \(C_{n-2},\ldots,c_0 \in \mathbb{K}.\) For \(n=1\) we obtain \(x-A_{11},\) so that the statement is anchored.
Inductive step: Suppose \[\prod_{i=1}^{n-1}(x-A_{ii})=x^{n-1}-\left(\sum_{i=1}^{n-1} A_{ii}\right)x^{n-2}+C_{n-2}x^{n-3}+\cdots +c_1x+c_0,\] for coefficients \(C_{n-2},\ldots,c_0,\) then \[\begin{aligned} \prod_{i=1}^{n}(x-A_{ii})&=(x-A_{nn})\left[x^{n-1}-\left(\sum_{i=1}^{n-1} A_{ii}\right)x^{n-2}+C_{n-2}x^{n-3}+\cdots +c_1x+c_0\right]\\ &=x^{n}-\left(\sum_{i=1}^{n}A_{ii}\right)x^{n-1}+\text{ lower order terms in }x, \end{aligned}\] so the induction is complete.
We next argue that \(\sum_{\sigma \in S_n,\sigma \neq 1}\operatorname{sgn}(\sigma)\prod_{i=1}^nB_{i\sigma(i)}\) has at most degree \(n-2.\) Notice that each factor \(B_{i\sigma(i)}\) of \(\prod_{i=1}^nB_{i\sigma(i)}\) for which \(i\neq \sigma(i)\) does not contain \(x.\) So suppose that \(\sum_{\sigma \in S_n,\sigma \neq 1}\operatorname{sgn}(\sigma)\prod_{i=1}^nB_{i\sigma(i)}\) has degree bigger or equal than \(n-1.\) Then we have \(n-1\) integers \(i\) with \(1\leqslant i\leqslant n\) such that \(i=\sigma(i).\) Let \(j\) denote the remaining integer. Since \(\sigma\) is injective, it follows that for any \(i\neq j\) we must have \(i=\sigma(i)\neq \sigma(j).\) Therefore, \(\sigma(j)=j\) and hence \(\sigma=1,\) a contradiction.
In summary, we have shown that \[\operatorname{char}_g(x)=x^n-\operatorname{Tr}(g)x^{n-1}+C_{n-2}x^{n-1}+\cdots+c_1x+c_0\] for coefficients \(C_{n-2},\ldots,c_0 \in \mathbb{K}.\) It remains to show that \(c_0=(-1)^n\det(g).\) We have \(c_0=\operatorname{char}_g(0)=\det(-g)=\det(-\mathbf{A}).\) Since the determinant is linear in each row of \(\mathbf{A},\) this gives \(\det(-\mathbf{A})=(-1)^n\det(\mathbf{A}),\) as claimed.
We consider \[\mathbf{A}=\begin{pmatrix} 1 & 5 \\ 5 & 1 \end{pmatrix}.\] Then \[\begin{aligned} \operatorname{char}_{\mathbf{A}}(x)&=\operatorname{char}_{f_\mathbf{A}}(x)=\det\left(x\mathbf{1}_{2}-A\right)=\det\begin{pmatrix} x-1 & -5 \\ -5 & x-1\end{pmatrix}\\ &=(x-1)^2-25=x^2-2x-24=(x+4)(x-6). \end{aligned}\] Hence we have eigenvalues \(\lambda_1=6\) and \(\lambda_2=-4,\) both with algebraic multiplicity \(1.\) By definition we have \[\operatorname{Eig}_{\mathbf{A}}(6)=\operatorname{Eig}_{f_\mathbf{A}}(6)=\left\{\vec{v} \in \mathbb{K}^2 | \mathbf{A}\vec{v}=6\vec{v}\right\}\] and we compute that \[\operatorname{Eig}_{\mathbf{A}}(6)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ 1 \end{pmatrix}\right\}\] Since \(\dim \operatorname{Eig}_\mathbf{A}(6)=1,\) the eigenvalue \(6\) has geometric multiplicity \(1.\) Likewise we compute \[\operatorname{Eig}_\mathbf{A}(-4)=\operatorname{span}\left\{\begin{pmatrix} -1 \\ 1 \end{pmatrix}\right\}\] so that the eigenvalue \(-4\) has geometric multiplicity \(1\) as well. Notice that we have an ordered basis of eigenvectors of \(\mathbf{A}\) and hence \(\mathbf{A}\) is diagonalisable, c.f. Example 3.96.
We consider \[\mathbf{A}=\begin{pmatrix} 2 & 1 \\ 0 & 2 \end{pmatrix}\] Then \(\operatorname{char}_\mathbf{A}(x)=(x-2)^2\) so that we have a single eigenvalue \(2\) with algebraic multiplicity \(2.\) We compute \[\operatorname{Eig}_\mathbf{A}(2)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ 0 \end{pmatrix}\right\}\] so that the eigenvalue \(2\) has geometric multiplicity \(1.\) Notice that we cannot find an ordered basis consisting of eigenvectors, hence \(\mathbf{A}\) is not diagonalisable.