12 Endomorphisms, II

12.1 Properties of eigenvalues

We will argue next that an endomorphism $g : V \to V$ of a finite dimensional $\mathbb{K}$-vector space $V$ has at most $\dim(V)$ eigenvalues. We first need:

Theorem 12.1 • Little Bézout’s theorem

For a polynomial $f \in \mathsf{P}(\mathbb{K})$ of degree $n\geqslant 1$ and $x_0 \in \mathbb{K},$ there exists a polynomial $g \in \mathsf{P}(\mathbb{K})$ of degree $n-1$ such that for all $x \in \mathbb{K}$ we have $f(x)=f(x_0)+g(x)(x-x_0).$

Proof. We will give an explicit expression for the polynomial $g.$ If one is not interested in such an expression, a proof using induction can also be given. Write $f(x)=\sum_{k=0}^n a_kx^k$ for coefficients $(a_0,\ldots,a_n) \in \mathbb{K}^{n+1}.$ For $0\leqslant j\leqslant n-1$ consider \[\tag{12.1} b_j=\sum_{k=0}^{n-j-1}a_{k+j+1}x_0^k\] and the polynomial \[g(x)=\sum_{j=0}^{n-1}b_jx^j\] of degree $n-1.$ We have \[\begin{aligned} g(x)(x-x_0)&=\sum_{j=0}^{n-1}\sum_{k=0}^{n-j-1}\left(a_{k+j+1}x_0^kx^{j+1}\right)-\sum_{j=0}^{n-1}\sum_{k=0}^{n-j-1}\left(a_{k+j+1}x_0^{k+1}x^{j}\right)\\ &=\sum_{j=1}^n\sum_{k=0}^{n-j}\left(a_{k+j}x_0^kx^j\right)-\sum_{j=0}^{n-1}\sum_{k=1}^{n-j}\left(a_{k+j}x_0^kx^j\right)\\ &=a_nx^n+\sum_{j=1}^{n-1}a_jx^j+a_0-a_0-\sum_{k=1}^na_kx_0^k=f(x)-f(x_0). \end{aligned}\]

From this we conclude:

Proposition 12.2

Let $f \in \mathsf{P}(\mathbb{K})$ be a polynomial of degree $n.$ Then $f$ has at most $n$ (distinct) zeros or $f$ is the zero polynomial.

Proof. We use induction. The case $n=0$ is clear, hence the statement is anchored.

Inductive step: Suppose $f \in \mathsf{P}(\mathbb{K})$ is a polynomial of degree $n$ with $n+1$ distinct zeros $\lambda_1,\ldots,\lambda_{n+1}.$ Since $f(\lambda_{n+1})=0,$ Theorem 12.1 implies that \[f(x)=(x-\lambda_{n+1})g(x)\] for some polynomial $g$ of degree $n-1.$ For $1\leqslant i\leqslant n,$ we thus have \[0=f(\lambda_i)=(\lambda_i-\lambda_{n+1})g(\lambda_i).\] Since $\lambda_i\neq \lambda_{n+1}$ it follows that $g(\lambda_i)=0.$ Therefore, $g$ has $n$ distinct zeros and must be the zero polynomial by the induction hypothesis. It follows that $f$ is the zero polynomial as well.

This gives:

Corollary 12.3

Let $g : V \to V$ be an endomorphism of a $\mathbb{K}$-vector space of dimension $n \in \mathbb{N}.$ Then $g$ has at most $n$ (distinct) eigenvalues.

Proof. By Lemma 11.38 and Lemma 11.43, the eigenvalues of $g$ are the zeros of the characteristic polynomial. The characteristic polynomial of $g$ has degree $n.$ The claim follows by applying Proposition 12.2.

Proposition 12.4 • Linear independence of eigenvectors

Let $V$ be a finite dimensional $\mathbb{K}$-vector space and $g :V \to V$ an endomorphism. Then the eigenspaces $\operatorname{Eig}_{g}(\lambda)$ of $g$ are in direct sum. In particular, if $v_1,\ldots,v_m$ are eigenvectors corresponding to distinct eigenvalues of $g,$ then $\{v_1,\ldots,v_m\}$ are linearly independent.

Proof. We use induction on the number $m$ of distinct eigenvalues of $g.$ Let $\{\lambda_1,\ldots,\lambda_m\}$ be distinct eigenvalues of $g.$ For $m=1$ the statement is trivially true, so the statement is anchored.

Inductive step: Assume $m-1$ eigenspaces are in direct sum. We want to show that then $m$ eigenspaces are also in direct sum. Let $v_i,v_i^{\prime} \in \operatorname{Eig}_{g}(\lambda_i)$ be eigenvectors such that \[\tag{12.2} v_1+v_2+\cdots+v_m=v_1^{\prime}+v_2^{\prime}+\cdots+v_{\tilde{m}}.\] Applying $g$ to this last equation gives \[\tag{12.3} \lambda_1v_1+\lambda_2 v_2+\cdots+\lambda_m v_m=\lambda_1 v_1^{\prime}+\lambda_2 v_2^{\prime}+\cdots+\lambda_m v_{\tilde{m}}.\] Subtracting $\lambda_m$ times (12.2) from (12.3) gives \[(\lambda_1-\lambda_m)v_1+\cdots+(\lambda_{m-1}-\lambda_m)v_{m-1}=(\lambda_1-\lambda_m)v_1^{\prime}+\cdots+(\lambda_{m-1}-\lambda_m)v_{m-1}^{\prime}.\] Since $m-1$ eigenspaces are in direct sum, this implies that $(\lambda_i-\lambda_m)v_i=(\lambda_i-\lambda_m)v_i^{\prime}$ for $1\leqslant i\leqslant m-1.$ Since the eigenvalues are distinct, we have $\lambda_i-\lambda_m\neq 0$ for all $1\leqslant i\leqslant m-1$ and hence $v_i=v_i^{\prime}$ for all $1\leqslant i\leqslant m-1.$ Now (12.3) implies that $v_m=v_{\tilde{m}}$ as well and the inductive step is complete.

Since the eigenspaces are in direct sum, the linear independence of eigenvectors with respect to distinct eigenvalues follows from Remark 11.19 (ii).

In the case where all the eigenvalues are distinct, we conclude that $g$ is diagonalisable.

Proposition 12.5

Let $g : V \to V$ be an endomorphism of a finite dimensional $\mathbb{K}$-vector space $V.$ Suppose the characteristic polynomial of $g$ has $\dim(V)$ distinct zeros (that is, the algebraic multiplicity of each eigenvalue is $1$), then $g$ is diagonalisable.

Proof. Let $n=\dim(V).$ Let $\lambda_1,\ldots,\lambda_n$ denote the distinct eigenvalues of $g.$ Let $0_V\neq v_i \in \operatorname{Eig}_g(\lambda_i)$ for $i=1,\ldots,n.$ Then, by Proposition 12.4, the eigenvectors are linearly independent, it follows that $(v_1,\ldots,v_n)$ is an ordered basis of $V$ consisting of eigenvectors, hence $g$ is diagonalisable.

Remark 12.6

Proposition 12.5 gives a sufficient condition for an endomorphism $g : V \to V$ to be diagonalisable, it is however not necessary. The identity endomorphism is diagonalisable, but its spectrum consists of the single eigenvalue $1$ with algebraic multiplicity $\dim(V).$

Every polynomial in $\mathsf{P}(\mathbb{C})$ of degree at least $1$ has at least one zero. This fact is known as the fundamental theorem of algebra. The name is well-established, but quite misleading, as there is no purely algebraic proof. You will encounter a proof of this statement in the module M07. As a consequence we obtain the following important existence theorem:

Theorem 12.7 • Existence of eigenvalues

Let $g : V \to V$ be an endomorphism of a complex vector space $V$ of dimension $n\geqslant 1.$ Then $g$ admits at least one eigenvalue. Moreover, the sum of the algebraic multiplicities of the eigenvalues of $g$ is equal to $n.$ In particular, if $\mathbf{A}\in M_{n,n}(\mathbb{C})$ is a matrix, then there is at least one eigenvalue of $\mathbf{A}.$

Proof. By Lemma 11.38 and Lemma 11.43, the eigenvalues of $g$ are the zeros of the characteristic polynomial and this is an element of $\mathsf{P}(\mathbb{C}).$ The first statement thus follows by applying the fundamental theorem of algebra to the characteristic polynomial of $g.$

Applying Theorem 12.1 and the fundamental theorem of algebra repeatedly, we find $k \in \mathbb{N}$ and multiplicities $m_1,\ldots,m_k \in \mathbb{N}$ such that \[\operatorname{char}_g(x)=(x-\lambda_1)^{m_1}(x-\lambda_2)^{m_2}\cdots(x-\lambda_k)^{m_k}\] where $\lambda_1,\ldots,\lambda_k$ are zeros of $\operatorname{char}_g.$ Since $\operatorname{char}_g$ has degree $n,$ it follows that $\sum_{i=1}^k m_i=n.$

Example 12.8

Recall that the discriminant of a quadratic polynomial $x\mapsto ax^2+bx+c \in \mathsf{P}(\mathbb{K})$ is $b^2-4ac,$ provided $a\neq 0.$ If $\mathbb{K}=\mathbb{C}$ and $b^2-4ac$ is non-zero, then the polynomial $ax^2+bx+c$ has two distinct zeros. The characteristic polynomial of a $2$-by-$2$ matrix $\mathbf{A}$ satisfies $\operatorname{char}_\mathbf{A}(x)=x^2-\operatorname{Tr}(\mathbf{A})x+\det(\mathbf{A}).$ Therefore, if $\mathbf{A}$ has complex entries and satisfies $(\operatorname{Tr}\mathbf{A})^2-4\det \mathbf{A}\neq 0,$ then it is diagonalisable. If $\mathbf{A}$ has real entries and satisfies $(\operatorname{Tr}\mathbf{A})^2-4\det \mathbf{A}\geqslant 0,$ then it has a least one eigenvalue. If $(\operatorname{Tr}\mathbf{A})^2-4\det \mathbf{A}>0$ then it is diagonalisable.
Recall that, by Proposition 10.4, an upper triangular matrix $\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}$ satisfies $\det \mathbf{A}=\prod_{i=1}^n A_{ii}.$ It follows that \[\operatorname{char}_\mathbf{A}(x)=\prod_{i=1}^n(x-A_{ii})=(x-A_{11})(x-A_{22})\cdots(x-A_{nn}).\] Consequently, an upper triangular matrix has spectrum $\{A_{11},A_{22},\ldots,A_{nn}\}$ and is diagonalisable if all its diagonal entries are distinct. Notice that by Example 11.42 (ii) not every upper triangular matrix is diagonalisable.

Example 12.9 • Fibonacci sequences

The Fibonacci sequence is the sequence $(F_n)_{n \in \mathbb{N}}$ defined by the relations \[F_0 = 0,\quad F_1 = 1, \quad F_n = F_{n-1} + F_{n-2}\text{ for $n \ge 2$}.\]

Consider the matrix \[\mathbf{A}= \begin{pmatrix}0 & 1 \\ 1 & 1 \end{pmatrix} = \begin{pmatrix} F_0 & F_1 \\ F_1 & F_2 \end{pmatrix}.\] Then, using induction, we can show that \[\mathbf{A}^n=\begin{pmatrix} F_{n-1} & F_n \\ F_n & F_{n+1}\end{pmatrix}\] for all $n \in \mathbb{N}.$ We would like to give a simple formula for computing $\mathbf{A}^n,$ and hence the Fibonacci numbers $F_n.$

Suppose we can find an invertible matrix $\mathbf{C}$ so that $\mathbf{A}=\mathbf{C}\mathbf{D}\mathbf{C}^{-1}$ for some diagonal matrix $\mathbf{D}.$ Then \[\mathbf{A}^n=\mathbf{C}\mathbf{D}\mathbf{C}^{-1}\mathbf{C}\mathbf{D}\mathbf{C}^{-1}\cdots \mathbf{C}\mathbf{D}\mathbf{C}^{-1}=\mathbf{C}\mathbf{D}^n\mathbf{C}^{-1}\] and we can easily compute $\mathbf{A}^n,$ as the $n$-th power of a diagonal matrix $\mathbf{D}$ is the diagonal matrix whose diagonal entries are given by the $n$-th powers of diagonal entries of $\mathbf{D}.$ We thus want to diagonalise the matrix \[\mathbf{A}=\begin{pmatrix} 0 & 1 \\ 1 & 1 \end{pmatrix}.\] We obtain $\operatorname{char}_\mathbf{A}(x)=x^2-x-1$ and hence eigenvalues $\lambda_{1}=(1+\sqrt{5})/2$ and $\lambda_2=(1-\sqrt{5})/2.$ From this we compute \[\operatorname{Eig}_{\mathbf{A}}(\lambda_1)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ \lambda_1 \end{pmatrix}\right\}\qquad \text{and} \qquad \operatorname{Eig}_{\mathbf{A}}(\lambda_2)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ \lambda_2 \end{pmatrix}\right\}\]

Let $\mathbf{e}=(\vec{e}_1,\vec{e}_2)$ denote the standard basis of $\mathbb{R}^2$ and consider the ordered basis \[\mathbf{b}=\left(\begin{pmatrix} 1 \\ \lambda_1 \end{pmatrix},\begin{pmatrix} 1 \\ \lambda_2 \end{pmatrix}\right)\] of eigenvectors of $f_\mathbf{A}.$ We have \[\mathbf{M}(f_\mathbf{A},\mathbf{b},\mathbf{b})=\begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2\end{pmatrix}=\mathbf{D}\] and the change of base matrix is \[\mathbf{C}=\mathbf{C}(\mathbf{b},\mathbf{e})=\begin{pmatrix} 1 & 1 \\ \lambda_1 & \lambda_2 \end{pmatrix}\] and \[\mathbf{C}^{-1} =\mathbf{C}(\mathbf{e},\mathbf{b})=\frac{1}{\lambda_2-\lambda_1}\begin{pmatrix} \lambda_2 & -1 \\ -\lambda_1 & 1 \end{pmatrix}.\] Therefore $\mathbf{A}=\mathbf{C}\mathbf{D}\mathbf{C}^{-1}$ and hence $\mathbf{A}^n=\mathbf{C}\mathbf{D}^n\mathbf{C}^{-1}$ so that \[\mathbf{A}^n=\frac{1}{\lambda_2-\lambda_1}\begin{pmatrix} 1 & 1 \\ \lambda_1 & \lambda_2 \end{pmatrix}\begin{pmatrix} \lambda_1^n & 0 \\ 0 & \lambda_2^n \end{pmatrix}\begin{pmatrix} \lambda_2 & -1 \\ -\lambda_1 & 1 \end{pmatrix}=\begin{pmatrix} F_{n-1} & F_n \\ F_n & F_{n+1}\end{pmatrix}.\] This yields the formula \[F_n=\frac{\lambda_1^n-\lambda_2^n}{\lambda_1-\lambda_2} = \frac{((1 + \sqrt{5})/2) ^ n - ((1 - \sqrt{5})/2)^n}{\sqrt{5}}.\]

Proposition 12.10

Let $g : V \to V$ be an endomorphism of a finite dimensional $\mathbb{K}$-vector space $V$ of dimension $n\geqslant 1.$

Let $\lambda$ be an eigenvalue of $g.$ Then its algebraic multiplicity is at least as big as its geometric multiplicity.
If $\mathbb{K}=\mathbb{C},$ then $g$ is diagonalisable if and only if for all eigenvalues of $g,$ the algebraic and geometric multiplicity are the same.

Proof. (i) Let $\dim \operatorname{Eig}_{g}(\lambda)=m$ and $\mathbf{b}$ be an ordered basis of $\operatorname{Eig}_{g}(\lambda).$ Furthermore, let $\mathbf{b}^{\prime}$ be an ordered tuple of vectors such that $\mathbf{c}=(\mathbf{b},\mathbf{b}^{\prime})$ is an ordered basis of $V.$ The eigenspace $\operatorname{Eig}_{g}(\lambda)$ is stable under $g$ and \[\mathbf{M}(g|_{\operatorname{Eig}_{g}(\lambda)},\mathbf{b},\mathbf{b})=\lambda\mathbf{1}_{m}.\] By Proposition 11.36, the matrix representation of $g$ with respect to the basis $\mathbf{c}$ takes the form \[\mathbf{M}(g,\mathbf{c},\mathbf{c})=\begin{pmatrix} \lambda\mathbf{1}_{m} & * \\ \mathbf{0}_{n-m,m} & \mathbf{B}\end{pmatrix}\] for some matrix $\mathbf{B}\in M_{n-m,n-m}(\mathbb{K}).$ We thus obtain \[\operatorname{char}_g(x)=\det\begin{pmatrix} (x-\lambda)\mathbf{1}_{m} & * \\ \mathbf{0}_{n-m,m} & x\mathbf{1}_{n-m}-\mathbf{B}\end{pmatrix}\] Applying the Laplace expansion (9.5) with respect to the first column, we have \[\operatorname{char}_g(x)=(x-\lambda)\det\begin{pmatrix} (x-\lambda)\mathbf{1}_{m-1} & * \\ \mathbf{0}_{n-m,m-1} & x\mathbf{1}_{n-m}-\mathbf{B}\end{pmatrix}\] Applying the Laplace expansion again with respect to the first column, $m$-times in total, we get \[\operatorname{char}_g(x)=(x-\lambda)^m\det(x\mathbf{1}_{n-m}-\mathbf{B})=(x-\lambda)^m\operatorname{char}_\mathbf{B}(x).\] The algebraic multiplicity of $\lambda$ is thus at least $m.$

(ii) Suppose $\mathbb{K}=\mathbb{C}$ and that $g :V \to V$ is diagonalisable. Hence we have an ordered basis $(v_1,\ldots,v_n)$ of $V$ consisting of eigenvectors of $g.$ Therefore, \[\operatorname{char}_g(x)=\prod_{i=1}^n(x-\lambda_i)\] where $\lambda_i$ is the eigenvalue of the eigenvector $v_i,$ $1\leqslant i \leqslant n.$ For any eigenvalue $\lambda_j,$ its algebraic multiplicity is the number of indices $i$ with $\lambda_i=\lambda_j.$ For each such index $i,$ the eigenvector $v_i$ satisfies $g(v_i)=\lambda_iv_i=\lambda_j v_i$ and hence is an element of the eigenspace $\operatorname{Eig}_{g}(\lambda_j).$ The geometric multiplicity of each eigenvalue is thus at least as big as the algebraic multiplicity, but by the previous statement, the latter cannot be bigger than the former, hence they are equal.

Conversely, suppose that for all eigenvalues of $g,$ the algebraic and geometric multiplicity are the same. Since $\mathbb{K}=\mathbb{C},$ by Theorem 12.7, the sum of the algebraic multiplicities is $n.$ The sum of the geometric multiplicities is by assumption also $n.$ Since, by Proposition 12.4, the eigenspaces with respect to different eigenvalues are in direct sum, we obtain a basis of $V$ consisting of eigenvectors of $g.$

12.2 Special endomorphisms

Involutions

A mapping $\iota : \mathcal{X} \to \mathcal{X}$ from a set $\mathcal{X}$ into itself is called an involution, if $\iota \circ \iota=\mathrm{Id}_\mathcal{X}.$ In the case where $\mathcal{X}$ is a vector space and $\iota$ is linear, then $\iota$ is called a linear involution.

Example 12.11 • Involutions

Let $V$ be a $\mathbb{K}$-vector space. Then the identity mapping $\mathrm{Id}_V : V \to V$ is a linear involution.
For all $n \in \mathbb{N},$ the transpose $M_{n,n}(\mathbb{K}) \to M_{n,n}(\mathbb{K})$ is a linear involution.
For $n \in \mathbb{N},$ let $\mathcal{X}$ denote the set of invertible $n\times n$ matrices. Then the matrix inverse $^{-1} : \mathcal{X} \to \mathcal{X}$ is an involution. Notice that $\mathcal{X}$ is not a vector space.
For any $\mathbb{K}$-vector space $V,$ the mapping $\iota : V \to V, v \mapsto -v$ is a linear involution. Considering $\mathsf{F}(I,\mathbb{K}),$ the $\mathbb{K}$-vector space of functions on the interval $I\subset \mathbb{R},$ we obtain a linear involution of $\mathsf{F}(V,\mathbb{K})$ by sending a function $f$ to $f\circ \iota.$
If $\mathbf{A}\in M_{n,n}(\mathbb{K})$ satisfies $\mathbf{A}^2=\mathbf{1}_{n},$ then $f_\mathbf{A}: \mathbb{K}^n \to \mathbb{K}^n$ is a linear involution.

In the rest of this section we suppose $2 \ne 0$ in $\mathbb{K}.$

Proposition 12.12

Let $V$ be a $\mathbb{K}$-vector space and $\iota : V \to V$ a linear involution. Then the spectrum of $\iota$ is contained in $\{-1,1\}.$ Moreover $V=\operatorname{Eig}_{\iota}(1)\oplus \operatorname{Eig}_{\iota}(-1)$ and $\iota$ is diagonalisable.

Proof. Suppose $\lambda \in \mathbb{K}$ is an eigenvalue of $\iota$ so that $\iota(v)=\lambda v$ for some non-zero vector $v \in V.$ Then $\iota(\iota(v))=v=\lambda \iota(v)=\lambda^2 v.$ Hence $(1-\lambda^2)v=0_V$ and since $v$ is non-zero, we conclude that $\lambda=\pm 1.$

By Proposition 12.4, the eigenspaces $\operatorname{Eig}_{\iota}(1)$ and $\operatorname{Eig}_{\iota}(-1)$ are in direct sum (interpreting $\operatorname{Eig}_{\iota}(1)$ as $\{0\}$ if $1$ is not an eigenvalue, and similarly for $-1$). What we need to show is that their sum is all of $V.$ For $v \in V$ we write \[v= \frac{1}{2}(v+\iota(v))+\frac{1}{2}(v-\iota(v)).\] We claim that $\frac{1}{2}(v+\iota(v)) \in \operatorname{Eig}_{\iota}(1),$ and $\frac{1}{2}(v-\iota(v)) \in \operatorname{Eig}_{\iota}(-1).$

For the first half of the claim, we compute \[\iota\left(\frac{1}{2}(v+\iota(v))\right)= \frac{1}{2}(\iota(v) + \iota(\iota(v))) = \frac{1}{2}(\iota(v) + v).\] The second half of the claim is similar.

It follows that the sum of $\operatorname{Eig}_{\iota}(1)$ and $\operatorname{Eig}_\iota(-1)$ is $V,$ so $\iota$ is diagonalisable as required.

Projections

A linear mapping $\Pi : V \to V$ satisfying $\Pi\circ \Pi=\Pi$ is called a projection.

Example 12.13

Consider $V=\mathbb{R}^3$ and \[\mathbf{A}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix}.\] Clearly, $\mathbf{A}^2=\mathbf{A}$ and $f_\mathbf{A}: \mathbb{R}^3 \to \mathbb{R}^3$ projects a vector $\vec{x}=(x_i)_{1\leqslant i\leqslant 3}$ onto the plane $\{\vec{x} \in \mathbb{R}^3 | x_3=0\}.$

Similar to Proposition 12.12, we obtain:

Proposition 12.14

Let $V$ be a $\mathbb{K}$-vector space and $\Pi : V \to V$ a projection. Then the spectrum of $\Pi$ is contained in $\{0,1\}.$ Moreover $V=\operatorname{Eig}_{\Pi}(0)\oplus \operatorname{Eig}_{\Pi}(1),$ $\Pi$ is diagonalisable and $\operatorname{Im}\Pi=\operatorname{Eig}_{\Pi}(1).$

Proof. Let $v \in V$ be an eigenvector of the projection $\Pi$ with eigenvalue $\lambda.$ Hence we obtain $\Pi(\Pi(v))=\lambda^2 v=\Pi(v)=\lambda v,$ equivalently, $\lambda(\lambda-1)v=0_V.$ Since $v$ is non zero, it follows that $\lambda=0$ or $\lambda=1.$

Using an argument similar to Proposition 12.12, one can show that $V = \operatorname{Ker}\Pi + \operatorname{Im}\Pi.$ Since $\operatorname{Ker}\Pi=\operatorname{Eig}_\Pi(0),$ the theorem will follow if we can show that $\operatorname{Im}\Pi=\operatorname{Eig}_{\Pi}(1).$ Let $v \in \operatorname{Im}\Pi$ so that $v=\Pi(\hat{v})$ for some vector $\hat{v} \in V.$ Hence $\Pi(v)=\Pi(\Pi(\hat{v}))=\Pi(\hat{v})=v$ and $v$ is an eigenvector with eigenvalue $1.$ Conversely, suppose $v \in V$ is an eigenvector of $\Pi$ with eigenvalue $1.$ Then $\Pi(v)=v=\Pi(\Pi(v))$ and hence $v \in \operatorname{Im}\Pi.$ We thus conclude that $\operatorname{Im}\Pi=\operatorname{Eig}_{\Pi}(1).$ Choosing an ordered basis of $\operatorname{Ker}\Pi$ and an ordered basis of $\operatorname{Im}\Pi$ gives a basis of $V$ consisting of eigenvectors, hence $\Pi$ is diagonalisable.

Remark 12.15

In a sense there is only one kind of projection. It follows from Proposition 12.14 that for a projection $\Pi : V \to V,$ we have $V=\operatorname{Ker}\Pi \oplus \operatorname{Im}\Pi.$

Conversely, given two subspaces $U_1,U_2$ of $V$ such that $V=U_1 \oplus U_2,$ there is a projection $\Pi : V \to V$ whose kernel is $U_1$ and whose image is $U_2.$ Indeed, every vector $v\in V$ can be written as $v=u_1+u_2$ for unique vectors $u_i \in U_i$ for $i=1,2.$ Hence we obtain a projection by defining $\Pi(v)=u_2$ for all $v \in V.$

Denote by $\mathcal{X}$ the set of projections from $V$ to $V$ and by $\mathcal{Y}$ the set of pairs $(U_1,U_2)$ of subspaces of $V$ that are in direct sum and satisfy $V=U_1\oplus U_2.$ Then we obtain a mapping $\Lambda : \mathcal{X} \to \mathcal{Y}$ defined by $f \mapsto (\operatorname{Ker}f,\operatorname{Im}f),$ and one can check that this is a bijection.

Exercises

Exercise 12.16

Derive the formula (12.1) for the coefficients $b_j.$

Solution

We first note that for $k\geqslant 1$ we have \[x^k-x_0^k = (x-x_0)\sum_{j=0}^{k-1}x^jx_0^{k-1-j},\] which can be checked by a direct computation. Using this expression we find \[\begin{aligned}f(x)-f(x_0) & = \sum_{k=0}^na_k(x^k-x_0^k)\\ & = (x-x_0)\sum_{k=0}^n\sum_{j=0}^{k-1}a_kx^jx_0^{k-1-j}\\ & = (x-x_0)\sum_{j=0}^{n-1}\sum_{k=j+1}^na_kx_0^{k-1-j}x^j\\ & = (x-x_0)\sum_{j=0}^{n-1}\bigg(\underbrace{\sum_{k=0}^{n-j-1}a_{k+j+1}x_0^{k}}_{=b_j}\bigg)x^j \end{aligned}\] and the claim follows.

Exercise 12.17

Show that the map $\Lambda$ of Remark 12.15 is a bijection.

Solution

We need to show that $\Lambda:\mathcal X\to\mathcal Y$ is both injective and surjective. Injectivity: Let $\Pi_1,\Pi_2\in \mathcal X$ be two distinct projections. Then there is a non-zero vector $v\in V$ such that $\Pi_1(v)\ne \Pi_2(v).$ Since $V=\operatorname{Ker}\Pi_1\oplus \operatorname{Im}\Pi_1 = \operatorname{Ker}\Pi_2\oplus \operatorname{Im}\Pi_2,$ either $v\in\operatorname{Ker}\Pi_1$ or $v\in\operatorname{Im}\Pi_1$:

If $v\in \operatorname{Ker}\Pi_1,$ then $v\notin \operatorname{Ker}\Pi_2$ and therefore $\operatorname{Ker}\Pi_1\ne\operatorname{Ker}\Pi_2$ and hence $\Lambda(\Pi_1)\ne\Lambda(\Pi_2).$
If $v\in\operatorname{Im}\Pi_1,$ then $\Pi_1(v)=v\ne \Pi_2(v).$ However, if $v\in\operatorname{Im}\Pi_2,$ then $\Pi_2(v)=v$ which is a contradiction. Therefore, $v\notin\operatorname{Im}\Pi_2$ and $\Lambda(\Pi_1)\ne\Lambda(\Pi_2).$

Surjectivity: This is done exactly as indicated in the lecture notes. If $(U_1,U_2)\in\mathcal Y,$ then every $v\in V$ can uniquely be written as $v=u_1+u_2,$ where $u_1\in U_1$ and $u_2\in U_2.$ Let $\Pi:V\to V$ be defined as $\Pi(v)=u_2.$ Then $\Pi$ is linear, $\Pi(\Pi(v)) = \Pi(u_2) = \Pi(0_V+u_2) = u_2$ and $\operatorname{Ker}\Pi=U_1,$ $\operatorname{Im}\Pi=U_2.$

Exercise 12.18

Show that if $\Pi : V \to V$ is a projection then $\mathrm{Id}_V - \Pi : V \to V$ is a projection with kernel equal to the image of $\Pi$ and image equal to the kernel of $\Pi.$

Solution

We first show that $\mathrm{Id}_V-\Pi$ is a projection: \[\begin{aligned}((\mathrm{Id}_V-\Pi)\circ(\mathrm{Id}_V-\Pi))(v) & = (\mathrm{Id}_V-\Pi)(v-\Pi(v))\\ & = v-\Pi(v)-\Pi(v-\Pi(v))\\ & = v-\Pi(v)-\Pi(v)+\Pi(v)\\ & = (\mathrm{Id}_V-\Pi)(v) \end{aligned}\] We show $\operatorname{Ker}(\mathrm{Id}_V-\Pi)=\operatorname{Im}\Pi$ by showing both inclusions: Let $v\in \operatorname{Ker}(\mathrm{Id}_V-\Pi).$ Then $v-\Pi(v)=0_V \Longleftrightarrow v=\Pi(v)$ which shows that $v\in \operatorname{Im}\Pi.$ In order to show the reverse inclusion, let $w\in \operatorname{Im}\Pi.$ Then there exists $v\in V$ such that $\Pi(v) = w.$ \[(\mathrm{Id}_V-\Pi)(w) = w-\Pi(w) = \Pi(v)-\Pi(\Pi(v))= 0_V\] and hence $w\in \operatorname{Ker}(\mathrm{Id}_V-\Pi).$

In order to show $\operatorname{Im}(\mathrm{Id}_V-\Pi)=\operatorname{Ker}\Pi,$ we apply the above argument to the projection $\widetilde \Pi = \mathrm{Id}_V-\Pi$ to conclude that $\operatorname{Ker}(\mathrm{Id}_V-\widetilde \Pi) = \operatorname{Im}\widetilde \Pi$ and thus \[\operatorname{Im}(\mathrm{Id}_V-\Pi) = \operatorname{Im}\widetilde \Pi = \operatorname{Ker}(\mathrm{Id}_V-\widetilde\Pi) = \operatorname{Ker}\Pi.\]

12 Endomorphisms, II

12.1 Properties of eigenvalues

12.2 Special endomorphisms

Involutions

Projections

Exercises

Home

Chapters

Contents

✕