4 Applications of Gaussian elimination

4.1 Gaussian elimination

In the Algorithmics module M01 you learned how to use Gaussian elimination to solve a system of equations of the form \[\tag{4.1} \mathbf{A}\vec{x}=\vec{b}\] for some given matrix \(\mathbf{A}\in M_{m,n}(\mathbb{K}),\) vector \(\vec{b} \in \mathbb{K}^m\) and unknown \(\vec{x} \in \mathbb{K}^n.\) Many concrete problems in Linear Algebra lead to systems of the form (4.1). A few sample problems that can be solved with Gaussian elimination are discussed below.

Solving equation of the type (4.1) hinges on the elementary observation that a vector \(\vec{x} \in \mathbb{K}^n\) solves \(\mathbf{A}\vec{x}=\vec{b}\) if and only if it solves \(\mathbf{B}\mathbf{A}\vec{x}=\mathbf{B}\vec{b},\) where \(\mathbf{B}\in M_{m,m}(\mathbb{K})\) is any invertible \(m\)-by-\(m\) matrix.

In the Gaussian elimination algorithm, the matrix \(\mathbf{B}\) is chosen among three types of matrices:

Definition 4.1 • Elementary matrices

Let \(m \in \mathbb{N}.\) The elementary matrices of size \(m\) are the square matrices \[\begin{aligned} &\mathbf{L}_{k,l}(s)=\mathbf{1}_{m}+s\mathbf{E}_{k,l},\\ &\mathbf{D}_k(s)=\mathbf{1}_{m}+(s-1)\mathbf{E}_{k,k},\\ &\mathbf{P}_{k,l}=\mathbf{1}_{m}-\mathbf{E}_{k,k}-\mathbf{E}_{l,l}+\mathbf{E}_{k,l}+\mathbf{E}_{l,k}, \end{aligned}\] where \(1\leqslant k,l\leqslant m\) with \(k\neq l,\) \(\mathbf{E}_{k,l}\in M_{m,m}(\mathbb{K})\) and \(s\in \mathbb{K}\) with \(s\neq 0.\)

Example 4.2

For \(m=4\) we have for instance \[\mathbf{L}_{2,3}(s)=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & s& 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{pmatrix},\qquad \mathbf{D}_{4}(s)=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & s\end{pmatrix}\] and \[\quad \mathbf{P}_{2,4}=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0\end{pmatrix}.\]

As an exercise in matrix multiplication, we compute the effect of left multiplication with elementary matrices.

For \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant m} \in M_{m,n}(\mathbb{K}),\) we obtain \[\left[\mathbf{L}_{k,l}(s)\mathbf{A}\right]_{ij}=\sum_{r=1}^m\left(\delta_{ir}+s\delta_{ik}\delta_{lr}\right)A_{rj}=A_{ij}+s\delta_{ik}A_{lj}=\left\{\begin{array}{cc} A_{ij}+sA_{lj} & i=k \\ A_{ij} & i\neq k \end{array}\right.,\] where we use that \([\mathbf{1}_{m}]_{ir}=\delta_{ir}\) and \([\mathbf{E}_{k,l}]_{ir}=\delta_{ik}\delta_{lr}.\) Therefore, multiplying the matrix \(\mathbf{A}\) with \(\mathbf{L}_{k,l}(s)\) from the left, adds \(s\) times the \(l\)-th row of \(\mathbf{A}\) to the \(k\)-th row of \(\mathbf{A}\) and leaves \(\mathbf{A}\) unchanged otherwise.

Likewise, we obtain \[\left[\mathbf{D}_k(s)\mathbf{A}\right]_{ij}=\sum_{r=1}^m\left(\delta_{ir}+(s-1)\delta_{ik}\delta_{kr}\right)A_{rj}=\left\{\begin{array}{cc} sA_{ij} & i=k \\ A_{ij} & i\neq k \end{array}\right..\] Therefore, multiplying the matrix \(\mathbf{A}\) with \(\mathbf{D}_{k}(s)\) from the left, multiplies the \(k\)-th row of \(\mathbf{A}\) with \(s\) and leaves \(\mathbf{A}\) unchanged otherwise.

Finally, \[\begin{aligned} \left[\mathbf{P}_{k,l}\mathbf{A}\right]_{ij}&=\sum_{r=1}^m(\delta_{ir}-\delta_{ik}\delta_{kr}-\delta_{il}\delta_{lr}+\delta_{ik}\delta_{lr}+\delta_{il}\delta_{rk})A_{rj}\\ &=A_{ij}-\delta_{ik}A_{kj}-\delta_{il}A_{lj}+\delta_{ik}A_{lj}+\delta_{il}A_{kj}\\ &=A_{ij}+\delta_{ik}\left(A_{lj}-A_{kj}\right)+\delta_{il}\left(A_{kj}-A_{lj}\right)=\left\{\begin{array}{cc}A_{lj} & i=k \\ A_{kj} & i=l \\ A_{ij} & i\neq k, i\neq l \end{array}\right.. \end{aligned}\] Therefore, multiplying the matrix \(\mathbf{A}\) with \(\mathbf{P}_{k,l}\) from the left, swaps the \(k\)-th row of \(\mathbf{A}\) with the \(l\)-th row of \(\mathbf{A}\) and leaves \(\mathbf{A}\) unchanged otherwise.

These calculations immediately imply:

Proposition 4.3

The elementary matrices are invertible with \[\mathbf{L}_{k,l}(s)^{-1}=\mathbf{L}_{k,l}(-s) \quad \text{and}\quad \mathbf{D}_k(s)^{-1}=\mathbf{D}_k(1/s)\quad \text{and}\quad (\mathbf{P}_{k,l})^{-1}=\mathbf{P}_{k,l}.\]

The sceptical reader may also verify this fact by direct computation with the help of the following lemma:

Lemma 4.4

Let \(m \in \mathbb{N}.\) For \(1\leqslant k,l,p,q \leqslant m,\) we have \[\mathbf{E}_{k,l}\mathbf{E}_{p,q}=\left\{\begin{array}{cc}\mathbf{E}_{k,q} & p=l \\ \mathbf{0}_{m,m} & p\neq l \end{array}\right.\]

Proof. By definition, we have \[\mathbf{E}_{k,l}\mathbf{E}_{p,q}=\left(\sum_{r=1}^m \delta_{ik}\delta_{lr}\delta_{rp}\delta_{qj}\right)_{1\leqslant i,j\leqslant m}=\delta_{lp}\left(\delta_{ik}\delta_{qj}\right)_{1\leqslant i,j\leqslant m}=\left\{\begin{array}{cc}\mathbf{E}_{k,q} & p=l \\ \mathbf{0}_{m,m} & p\neq l \end{array}\right..\]

For each row in a matrix, if the row does not consist of zeros only, then the leftmost nonzero entry is called the leading coefficient of that row.

Definition 4.5 • Row echelon form

A matrix \(\mathbf{A}\in M_{m,n}(\mathbb{K})\) is said to be in row echelon form (REF) if

  • all rows consisting of only zeros are at the bottom;

  • the leading coefficient of a nonzero row is always strictly to the right of the leading coefficient of the row above it.

The matrix \(\mathbf{A}\) is said to be in reduced row echelon form (rREF) if furthermore

  • all of the leading coefficients are equal to \(1\);

  • in every column containing a leading coefficient, all of the other entries in that column are zero.

Gaussian elimination from the Algorithmics module M01 implies the following statement:

Theorem 4.6 • Gauss–Jordan elimination

Let \(\mathbf{A}\in M_{m,n}(\mathbb{K})\) then there exists \(N \in \mathbb{N}\) and an \(N\)-tuple of elementary matrices \((\mathbf{B}_1,\ldots,\mathbf{B}_N)\) such that the matrix \(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\mathbf{A}\) is in reduced row echelon form.

Proof. Applying Gaussian elimination implies the existence of \(\hat{N} \in \mathbb{N}\) and elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_{\hat{N}}\) so that \(\mathbf{B}_{\hat{N}}\mathbf{B}_{\hat{N}-1}\cdots \mathbf{B}_2\mathbf{B}_{1}\mathbf{A}\) is REF. After possibly further multiplying this matrix from the left with elementary matrices of the type \(\mathbf{D}_k(s),\) we can assume that all leading coefficients are \(1.\) By choosing suitable left multiplications with matrices of the type \(\mathbf{L}_{k,l}(s),\) we find a natural number \(N \geqslant \hat{N}\) and elementary matrices \((\mathbf{B}_1,\ldots,\mathbf{B}_N)\) so that \(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\mathbf{A}\) is in reduced row echelon form.

4.2 Applications

4.2.1 Compute the inverse of a matrix

An algorithm using Gaussian elimination for computing the inverse of an invertible matrix relies on the following fact:

Proposition 4.7

Let \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be a square matrix. Then the following statements are equivalent:

  1. \(\mathbf{A}\) is invertible;

  2. the row vectors of \(\mathbf{A}\) are linearly independent;

  3. the column vectors of \(\mathbf{A}\) are linearly independent.

Proof. Part of an exercise sheet.

Suppose the matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) is invertible. Applying Gauss–Jordan elimination to \(\mathbf{A},\) we cannot encounter a zero row, since the occurrence of a zero row corresponds to a non-trivial linear combination of row vectors which gives the zero vector. This is excluded by the above proposition. Having no zero row vectors, the Gauss–Jordan elimination applied to \(\mathbf{A}\) must give the identity matrix \(\mathbf{1}_{n}.\) Thus we can find a sequence of elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_N,\) \(N \in \mathbb{N},\) so that \[\mathbf{1}_{n}=\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\mathbf{A}.\] In other words, \(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\) is the inverse of \(\mathbf{A}.\) This gives the following recipe for computing the inverse of \(\mathbf{A}\):

We write the matrix \(\mathbf{A}\) and \(\mathbf{1}_{n}\) next to each other, say \(\mathbf{A}\) on the left and \(\mathbf{1}_{n}\) on the right. We then perform Gauss–Jordan elimination on \(\mathbf{A}.\) At each step, we also perform the Gauss–Jordan elimination step to the matrix on the right. Once Gauss–Jordan elimination terminates, we thus obtain \(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\mathbf{A}\) on the left and \(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\mathbf{1}_{n}\) on the right. But since \(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\mathbf{1}_{n}=\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\) (notice the absence of \(\mathbf{1}_{n}\) after the equality sign), the right hand side is the inverse of \(\mathbf{A}.\)

Example 4.8 • Inverse of a matrix

We want to compute the inverse of \[\mathbf{A}=\begin{pmatrix} 1 & -2 \\ -3 & 4 \end{pmatrix}.\] Write \[\left(\begin{array}{cc} 1 & -2 \\ -3 & 4\end{array}\right| \left.\begin{array}{cc} 1 & 0 \\ 0 & 1\end{array}\right).\] Adding \(3\)-times the first row to the second row gives \[\left(\begin{array}{cc} 1 & -2 \\ 0 & -2\end{array}\right| \left.\begin{array}{cc} 1 & 0 \\ 3 & 1\end{array}\right).\] Dividing the second row by \(-2\) gives \[\left(\begin{array}{cc} 1 & -2 \\ 0 & 1\end{array}\right| \left.\begin{array}{cc} 1 & 0 \\ -\frac{3}{2} & -\frac{1}{2}\end{array}\right).\] Finally, adding the second row twice to the first row gives \[\left(\begin{array}{cc} 1 & 0 \\ 0 & 1\end{array}\right| \left.\begin{array}{cc} -2 & -1 \\ -\frac{3}{2} & -\frac{1}{2}\end{array}\right),\] so that \[\mathbf{A}^{-1}=\begin{pmatrix} -2 & -1 \\ -\frac{3}{2} & -\frac{1}{2}\end{pmatrix}.\]

4.2.2 Compute a basis of a subspace

Gaussian elimination can also be used to compute a basis for a vector subspace \(U\) of a finite dimensional \(\mathbb{K}\)-vector space \(V.\) We assume that \(U=\operatorname{span}\{v_1,\ldots,v_k\}\) for some vectors \(v_i \in V,\) \(1\leqslant i\leqslant k.\) We assume that \(\dim U\geqslant 1\) so that not all vectors are the zero vector.

We first consider the special case where \(V\) is the space \(\mathbb{K}_n\) of row vectors of length \(n\) and with entries in \(\mathbb{K}.\) Recall that we denote the row vectors by small Greek letters. We write \(\mathbb{K}_n^m\) for the \(m\)-fold Cartesian product \((\mathbb{K}_n)^m\) of \(\mathbb{K}_n.\) Clearly, we have a bijective mapping \[\Omega : \mathbb{K}_n^m \to M_{m,n}(\mathbb{K}), \quad (\vec{\nu}_1,\ldots,\vec{\nu}_m) \mapsto \begin{pmatrix} \vec{\nu}_1 \\ \vdots \\ \vec{\nu}_m\end{pmatrix}\] which simply writes the row vectors \((\vec{\nu}_1,\ldots,\vec{\nu}_m)\) into a matrix with the \(k\)-th row vector from the \(m\)-tuple of row vectors becoming the \(k\)-th row of the matrix.

Example 4.9

\[\Omega\left(\begin{pmatrix} 1 & 2 & 3 \end{pmatrix}, \begin{pmatrix} 4 & 5 & 6 \end{pmatrix}\right)=\begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{pmatrix}.\]

We have \[\begin{aligned} \mathbf{L}_{k,l}(s)\Omega(\vec{\nu}_1,\ldots,\vec{\nu}_m)&=\Omega\left(\vec{\nu}_1,\ldots,\vec{\nu}_{k-1},\vec{\nu}_k+s\vec{\nu}_l,\vec{\nu}_{k+1},\ldots,\vec{\nu}_m\right),\\ \mathbf{D}_{k}(s)\Omega(\vec{\nu}_1,\ldots,\vec{\nu}_m)&=\Omega\left(\vec{\nu}_1,\ldots,\vec{\nu}_{k-1},s\vec{\nu}_k,\vec{\nu}_{k+1},\ldots,\vec{\nu}_m\right),\\ \mathbf{P}_{k,l}\Omega(\vec{\nu}_1,\ldots,\vec{\nu}_m)&=\Omega\left(\vec{\nu}_1,\ldots,\vec{\nu}_{k-1},\vec{\nu}_l,\vec{\nu}_{k+1},\ldots,\vec{\nu}_{l-1},\vec{\nu}_k,\vec{\nu}_{l+1},\ldots,\vec{\nu}_m\right). \end{aligned}\] Notice that all these operations do not change the span of the vectors \(\vec{\nu}_1,\ldots,\vec{\nu}_m.\) More precisely, if \((\vec{\nu}_1,\ldots,\vec{\nu}_m)\) is an \(n\)-tuple of row vectors and if \(\Omega\left(\vec{\omega}_1,\ldots,\vec{\omega}_m\right)=\mathbf{B}\Omega (\vec{\nu}_1,\ldots,\vec{\nu}_m)\) for some elementary matrix \(\mathbf{B},\) then \[\operatorname{span}\{\vec{\nu}_1,\ldots,\vec{\nu}_m\}=\operatorname{span}\{\vec{\omega}_1,\ldots,\vec{\omega}_m\}.\] Applying Gaussian elimination to the matrix \(\Omega(\vec{\nu}_1,\ldots,\vec{\nu}_m)\) gives a list of elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_N\) such that \[\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_2\mathbf{B}_1\Omega(\vec{\nu}_1,\ldots,\vec{\nu}_m)=\Omega(\vec{\omega}_1,\ldots,\vec{\omega}_{r},0_{\mathbb{K}_n},\dots,0_{\mathbb{K}_n})\] where \(1\leqslant r\leqslant m\) and \(0_{\mathbb{K}_n}\) denotes the zero vector in \(\mathbb{K}_n.\) By construction, the matrix \(\mathbf{A}=\Omega(\vec{\omega}_1,\ldots,\vec{\omega}_{r},0_{\mathbb{K}_n},\dots,0_{\mathbb{K}_n})\) is REF. Since the leading coefficient of \(\vec{\omega}_i\) is always strictly to the right of the leading coefficient of \(\vec{\omega}_{i-1},\) it follows that the vectors \(\vec{\omega}_1,\ldots,\vec{\omega}_{r}\) are linearly independent. Therefore, a basis of \(\operatorname{span}\{\vec{\nu}_1,\ldots,\vec{\nu}_m\}\) is given by \(\{\vec{\omega}_1,\ldots,\vec{\omega}_{r}\}.\)

The general case can be treated with the help of the following facts:

Proposition 4.10

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces and \(\Phi : V \to W\) an isomorphism. Then \(\mathcal{S}\subset V\) is a basis of \(V\) if and only if \(\Phi(\mathcal{S})\) is a basis of \(W.\)

Proof. \(\Rightarrow\) Since \(\mathcal{S}\) is a basis, the set \(\mathcal{S}\) is linearly independent and since \(\Phi\) is injective, so is \(\Phi(\mathcal{S})\) by Lemma 3.56. Since \(\mathcal{S}\) is a basis, \(\mathcal{S}\) is a generating set and since \(\Phi\) is surjective, the subset \(\Phi(\mathcal{S})\subset W\) is a generating set for \(W\) by Lemma 3.46.

\(\Leftarrow\) We apply the above implication to \(\Phi^{-1} : W \to V\) and the basis \(\Phi(\mathcal{S})\subset W.\)

Corollary 4.11

Let \(\hat{V},\hat{W}\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\Theta : \hat{V} \to \hat{W}\) an isomorphism and \(U\subset \hat{V}\) a vector subspace. Then \(\mathcal{S}\subset U\) is a basis of \(U\) if and only if \(\Theta(\mathcal{S})\) is a basis of \(\Theta(U).\)

Proof. Apply Proposition 4.10 to the vector space \(V=U,\) the vector space \(W=\Theta(U)\) and the isomorphism \(\Phi=\Theta|_U : V \to W.\)

We now describe a recipe to treat the general case of a subset \(U=\operatorname{span}\{v_1,\ldots,v_m\}\) of a finite dimensional \(\mathbb{K}\)-vector space \(V\):

  1. Fix an isomorphism \(\Phi : V \to \mathbb{K}_n\) and write \(\vec{\nu}_i=\Phi(v_i)\) for \(1\leqslant i\leqslant m.\)

  2. Apply Gaussian elimination to the matrix \(\Omega(\vec{\nu}_1,\ldots,\vec{\nu}_m)\) to obtain a set of new vectors \((\vec{\omega}_1,\ldots,\vec{\omega}_{r},0_{\mathbb{K}_n},\ldots,0_{\mathbb{K}_n})\) for some \(r \in \mathbb{N}.\)

  3. Apply the inverse isomorphism \(\Phi^{-1}\) to the obtained list of vectors. This gives the desired basis \(\{\Phi^{-1}(\vec{\omega}_1),\ldots,\Phi^{-1}(\vec{\omega}_{r})\}\) of \(U.\)

Example 4.12 • Basis of a subspace

Let \(V=\mathsf{P}_3(\mathbb{R})\) so that \(\dim(V)=4\) and \[U=\operatorname{span}\{x^3+2x^2-x,4x^3+8x^2-4x-3,x^2+3x+4,2x^3+5x+x+4\}.\] We want to compute a basis of \(U.\) We choose the isomorphism \(\Phi : V \to \mathbb{R}_4\) defined by \[\Phi(a_3x^3+a_2x^2+a_1x+a_0)=\begin{pmatrix} a_3 & a_2 & a_1 & a_0 \end{pmatrix}.\] We thus have \(\vec{\nu}_1=\begin{pmatrix} 1 & 2 & -1 & 0 \end{pmatrix},\) \(\vec{\nu}_2=\begin{pmatrix} 4 & 8 & -4 & -3 \end{pmatrix},\) \(\vec{\nu}_3=\begin{pmatrix} 0 & 1 & 3 & 4 \end{pmatrix}\) and \(\vec{\nu}_4=\begin{pmatrix} 2 & 5 & 1 & 4 \end{pmatrix}.\)

Applying Gaussian elimination to the matrix \[\Omega(\vec{\nu}_1,\vec{\nu}_2,\vec{\nu}_3,\vec{\nu}_4)=\begin{pmatrix} 1 & 2 & -1 & 0 \\ 4 & 8 & -4 & -3 \\ 0 & 1 & 3 & 4 \\ 2 & 5 & 1 & 4\end{pmatrix}\] yields \[\begin{pmatrix} 1 & 0 & -7 & 0 \\ 0 & 1 & 3 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{pmatrix}.\] Here we applied Gauss-Jordan elimination, but Gaussian elimination is good enough. This gives the vectors \(\vec{\omega}_1=\begin{pmatrix} 1 & 0 & -7 & 0 \end{pmatrix},\) \(\vec{\omega}_2=\begin{pmatrix} 0 & 1 & 3 & 0 \end{pmatrix},\) \(\vec{\omega}_3=\begin{pmatrix} 0 & 0 & 0 & 1 \end{pmatrix}.\)

Our basis of \(U\) is thus \[\{\Phi^{-1}(\vec{\omega}_1),\Phi^{-1}(\vec{\omega}_2),\Phi^{-1}(\vec{\omega}_3)\}=\left\{x^3-7x,x^2+3x,1\right\},\] where we use that \[\Phi^{-1}\left(\begin{pmatrix} a_3 & a_2 & a_1 & a_0\end{pmatrix}\right)=a_3x^3+a_2x^2+a_1x+a_0.\]

4.2.3 Compute the image and rank of a linear map

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces and \(f : V \to W\) a linear map. By computing the image of a linear map \(f,\) we mean computing a basis of \(\operatorname{Im}(f).\)

In order to compute a basis for \(\operatorname{Im}(f)\) we use the following lemma:

Lemma 4.13

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces and \(f : V \to W\) a linear map. If \(\{v_1,\ldots,v_n\}\) is a basis of \(V,\) then \[\operatorname{Im}(f)=\operatorname{span}\{f(v_1),\ldots,f(v_n)\}.\]

Proof. Let \(w \in \operatorname{Im}(f)\) so that \(w=f(v)\) for some \(v \in V.\) We have scalars \(s_i\) for \(1\leqslant i\leqslant n\) so that \(v=\sum_{i=1}^ns_i v_i.\) We obtain \[w=f(v)=f\left(\sum_{i=1}^ns_i v_i\right)=\sum_{i=1}^ns_if(v_i)\] so that \(w\) is a linear combination of the vectors \(\{f(v_1),\ldots,f(v_n)\}.\) On the other hand, a linear combination of the vectors \(f(v_i) \in \operatorname{Im}(f)\) lies in the image of \(f\) as well, since \(\operatorname{Im}(f)\) is a vector subspace. Hence we have \(\operatorname{Im}(f)=\operatorname{span}\{f(v_1),\ldots,f(v_n)\},\) as claimed.

Knowing that \(\operatorname{Im}(f)=\operatorname{span}\{f(v_1),\ldots,f(v_n)\}\) we can apply the recipe from Section 4.2.2 to \(U=\operatorname{span}\{f(v_1),\ldots,f(v_n)\}.\) By definition, the number of basis vectors for \(\operatorname{Im}(f)\) is the rank of \(f.\)

Example 4.14 Let \[\mathbf{A}=\begin{pmatrix} 1 & -2 & 0 & 4 \\ 3 & 1 & 1 & 0 \\ -1 & -5 & -1 & 8 \\ 3 & 8 & 2 & -12 \end{pmatrix}\] Compute a basis for the image of \(f_\mathbf{A}: \mathbb{R}^4 \to \mathbb{R}^4\) and the rank of \(f_\mathbf{A}.\) By Lemma 4.13 we have \[U=\operatorname{Im}(f_\mathbf{A})=\operatorname{span}\{\mathbf{A}\vec{e}_1,\mathbf{A}\vec{e}_2,\mathbf{A}\vec{e}_3,\mathbf{A}\vec{e}_4\}=\operatorname{span}\{\vec{a}_1,\vec{a}_2,\vec{a}_3,\vec{a}_4\},\] where \(\{\vec{e}_i\}_{1\leqslant i\leqslant 4}\) denotes the standard basis of \(\mathbb{R}^4\) and \(\{\vec{a}_i\}_{1\leqslant i\leqslant 4}\) the column vectors of \(\mathbf{A}.\) Comparing with the general setup described above, we are in the case where \(V=\mathbb{R}^4\) and \(v_i=\mathbf{A}\vec{e}_i\) for \(i=1,2,3,4.\)
  1. For the isomorphism \(\Phi : V=\mathbb{R}^4 \to \mathbb{R}_4\) we usually choose the transpose (but any other isomorphism would work too). We thus have \(\vec{\nu}_1=\begin{pmatrix} 1 & 3 & -1 & 3 \end{pmatrix},\) \(\vec{\nu}_2=\begin{pmatrix} -2 & 1 & -5 & 8 \end{pmatrix},\) \(\vec{\nu}_3=\begin{pmatrix} 0 & 1 & -1 & 2 \end{pmatrix}\) and \(\vec{\nu}_4=\begin{pmatrix} 4 & 0 & 8 & -12 \end{pmatrix}.\)

  2. Applying Gaussian elimination to the matrix \[\Omega(\vec{\nu}_1,\vec{\nu}_2,\vec{\nu}_3,\vec{\nu}_4)=\mathbf{A}^T=\begin{pmatrix} 1 & 3 & -1 & 3 \\ -2 & 1 & -5 & 8 \\ 0 & 1 & -1 & 2 \\ 4 & 0 & 8 & -12\end{pmatrix}\] yields \[\begin{pmatrix} 1 & 0 & 2 & -3 \\ 0 & 1 & -1 & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix}.\] Here again, we applied Gauss-Jordan elimination, but Gaussian elimination is good enough. This gives the vectors \(\vec{\omega}_1=\begin{pmatrix} 1 & 0 & 2 & -3 \end{pmatrix},\) \(\vec{\omega}_2=\begin{pmatrix} 0 & 1 & -1 & 2 \end{pmatrix}.\)

  3. Our basis of \(\operatorname{Im}(f)\) is thus \[\{\Phi^{-1}(\vec{\omega}_1),\Phi^{-1}(\vec{\omega}_2)\}=\left\{\begin{pmatrix} 1 \\ 0 \\ 2 \\ -3 \end{pmatrix},\begin{pmatrix} 0 \\ 1 \\ -1 \\ 2 \end{pmatrix}\right\},\] where we use that the transpose is its own inverse. We also conclude that \(\operatorname{rank}(f_\mathbf{A})=2.\)

Remark 4.15

In the special case where we want to compute a basis for the image of \(f_\mathbf{A}\) for some matrix \(\mathbf{A},\) the recipe thus reduces to the following steps. Take the transpose of \(\mathbf{A},\) perform Gauss elimination, take the transpose again, write down the nonzero column vectors. This gives the desired basis.

4.2.4 Compute the kernel and nullity of a linear map

In order to find a recipe for computing the kernel and nullity of a linear map, we first start with a related problem. Let \(\mathbf{A}\in M_{n,m}(\mathbb{K})\) be an \(n\times m\)-matrix and \[U=\left\{\vec{\xi} \in \mathbb{K}_n \,|\,\vec{\xi} \mathbf{A}=0_{\mathbb{K}_m}\right\},\] where \(\vec{\xi} \mathbf{A}\) is defined via matrix multiplication of the row vector \(\vec{\xi} \in \mathbb{K}_n=M_{1,n}(\mathbb{K})\) and the matrix \(\mathbf{A}\in M_{n,m}(\mathbb{K}).\) Notice that \(0_{\mathbb{K}_n} \in U\) and if \(\vec{\xi}_1,\vec{\xi}_2 \in U,\) then \(s_1\vec{\xi}_1+s_2\vec{\xi}_2 \in U\) for all \(s_1,s_2 \in \mathbb{K}.\) By Definition 3.21, it follows that \(U\) is a vector subspace of \(\mathbb{K}_{n}.\) We want to compute a basis for \(U.\) Applying Gauss elimination to the matrix \(\mathbf{A},\) we obtain \(r \in \mathbb{N}\) and elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_N\) so that \[\mathbf{B}_N\cdots \mathbf{B}_1\mathbf{A}=\Omega\left(\vec{\omega}_1,\ldots,\vec{\omega}_r,0_{\mathbb{K}_m},\ldots,0_{\mathbb{K}_m}\right)\] for some linearly independent row vectors \((\vec{\omega}_1,\ldots,\vec{\omega}_r) \in \mathbb{K}_{m}.\) Since the matrix \(\mathbf{B}_N\cdots \mathbf{B}_1\) is invertible, we also obtain a basis \(\{\vec{\xi}_1,\ldots,\vec{\xi}_n\}\) of \(\mathbb{K}_{n}\) so that \[\mathbf{B}_N\cdots \mathbf{B}_1=\Omega(\vec{\xi}_1,\ldots,\vec{\xi}_{n}).\] We now claim that \(\mathcal{S}=\{\vec{\xi}_{r+1},\ldots,\vec{\xi}_n\}\) is a basis of \(U.\) The set \(\mathcal{S}\) is linearly independent, hence we only need to show that \(\operatorname{span}(\mathcal{S})=U.\) Since we have \[\Omega(\vec{\xi}_1,\ldots,\vec{\xi}_n)\mathbf{A}=\Omega\left(\vec{\omega}_1,\ldots,\vec{\omega}_r,0_{\mathbb{K}_m},\ldots,0_{\mathbb{K}_m}\right),\] the definition of matrix multiplication implies that \(\vec{\xi}_i\mathbf{A}=\vec{\omega}_i\) for \(1\leqslant i\leqslant r\) and \(\vec{\xi}_i\mathbf{A}=0_{\mathbb{K}_m}\) for \(r+1\leqslant i\leqslant n.\) Any vector in \(U\) can be written as \(\vec{\nu}=\sum_{i=1}^ns_i\vec{\xi}_i.\) The condition \(\vec{\nu}\mathbf{A}=0_{\mathbb{K}_m}\) then implies that \(s_1=\cdots=s_r=0,\) hence \(\mathcal{S}\) is generating.

We can use this observation to compute the kernel and nullity of a linear map \(\mathbb{K}^n \to \mathbb{K}^m\) because of the following lemma whose proof is left as an exercise.

Lemma 4.16

Let \(\mathbf{C}\in M_{m,n}(\mathbb{K})\) and \(f_\mathbf{C}: \mathbb{K}^n \to \mathbb{K}^m\) be the associated linear map. Then \(\vec{x} \in \operatorname{Ker}(f_\mathbf{C})\) if and only if \(\vec{x}^T\mathbf{C}^T=0_{\mathbb{K}_m}.\)

We simply apply the above procedure to the matrix \(\mathbf{A}=\mathbf{C}^T\) and compute the vectors \(\{\vec{\xi}_{r+1},\ldots,\vec{\xi}_n\}.\) The basis of \(\operatorname{Ker}(f_\mathbf{C})\) is then given by \(\{\vec{\xi}_{r+1}^T,\ldots,\vec{\xi}_n^T\}.\)

The nullity of \(f_\mathbf{C}\) is given by the number of basis vectors of \(\operatorname{Ker}(f_\mathbf{C}).\)

Example 4.17 • Kernel of a linear map

Let \[\mathbf{C}=\begin{pmatrix} 1 & 0 & 1 & 7 \\ -2 & -3 & 1 & 2 \\ 7 & 9 & -2 & 1\end{pmatrix}\] In order to compute \(\operatorname{Ker}(f_\mathbf{C})\) we apply Gaussian elimination to \(\mathbf{C}^T\) whilst keeping track of the relevant elementary matrices as in the algorithm for computing the inverse of a matrix. That is, we consider \[\left(\begin{array}{ccc}1 & -2 & 7 \\ 0 & -3 & 9 \\ 1 & 1 & -2 \\ 7 & 2 & 1\end{array}\right|\left.\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1\end{array}\right).\] Gauss–Jordan elimination (again, Gaussian elimination is enough) gives \[\left(\begin{array}{ccc}1 & 0 & 1 \\ 0 & 1 & -3 \\ 0 & 0 & 0 \\ 0 & 0 & 0\end{array}\right|\left.\begin{array}{cccc} 0 & 0 & -\frac{2}{5} & \frac{1}{5} \\ 0 & 0 & \frac{7}{5} & -\frac{1}{5} \\ 1 & 0 & \frac{16}{5} & -\frac{3}{5} \\ 0 & 1 & \frac{21}{5} & -\frac{3}{5}\end{array}\right).\] The vectors \(\vec{\xi}_3=\begin{pmatrix} 1 & 0 & \frac{16}{5} & -\frac{3}{5}\end{pmatrix}\) and \(\vec{\xi}_4=\begin{pmatrix} 0 & 1 & \frac{21}{5} & -\frac{3}{5} \end{pmatrix}\) thus span the subspace of vectors \(\xi\) satisfying \(\xi \mathbf{C}^T=0_{\mathbb{K}_3}.\) A basis \(\mathcal{S}\) for the kernel of \(f_\mathbf{C}\) is thus given by \[\mathcal{S}=\left\{\begin{pmatrix} 1 \\ 0 \\ \frac{16}{5} \\ -\frac{3}{5}\end{pmatrix},\begin{pmatrix} 0 \\ 1 \\ \frac{21}{5} \\ -\frac{3}{5}\end{pmatrix}\right\}\] and \(f_\mathbf{C}\) satisfies \(\operatorname{nullity}(f_\mathbf{C})=2.\)

Remark 4.18

Section 4.2.3 and Section 4.2.4 can be combined to compute \(\operatorname{Ker}(f_\mathbf{A})\) and \(\operatorname{Im}(f_\mathbf{A})\) for \(\mathbf{A}\in M_{m,n}(\mathbb{K})\) by a single application of Gaussian elimination.

Remark 4.19

In order to compute the kernel of a linear map \(g : V \to W\) between finite dimensional vector spaces, we can fix an ordered basis \(\mathbf{b}\) of \(V\) and an ordered basis \(\mathbf{c}\) of \(W,\) compute \(\mathbf{C}=\mathbf{M}(g,\mathbf{b},\mathbf{c}),\) apply the above procedure to the matrix \(\mathbf{C}\) in order to obtain a basis \(\mathcal{S}\) of \(\operatorname{Ker}(f_\mathbf{C}).\) The desired basis of \(\operatorname{Ker}(g)\) is then given by \(\boldsymbol{\beta}^{-1}(\mathcal{S}).\) While this algorithm can always be carried out, it is computationally quite involved. In many cases it is therefore advisable to compute \(\operatorname{Ker}(g)\) by some other technique.

Home

Contents

Exercises

Lecture Recordings

Quizzes

Study Weeks