3.7 Matrix representation of linear maps

Notice that Proposition 3.80 implies that every finite dimensional \(\mathbb{K}\)-vector space \(V\) is isomorphic to \(\mathbb{K}^n,\) where \(n=\dim(V).\) Choosing an isomorphism from \(V\) to \(\mathbb{K}^n\) allows to uniquely describe each vector of \(V\) in terms of \(n\) scalars, its coordinates.
Definition 3.82 • Linear coordinate system

Let \(V\) be a \(\mathbb{K}\)-vector space of dimension \(n \in \mathbb{N}.\) A linear coordinate system is an injective linear map \(\boldsymbol{\varphi}: V \to \mathbb{K}^n.\) The entries of the vector \(\boldsymbol{\varphi}(v)\in \mathbb{K}^n\) are called the coordinates of the vector \(v\in V\) with respect to the coordinate system \(\boldsymbol{\varphi}.\)

We only request that \(\boldsymbol{\varphi}\) is injective, but the mapping \(\boldsymbol{\varphi}\) is automatically bijective by Corollary 3.77.
Example 3.83 • Standard coordinates

On the vector space \(\mathbb{K}^n\) we have a linear coordinate system defined by the identity mapping, that is, we define \(\boldsymbol{\varphi}(\vec{v})=\vec{v}\) for all \(\vec{v} \in \mathbb{K}^n.\) We call this coordinate system the standard coordinate system of \(\mathbb{K}^n.\)

Example 3.84 • Non-linear coordinates

In Linear Algebra we only consider linear coordinate systems, but in other areas of mathematics non-linear coordinate systems are also used. An example are the so-called polar coordinates \[\boldsymbol{\rho}: \mathbb{R}^2\setminus\{0_{\mathbb{R}^2}\} \to (0,\infty)\times (-\pi,\pi]\subset \mathbb{R}^2, \qquad \vec{x}\mapsto \begin{pmatrix} r \\ \phi\end{pmatrix}=\begin{pmatrix} \sqrt{(x_1)^2+(x_2)^2} \\ \arg(\vec{x}) \end{pmatrix},\] where \(\arg(\vec{x})=\arccos(x_1/r)\) for \(x_2\geqslant 0\) and \(\arg(\vec{x})=-\arccos(x_1/r)\) for \(x_2<0.\) Notice that the polar coordinates are only defined on \(\mathbb{R}^2\setminus\{0_{\mathbb{R}^2}\}.\) For further details we refer to the Analysis module.

A convenient way to visualise a linear coordinate system \(\boldsymbol{\varphi}: \mathbb{R}^2 \to \mathbb{R}^2\) is to consider the preimage \(\boldsymbol{\varphi}^{-1}(\mathcal{C})\) of the standard coordinate grid \[\tag{3.13} \mathcal{C}=\left\{s \vec{e}_1+k\vec{e}_2| s \in \mathbb{R}, k \in \mathbb{Z}\right\}\cup \left\{k \vec{e}_1+s\vec{e}_2| s \in \mathbb{R}, k \in \mathbb{Z}\right\}\] under \(\boldsymbol{\varphi}.\) The first set in the union (3.13) of sets are the horizontal coordinate lines and the second set the vertical coordinate lines.

Example 3.85 • see Figure 3.1

The vector \(\vec{v}=\begin{pmatrix} 2 \\ 1 \end{pmatrix}\) has coordinates \(\begin{pmatrix} 2 \\ 1 \end{pmatrix}\) with respect to the standard coordinate system of \(\mathbb{R}^2.\) The same vector has coordinates \(\boldsymbol{\varphi}(\vec{v})=\begin{pmatrix} 4 \\ -1 \end{pmatrix}\) with respect to the coordinate system \(\boldsymbol{\varphi}\left(\begin{pmatrix}v_1 \\ v_2\end{pmatrix}\right)=\begin{pmatrix}v_1+2v_2 \\ -v_1+v_2\end{pmatrix}.\)

Figure 3.1: The coordinates of a vector with respect to different coordinate systems.

While \(\mathbb{K}^n\) is equipped with the standard coordinate system, in an abstract vector space \(V\) there is no preferred linear coordinate system and a choice of linear coordinate system amounts to choosing a so-called ordered basis of \(V.\)

Definition 3.86 • Ordered basis

Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space. An (ordered) \(n\)-tuple \(\mathbf{b}=(v_1,\ldots,v_n)\) of vectors from \(V\) is called an ordered basis of \(V\) if the set \(\{v_1,\ldots,v_n\}\) is a basis of \(V.\)

That there is a bijective correspondence between ordered bases of \(V\) and linear coordinate systems on \(V\) is a consequence of the following very important lemma which states in particular that two linear maps \(f, g : V \to W\) are the same if and only if they agree on a basis of \(V.\)

Lemma 3.87

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces.

  1. Suppose \(f,g : V \to W\) are linear maps and \(\mathbf{b}=(v_1,\ldots,v_n)\) is an ordered basis of \(V.\) Then \(f=g\) if and only if \(f(v_i)=g(v_i)\) for all \(1\leqslant i\leqslant n.\)

  2. If \(\dim V=\dim W\) and \(\mathbf{b}=(v_1,\ldots,v_n)\) is an ordered basis of \(V\) and \(\mathbf{c}=(w_1,\ldots,w_n)\) an ordered basis of \(W,\) then there exists a unique isomorphism \(f : V \to W\) such that \(f(v_i)=w_i\) for all \(1\leqslant i\leqslant n.\)

Proof. (i) \(\Rightarrow\) If \(f=g\) then \(f(v_i)=g(v_i)\) for all \(1\leqslant i\leqslant n.\) \(\Leftarrow\) Let \(v \in V.\) Since \(\mathbf{b}\) is an ordered basis of \(V\) there exist unique scalars \(s_1,\ldots,s_n \in \mathbb{K}\) such that \(v=\sum_{i=1}^n s_i v_i.\) Using the linearity of \(f\) and \(g,\) we compute \[f(v)=f\left(\sum_{i=1}^n s_i v_i\right)=\sum_{i=1}^ns_if(v_i)=\sum_{i=1}^ns_ig(v_i)=g\left(\sum_{i=1}^n s_i v_i\right)=g(v)\] so that \(f=g.\)

(ii) Let \(v \in V.\) Since \(\{v_1,\ldots,v_n\}\) is a basis of \(V\) there exist unique scalars \(s_1,\ldots,s_n\) such that \(v=\sum_{i=1}^n s_i v_i.\) We define \(f(v)=\sum_{i=1}^ns_i w_i,\) so that in particular \(f(v_i)=w_i\) for \(1\leqslant i\leqslant n.\) Since \(\{w_1,\ldots,w_n\}\) are linearly independent we have \(f(v)=0_W\) if and only if \(s_1=\cdots=s_n=0,\) that is \(v=0_V.\) Lemma 3.31 implies that \(f\) is injective and hence an isomorphism by Corollary 3.77. The uniqueness of \(f\) follows from (i).
Remark 3.88 Notice that Lemma 3.87 is wrong for maps that are not linear. Consider \[f : \mathbb{R}^2\to \mathbb{R}, \quad \begin{pmatrix} x_1 \\ x_2\end{pmatrix} \mapsto x_1x_2\] and \[g : \mathbb{R}^2 \to \mathbb{R}\quad \begin{pmatrix} x_1 \\ x_2\end{pmatrix} \mapsto (x_1-1)(x_2-1).\] Then \(f(\vec{e}_1)=g(\vec{e}_1)\) and \(f(\vec{e}_2)=g(\vec{e}_2),\) but \(f\neq g.\)

Given an ordered basis \(\mathbf{b}=(v_1,\ldots,v_n)\) of \(V,\) the previous lemma implies that there is a unique linear coordinate system \(\boldsymbol{\beta}: V \to \mathbb{K}^n\) such that \[\tag{3.14} \boldsymbol{\beta}(v_i)=\vec{e}_i\] for \(1\leqslant i \leqslant n,\) where \(\{\vec{e}_1,\ldots,\vec{e}_n\}\) denotes the standard basis of \(\mathbb{K}^n.\) Conversely, if \(\boldsymbol{\beta}: V \to \mathbb{K}^n\) is a linear coordinate system, we obtain an ordered basis of \(V\) \[\mathbf{b}=(\boldsymbol{\beta}^{-1}(\vec{e}_1),\ldots,\boldsymbol{\beta}^{-1}(\vec{e}_n))\] and these assignments are inverse to each other. Notice that for all \(v \in V\) we have \[\boldsymbol{\beta}(v)=\begin{pmatrix} s_1 \\ \vdots \\ s_n\end{pmatrix} \qquad \iff \qquad v=s_1v_1+\cdots+s_n v_n.\]

Remark 3.89 • Notation

We will denote an ordered basis by an upright bold Roman letter, such as \(\mathbf{b},\mathbf{c},\mathbf{d}\) or \(\mathbf{e}.\) We will denote the corresponding linear coordinate system by the corresponding bold Greek letter \(\boldsymbol{\beta},\)\(\boldsymbol{\gamma},\)\(\boldsymbol{\delta}\) or \(\boldsymbol{\varepsilon},\) respectively.

Example 3.90

Let \(V=\mathbb{K}^3\) and \(\mathbf{e}=(\vec{e}_1,\vec{e}_2,\vec{e}_3)\) denote the ordered standard basis. Then for all \(\vec{x}=(x_i)_{1\leqslant i\leqslant 3}\in \mathbb{R}^3\) we have \[\boldsymbol{\varepsilon}(\vec{x})=\vec{x}.\] where \(\boldsymbol{\varepsilon}\) denotes the linear coordinate system corresponding to \(\mathbf{e}.\) Notice that \(\boldsymbol{\varepsilon}\) is the standard coordinate system on \(\mathbb{K}^n.\) Considering instead the ordered basis \(\mathbf{b}=(\vec{v}_1,\vec{v}_2,\vec{v}_3)=(\vec{e}_1+\vec{e}_3,\vec{e}_3,\vec{e}_2-\vec{e}_1),\) we obtain \[\boldsymbol{\beta}(\vec{x})=\begin{pmatrix} x_1+x_2 \\ x_3-x_1-x_2 \\ x_2\end{pmatrix}\] since \[\vec{x}=\begin{pmatrix} x_1 \\ x_2 \\ x_3\end{pmatrix}=(x_1+x_2)\underbrace{\begin{pmatrix}1 \\ 0 \\ 1 \end{pmatrix}}_{=\vec{v}_1}+(x_3-x_1-x_2)\underbrace{\begin{pmatrix} 0 \\ 0 \\ 1 \end{pmatrix}}_{=\vec{v}_2}+x_2\underbrace{\begin{pmatrix}-1 \\ 1 \\ 0 \end{pmatrix}}_{=\vec{v}_3}.\]

Fixing linear coordinate systems – or equivalently ordered bases – on finite dimensional vector spaces \(V,W\) allows to describe each linear map \(g :V \to W\) in terms of a matrix:

Definition 3.91 • Matrix representation of a linear map

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}\) an ordered basis of \(V\) and \(\mathbf{c}\) an ordered basis of \(W.\) The matrix representation of a linear map \(g : V \to W\) with respect to the ordered bases \(\mathbf{b}\) and \(\mathbf{c}\) is the unique matrix \(\mathbf{M}(g,\mathbf{b},\mathbf{c}) \in M_{m,n}(\mathbb{K})\) such that \[f_{\mathbf{M}(g,\mathbf{b},\mathbf{c})}=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1},\] where \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\) denote the linear coordinate systems corresponding to \(\mathbf{b}\) and \(\mathbf{c},\) respectively.

The role of the different mappings can be summarised in terms of the following diagram: \[\begin{CD} V @>g>> W \\ @A{\boldsymbol{\beta}^{-1}}AA @VV{\boldsymbol{\gamma}}V\\ \mathbb{K}^n @>f_{\mathbf{M}(g,\mathbf{b},\mathbf{c})}>> \mathbb{K}^m \\ \end{CD}\] In practise, we can compute the matrix representation of a linear map as follows:

Proposition 3.92

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V,\) \(\mathbf{c}=(w_1,\ldots,w_m)\) an ordered basis of \(W\) and \(g : V \to W\) a linear map. Then there exist unique scalars \(A_{ij} \in \mathbb{K},\) where \(1\leqslant i\leqslant m, 1\leqslant j \leqslant n\) such that \[\tag{3.15} g(v_j)=\sum_{i=1}^m A_{ij}w_i, \qquad 1\leqslant j\leqslant n.\] Furthermore, the matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i\leqslant m, 1\leqslant j \leqslant n}\) satisfies \[f_\mathbf{A}=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1}\] and hence is the matrix representation of \(g\) with respect to the ordered bases \(\mathbf{b}\) and \(\mathbf{c}.\)

Remark 3.93

Notice that we sum over the first index of \(A_{ij}\) in (3.15).

Proof of Proposition 3.92. For all \(1\leqslant j\leqslant n\) the vector \(g(v_j)\) is an element of \(W\) and hence a linear combination of the vectors \(\mathbf{c}=(w_1,\ldots,w_m),\) as \(\mathbf{c}\) is an ordered basis of \(W.\) We thus have scalars \(A_{ij}\in \mathbb{K}\) with \(1\leqslant i\leqslant m, 1\leqslant j \leqslant n\) such that \(g(v_j)=\sum_{i=1}^m A_{ij}w_i.\) If \(\hat{A}_{ij} \in \mathbb{K}\) with \(1\leqslant i\leqslant m, 1\leqslant j \leqslant n\) also satisfy \(g(v_j)=\sum_{i=1}^m\hat{A}_{ij}w_i,\) then subtracting the two equations gives \[g(v_j)-g(v_j)=0_{W}=\sum_{i=1}^m(A_{ij}-\hat{A}_{ij})w_i\] so that \(0=A_{ij}-\hat{A}_{ij}\) for \(1\leqslant i\leqslant m, 1\leqslant j \leqslant n,\) since the vectors \((w_1,\ldots,w_m)\) are linearly independent. It follows that the scalars \(A_{ij}\) are unique. We want to show that \(f_\mathbf{A}\circ \boldsymbol{\beta}=\boldsymbol{\gamma}\circ g.\) Using Lemma 3.87 it is sufficient to show that \((f_{\mathbf{A}}\circ \boldsymbol{\beta})(v_j)=(\boldsymbol{\gamma}\circ g)(v_j)\) for \(1\leqslant j\leqslant n.\) Let \(\{\vec{e}_1,\ldots,\vec{e_n}\}\) denote the standard basis of \(\mathbb{K}^n\) so that \(\boldsymbol{\beta}(v_j)=\vec{e}_j\) and \(\{\vec{d}_1,\ldots,\vec{d}_m\}\) the standard basis of \(\mathbb{K}^m\) so that \(\boldsymbol{\gamma}(w_i)=\vec{d}_i.\) We compute \[\begin{aligned} (f_\mathbf{A}\circ \boldsymbol{\beta})(v_j)&=f_\mathbf{A}(\vec{e}_j)=\mathbf{A}\vec{e}_j=\sum_{i=1}^mA_{ij}\vec{d}_i=\sum_{i=1}^mA_{ij}\boldsymbol{\gamma}(w_i)=\boldsymbol{\gamma}\left(\sum_{i=1}^m A_{ij}w_i\right)\\ &=\boldsymbol{\gamma}(g(v_j))=(\boldsymbol{\gamma}\circ g)(v_j) \end{aligned}\] where we have used the linearity of \(\boldsymbol{\gamma}\) and (3.15).

This all translates to a simple recipe for calculating the matrix representation of a linear map, which we now illustrate in some examples.

Example 3.94

Let \(V=\mathsf{P}_{2}(\mathbb{R})\) and \(W=\mathsf{P}_{1}(\mathbb{R})\) and \(g=\frac{\mathrm{d}}{\mathrm{d}x}.\) We consider the ordered basis \(\mathbf{b}=(v_1,v_2,v_3)=((1/2)(3x^2-1),x,1)\) of \(V\) and \(\mathbf{c}=(w_1,w_2)=(x,1)\) of \(W.\)

  1. Compute the image under \(g\) of the elements \(v_i\) of the ordered basis \(\mathbf{b}.\) \[\begin{aligned} g\left(\frac{1}{2}(3x^2-1)\right)&=\frac{\mathrm{d}}{\mathrm{d}x}\left(\frac{1}{2}(3x^2-1)\right)=3x\\ g\left(x\right)&=\frac{\mathrm{d}}{\mathrm{d}x}(x)=1\\ g\left(1\right)&=\frac{\mathrm{d}}{\mathrm{d}x}(1)=0. \end{aligned}\]

  2. Write the image vectors as linear combinations of the elements of the ordered basis \(\mathbf{c}.\) \[\tag{3.16} \begin{aligned} 3x &=3\cdot w_1+ 0\cdot w_2 \\ 1 & = 0\cdot w_1+ 1 \cdot w_2 \\ 0 & = 0\cdot w_1+0 \cdot w_2 \end{aligned}\]

  3. Taking the transpose of the matrix of coefficients appearing in (3.16) gives the matrix representation \[\mathbf{M}\left(\frac{\mathrm{d}}{\mathrm{d}x},\mathbf{b},\mathbf{c}\right)=\begin{pmatrix} 3 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix}.\] of the linear map \(g=\frac{\mathrm{d}}{\mathrm{d}x}\) with respect to the bases \(\mathbf{b},\mathbf{c}.\)

Example 3.95

Let \(\mathbf{e}=(\vec{e}_1,\ldots,\vec{e}_n)\) and \(\mathbf{d}=(\vec{d}_1,\ldots,\vec{d}_m)\) denote the ordered standard basis of \(\mathbb{K}^n\) and \(\mathbb{K}^m,\) respectively. Then for \(\mathbf{A}\in M_{m,n}(\mathbb{K}),\) we have \[\mathbf{A}=\mathbf{M}(f_\mathbf{A},\mathbf{e},\mathbf{d}),\] that is, the matrix representation of the mapping \(f_\mathbf{A}: \mathbb{K}^n \to \mathbb{K}^m\) with respect to the standard bases is simply the matrix \(\mathbf{A}.\) Indeed, we have \[f_\mathbf{A}(\vec{e}_j)=\mathbf{A}\vec{e}_j=\begin{pmatrix} A_{1j} \\ \vdots \\ A_{mj} \end{pmatrix}=\sum_{i=1}^m A_{ij}\vec{d}_i.\]

Example 3.96

Let \(\mathbf{e}=(\vec{e}_1,\vec{e}_2)\) denote the ordered standard basis of \(\mathbb{R}^2.\) Consider the matrix \[\mathbf{A}=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}=\mathbf{M}(f_\mathbf{A},\mathbf{e},\mathbf{e}).\] We want to compute \(\mathrm{Mat}(f_\mathbf{A},\mathbf{b},\mathbf{b}),\) where \(\mathbf{b}=(\vec{v}_1,\vec{v}_2)=(\vec{e}_1+\vec{e}_2,\vec{e}_2-\vec{e}_1)\) is not the standard basis of \(\mathbb{R}^2.\) We obtain \[\begin{aligned} f_\mathbf{A}(\vec{v}_1)&=A\vec{v}_1=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}\begin{pmatrix} 1 \\ 1 \end{pmatrix}=\begin{pmatrix} 6 \\ 6 \end{pmatrix}=6\cdot \vec{v}_1+ 0\cdot \vec{v}_2\\ f_\mathbf{A}(\vec{v}_2)&=A\vec{v}_2=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}\begin{pmatrix} -1 \\ 1 \end{pmatrix}=\begin{pmatrix} -4 \\ 4 \end{pmatrix}=0\cdot \vec{v}_1+ 4\cdot \vec{v}_2 \end{aligned}\] Therefore, we have \[\mathbf{M}(f_\mathbf{A},\mathbf{b},\mathbf{b})=\begin{pmatrix} 6 & 0 \\ 0 & 4\end{pmatrix}.\]

Proposition 3.97

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}\) an ordered basis of \(V\) with corresponding linear coordinate system \(\boldsymbol{\beta},\) \(\mathbf{c}\) an ordered basis of \(W\) with corresponding linear coordinate system \(\boldsymbol{\gamma}\) and \(g : V \to W\) a linear map. Then for all \(v \in V\) we have \[\boldsymbol{\gamma}(g(v))=\mathbf{M}(g,\mathbf{b},\mathbf{c})\boldsymbol{\beta}(v).\]

Proof. By definition we have for all \(\vec{x} \in \mathbb{K}^n\) and \(\mathbf{A}\in M_{m,n}(\mathbb{K})\) \[\mathbf{A}\vec{x}=f_\mathbf{A}(\vec{x}).\] Combining this with Definition 3.91, we obtain for all \(v \in V\) \[\mathbf{M}(g,\mathbf{b},\mathbf{c})\boldsymbol{\beta}(v)=f_{\mathbf{M}(g,\mathbf{b},\mathbf{c})}(\boldsymbol{\beta}(v))=(\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1})(\boldsymbol{\beta}(v))=\boldsymbol{\gamma}(g(v)),\] as claimed.
Remark 3.98 Explicitly, Proposition 3.97 states the following. Let \(\mathbf{A}=\mathbf{M}(g,\mathbf{b},\mathbf{c})\) and let \(v \in V.\) Since \(\mathbf{b}\) is an ordered basis of \(V,\) there exist unique scalars \(s_i \in \mathbb{K},\) \(1\leqslant i\leqslant n\) such that \[v=s_1v_1+\cdots+s_n v_n.\] Then we have \[g(v)=t_1w_1+\cdots+t_m w_m,\] where \[\begin{pmatrix} t_1 \\ \vdots \\ t_m\end{pmatrix}=\mathbf{A}\begin{pmatrix} s_1 \\ \vdots \\ s_n\end{pmatrix}.\]
Example 3.99(Example 3.94 continued). With respect to the ordered basis \(\mathbf{b}=\left(\frac{1}{2}(3x^2-1),x,1\right),\) the polynomial \(a_2x^2+a_1x+a_0 \in V=\mathsf{P}_2(\mathbb{R})\) is represented by the vector \[\boldsymbol{\beta}(a_2x^2+a_1x+a_0)=\begin{pmatrix} \frac{2}{3}a_2 \\ a_1 \\ \frac{a_2}{3}+a_0\end{pmatrix}\] Indeed \[a_2x^2+a_1x+a_0=\frac{2}{3}a_2\left(\frac{1}{2}(3x^2-1)\right)+a_1x+\left(\frac{a_2}{3}+a_0\right)1.\] Computing \(\mathbf{M}(\frac{\mathrm{d}}{\mathrm{d}x},\mathbf{b},\mathbf{c})\boldsymbol{\beta}(a_2x^2+a_1x+a_0)\) gives \[\begin{pmatrix} 3 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix}\begin{pmatrix} \frac{2}{3}a_2 \\ a_1 \\ \frac{a_2}{3}+a_0\end{pmatrix}=\begin{pmatrix} 2a_2 \\ a_1 \end{pmatrix}\] and this vector represents the polynomial \(2a_2\cdot x+a_1\cdot 1=\frac{\mathrm{d}}{\mathrm{d}x}(a_2x^2+a_1x+a_0)\) with respect to the basis \(\mathbf{c}=(x,1)\) of \(\mathsf{P}_1(\mathbb{R}).\)
As a corollary to Proposition 3.92 we obtain:
Corollary 3.100

Let \(V_1,V_2,V_3\) be finite dimensional \(\mathbb{K}\)-vector spaces and \(\mathbf{b}_i\) an ordered basis of \(V_i\) for \(i=1,2,3.\) Let \(g_1 : V_1 \to V_2\) and \(g_2 : V_2 \to V_3\) be linear maps. Then \[\mathbf{M}(g_2\circ g_1,\mathbf{b}_1,\mathbf{b}_3)=\mathbf{M}(g_2,\mathbf{b}_2,\mathbf{b}_3)\mathbf{M}(g_1,\mathbf{b}_1,\mathbf{b}_2).\]

Proof. Let us write \(\mathbf{C}=\mathbf{M}(g_2\circ g_1,\mathbf{b}_1,\mathbf{b}_3)\) and \(\mathbf{A}_1=\mathbf{M}(g_1,\mathbf{b}_1,\mathbf{b}_2)\) as well as \(\mathbf{A}_2=\mathbf{M}(g_2,\mathbf{b}_2,\mathbf{b}_3).\) Using Proposition 2.20 and Theorem 2.21 it suffices to show that \(f_\mathbf{C}=f_{\mathbf{A}_2\mathbf{A}_1}=f_{\mathbf{A}_2}\circ f_{\mathbf{A}_1}.\) Now Proposition 3.92 gives \[f_{\mathbf{A}_2}\circ f_{\mathbf{A}_1}=\boldsymbol{\beta}_3\circ g_2 \circ \boldsymbol{\beta}_2^{-1}\circ \boldsymbol{\beta}_2\circ g_1 \circ \boldsymbol{\beta}_1^{-1}=\boldsymbol{\beta}_3\circ g_2\circ g_1\circ \boldsymbol{\beta}_{1}^{-1}=f_\mathbf{C}.\]
Proposition 3.101

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces, \(\mathbf{b}\) an ordered basis of \(V\) and \(\mathbf{c}\) an ordered basis of \(W.\) A linear map \(g : V \to W\) is bijective if and only if \(\mathbf{M}(g,\mathbf{b},\mathbf{c})\) is invertible. Moreover, in the case where \(g\) is bijective we have \[\mathbf{M}(g^{-1},\mathbf{c},\mathbf{b})=(\mathbf{M}(g,\mathbf{b},\mathbf{c}))^{-1}.\]

Proof. Let \(n=\dim(V)\) and \(m=\dim(W).\)

\(\Rightarrow\) Let \(g : V \to W\) be bijective so that \(g\) is an isomorphism and hence \(n=\dim(V)=\dim(W)=m\) by Proposition 3.80. Then Corollary 3.100 gives \[\mathbf{M}(g^{-1},\mathbf{c},\mathbf{b})\mathbf{M}(g,\mathbf{b},\mathbf{c})=\mathbf{M}(g^{-1}\circ g,\mathbf{b},\mathbf{b})=\mathbf{M}(\mathrm{Id}_{V},\mathbf{b},\mathbf{b})=\mathbf{1}_{n}\] and \[\mathbf{M}(g,\mathbf{b},\mathbf{c})\mathbf{M}(g^{-1},\mathbf{c},\mathbf{b})=\mathbf{M}(g\circ g^{-1},\mathbf{c},\mathbf{c})=\mathbf{M}(\mathrm{Id}_{W},\mathbf{c},\mathbf{c})=\mathbf{1}_{n}\] so that \(\mathbf{M}(g,\mathbf{b},\mathbf{c})\) is invertible with inverse \(\mathbf{M}(g^{-1},\mathbf{c},\mathbf{b}).\) \(\Leftarrow\) Conversely suppose \(\mathbf{A}=\mathbf{M}(g,\mathbf{b},\mathbf{c})\) is invertible with inverse \(\mathbf{A}^{-1}.\) It follows that \(n=m\) by Corollary 3.81. We consider \(h=\boldsymbol{\beta}^{-1}\circ f_{\mathbf{A}^{-1}}\circ \boldsymbol{\gamma}: W \to V\) and since \(f_\mathbf{A}=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1}\) by Proposition 3.92, we have \[g\circ h=\boldsymbol{\gamma}^{-1}\circ f_\mathbf{A}\circ \boldsymbol{\beta}\circ \boldsymbol{\beta}^{-1}\circ f_{\mathbf{A}^{-1}}\circ \boldsymbol{\gamma}=\boldsymbol{\gamma}^{-1}\circ f_{\mathbf{A}\mathbf{A}^{-1}}\circ \boldsymbol{\gamma}=\mathrm{Id}_{W}.\] Likewise, we have \[h\circ g=\boldsymbol{\beta}^{-1}\circ f_{\mathbf{A}^{-1}}\circ \boldsymbol{\gamma}\circ \boldsymbol{\gamma}^{-1}\circ f_\mathbf{A}\circ \boldsymbol{\beta}=\boldsymbol{\beta}^{-1}\circ f_{\mathbf{A}^{-1}\mathbf{A}}\circ\boldsymbol{\beta}=\mathrm{Id}_{V},\] showing that \(g\) admits an inverse mapping \(h : W \to V\) and hence \(g\) is bijective.

Recall that a mapping \(f : \mathcal{X} \to \mathcal{Y}\) between sets \(\mathcal{X},\mathcal{Y}\) is said to admit a left inverse if there exists a mapping \(g : \mathcal{Y} \to \mathcal{X}\) such that \(g\circ f=\mathrm{Id}_\mathcal{X}.\) Likewise, a right inverse is a mapping \(h : \mathcal{Y} \to \mathcal{X}\) such that \(f\circ h=\mathrm{Id}_\mathcal{Y}.\)

We now have:

Proposition 3.102

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) a square matrix. Then the following statements are equivalent:

  1. The matrix \(\mathbf{A}\) admits a left inverse, that is, a matrix \(\mathbf{B}\in M_{n,n}(\mathbb{K})\) such that \(\mathbf{B}\mathbf{A}=\mathbf{1}_{n}\);

  2. The matrix \(\mathbf{A}\) admits a right inverse, that is, a matrix \(\mathbf{B}\in M_{n,n}(\mathbb{K})\) such that \(\mathbf{A}\mathbf{B}=\mathbf{1}_{n}\);

  3. The matrix \(\mathbf{A}\) is invertible.

Proof. By the definition of the invertability of a matrix, (iii) implies both (i) and (ii).

(i) \(\Rightarrow\) (iii) Since \(\mathbf{B}\mathbf{A}=\mathbf{1}_{n}\) we have \(f_\mathbf{B}\circ f_\mathbf{A}=f_{\mathbf{1}_{n}}=\mathrm{Id}_{\mathbb{K}^n}\) by Theorem 2.21 and hence \(f_\mathbf{B}\) is a left inverse for \(f_\mathbf{A}.\) Therefore, by the above exercise, \(f_\mathbf{A}\) is injective. Corollary 3.77 implies that \(f_\mathbf{A}\) is also bijective. Denoting the ordered standard basis of \(\mathbb{K}^n\) by \(\mathbf{e},\) we have \(\mathbf{M}(f_\mathbf{A},\mathbf{e},\mathbf{e})=\mathbf{A}\) and hence Proposition 3.101 implies that \(\mathbf{A}\) is invertible.

(ii) \(\Rightarrow\) (iii) is completely analogous to (i) \(\Rightarrow\) (iii).

3.7.1 Change of basis

It is natural to ask how the choice of bases affects the matrix representation of a linear map.

Definition 3.103 • Change of basis matrix

Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space and \(\mathbf{b}, \mathbf{b}^{\prime}\) be ordered bases of \(V\) with corresponding linear coordinate systems \(\boldsymbol{\beta},\boldsymbol{\beta}^{\prime}.\) The change of basis matrix from \(\mathbf{b}\) to \(\mathbf{b}^{\prime}\) is the matrix \(\mathbf{C}\in M_{n,n}(\mathbb{K})\) satisfying \[f_\mathbf{C}=\boldsymbol{\beta}^{\prime}\circ \boldsymbol{\beta}^{-1}\] We will write \(\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})\) for the change of basis matrix from \(\mathbf{b}\) to \(\mathbf{b}^{\prime}.\)

Remark 3.104 Notice that by definition \[\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})=\mathbf{M}(\mathrm{Id}_V,\mathbf{b},\mathbf{b}^{\prime}).\] Since the identity map \(\mathrm{Id}_V : V \to V\) is bijective with inverse \((\mathrm{Id}_V)^{-1}=\mathrm{Id}_V,\) Proposition 3.101 implies that the change of basis matrix \(\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})\) is invertible with inverse \[\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})^{-1}=\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b}).\]
Example 3.105 Let \(V=\mathbb{R}^2\) and \(\mathbf{e}=(\vec{e}_1,\vec{e}_2)\) be the ordered standard basis and \(\mathbf{b}=(\vec{v}_1,\vec{v}_2)=(\vec{e}_1+\vec{e}_2,\vec{e}_2-\vec{e}_1)\) another ordered basis. According to the recipe mentioned in Example 3.94, if we want to compute \(\mathbf{C}(\mathbf{e},\mathbf{b})\) we simply need to write each vector of \(\mathbf{e}\) as a linear combination of the elements of \(\mathbf{b}.\) The transpose of the resulting coefficient matrix is then \(\mathbf{C}(\mathbf{e},\mathbf{b}).\) We obtain \[\begin{aligned} \vec{e}_1&=\frac{1}{2}\vec{v}_1-\frac{1}{2}\vec{v}_2,\\ \vec{e}_2&=\frac{1}{2}\vec{v}_1+\frac{1}{2}\vec{v}_2, \end{aligned}\] so that \[\mathbf{C}(\mathbf{e},\mathbf{b})=\begin{pmatrix} \frac{1}{2} & \frac{1}{2} \\ -\frac{1}{2} & \frac{1}{2}\end{pmatrix}.\] Reversing the role of \(\mathbf{e}\) and \(\mathbf{b}\) gives \(\mathbf{C}(\mathbf{b},\mathbf{e})\) \[\begin{aligned} \vec{v}_1&=1\vec{e}_1+1\vec{e}_2,\\ \vec{v}_2&=-1 \vec{e}_1+1\vec{e}_2, \end{aligned}\] so that \[\mathbf{C}(\mathbf{b},\mathbf{e})=\begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}.\] Notice that indeed we have \[\mathbf{C}(\mathbf{e},\mathbf{b})\mathbf{C}(\mathbf{b},\mathbf{e})=\begin{pmatrix} \frac{1}{2} & \frac{1}{2} \\ -\frac{1}{2} & \frac{1}{2}\end{pmatrix}\begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}=\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\] so that \(\mathbf{C}(\mathbf{e},\mathbf{b})^{-1}=\mathbf{C}(\mathbf{b},\mathbf{e}).\)
Theorem 3.106

Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces and \(\mathbf{b},\mathbf{b}^{\prime}\) ordered bases of \(V\) and \(\mathbf{c},\mathbf{c}^{\prime}\) ordered bases of \(W.\) Let \(g : V \to W\) be a linear map. Then we have \[\mathbf{M}(g,\mathbf{b}^{\prime},\mathbf{c}^{\prime})=\mathbf{C}(\mathbf{c},\mathbf{c}^{\prime})\mathbf{M}(g,\mathbf{b},\mathbf{c})\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})\] In particular, for a linear map \(g : V \to V\) we have \[\mathbf{M}(g,\mathbf{b}^{\prime},\mathbf{b}^{\prime})=\mathbf{C}\,\mathbf{M}(g,\mathbf{b},\mathbf{b})\,\mathbf{C}^{-1},\] where we write \(\mathbf{C}=\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime}).\)

Proof. We write \(\mathbf{A}=\mathbf{M}(g,\mathbf{b},\mathbf{c})\) and \(\mathbf{B}=\mathbf{M}(g,\mathbf{b}^{\prime},\mathbf{c}^{\prime})\) and \(\mathbf{C}=\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})\) and \(\mathbf{D}=\mathbf{C}(\mathbf{c},\mathbf{c}^{\prime}).\) By Remark 3.104 we have \(\mathbf{C}^{-1}=\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b}),\) hence applying Proposition 2.20 and Theorem 2.21 and Corollary 2.22, we need to show that \[f_\mathbf{B}=f_\mathbf{D}\circ f_\mathbf{A}\circ f_{\mathbf{C}^{-1}}.\] By Definition 3.91 we have \[\begin{aligned} f_\mathbf{A}&=\boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1},\\ f_\mathbf{B}&=\boldsymbol{\gamma}^{\prime}\circ g \circ (\boldsymbol{\beta}^{\prime})^{-1} \end{aligned}\] and by Definition 3.103 we have \[\begin{aligned} f_{\mathbf{C}^{-1}}&=\boldsymbol{\beta}\circ (\boldsymbol{\beta}^{\prime})^{-1},\\ f_\mathbf{D}&=\boldsymbol{\gamma}^{\prime}\circ \boldsymbol{\gamma}^{-1}. \end{aligned}\] Hence we obtain \[f_\mathbf{D}\circ f_\mathbf{A}\circ f_{\mathbf{C}^{-1}}=\boldsymbol{\gamma}^{\prime}\circ \boldsymbol{\gamma}^{-1}\circ \boldsymbol{\gamma}\circ g \circ \boldsymbol{\beta}^{-1}\circ \boldsymbol{\beta}\circ (\boldsymbol{\beta}^{\prime})^{-1}=\boldsymbol{\gamma}^{\prime}\circ g \circ (\boldsymbol{\beta}^{\prime})^{-1}=f_\mathbf{B},\] as claimed. The second statement follows again by applying Remark 3.104.
Example 3.107(Example 3.96 and Example 3.105 continued). Let \(\mathbf{e}=(\vec{e}_1,\vec{e}_2)\) denote the ordered standard basis of \(\mathbb{R}^2\) and \[\mathbf{A}=\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}=\mathbf{M}(f_\mathbf{A},\mathbf{e},\mathbf{e}).\] Let \(\mathbf{b}=(\vec{e}_1+\vec{e}_2,\vec{e}_2-\vec{e}_1).\) We computed that \[\mathbf{M}(f_\mathbf{A},\mathbf{b},\mathbf{b})=\begin{pmatrix} 6 & 0 \\ 0 & 4\end{pmatrix}\] as well as \[\mathbf{C}(\mathbf{e},\mathbf{b})=\begin{pmatrix} \frac{1}{2} & \frac{1}{2} \\ -\frac{1}{2} & \frac{1}{2}\end{pmatrix} \quad \text{and} \quad \mathbf{C}(\mathbf{b},\mathbf{e})=\begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}.\] According to Theorem 3.106 we must have \[\mathbf{M}(f_\mathbf{A},\mathbf{b},\mathbf{b})=\mathbf{C}(\mathbf{e},\mathbf{b})\mathbf{M}(f_\mathbf{A},\mathbf{e},\mathbf{e})\mathbf{C}(\mathbf{b},\mathbf{e})\] and indeed \[\begin{pmatrix} 6 & 0 \\ 0 & 4\end{pmatrix} =\begin{pmatrix} \frac{1}{2} & \frac{1}{2} \\ -\frac{1}{2} & \frac{1}{2}\end{pmatrix}\begin{pmatrix} 5 & 1 \\ 1 & 5 \end{pmatrix}\begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}.\]

Finally, we observe that every invertible matrix can be realised as a change of basis matrix:

Lemma 3.108

Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space, \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V\) and \(\mathbf{C}\in M_{n,n}(\mathbb{K})\) an invertible \(n\times n\)-matrix. Define \(v^{\prime}_j=\sum_{i=1}^n C_{ij}v_i\) for \(1\leqslant i\leqslant n.\) Then \(\mathbf{b}^{\prime}=(v^{\prime}_1,\ldots,v^{\prime}_n)\) is an ordered basis of \(V\) and \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})=\mathbf{C}.\)

Proof. It is sufficient to prove that the vectors \(\{v^{\prime}_1,\ldots,v^{\prime}_n\}\) are linearly independent. Indeed, if they are linearly independent, then they span a subspace \(U\) of dimension \(n\) and Proposition 3.74 implies that \(U=V,\) so that \(\mathbf{b}^{\prime}\) is an ordered basis of \(V.\) Suppose we have scalars \(s_1,\ldots,s_n\) such that \[0_V=\sum_{j=1}^n s_ jv^{\prime}_j=\sum_{j=1}^n\sum_{i=1}^n s_j C_{ij}v_i=\sum_{i=1}^n \Big(\sum_{j=1}^n C_{ij}s_j\Big)v_i.\] Since \(\{v_1,\ldots,v_n\}\) is a basis of \(V\) we must have \(\sum_{j=1}^n C_{ij}s_j=0\) for all \(i=1,\ldots,n.\) In matrix notation this is equivalent to the conditon \(\mathbf{C}\vec{s}=0_{\mathbb{K}^n},\) where \(\vec{s}=(s_i)_{1\leqslant i\leqslant n}.\) Since \(\mathbf{C}\) is invertible, we can multiply this last equation from the left with \(\mathbf{C}^{-1}\) to obtain \(\mathbf{C}^{-1}\mathbf{C}\vec{s}=\mathbf{C}^{-1}0_{\mathbb{K}^n}\) which is equivalent to \(\vec{s}=0_{\mathbb{K}^n}.\) It follows that \(\mathbf{b}^{\prime}\) is an ordered basis of \(V.\) By definition we have \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})=\mathbf{C}.\)

Exercises

Exercise 3.109

Let \(\mathrm{Id}_V : V \to V\) denote the identity mapping of the finite dimensional \(\mathbb{K}\)-vector space \(V\) and let \(\mathbf{b}=(v_1,\ldots,v_n)\) be any ordered basis of \(V.\) Show that \(\mathbf{M}(\mathrm{Id}_V,\mathbf{b},\mathbf{b})=\mathbf{1}_{n}.\)

Solution

By Definition 3.91 we have \[f_{\mathbf{M}(\mathrm{Id}_V,\mathbf{b},\mathbf{b})} = \boldsymbol{\beta}\circ \mathrm{Id} \circ \boldsymbol{\beta}^{-1}=\mathrm{Id}_{\mathbb{K}^n}=f_{\mathbf{1}_{n}}\] and hence \(\mathbf{M}(\mathrm{Id}_V,\mathbf{b},\mathbf{b})=\mathbf{1}_{n}.\)
Exercise 3.110

Show that \(f :\mathcal{X} \to \mathcal{Y}\) admits a left inverse if and only if \(f\) is injective and that \(f : \mathcal{X} \to \mathcal{Y}\) admits a right inverse if and only if \(f\) is surjective.

Solution

We first show the first equivalence: Suppose \(f:\mathcal X\to\mathcal Y\) admits a left inverse, i.e. there exists a map \(g:\mathcal Y\to\mathcal X\) such that \(g\circ f = \mathrm {Id}_{\mathcal X}.\) Let \(x_1,x_2\in\mathcal X\) such that \(f(x_1)=f(x_2).\) Applying \(g\) to both sides of this equations leads to \((g\circ f)(x_1) = (g\circ f)(x_2)\Longrightarrow x_1=x_2\) and hence \(f\) is injective.

Conversely, if \(f\) is injective, for each given \(y\in\mathrm{Im}f\) there is a unique preimage \(x\in\mathcal X\) such that \(f(x) = y.\) Now define \(g:\mathcal Y\to\mathcal X\) in such a way that \(y\in\mathrm{Im}f\) is assigned to its unique preimage \(x\) and let \(\mathrm{Im}f\not\ni y\mapsto x_0,\) where \(x_0\in \mathcal X\) is arbitrary. By construction \(g\circ f=\mathrm{Id}_{\mathcal X}.\)

For the second equivalence, suppose \(f:\mathcal X\to\mathcal Y\) admits a right inverse, i.e. there exists a map \(g:\mathcal Y\to\mathcal X\) such that \(f\circ g = \mathrm {Id}_{\mathcal Y}.\) Given any \(y\in\mathcal Y,\) we have \(f(g(y))=(f\circ g)(y) = y.\) Therefore \(g(y)\in \mathcal X\) is an element of \(f^{-1}(y)\) and hence \(f\) is surjective.

Conversely, if \(f\) is surjective, given any \(y\in\mathcal Y,\) the set \(f^{-1}(y)\) is non-empty. We construct \(g:\mathcal Y\to\mathcal X\) by assigning to every \(y\in\mathcal Y\) an element of \(f^{-1}(y)\) which is possible by the axiom of choice. By construction, \(f\circ g=\mathrm{Id}_{\mathcal Y}.\)

Exercise 3.111

Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space and \(\mathbf{b},\mathbf{b}^{\prime}\) be ordered bases of \(V.\) Show that for all \(v \in V\) we have \[\boldsymbol{\beta}^{\prime}(v)=\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})\boldsymbol{\beta}(v).\]

Solution

Let \(\mathbf{b}\) and \(\mathbf{b}'\) be two ordered bases of \(V\) with corresponding coordinate systems \(\boldsymbol{\beta}\) and \(\boldsymbol{\beta}'.\) Then, according to Definition 3.103: \[\boldsymbol{\beta}'(v) = (\boldsymbol{\beta}'\circ \boldsymbol{\beta}^{-1}\circ \boldsymbol{\beta})(v) = (\boldsymbol{\beta}'\circ \boldsymbol{\beta}^{-1})(\boldsymbol{\beta}(v)) = f_{\mathbf{C}(\mathbf{b},\mathbf{b}')}( \boldsymbol{\beta}(v))=\mathbf{C}(\mathbf{b},\mathbf{b}')\boldsymbol{\beta}(v)\]

Home

Contents

Exercise Sheets

Lecture Recordings

Quizzes

Study Weeks