7 Quotient vector spaces
7.1 Affine mappings and affine spaces
Previously we saw that we can take the sum of subspaces of a vector space. In this final chapter of the Linear Algebra I module we introduce the concept of a quotient of a vector space by a subspace.
Translations are among the simplest non-linear mappings.
Let \(V\) be a \(\mathbb{K}\)-vector space and \(v_0 \in V.\) The mapping \[T_{v_0} : V \to V,\qquad v \mapsto v+v_0\] is called the translation by the vector \(v_0.\)
Notice that for \(v_0\neq 0_V,\) a translation is not linear, since \(T_{v_0}(0_V)=0_V+v_0=v_0\neq 0_V.\)
Taking \(s_1=1\) and \(s_2=-1\) in (3.6), we see that a linear map \(f : V \to W\) between \(\mathbb{K}\)-vector spaces \(V,W\) satisfies \(f(v_1-v_2)=f(v_1)-f(v_2)\) for all \(v_1,v_2 \in V.\) In particular, linear maps are affine maps in the following sense:
A mapping \(f : V \to W\) is called affine if there exists a linear map \(g : V \to W\) so that \(f(v_1)-f(v_2)=g(v_1-v_2)\) for all \(v_1,v_2 \in V.\) We call \(g\) the linear map associated to \(f\).
Affine mappings are compositions of linear mappings and translations:
A mapping \(f : V \to W\) is affine if and only if there exists a linear map \(g : V \to W\) and a translation \(T_{w_0} : W \to W\) so that \(f=T_{w_0}\circ g.\)
Proof. \(\Leftarrow\) Let \(g : V \to W\) be linear and \(T_{w_0} : W \to W\) be a translation for some vector \(w_0 \in W\) so that \(T_{w_0}(w)=w+w_0\) for all \(w \in W.\) Let \(f=T_{w_0}\circ g\) so that \(f(v)=g(v)+w_0\) for all \(v \in V.\) Then \[f(v_1)-f(v_2)=g(v_1)+w_0-g(v_2)-w_0=g(v_1)-g(v_2)=g(v_1-v_2),\] hence \(f\) is affine.
Let \(f : V \to W\) be a linear map, then \(f(0_V)=0_W.\)
A mapping \(g : \mathbb{K}^m \to \mathbb{K}^n\) is linear if and only if there exists a matrix \(\mathbf{B}\in M_{n,m}(\mathbb{K})\) so that \(g=f_\mathbf{B}.\)
A mapping \(f : V \to W\) is affine if and only if there exists a linear map \(g : V \to W\) and a translation \(T_{w_0} : W \to W\) so that \(f=T_{w_0}\circ g.\)
An affine subspace of a \(\mathbb{K}\)-vector space \(V\) is a translation of a subspace by some fixed vector \(v_0.\)
Let \(V\) be a \(\mathbb{K}\)-vector space. An affine subspace of \(V\) is a subset of the form \[U+v_0=\{u+v_0| u \in U\},\] where \(U\subset V\) is a subspace and \(v_0 \in V.\) We call \(U\) the associated vector space to the affine subspace \(U+v_0\) and we say that \(U+v_0\) is parallel to \(U.\)
Let \(V=\mathbb{R}^2\) and \(U=\operatorname{span}\{\vec{e}_1+\vec{e}_2 \}=\left\{s(\vec{e}_1+\vec{e}_2)| s\in \mathbb{R}\right\}\) where here, as usual, \(\{\vec{e}_1,\vec{e}_2\}\) denotes the standard basis of \(\mathbb{R}^2.\) So \(U\) is the line through the origin \(0_{\mathbb{R}^2}\) defined by the equation \(y=x.\) By definition, for all \(\vec{v} \in \mathbb{R}^2\) we have \[U+\vec{v}=\left\{\vec{v}+s\vec{w}| s\in \mathbb{R}\right\},\] where we write \(\vec{w}=\vec{e}_1+\vec{e}_2.\) So for each \(\vec{v} \in \mathbb{R}^2,\) the affine subspace \(U+\vec{v}\) is a line in \(\mathbb{R}^2,\) the translation by the vector \(\vec{v}\) of the line defined by \(y=x.\)
7.2 Quotient vector spaces
Let \(U\) be a subspace of a \(\mathbb{K}\)-vector space \(V.\) We want to make sense of the notion of dividing \(V\) by \(U.\) It turns out that there is a natural way to do this and moreover, the quotient \(V/U\) again carries the structure of a \(\mathbb{K}\)-vector space. The idea is to define \(V/U\) to be the set of all translations of the subspace \(U,\) that is, we consider the set of subsets \[V/U=\{U+v | v \in V\}.\] We have to define what it means to add affine subspaces \(U+v_1\) and \(U+v_2\) and what it means to scale \(U+v\) by a scalar \(s\in \mathbb{K}.\) Formally, it is tempting to define \(0_{V/U}=U+0_V\) and \[\tag{7.1} (U+v_1)+_{V/U}(U+v_2)=U+(v_1+v_2)\] for all \(v_1,v_2 \in V\) as well as \[\tag{7.2} s\cdot_{V/U}(U+v)=U+(sv)\] for all \(v \in V\) and \(s\in \mathbb{K}.\) However, we have to make sure that these operations are well defined. We do this with the help of the following lemma.
Let \(U\subset V\) be a subspace. Then any vector \(v \in V\) belongs to a unique affine subspace parallel to \(U,\) namely \(U+v.\) In particular, two affine subspaces \(U+v_1\) and \(U+v_2\) are either equal or have empty intersection.
Proof. Since \(0_V \in U,\) we have \(v \in (U+v),\) hence we only need to show that if \(v \in (U+\hat{v})\) for some vector \(\hat{v},\) then \(U+v=U+\hat{v}.\) Assume \(v \in (U+\hat{v})\) so that \(v=u+\hat{v}\) for some vector \(u \in U.\) Suppose \(w \in (U+\hat{v}).\) We need to show that then also \(w \in (U+v).\) Since \(w \in (U+\hat{v})\) we have \(w=\hat{u}+\hat{v}\) for some vector \(\hat{u} \in U.\) Using that \(\hat{v}=v-u,\) we obtain \[w=\hat{u}+v-u=\hat{u}-u+v\] Since \(U\) is a subspace we have \(\hat{u}-u \in U\) and hence \(w \in (U+v).\)
Conversely, suppose \(w \in (U+v),\) it follows exactly as before that then \(w \in (U+\hat{v})\) as well.
Let \(U\subset V\) be a subspace. Then any vector \(v \in V\) belongs to a unique affine subspace parallel to \(U,\) namely \(U+v.\) In particular, two affine subspaces \(U+v_1\) and \(U+v_2\) are either equal or have empty intersection.
Let \(U\subset V\) be a subspace. Then any vector \(v \in V\) belongs to a unique affine subspace parallel to \(U,\) namely \(U+v.\) In particular, two affine subspaces \(U+v_1\) and \(U+v_2\) are either equal or have empty intersection.
A \(\mathbb{K}\)-vector space, or vector space over \(\mathbb{K}\) is a set \(V\) with a distinguished element \(0_V\) (called the zero vector) and two operations \[\begin{aligned} +_V : V \times V \to V& &(v_1,v_2) \mapsto v_1+_Vv_2& &(\text{vector addition}) \end{aligned}\] and \[\begin{aligned} \cdot_V : \mathbb{K}\times V \to V& &(s,v) \mapsto s\cdot_V v& &(\text{scalar multiplication}), \end{aligned}\] so that the following properties hold:
Commutativity of vector addition \[v_1+_Vv_2=v_2+_Vv_1\quad (\text{for all}\; v_1,v_2 \in V);\]
Associativity of vector addition \[v_1+_V(v_2+_Vv_3)=(v_1+_Vv_2)+_Vv_3 \quad (\text{for all}\; v_1,v_2,v_3 \in V);\]
Identity element of vector addition \[\tag{3.4} 0_V+_Vv=v+_V0_V=v\quad (\text{for all}\; v \in V);\]
Identity element of scalar multiplication \[1\cdot_V v=v\quad (\text{for all}\; v \in V);\]
Scalar multiplication by zero \[\tag{3.5} 0\cdot_{V}v=0_V \quad (\text{for all}\; v \in V);\]
Compatibility of scalar multiplication with field multiplication \[(s_1s_2)\cdot_V v=s_1\cdot_V(s_2\cdot_V v) \quad (\text{for all}\; s_1,s_2 \in \mathbb{K}, v \in V);\]
Distributivity of scalar multiplication with respect to vector addition \[s\cdot_V(v_1+_Vv_2)=s\cdot_Vv_1+_Vs\cdot_V v_2\quad (\text{for all}\; s\in \mathbb{K}, v_1,v_2 \in V);\]
Distributivity of scalar multiplication with respect to field addition \[(s_1+s_2)\cdot_Vv=s_1\cdot_Vv+_Vs_2\cdot_Vv \quad (\text{for all}\; s_1,s_2 \in \mathbb{K}, v \in V).\] The elements of \(V\) are called vectors.
Notice that we have a surjective mapping \[p : V \to V/U, \quad v \mapsto U+v.\] which satisfies \[p(v_1+v_2)=U+(v_1+v_2)=(U+v_1)+_{V/U}(U+v_2)=p(v_1)+_{V/U} p(v_2)\] for all \(v_1,v_2 \in V\) and \[p(sv)=U+(sv)=s\cdot_{V/U}(U+v)=s\cdot_{V/U}p(v).\] for all \(v \in V\) and \(s\in \mathbb{K}.\) Therefore, the mapping \(p\) is linear.
The vector space \(V/U\) is called the quotient (vector) space of \(V\) by \(U\). The linear map \(p : V \to V/U\) is called the canonical surjection from \(V\) to \(V/U.\)
The mapping \(p : V \to V/U\) satisfies \[p(v)=0_{V/W}=U+0_V \quad \iff \quad v \in U\] and hence \(\operatorname{Ker}(p) = U.\) This gives:
Suppose the \(\mathbb{K}\)-vector space \(V\) is finite dimensional. Then \(V/U\) is finite dimensional as well and \[\dim(V/U)=\dim(V)-\dim(U).\]
Let \(V,W\) be finite dimensional \(\mathbb{K}\)-vector spaces and \(f : V \to W\) a linear map. Then we have \[\dim(V)=\dim \operatorname{Ker}(f)+\dim \operatorname{Im}(f)=\operatorname{nullity}(f)+\operatorname{rank}(f).\]
In the case where \(U=V\) we obtain \(V/U=\{0_{V/U}\}.\)
In the case where \(U=\{0_V\}\) we obtain that \(V/U\) is isomorphic to \(V.\)
Exercises
Let \(V,W\) be \(\mathbb{K}\)-vector spaces, \(U\subset V\) and \(Z\subset W\) be vector subspaces and \(f : V \to W\) a linear map. Then the image \(f(U)\) is a vector subspace of \(W\) and the preimage \(f^{-1}(Z)\) is a vector subspace of \(V.\)
Solution
A mapping \(f : V \to W\) is affine if and only if there exists a linear map \(g : V \to W\) and a translation \(T_{w_0} : W \to W\) so that \(f=T_{w_0}\circ g.\)