10.3 Gram–Schmidt orthonormalisation

Using the orthogonal projection onto a subspace, we can now describe an explicit computational algorithm which constructs an orthonormal basis from a given ordered basis \(\mathbf{b}=(v_1,\ldots,v_n)\) of a Euclidean space \((V,\langle\cdot{,}\cdot\rangle).\) This algorithm is known as Gram–Schmidt orthonormalisation.

We first consider the case of a \(3\)-dimensional Euclidean space \((V,\langle\cdot{,}\cdot\rangle)\) equipped with an ordered basis \(\mathbf{b}=(v_1,v_2,v_3).\) We take \[u_1=\frac{v_1}{\Vert v_1\Vert}\] as the first vector of our new orthonormal basis. We then construct a vector from \(v_2\) that is orthogonal to the subspace \(U_1=\operatorname{span}\{u_1\}\) \[w_2=v_2-\Pi_{U_1}^{\perp}(v_2)=v_2-\frac{\langle v_2,u_1\rangle}{\langle u_1,u_1\rangle}u_1=v_2-\langle v_2,u_1\rangle u_1,\] where we use that \(\langle u_1,u_1\rangle=1.\) As our second basis vector we can thus take \[u_2=\frac{w_2}{\Vert w_2\Vert}.\] We then define \(U_2=\operatorname{span}\{u_1,u_2\}\) and set \[w_3=v_3-\Pi_{U_2}^{\perp}(v_3)=v_3-\frac{\langle v_3,u_1\rangle}{\langle u_1,u_1\rangle}u_1-\frac{\langle v_3,u_2\rangle}{\langle u_2,u_2\rangle}u_2=v_3-\langle v_3,u_1\rangle u_1-\langle v_3,u_2\rangle u_2\] As our third basis vector we can thus take \[u_3=\frac{w_3}{\Vert w_3\Vert}.\] Setting \(\mathbf{b}^{\prime}=(u_1,u_2,u_3),\) we have obtained an orthonormal basis \(\mathbf{b}^{\prime}\) of \((V,\langle\cdot{,}\cdot\rangle).\)

Example 10.21

We consider \(V=\mathbb{R}^3\) with the standard scalar product \(\langle\cdot{,}\cdot\rangle\) and the ordered basis \(\mathbf{b}=(\vec{v}_1,\vec{v}_2,\vec{v}_3),\) where \[\vec{v}_1=\begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}, \qquad \vec{v}_2=\begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix},\qquad \vec{v}_3=\begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix}.\] We apply Gram-Schmidt orthonormalisation to \(\mathbf{b}.\) We obtain \[\vec{u}_1=\frac{\vec{v}_1}{\Vert\vec{v}_1 \Vert}=\frac{1}{\sqrt{3}}\begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}\] and \[\vec{w}_2=\vec{v}_2-\langle \vec{v}_2,\vec{u}_1\rangle \vec{u}_1=\begin{pmatrix} 1 \\ 0 \\ 1 \end{pmatrix}-\frac{2}{\sqrt{3}}\frac{1}{\sqrt{3}}\begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}=\frac{1}{3}\begin{pmatrix} 1 \\ -2 \\ 1 \end{pmatrix}\] so that \[\vec{u}_2=\frac{\vec{w}_2}{\Vert \vec{w}_2\Vert}=\begin{pmatrix} \frac{1}{\sqrt{6}} \\ -\frac{\sqrt{2}}{\sqrt{3}} \\ \frac{1}{\sqrt{6}}\end{pmatrix}.\] Likewise, \[\begin{aligned} \vec{w}_3&=\vec{v}_3-\langle \vec{v}_3,\vec{u}_1\rangle \vec{u}_1-\langle \vec{v}_3,\vec{u}_2\rangle \vec{u}_2\\ &=\begin{pmatrix} 1 \\ 1 \\ 0 \end{pmatrix}-\frac{2}{\sqrt{3}}\frac{1}{\sqrt{3}}\begin{pmatrix} 1 \\ 1 \\ 1\end{pmatrix}-\left(-\frac{1}{\sqrt{6}}\right)\begin{pmatrix} \frac{1}{\sqrt{6}} \\ -\frac{\sqrt{2}}{\sqrt{3}} \\ \frac{1}{\sqrt{6}}\end{pmatrix}=\frac{1}{2}\begin{pmatrix} 1 \\ 0 \\ -1 \end{pmatrix} \end{aligned}\] so that \[\vec{u}_3=\frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ 0 \\ -1 \end{pmatrix}\] and we have indeed \(\langle \vec{u}_i,\vec{u}_j\rangle=\delta_{ij}\) for \(1\leqslant i,j\leqslant 3.\) Hence the ordered basis \(\mathbf{b}^{\prime}=(\vec{u}_1,\vec{u}_2,\vec{u}_3)\) is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle.\)

The careful reader might object that we have not argued above that \(\mathbf{b}^{\prime}\) is indeed well defined and an ordered basis. This is however the case:

Theorem 10.22 • Gram–Schmidt orthonormalisation

Let \((V,\langle\cdot{,}\cdot\rangle)\) be an \(n\)-dimensional Euclidean space and \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V.\) For \(2\leqslant i\leqslant n\) we define recursively \[w_i=v_i-\Pi^{\perp}_{U_{i-1}}(v_i)\qquad \text{and}\qquad u_i=\frac{w_i}{\Vert w_i\Vert},\] where \(U_{i-1}=\operatorname{span}\{u_1,\ldots ,u_{i-1}\}\) and \(u_1=v_1/\Vert v_1\Vert.\) Then \(\mathbf{b}^{\prime}=(u_1,\ldots,u_n)\) is well defined and an orthonormal ordered basis of \(V.\) Moreover, \(\mathbf{b}^{\prime}\) is the unique orthonormal ordered basis of \(V\) so that the change of basis matrix \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})\) is an upper triangular matrix with positive diagonal entries.

Proof. We will use induction on the dimension \(n\) of the Euclidean space \((V,\langle\cdot{,}\cdot\rangle).\) In the case where \(\dim V=1\) we have a single basis vector \(v_1\neq 0_V.\) We set \(u_1=v_1/\Vert v_1\Vert.\) Then \(\mathbf{b}^{\prime}=(u_1)\) is an ordered basis of \(V\) which is orthonormal. The change of basis matrix is \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})=(1/\Vert v_1\Vert)\) and hence is an upper triangular matrix with positive diagonal entries. The only other ordered basis of \(V\) which is orthonormal is \((-u_1),\) but the change of basis matrix for this basis has a negative diagonal entry. Therefore, the statement is anchored.

Inductive step: Suppose \(n\geqslant 2\) and that the statement is true for an \((n-1)\)-dimensional Euclidean space. Let \((V,\langle\cdot{,}\cdot\rangle)\) be an \(n\)-dimensional Euclidean space and \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V.\) Consider the subspace \(U_{n-1}=\operatorname{span}\{v_1,\ldots,v_{n-1}\}\) of dimension \(n-1\) for which \(\mathbf{c}=(v_1,\ldots,v_{n-1})\) is an ordered basis. By the induction hypothesis, there exists a unique ordered basis \(\mathbf{c}^{\prime}=(u_1,\ldots,u_{n-1})\) of \(U_{n-1}\) which is orthonormal and such that the change of basis matrix \(\mathbf{C}(\mathbf{c}^{\prime},\mathbf{c})\) is an upper triangular matrix with positive diagonal entries. Set \(w_n=v_{n}-\Pi_{U_{n-1}}^{\perp}(v_{n})\) so that \(w_n \in U_{n-1}^{\perp}.\) Since \(\mathbf{b}\) is a basis it follows that \(v_n \notin U_{n-1},\) therefore Remark 10.19 implies that \(w_n\neq 0_V\) and we conclude that \(\{u_1,\ldots,u_{n-1},w_n\}\) is orthogonal as well as linearly independent. Let \(u_{n}=w_n/\Vert w_n\Vert,\) then \(\mathbf{b}^{\prime}=(u_1,\ldots,u_n)\) is an ordered basis of \(V\) which is orthonormal. By definition, we have \[u_{n}=\frac{v_{n}-\Pi_{U_{n-1}}^{\perp}(v_{n})}{\Vert v_{n}-\Pi_{U_{n-1}}^{\perp}(v_{n})\Vert}=\frac{v_{n}}{\Vert v_{n}-\Pi_{U_{n-1}}^{\perp}(v_{n})\Vert}+\sum_{i=1}^{n-1}s_i v_i\] for suitable scalars \(s_1,\ldots,s_{n-1}.\) Writing \(\vec{s}=(s_i)_{1\leqslant i\leqslant n-1},\) the change of basis matrix thus takes the form \[\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})=\begin{pmatrix} \mathbf{C}(\mathbf{c}^{\prime},\mathbf{c}) & \vec{s} \\ 0_{\mathbb{R}_{n-1}} & \frac{1}{\Vert v_{n}-\Pi_{U_{n-1}}^{\perp}(v_{n})\Vert}\end{pmatrix}.\] Since \(\mathbf{C}(\mathbf{c}^{\prime},\mathbf{c})\) is an upper triangular matrix with positive entries, it follows that \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})\) is an an upper triangular matrix with positive entries as well. Finally, we argue that \(\mathbf{b}^{\prime}\) is the unique ordered basis of \(V\) satisfying the conditions of the theorem. Notice that \(u_{n}\) must be an element of \(U^{\perp}_{n-1}.\) Now \(\dim V=n\) and \(\dim U_{n-1}=n-1\) and since \(V=U_{n-1}\oplus U^{\perp}_{n-1}\) by Corollary 9.24, we must have \(\dim U_{n-1}^{\perp}=1\) by Remark 10.16. This implies that \(u_{n}\) is uniquely determined up to multiplication by \(\pm 1,\) but the above choice is the only one resulting in a change of basis matrix which has positive diagonal entries. Since by the induction hypothesis the basis \(\mathbf{c}^{\prime}\) is unique, it follows that \(\mathbf{b}^{\prime}\) is the unique ordered basis of \(V\) satisfying the conditions of the theorem.
It follows from Theorem 3.64 that an ordered basis of a subspace \(U\) of a finite dimensional vector space \(V\) can always be extended to a basis of \(V.\) A corresponding statement is also true for orthonormal bases:
Corollary 10.23

Let \((V,\langle\cdot{,}\cdot\rangle)\) be a finite dimensional Euclidean space and \(U\subset V\) a subspace. Suppose \(\mathbf{b}\) is an ordered orthonormal basis of \(U,\) then there exists an ordered orthonormal basis of \(V\) which contains \(\mathbf{b}.\)

Proof. Let \(k=\dim U,\) \(n=\dim V\) and \(\mathbf{b}=(v_1,\ldots,v_k).\) Choose any ordered basis \(\mathbf{c}\) of \(U^{\perp}\) and apply Gram–Schmidt orthonormalisation to \(\mathbf{c}\) to obtain an orthonormal basis \(\mathbf{b}^{\prime}=(v_{k+1},\ldots,v_{n})\) of \(U^{\perp}.\) Since all vectors of \(U\) are orthogonal to all vectors of \(U^{\perp},\) the ordered basis \((v_1,\ldots,v_n)\) is an orthonormal ordered basis for \((V,\langle\cdot{,}\cdot\rangle).\)

Notice that if we carry out the Gram–Schmidt procedure without normalising the vectors \(w_i\) at each step – sometimes referred to as Gram–Schmidt orthogonalisation – then we still obtain an ordered orthogonal basis \((w_1,\ldots,w_n).\)

Example 10.24 • Legendre polynomials

We consider again the vector space \(V=\mathsf{C}([-1,1],\mathbb{R})\) of continuous real-valued functions defined on the interval \([-1,1],\) equipped with the bilinear form defined by the rule \[\langle f,g\rangle=\int_{-1}^1 f(x)g(x)\mathrm{d}x\] for all \(f,g \in V.\) For \(n \in \mathbb{N}\cup\{0\}\) let \(U_n\) denote the subspace of \(V\) consisting of polynomials of degree \(n.\) An ordered basis of \(U_n\) is given by the polynomials \(\mathbf{b}=(1,x,x^2,x^3,\ldots,x^n).\) Applying Gram–Schmidt orthogonalisation we obtain an ordered orthogonal basis \((p_0,p_1,\ldots,p_n)\) of \(U_n.\) That is, for \(i\neq j,\) the polynomials satisfy \[\langle p_i,p_j\rangle=\int_{-1}^1 p_i(x)p_j(x)\mathrm{d}x=0.\] The polynomials \(p_i\) are known at the Legendre polynomials. There are different ways to normalise the Legendre polynomials. Besides the standard normalisation which makes the polynomials orthonormal, that is, \(\langle p_i,p_i\rangle=1,\) it is also common to request that \(\langle p_i,p_i\rangle=2/(2i+1).\) The reason for this normalisation is that it allows to give a neat formula for \(p_i\) known as Rodrigues’ formula (which we will not prove) \[p_i(x)=\frac{1}{2^i i!} \frac{\mathrm{d}^i}{\mathrm{d}x^i}(x^2-1)^i,\] where \(\frac{\mathrm{d}^i}{\mathrm{d}x^i}\) stands for the \(i\)-th derivative with respect to the variable \(x.\) Using this formula we obtain for the first four Legendre polynomials \[\begin{aligned} p_0(x)&=\frac{1}{2^0 0!} \frac{\mathrm{d}^0}{\mathrm{d}x^0}(x^2-1)^0=(x^2-1)^0=1,\\ p_1(x)&=\frac{1}{2^1 1!} \frac{\mathrm{d}}{\mathrm{d}x}(x^2-1)^1=\frac{1}{2}(2x)=x,\\ p_2(x)&=\frac{1}{2^2 2!} \frac{\mathrm{d}^2}{\mathrm{d}x^2}(x^2-1)^2=\frac{1}{8}\frac{\mathrm{d}^2}{\mathrm{d}x^2}(x^4-2x^2+1)=\frac{1}{2}(3x^2-1),\\ p_3(x)&=\frac{1}{2^3 3!} \frac{\mathrm{d}^3}{\mathrm{d}x^3}(x^2-1)^3=\frac{1}{48}\frac{\mathrm{d}^3}{\mathrm{d}x^3}(x^6-3x^4+3x^2-1)=\frac{1}{2}(5x^3-3x). \end{aligned}\]

The Gram–Schmidt orthonormalisation Theorem 10.22 has a matrix version known as the Cholesky decomposition. In order to phrase it, we make the following definition.
Definition 10.25 • Positive definite matrix

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{R}).\) The matrix \(\mathbf{A}\) is called positive definite if the bilinear form \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) on \(\mathbb{R}^n\) is positive definite.

Theorem 10.26 • Cholesky decomposition

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric positive definite matrix. Then there exists a unique upper triangular matrix \(\mathbf{C}\in M_{n,n}(\mathbb{R})\) with positive diagonal entries such that \(\mathbf{A}=\mathbf{C}^T\mathbf{C}.\)

Proof. Since \(\mathbf{A}\) is positive definite and symmetric, \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) is an inner product on \(\mathbb{R}^n.\) Let \(\mathbf{e}=(\vec{e}_1,\ldots,\vec{e}_n)\) denote the standard ordered basis of \(\mathbb{R}^n.\) Recall that we have \(\mathbf{M}(\langle\cdot{,}\cdot\rangle_\mathbf{A},\mathbf{e})=\mathbf{A}.\) Theorem 10.22 implies the existence of a unique ordered basis \(\mathbf{b}^{\prime}\) of \(\mathbb{R}^n\) which is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle_\mathbf{A}.\) Therefore, using Proposition 9.6, we obtain \[\mathbf{A}=\mathbf{M}(\langle\cdot{,}\cdot\rangle_\mathbf{A},\mathbf{e})=\mathbf{C}^T\mathbf{M}(\langle\cdot{,}\cdot\rangle_\mathbf{A},\mathbf{b}^{\prime})\mathbf{C}=\mathbf{C}^T\mathbf{C},\] where \(\mathbf{C}=\mathbf{C}(\mathbf{e},\mathbf{b}^{\prime})\) and where we use that the matrix representation of an inner product with respect to an orthonormal basis is the identity matrix. Theorem 10.22 implies that \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{e})\) is an upper triangular matrix with positive diagonal entries. By Remark 3.104 we have \(\mathbf{C}(\mathbf{e},\mathbf{b}^{\prime})=\mathbf{C}(\mathbf{b}^{\prime},\mathbf{e})^{-1}\) and hence Corollary 5.46 implies that \(\mathbf{C}\) is an upper triangular matrix as well. Now for \(2\leqslant i\leqslant n-1,\) we have (the cases \(i=1\) and \(i=n\) are similar) \[\begin{aligned} 1&=[\mathbf{C}\mathbf{C}^{-1}]_{ii}=\sum_{k=1}^n [\mathbf{C}]_{ik}[\mathbf{C}^{-1}]_{ki}=[\mathbf{C}]_{ii}[\mathbf{C}^{-1}]_{ii}+\sum_{k=1}^{i-1}[\mathbf{C}]_{ik}[\mathbf{C}^{-1}]_{ki}+\sum_{k=i+1}^n [\mathbf{C}]_{ik}[\mathbf{C}^{-1}]_{ki}\\ &=[\mathbf{C}]_{ii}[\mathbf{C}^{-1}]_{ii}, \end{aligned}\] where we use that \(\mathbf{C}\) and \(\mathbf{C}^{-1}\) are upper triangular matrices. It follows that \([\mathbf{C}]_{ii}\) has the same sign as \([\mathbf{C}^{-1}]_{ii}\) for \(1\leqslant i\leqslant n.\) Therefore we conclude that \(\mathbf{C}\) has positive diagonal entries as well. Suppose that \(\hat{\mathbf{C}} \in M_{n,n}(\mathbb{R})\) is another upper triangular matrix with positive diagonal entries so that \(\mathbf{A}=\hat{\mathbf{C}}^T\hat{\mathbf{C}}.\) Using Lemma 3.108 we conclude that there exists an ordered basis \(\mathbf{c}^{\prime}\) of \(\mathbb{R}^n\) such that \(\hat{\mathbf{C}}=\mathbf{C}(\mathbf{e},\mathbf{c}^{\prime}).\) Since \(\mathbf{A}=\hat{\mathbf{C}}^T\hat{\mathbf{C}}\) the basis \(\mathbf{c}^{\prime}\) is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle_\mathbf{A}.\) The uniqueness statement of Theorem 10.22 implies that \(\mathbf{c}^{\prime}=\mathbf{b}^{\prime}\) and hence \(\mathbf{C}=\hat{\mathbf{C}}.\)
Remark 10.27 Notice that every invertible matrix \(\mathbf{C}\in M_{n,n}(\mathbb{R})\) gives rise to a symmetric positive definite matrix \(\mathbf{A}=\mathbf{C}^T\mathbf{C}.\) Indeed, by Remark 2.18 we have \(\mathbf{A}^T=(\mathbf{C}^T\mathbf{C})^T=\mathbf{C}^T(\mathbf{C}^T)^T=\mathbf{C}^T\mathbf{C}=\mathbf{A}\) so that \(\mathbf{A}\) is symmetric. Using Remark 2.18 again we obtain for all \(\vec{x},\in \mathbb{R}^n\) \[\langle \vec{x},\vec{x}\rangle_\mathbf{A}=\vec{x}^T\mathbf{C}^T\mathbf{C}\vec{x}=(\mathbf{C}\vec{x})^T\mathbf{C}\vec{x}=\langle \mathbf{C}\vec{x},\mathbf{C}\vec{x}\rangle\] where the bilinear form on the right hand side denotes the standard scalar product on \(\mathbb{R}^n.\) In particular this implies that \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) is positive. Since the standard scalar product on \(\mathbb{R}^n\) is positive definite, the last expression is \(0\) if and only if \(\mathbf{C}\vec{x}=0_{\mathbb{R}^n}.\) Since \(\mathbf{C}\) is invertible this condition is equivalent to \(\vec{x}=0_{\mathbb{R}^n}.\) It follows that \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) is positive definite as well.

Finally, we observe that the coordinate representation of a vector with respect to an orthonormal basis can be computed easily:

Remark 10.28 • Coordinate representation with respect to an orthonormal basis

Let \((V,\langle\cdot{,}\cdot\rangle)\) be a finite dimensional Euclidean space equipped with an ordered orthonormal basis \(\mathbf{b}=(v_1,\ldots,v_n)\) with corresponding linear coordinate system \(\boldsymbol{\beta}.\) Then for all \(v \in V\) we have \[\boldsymbol{\beta}(v)=\begin{pmatrix} \langle v,v_1\rangle \\ \vdots \\ \langle v,v_n\rangle\end{pmatrix} \qquad \iff \qquad v=\sum_{i=1}^n \langle v,v_i\rangle v_i\] Indeed, since \(\mathbf{b}\) is a basis we can write \(v=\sum_{i=1}^n s_i v_i\) for unique real numbers \(s_i,\) where \(1\leqslant i\leqslant n.\) Using that \(\langle v_i,v_j\rangle=0\) for \(i\neq j\) and that \(\langle v_i,v_i\rangle=1,\) we obtain \[\langle v,v_j\rangle=\Big\langle\sum_{i=1}^n s_i v_i,v_j\Big\rangle=\sum_{i=1}^n s_i\langle v_i,v_j\rangle=s_j.\] Correspondingly, for all \(v \in V\) we obtain the following formula for the length of \(v\) \[\begin{aligned} \Vert v\Vert&=\Big\Vert\sum_{i=1}^n\langle v,v_i\rangle v_i\Big\Vert=\sqrt{\Big\langle \sum_{i=1}^n\langle v,v_i\rangle v_i,\sum_{j=1}^n\langle v,v_j\rangle v_j\Big\rangle}\\ &=\sqrt{\sum_{i=1}^n\sum_{j=1}^n\langle v,v_i\rangle\langle v,v_j\rangle\langle v_i,v_j\rangle}=\sqrt{\sum_{i=1}^n\langle v,v_i\rangle^2}. \end{aligned}\]

Remark 10.29 • Linear independence of orthogonal vectors

Let \((V,\langle\cdot{,}\cdot\rangle)\) be a Euclidean space and \(\{u_1,\ldots,u_k\}\) be non-zero orthogonal vectors so that \(\langle u_i,u_j\rangle=0\) for all \(1\leqslant i,j\leqslant k\) with \(i\neq j.\) Suppose we have scalars \(s_1,\ldots,s_k \in \mathbb{R}\) such that \(\sum_{j=1}^k s_j u_j=0_V.\) Then, taking the inner product with \(u_i\) gives \[0=\langle 0_V,u_i\rangle=\sum_{j=1}^K s_j\langle u_j,u_i\rangle=s_i\langle u_i,u_i\rangle.\] Since by assumption \(u_i\neq 0_V,\) we have \(\langle u_i,u_i\rangle \neq 0\) and hence \(s_i=0.\) It follows that \(\{u_1,\ldots,u_k\}\) is linearly independent.

Example 10.30(Example 10.21 continued). If we want to compute \(\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})\) we need to compute \(\boldsymbol{\beta}^{\prime}(v_i)\) for \(i=1,2,3\) and write the resulting vectors into the columns of \(\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime}).\) Since \(\mathbf{b}^{\prime}\) is orthonormal, the preceding remark gives \[\boldsymbol{\beta}^{\prime}(v_1)=\begin{pmatrix} \langle v_1,u_1\rangle \\ \langle v_1,u_2\rangle \\ \langle v_1,u_3\rangle\end{pmatrix}=\begin{pmatrix} \sqrt{3} \\ 0 \\ 0 \end{pmatrix}.\] Likewise we have \[\boldsymbol{\beta}^{\prime}(v_2)=\begin{pmatrix} \langle v_2,u_1\rangle \\ \langle v_2,u_2\rangle \\ \langle v_2,u_3\rangle\end{pmatrix}=\begin{pmatrix} \frac{2}{\sqrt{3}} \\ \frac{\sqrt{2}}{\sqrt{3}} \\ 0 \end{pmatrix}\] and \[\boldsymbol{\beta}^{\prime}(v_3)=\begin{pmatrix} \langle v_3,u_1\rangle \\ \langle v_3,u_2\rangle \\ \langle v_3,u_3\rangle\end{pmatrix}=\begin{pmatrix} \frac{2}{\sqrt{3}} \\ -\frac{1}{\sqrt{6}} \\ \frac{1}{\sqrt{2}} \end{pmatrix}\] so that \[\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})=\begin{pmatrix} \sqrt{3} & \frac{2}{\sqrt{3}} & \frac{2}{\sqrt{3}} \\ 0 & \frac{\sqrt{2}}{\sqrt{3}} & -\frac{1}{\sqrt{6}} \\ 0 & 0 & \frac{1}{\sqrt{2}}\end{pmatrix}.\] The proof of Theorem 10.26 implies that \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})=\mathbf{C}(\mathbf{b},\mathbf{b}^{\prime})^{-1}\) is an upper triangular matrix with positive diagonal entries as well, as predicted by Theorem 10.22.

Exercises

Exercise 10.31

Compute the Cholesky decomposition of the positive definite symmetric matrix \[\mathbf{A}=\begin{pmatrix} 3 & 0 & -1 \\ 0 & 8 & 4 \\ -1 & 4 & 3 \end{pmatrix}.\]

Solution

Since \(\mathbf{A}\) is positive definite and symmetric, we start by finding the unique ordered basis \(\mathbf{b}=(\vec v_1,\vec v_2,\vec v_3)\) of \(\mathbb{R}^3\) which is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) such that \(\mathbf{C}(\mathbf{b},\mathbf{e})\) is upper triangular with positive entries on the diagonal. Normalizing \(\vec e_1\) yields \[\vec v_1 = \begin{pmatrix}\frac{\sqrt 3}{3}\\0\\0\end{pmatrix}.\] Since \(\vec e_2\) is already \(\mathbf{A}\)-orthogonal to \(\vec v_1,\) we only need to normalize to obtain \[\vec v_2=\begin{pmatrix}0\\\frac{\sqrt{2}}{4}\\0\end{pmatrix}.\] Since \[\vec e_3 - \langle \vec e_3,\vec v_2\rangle_\mathbf{A}\vec v_2 - \langle \vec e_3,\vec v_1\rangle_\mathbf{A}\vec v_1 = \begin{pmatrix}\frac13\\-\frac12\\1\end{pmatrix},\] we find after normalizing \[\vec v_3 = \frac{\sqrt 6}{2}\begin{pmatrix} \frac13 \\ -\frac12 \\ 1\end{pmatrix}.\] This implies that \[\mathbf{C}(\mathbf{b},\mathbf{e}) = \begin{pmatrix}\frac{\sqrt 3}{3} & 0 & \frac{\sqrt 6}{6}\\ 0 & \frac{\sqrt 2}{4}& -\frac{\sqrt 6}{4} \\ 0 & 0 & \frac{\sqrt 6}{2}\end{pmatrix}\] and by construction, \(\mathbf{M}(\langle\cdot{,}\cdot\rangle_\mathbf{A},\mathbf{b}) = \mathbf{1}_{3}.\) Furthermore, if \(\mathbf{C}= \mathbf{C}(\mathbf{e},\mathbf{b}),\) then \[\mathbf{C}(\mathbf{e},\mathbf{b})^T\mathbf{M}(\langle\cdot{,}\cdot\rangle_\mathbf{A},\mathbf{b}) \mathbf{C}(\mathbf{e},\mathbf{b}) = \mathbf{C}^T\mathbf{C}= \mathbf{A},\] so that \(\mathbf{C}\) is obtained by inverting \(\mathbf{C}(\mathbf{b},\mathbf{e})\) and we find \[\mathbf{C}= \begin{pmatrix}\sqrt 3 & 0 & -\frac{\sqrt 3}{3}\\ 0 & 2\sqrt 2 & \sqrt 2\\ 0 & 0 & \frac{\sqrt{6}}{3}\end{pmatrix}.\]

Home

Contents

Exercises

Lecture Recordings

Quizzes

Study Weeks