11 Unitary spaces
Unitary spaces are the complex companions of Euclidean spaces. Much of the theory of Euclidean spaces also holds over to the complex numbers when we suitably adapt the notion of an inner product. In addition, almost all proofs carry over from the real case, hence we will only provide proofs when the arguments from the real case do not work.
11.1 Hermitian inner products
Naively one might define a “standard scalar product” on \(\mathbb{C}^n\) as in the case of \(\mathbb{R}^n,\) that is, for \(\vec{z}=(z_i)_{1\leqslant i\leqslant n}\) and \(\vec{w}=(w_i)_{1\leqslant i\leqslant n} \in \mathbb{C}^n\) we put \(\vec{z}\cdot \vec{w}=\sum_{i=1}^n z_iw_i.\) However, doing so, it is not true any more that \(\vec{z}\cdot \vec{z}=0\) only for the zero vector in \(\mathbb{C}^n.\) For instance, the vector \[\vec{z}=\begin{pmatrix} 1 \\ \mathrm{i}\end{pmatrix}\] satisfies \(\vec{z}\cdot\vec{z}=0,\) but \(\vec{z}\neq 0_{\mathbb{C}^2}.\) Instead of the above definition we define the Hermitian standard scalar product on \(\mathbb{C}^n\) by the rule \[\langle \vec{z},\vec{w}\rangle=\sum_{i=1}^n \overline{z}_iw_i,\] where \(\overline{z}\) denotes the complex conjugate of the complex number \(z \in \mathbb{C}.\) Recall that \(z\overline{z}=\operatorname{Re}(z)^2+\operatorname{Im}(z)^2\geqslant 0\) so that \(z\overline{z}=0\) if and only if \(z=0.\) The Hermitian standard scalar product is an example of a sesquilinear form:
Let \(V\) be a complex vector space. A sesquilinear form on \(V\) is a map \(\langle\cdot{,}\cdot\rangle: V \times V \to \mathbb{C}\) such that
\(\langle\cdot{,}\cdot\rangle\) is linear in the second variable, that is, \[\langle v,s_1w_1+s_2w_2\rangle=s_1\langle v,w_1\rangle+s_2\langle v,w_2\rangle\] for all \(s_1,s_2 \in \mathbb{C}\) and all \(v,w_1,w_2 \in V\);
\(\langle\cdot{,}\cdot\rangle\) is conjugate linear in the first variable, that is, \[\langle s_1w_1+s_2w_2,v\rangle=\overline{s_1}\langle w_1,v\rangle+\overline{s_2}\langle w_2,v\rangle\] for all \(s_1,s_2 \in \mathbb{C}\) and all \(v,w_1,w_2 \in V\);
Moreover, a sesquilinear form is called Hermitian if \[\langle v,w\rangle=\overline{\langle w,v\rangle}\] for all \(v,w \in V.\)
Sesquilinear forms correspond to bilinear forms in the real setting and Hermitian forms correspond to symmetric bilinear forms.
In our convention a sesquilinear form is conjugate linear in the first variable and linear in the second variable. The reader is warned that some authors use the opposite convention so that a sesquilinear form is linear in the first variable and conjugate linear in the second variable.
Let \(V\) be a finite dimensional \(\mathbb{C}\)-vector space and \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V.\) As in the case of bilinear forms over real vector spaces, we define the matrix representation of a sesquilinear form \(\langle\cdot{,}\cdot\rangle\) on \(V\) with respect to \(\mathbf{b}\) \[\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})=(\langle v_i,v_j\rangle)_{1\leqslant i,j\leqslant n}.\] Recall that in the real setting symmetric bilinear forms are represented by symmetric matrices. Similarly, sesquilinear Hermitian forms – usually just called Hermitian forms – are represented by so-called Hermitian matrices. For a precise definition, we need:
Let \(\mathbf{A}=(A_{ij})_{1\leqslant i\leqslant m,1\leqslant j\leqslant n}\in M_{m,n}(\mathbb{C}).\) The conjugate matrix of \(\mathbf{A}\) is the matrix \(\overline{\mathbf{A}}=(\overline{A_{ij}})_{1\leqslant i\leqslant m,1\leqslant j\leqslant n}\in M_{m,n}(\mathbb{C})\) whose entries are the complex conjugates of the entries of \(\mathbf{A}.\)
For all \(\mathbf{A},\mathbf{B}\in M_{m,n}(\mathbb{C})\) and all \(s,t\in \mathbb{C},\) we have \[\overline{s\mathbf{A}+t\mathbf{B}}=\overline{s}\,\overline{\mathbf{A}}+\overline{t}\overline{\mathbf{B}}, \qquad \overline{\overline{\mathbf{A}}}=\mathbf{A}, \qquad \overline{\mathbf{A}^T}=\overline{\mathbf{A}}^T.\]
For all \(\mathbf{A}\in M_{m,n}(\mathbb{C})\) and \(\mathbf{B}\in M_{n,p}(\mathbb{C}),\) we have \[\overline{\mathbf{A}\mathbf{B}}=\overline{\mathbf{A}}\,\overline{\mathbf{B}}.\] In particular, \(\mathbf{A}\in M_{n,n}(\mathbb{C})\) is invertible if and only if \(\overline{\mathbf{A}}\) is invertible and \(\left(\overline{\mathbf{A}}\right)^{-1}=\overline{\mathbf{A}^{-1}}.\)
For all \(\mathbf{A}\in M_{n,n}(\mathbb{C})\) we have \[\overline{\det \mathbf{A}}=\det(\overline{\mathbf{A}}).\]
Proof. (i) and (ii) follow from the definitions of matrix operations and from \(\overline{\overline{z}}=z\) and \(\overline{zw}=\overline{z}\,\overline{w}\) for all complex numbers \(z,w.\) (iii) follows from the Leibniz formula Proposition 5.39.
Hermitian matrices have the property that their transpose equals their conjugate matrix.
A matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{C})\) is called Hermitian if \[\mathbf{A}^T=\overline{\mathbf{A}}\quad \iff \quad \mathbf{A}=\overline{\mathbf{A}}^T \quad \iff \quad A_{ji}=\overline{A_{ij}}, \quad 1\leqslant i,j\leqslant n.\]
Notice that the diagonal entries of a Hermitian matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{C})\) satisfy \(A_{ii}=\overline{A_{ii}}\) for all \(1\leqslant i\leqslant n\) and hence must be real.
If we write \(\mathbf{A}\in M_{n,n}(\mathbb{C})\) as \(\mathbf{A}=\mathbf{B}+\mathrm{i}\mathbf{C}\) for \(\mathbf{B},\mathbf{C}\in M_{n,n}(\mathbb{R}),\) then \(\mathbf{A}\) is Hermitian if and only if \[\mathbf{A}^T=(\mathbf{B}+\mathrm{i}\mathbf{C})^T=\mathbf{B}^T+\mathrm{i}\mathbf{C}^T=\overline{\mathbf{A}}=\mathbf{B}-\mathrm{i}\mathbf{C}\] which is equivalent to \(\mathbf{B}\) being symmetric and \(\mathbf{C}\) being anti-symmetric.
\(2\times 2\) and \(3\times 3\) Hermitian matrices are of the form \[\begin{pmatrix} a & z \\ \overline{z} & b \end{pmatrix}, \qquad \begin{pmatrix} a & z & w \\ \overline{z} & b & u \\ \overline{w} & \overline{u} & c \end{pmatrix}\] for \(a,b,c \in \mathbb{R}\) and \(u,z,w \in \mathbb{C}.\)
In analogy to Proposition 9.6 we obtain:
Let \(V\) be a finite dimensional \(\mathbb{C}\)-vector space and \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V\) with associated linear coordinate system \(\boldsymbol{\beta}: V \to \mathbb{K}^n.\) Suppose \(\langle\cdot{,}\cdot\rangle\) is a sesquilinear form on \(V,\) then
for all \(v,w \in V\) we have \[\langle v,w\rangle=\overline{\boldsymbol{\beta}(v)}^T\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})\boldsymbol{\beta}(w);\]
\(\langle\cdot{,}\cdot\rangle\) is Hermitian if and only if \(\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})\) is a Hermitian matrix;
if \(\mathbf{b}^{\prime}\) is another ordered basis of \(V,\) then \[\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b}^{\prime})=\overline{\mathbf{C}}^T\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})\mathbf{C},\] where \(\mathbf{C}=\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})\) denotes the change of basis matrix.
Proof. Exercise.
Non-degenerateness of a sesquilinear form is defined exactly as in the real case and correspondingly, a sesquilinear form on a finite dimensional complex vector space is non-degenerate if and only if its matrix representation with respect to some (and hence any) basis has non-vanishing determinant (c.f. Proposition 9.10).
Again, in analogy to the real case we call a sesquilinear form \(\langle\cdot{,}\cdot\rangle\) on \(V\) positive if \(\langle v,v\rangle\geqslant 0\) for all \(v \in V\) and positive definite if \(\langle\cdot{,}\cdot\rangle\) is positive and \(\langle v,v\rangle=0\) if and only if \(v=0_V.\) Also, in analogy to Definition 10.2, we define:
Let \(V\) be a \(\mathbb{C}\)-vector space. A sesquilinear form on \(V\) that is positive definite and Hermitian is called a Hermitian inner product.
Suppose \(\mathbf{A}\in M_{n,n}(\mathbb{C})\) is a Hermitian matrix, then the map \(\langle\cdot{,}\cdot\rangle: \mathbb{C}^n \times \mathbb{C}^n \to \mathbb{C}\) defined by the rule \[\langle \vec{z},\vec{w}\rangle_{\mathbf{A}}=(\overline{\vec{z}})^T\mathbf{A}\vec{w}\] for all \(\vec{z},\vec{w} \in \mathbb{C}^n\) defines a Hermitian form on \(\mathbb{C}^n.\)
Let \(a<b\) be real numbers and consider \(V=\mathsf{C}([a,b],\mathbb{C}),\) the complex-vector space of continuous complex-valued functions on the interval \([a,b].\) We define \(\langle\cdot{,}\cdot\rangle: V \times V \to \mathbb{C}\) by the rule \[\langle f,g\rangle=\int_{a}^b \overline{f(x)}g(x)\mathrm{d}x.\] Then the properties of integration from the Analysis module show that \(\langle\cdot{,}\cdot\rangle\) is a Hermitian inner product on \(V.\)
Let \(V=M_{n,n}(\mathbb{C})\) denote the \(\mathbb{C}\)-vector space of \(n\times n\)-matrices with complex entries. We define a map \(\langle\cdot{,}\cdot\rangle: V \times V \to \mathbb{C}\) defined by the rule \[\langle \mathbf{A},\mathbf{B}\rangle=\operatorname{Tr}\left(\overline{\mathbf{A}}^T\mathbf{B}\right)\] for all \(\mathbf{A},\mathbf{B}\in M_{n,n}(\mathbb{C}).\) Since the trace is a linear map \(\operatorname{Tr}: M_{n,n}(\mathbb{C}) \to \mathbb{C}\) satisfying \(\operatorname{Tr}(\overline{\mathbf{A}})=\overline{\operatorname{Tr}(\mathbf{A})}\) for all \(\mathbf{A}\in M_{n,n}(\mathbb{C}),\) it follows that \(\langle\cdot{,}\cdot\rangle\) is a Hermitian form on \(M_{n,n}(\mathbb{C}).\) Writing \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n},\) we obtain \[\langle \mathbf{A},\mathbf{A}\rangle=\sum_{i=1}^n \sum_{j=1}^n \overline{A_{ji}}A_{ji}=\sum_{i=1}^n\sum_{j=1}^n|A_{ji}|^2\] so that \(\langle \mathbf{A},\mathbf{A}\rangle\geqslant 0\) and \(\langle \mathbf{A},\mathbf{A}\rangle=0\) if and only if all entries of \(\mathbf{A}\) are zero, that is, \(\mathbf{A}=0.\) We conclude that \(\langle\cdot{,}\cdot\rangle\) defines a Hermitian inner product on \(M_{n,n}(\mathbb{C}).\)
The complex companions of Euclidean spaces (c.f. Definition 10.7) are the so-called unitary spaces:
A pair \((V,\langle\cdot{,}\cdot\rangle)\) consisting of an \(\mathbb{C}\)-vector space \(V\) and a Hermitian inner product \(\langle\cdot{,}\cdot\rangle\) on \(V\) is called a unitary space.
As in the case of Euclidean spaces, a Hermitian inner product \(\langle\cdot{,}\cdot\rangle\) on a complex vector space \(V\) allows to define a norm \(\Vert \cdot \Vert=\sqrt{\langle\cdot{,}\cdot\rangle}\) on \(V.\) Since \(\langle\cdot{,}\cdot\rangle\) is a Hermitian form, we have that \(\langle v,v\rangle=\overline{\langle v, v \rangle}\) for all \(v \in V.\) Therefore, \(\langle v,v\rangle\) is a non-negative real number for all \(v \in V\) and hence \(\Vert \cdot \Vert\) is well defined. Although we will not prove it here, the Cauchy–Schwarz inequality holds as well in the setting of unitary spaces. That is, as in Proposition 10.8, we have again that for all \(v_1,v_2 \in V\) \[|\langle v_1,v_2\rangle|\leqslant \Vert v_1 \Vert \Vert v_2 \Vert\] with equality if and only if \(\{v_1,v_2\}\) are linearly dependent. Here \(|\cdot|\) on the left denotes the absolute value.
The distance function is also defined analogously and again we have the triangle inequality. Again, we will not prove this.
The notions of orthogonality, orthonormality, the orthogonal complement, the orthogonal projection onto a subspace are again defined analogously to the Euclidean case.
Consider \(V=\mathsf{C}([0,2\pi],\mathbb{C}),\) the \(\mathbb{C}\)-vector space of continuous complex-valued functions defined on the interval \([0,2\pi].\) We equip \(V\) with the Hermitian inner product \(\langle\cdot{,}\cdot\rangle\) as defined in Example 11.10 above. For \(n \in \mathbb{Z}\) let \(f_n : [0,2\pi] \to \mathbb{C}\) be defined by the rule \[f_n(t)=\frac{1}{\sqrt{2 \pi}}\mathrm{e}^{\mathrm{i}n t}\] for all \(t \in [0,2\pi].\) Then for \(n\neq m,\) we obtain \[\langle f_n,f_m\rangle=\frac{1}{2\pi}\int_0^{2\pi} \overline{\mathrm{e}^{\mathrm{i}n t}}\mathrm{e}^{\mathrm{i}m t}\mathrm{d}t=\frac{1}{2\pi}\int_0^{2\pi} \mathrm{e}^{\mathrm{i}(m-n)t}\mathrm{d}t=\left.\frac{1}{2\pi \mathrm{i}(m-n)}\mathrm{e}^{\mathrm{i}(m-n)t}\right|_{0}^{2\pi}=0\] and for all \(n \in \mathbb{Z}\) we have that \(\langle f_n,f_n\rangle=1.\) It follows that \(\{f_n | n \in \mathbb{Z}\}\) is an orthonormal subset of \(V.\) This observation is at the heart of the theory of Fourier series.
Again, Theorem 10.22 also has a complex version:
Let \((V,\langle\cdot{,}\cdot\rangle)\) be an \(n\)-dimensional unitary space and \(\mathbf{b}=(v_1,\ldots,v_n)\) an ordered basis of \(V.\) For \(2\leqslant i\leqslant n\) we define recursively \[w_i=v_i-\Pi^{\perp}_{U_{i-1}}(v_i)\qquad \text{and}\qquad u_i=\frac{w_i}{\Vert w_i\Vert},\] where \(U_{i-1}=\operatorname{span}\{u_1,\ldots ,u_{i-1}\}\) and \(u_1=v_1/\Vert v_1\Vert.\) Then \(\mathbf{b}^{\prime}=(u_1,\ldots,u_n)\) is well defined and an orthonormal ordered basis of \(V.\) Moreover, \(\mathbf{b}^{\prime}\) is the unique orthonormal ordered basis of \(V\) so that the change of basis matrix \(\mathbf{C}(\mathbf{b}^{\prime},\mathbf{b})\) is an upper triangular matrix whose diagonal entries are real and positive.
As in Definition 10.25, we have:
Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{C}).\) The matrix \(\mathbf{A}\) is called positive definite if the sesquilinear form \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) on \(\mathbb{C}^n\) is positive definite.
As in Theorem 10.26, we obtain:
(Cholesky decomposition over \(\mathbb{C}\)). Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{C})\) be a positive definite Hermitian matrix. Then there exists a unique upper triangular matrix \(\mathbf{C}\in M_{n,n}(\mathbb{C})\) with real and positive diagonal entries such that \(\mathbf{A}=\overline{\mathbf{C}^T}\mathbf{C}.\)
Similar to the real case (c.f. Remark 10.27), for an invertible complex matrix \(\mathbf{C}\in M_{n,n}(\mathbb{C}),\) the matrix \(\overline{\mathbf{C}}^T\mathbf{C}\) is Hermitian and positive definite.
As in Remark 10.28, in a finite dimensional unitary space \((V,\langle\cdot{,}\cdot\rangle)\) equipped with an ordered orthonormal basis \(\mathbf{b}=(v_1,\ldots,v_n),\) we have the following identities for all \(v \in V\) \[v=\sum_{i=1}^n \langle v,v_i\rangle v_i\qquad \text{and}\qquad \Vert v\Vert=\sqrt{\sum_{i=1}^n \langle v,v_i\rangle^2}.\]
Exercises
Compute the Cholesky decomposition of the positive definite Hermitian matrix \[\mathbf{A}=\begin{pmatrix} 6 & -1+\mathrm{i}& -2 \\ -1-\mathrm{i}& 3 & -2+\mathrm{i}\\ -2 & -2-\mathrm{i}& 3 \end{pmatrix}.\]
Solution
We proceed in similar fashion as in Exercise 10.31 to find an ordered basis \(\mathbf{b}=(\vec v_1,\vec v_2,\vec v_3)\) of \(\mathbb{C}^3,\) which is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle_\mathbf{A}\) such that \(\mathbf{C}(\mathbf{b},\mathbf{e})\) is upper triangular with positive entries on the diagonal.
We start with \(\vec e_1\) and observe that \(\langle \vec e_1,\vec e_1\rangle_\mathbf{A}= 6\) so that \(\vec v_1 = \frac{\sqrt 6}{6}\vec e_1.\) In order to determine \(\vec v_2\) we compute \[\vec e_2 - \langle \vec v_1,\vec e_2\rangle_\mathbf{A}\vec v_1 = \begin{pmatrix}\frac16(1-\mathrm i)\\ 1 \\ 0\end{pmatrix}.\] Note that in contrast to the real case treated in Exercise 10.31, \(\langle \vec v_1,\vec e_2\rangle_\mathbf{A}\ne \langle \vec e_2,\vec v_1\rangle_\mathbf{A}.\) In fact, it holds that \(\langle \vec v_1,\vec e_2\rangle_\mathbf{A}=\overline{ \langle \vec e_2,\vec v_1\rangle}_\mathbf{A}.\) If the order is not chosen as in the computation above, the resulting vector will not be \(\mathbf{A}\)-orthogonal to \(\vec v_1.\) Normalizing the vector we have obtained yields \[\vec v_2 = \begin{pmatrix}\frac{\sqrt 6}{24}(1-\mathrm i) \\ \frac{\sqrt 6}{4}\\ 0 \end{pmatrix}.\] Now we compute \[\vec e_3 - \langle\vec v_1,\vec e_3\rangle_\mathbf{A}\vec v_1 - \langle v_2,\vec e_3\rangle_\mathbf{A}\vec v_2 = \begin{pmatrix}\frac{1}{16}(7-3\mathrm i)\\ \frac{1}{8}(7-2\mathrm i) \\ 1\end{pmatrix}\] and hence \[\vec v_3 = \begin{pmatrix}\frac{\sqrt{2}}{8}(7-3\mathrm i) \\ \frac{\sqrt{2}}{4}(7-2\mathrm i) \\ 2\sqrt2\end{pmatrix}.\] The \(\mathbf{A}\)-orthonormal ordered basis is then given by \[\mathbf{b}= \left(\begin{pmatrix}\frac{\sqrt 6}{6} \\ 0 \\ 0 \end{pmatrix},\begin{pmatrix}\frac{\sqrt 6}{24}(1-\mathrm i)\\ \frac{\sqrt 6}{4} \\ 0 \end{pmatrix},\begin{pmatrix}\frac{\sqrt{2}}{8}(7-3\mathrm i) \\ \frac{\sqrt{2}}{4}(7-2\mathrm i) \\ 2\sqrt2\end{pmatrix}\right),\] so that the matrix \(\mathbf{C}\) we are looking for is given by the inverse of \[\begin{pmatrix}\frac{\sqrt 6}{6} & \frac{\sqrt 6}{24}(1-\mathrm i) & \frac{\sqrt{2}}{8}(7-3\mathrm i) \\ 0 & \frac{\sqrt 6}{4} & \frac{\sqrt{2}}{4}(7-2\mathrm i) \\ 0 & 0 & 2\sqrt2\end{pmatrix}\] and we obtain \[\mathbf{C}= \begin{pmatrix} \sqrt 6 & -\frac{\sqrt 6}{6}(1-\mathrm i) & -\frac{\sqrt 6}{3} \\ 0 & \frac23\sqrt 6 & -\frac{\sqrt 6}{6}\left(\frac72-\mathrm i\right) \\ 0 & 0 & \frac{\sqrt 2}{4}\end{pmatrix}.\]