5 The determinant
5.1 Axiomatic characterisation
Surprisingly, whether or not a square matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) admits an inverse is captured by a single scalar, called the determinant of \(\mathbf{A}\) and denoted by \(\det \mathbf{A}\) or \(\det(\mathbf{A}).\) That is, the matrix \(\mathbf{A}\) admits an inverse if and only if \(\det \mathbf{A}\) is nonzero. In practice, however, it is often quicker to use Gauss–Jordan elimination to decide whether the matrix admits an inverse. The determinant is nevertheless a useful tool in linear algebra.
The determinant is an object of multilinear algebra, which – for \(\ell \in \mathbb{N}\) – considers mappings from the \(\ell\)-fold Cartesian product of a \(\mathbb{K}\)-vector space into another \(\mathbb{K}\)-vector space. Such a mapping \(f\) is required to be linear in each variable. This simply means that if we freeze all variables of \(f,\) except for the \(k\)-th variable, \(1\leqslant k\leqslant \ell,\) then the resulting mapping \(g_{k}\) of one variable is required to be linear. More precisely:
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) A mapping \(f : V^\ell \to W\) is called \(\ell\)-multilinear (or simply multilinear) if the mapping \(g_{k} : V \to W,\) \(v \mapsto f(v_1,\ldots,v_{k-1},v,v_{k+1},\ldots,v_{\ell})\) is linear for all \(1 \leqslant k \leqslant \ell\) and for all \(\ell\)-tuples \((v_1,\ldots,v_{\ell}) \in V^\ell.\)
We only need an \((\ell-1)\)-tuple of vectors to define the map \(g_{k},\) but the above definition is more convenient to write down.
Two types of multilinear maps are of particular interest:
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(f : V^\ell \to W\) an \(\ell\)-multilinear map.
The map \(f\) is called symmetric if exchanging two arguments does not change the value of \(f.\) That is, we have \[f(v_1,\ldots,v_{\ell})=f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell.\)
The map \(f\) is called alternating if \(f(v_1,\ldots,v_{\ell})=0_W\) whenever at least two arguments agree, that is, there exist \(i\neq j\) with \(v_i=v_j.\) Alternating \(\ell\)-multilinear maps are also called \(W\)-valued \(\ell\)-forms or simply \(\ell\)-forms when \(W=\mathbb{K}.\)
\(1\)-multilinear maps are simply linear maps. \(2\)-multilinear maps are called bilinear and \(3\)-multilinear maps are called trilinear. Most likely, you are already familiar with two examples of bilinear maps:
The first one is the scalar product of two vectors in \(\mathbb{R}^3\) (or more generally \(\mathbb{R}^n\)). So \(V=\mathbb{R}^3\) and \(W=\mathbb{R}.\) Recall that the scalar product is the mapping \[V^2=\mathbb{R}^3 \times \mathbb{R}^3 \to \mathbb{R}, \quad (\vec{x},\vec{y})\mapsto \vec{x}\cdot \vec{y}=x_1y_1+x_2y_2+x_3y_3,\] where we write \(\vec{x}=(x_i)_{1\leqslant i\leqslant 3}\) and \(\vec{y}=(y_i)_{1\leqslant i\leqslant 3}.\) Notice that for all \(s_1,s_2 \in \mathbb{R}\) and all \(\vec{x}_1,\vec{x}_2,\vec{y}\in \mathbb{R}^3\) we have \[(s_1\vec{x}_1+s_2\vec{x}_2)\cdot \vec{y}=s_1(\vec{x}_1\cdot\vec{y})+s_2(\vec{x}_2\cdot\vec{y}),\] so that the scalar product is linear in the first variable. Furthermore, the scalar product is symmetric, \(\vec{x}\cdot \vec{y}=\vec{y}\cdot \vec{x}.\) It follows that the scalar product is also linear in the second variable, hence it is bilinear or \(2\)-multilinear.
The second one is the cross product of two vectors in \(\mathbb{R}^3.\) Here \(V=\mathbb{R}^3\) and \(W=\mathbb{R}^3.\) Recall that the cross product is the mapping \[V^2=\mathbb{R}^3 \times \mathbb{R}^3 \to \mathbb{R}^3, \quad (\vec{x},\vec{y})\mapsto \vec{x}\times \vec{y}=\begin{pmatrix} x_2y_3-x_3y_2 \\ x_3y_1-x_1y_3 \\ x_1y_2-x_2y_1 \end{pmatrix}.\] Notice that for all \(s_1,s_2 \in \mathbb{R}\) and all \(\vec{x}_1,\vec{x}_2,\vec{y}\in \mathbb{R}^3\) we have \[(s_1\vec{x}_1+s_2\vec{x}_2)\times \vec{y}=s_1(\vec{x}_1\times\vec{y})+s_2(\vec{x}_2\times\vec{y}),\] so that the cross product is linear in the first variable. Likewise, we can check that the cross product is also linear in the second variable, hence it is bilinear or \(2\)-multilinear. Observe that the cross product is alternating.
Let \(V=\mathbb{K}\) and consider \(f : V^\ell \to \mathbb{K},\) \((x_1,\ldots,x_\ell)\mapsto x_1x_2\cdots x_\ell.\) Then \(f\) is \(\ell\)-multilinear and symmetric.
Let \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric matrix, \(\mathbf{A}^T=\mathbf{A}.\) Notice that we obtain a symmetric bilinear map \[f : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}, \quad (x,y) \mapsto \vec{x}^T\mathbf{A}\vec{y},\] where on the right hand side all products are defined by matrix multiplication.
Let \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric matrix, \(\mathbf{A}^T=\mathbf{A}.\) Notice that we obtain a symmetric bilinear map \[f : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}, \quad (x,y) \mapsto \vec{x}^T\mathbf{A}\vec{y},\] where on the right hand side all products are defined by matrix multiplication.
Let \(\{\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n\}\) denote the standard basis of \(\mathbb{K}_n\) so that \(\Omega(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=\mathbf{1}_{n}.\)
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : (\mathbb{K}_n)^n \to \mathbb{K}\) satisfying \(f_n(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=1.\)
Recall that we have bijective mapping \(\Omega : (\mathbb{K}_n)^n \to M_{n,n}(\mathbb{K})\) which forms an \(n\times n\)-matrix from \(n\) row vectors of length \(n.\) For the choice \(V=\mathbb{K}_n,\) the notion of \(n\)-multilinearity thus also makes sense for a mapping \(f : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) which takes an \(n\times n\) matrix as an input. Here the multilinearity means the the mapping is linear in each row of the matrix. Since \(\Omega(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=\mathbf{1}_{n},\) we may phrase the above theorem equivalently as:
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
It would be more precise to write \(\det_n\) since the determinant is a family of mappings, one mapping \(\det_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) for each \(n \in \mathbb{N}.\) It is however common to simply write \(\det.\)
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
5.2 Uniqueness of the determinant
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:
interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_jv_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.
Proof. (i) Since \(f\) is alternating, we have for all \((v_1,\ldots,v_{\ell}) \in V^\ell\) \[f(v_1,\ldots,v_{i-1},v_i+v_j,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell})=0_W.\] Using the linearity in the \(i\)-th argument, this gives \[\begin{aligned} 0_W&=f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell}). \end{aligned}\] Using the linearity in the \(j\)-th argument, we obtain \[\begin{aligned} 0_W&=f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_j,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_j,v_{j+1},\ldots,v_{\ell}). \end{aligned}\] The first summand has a double occurrence of \(v_i\) and hence vanishes by the alternating property. Likewise, the fourth summand has a double occurrence of \(v_j\) and hence vanishes as well. Since the second summand equals \(f(v_1,\ldots,v_{\ell}),\) we thus obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] as claimed.
(ii) Suppose \(\{v_1,\ldots,v_{\ell}\}\) are linearly dependent so that we have scalars \(s_j \in \mathbb{K}\) not all zero, \(1\leqslant j\leqslant \ell,\) so that \(s_1v_1+\cdots+s_\ell v_{\ell}=0_V.\) Suppose \(s_i\neq 0\) for some index \(1\leqslant i\leqslant \ell.\) Then \[v_i=-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)v_j\] and hence by the linearity in the \(i\)-th argument, we obtain \[\begin{gathered} f\left(v_1,\ldots,v_{i-1},-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)v_j,v_{i+1},\ldots,v_{\ell}\right)\\=-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)f\left(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{\ell}\right)=0_W, \end{gathered}\] where we use that for each \(1\leqslant j\leqslant \ell\) with \(j\neq i,\) the expression \[f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{\ell})\] has a double occurrence of \(v_j\) and thus vanishes by the alternating property.
(iii) Let \((v_1,\ldots,v_{\ell})\in V^\ell\) and \((s_1,\ldots,s_\ell) \in \mathbb{K}^\ell.\) Then, using the linearity in the \(i\)-th argument, we compute \[\begin{gathered} f(v_1,\ldots,v_{i-1},v_i+\sum_{j=1,j\neq i}^\ell s_jv_j,v_{i+1},\ldots,v_{\ell})\\ =f(v_1,\ldots,v_{\ell})+\sum_{j=1,j\neq i}^\ell s_jf(v_1,\ldots,v_{i-1}v_j,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell}), \end{gathered}\] where the last equality follows exactly as in the proof of (ii).
The alternating property of an \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) together with the condition \(f_n(\mathbf{1}_{n})=1\) uniquely determines the value of \(f_n\) on the elementary matrices:
Let \(n \in \mathbb{N}\) and \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) an alternating \(n\)-multilinear map satisfying \(f_n(\mathbf{1}_{n})=1.\) Then for all \(1\leqslant k,l\leqslant n\) with \(k\neq l\) and all \(s\in \mathbb{K},\) we have \[\tag{5.3} f_n(\mathbf{D}_k(s))=s,\qquad f_n(\mathbf{L}_{k,l}(s))=1, \qquad f_n(\mathbf{P}_{k,l})=-1.\] Moreover, for \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and an elementary matrix \(\mathbf{B}\) of size \(n,\) we have \[\tag{5.4} f_n(\mathbf{B}\mathbf{A})=f_n(\mathbf{B})f_n(\mathbf{A}).\]
Proof. Recall that \(\mathbf{D}_k(s)\) applied to a square matrix \(\mathbf{A}\) multiplies the \(k\)-th row of \(\mathbf{A}\) with \(s\) and leaves \(\mathbf{A}\) unchanged otherwise. We write \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) as \(\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_n)\) for \(\vec{\alpha}_{i} \in \mathbb{K}_n,\) \(1\leqslant i\leqslant n.\) Hence we obtain \[\mathbf{D}_k(s)\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},s\vec{\alpha}_k,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_n).\] The linearity of \(f\) in the \(k\)-th row thus gives \(f_n(\mathbf{D}_k(s)\mathbf{A})=sf_n(\mathbf{A}).\) In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies that \(f_n(\mathbf{D}_k(s))=f_n(\mathbf{D}_k(s)\mathbf{1}_{n})=sf_n(\mathbf{1}_{n})=s.\) Therefore, we have \[f_n(\mathbf{D}_k(s)\mathbf{A})=f_n(\mathbf{D}_k(s))f_n(\mathbf{A}).\]
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:
interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_jv_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.
Therefore, we have \[f_n(\mathbf{L}_{k,l}(s)\mathbf{A})=f_n(\mathbf{L}_{k,l}(s))f_n(\mathbf{A}).\]
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:
interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_jv_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.
Therefore, we have \(f_n(\mathbf{P}_{k,l}\mathbf{A})=f_n(\mathbf{P}_{k,l})f_n(\mathbf{A}),\) as claimed.
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
Let \(n\in \mathbb{N}\) and \(f_n,\hat{f}_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) be alternating \(n\)-multilinear maps satisfying \(f_n(\mathbf{1}_{n})=\hat{f}_n(\mathbf{1}_{n})=1.\) Then \(f_n=\hat{f}_n.\)
Let \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be a square matrix. Then the following statements are equivalent:
\(\mathbf{A}\) is invertible;
the row vectors of \(\mathbf{A}\) are linearly independent;
the column vectors of \(\mathbf{A}\) are linearly independent.
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:
interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_jv_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.
Now suppose that \(\mathbf{A}\) is invertible. Using Gauss–Jordan elimination, we obtain \(N \in \mathbb{N}\) and a sequence of elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_N\) so that \(\mathbf{B}_N\cdots \mathbf{B}_1=\mathbf{A}.\) We obtain \[\begin{aligned} f_n(\mathbf{A})&=f_n(\mathbf{B}_N\cdots \mathbf{B}_1)=f_n(\mathbf{B}_N)f_n(\mathbf{B}_{N-1}\cdots \mathbf{B}_1)=\hat{f}_n(\mathbf{B}_N)f_n(\mathbf{B}_{N-1}\cdots \mathbf{B}_1), \end{aligned}\] where the second equality uses (5.4) and the third equality uses that (5.3) implies that \(\hat{f}_n(\mathbf{B})=f_n(\mathbf{B})\) for all elementary matrices \(\mathbf{B}.\) Proceeding in this fashion we get \[\begin{aligned} f_n(\mathbf{A})&=\hat{f}_n(\mathbf{B}_N)\hat{f}_n(\mathbf{B}_{N-1})\cdots\hat{f}_n(\mathbf{B}_1)=\hat{f}_n(\mathbf{B}_N)\hat{f}_n(\mathbf{B}_{N-1})\cdots\hat{f}_n(\mathbf{B}_2\mathbf{B}_1)=\cdots \\ &=\hat{f}_n(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_1)=\hat{f}_n(\mathbf{A}). \end{aligned}\]
5.3 Existence of the determinant
It turns out that we can define the determinant recursively in terms of the determinants of certain submatrices. Determinants of submatrices are called minors. To this end we first define:
Let \(n \in \mathbb{N}.\) For a square matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and \(1\leqslant k,l\leqslant n\) we denote by \(\mathbf{A}^{(k,l)}\) the \((n-1)\times(n-1)\) submatrix obtained by removing the \(k\)-th row and \(l\)-th column from \(\mathbf{A}.\)
\[\mathbf{A}=\begin{pmatrix} a & b \\ c & d \end{pmatrix}, \qquad \mathbf{A}^{(1,1)}=(d), \qquad \mathbf{A}^{(2,1)}=(b).\] \[\mathbf{A}=\begin{pmatrix} 1 & -2 & 0 & 4 \\ 3 & 1 & 1 & 0 \\ -1 & -5 & -1 & 8 \\ 3 & 8 & 2 & -12 \end{pmatrix}, \qquad \mathbf{A}^{(3,2)}=\begin{pmatrix} 1 & 0 & 4 \\ 3 & 1 & 0 \\ 3 & 2 & -12\end{pmatrix}.\]
We use induction to prove the existence of the determinant:
Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : (\mathbb{K}_n)^n \to \mathbb{K}\) satisfying \(f_n(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=1.\)
Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)
Let \(n\in \mathbb{N}\) and \(f_n,\hat{f}_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) be alternating \(n\)-multilinear maps satisfying \(f_n(\mathbf{1}_{n})=\hat{f}_n(\mathbf{1}_{n})=1.\) Then \(f_n=\hat{f}_n.\)
Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)
Step 1. We first show that \(f_n(\mathbf{1}_{n})=1.\) Since \([\mathbf{1}_{n}]_{kl}=\delta_{kl},\) we obtain \[f_n(\mathbf{1}_{n})=\sum_{k=1}^n(-1)^{l+k}[\mathbf{1}_{n}]_{kl}f_{n-1}\left(\mathbf{1}_{n}^{(k,l)}\right)=(-1)^{2l}f_{n-1}\left(\mathbf{1}_{n}^{(l,l)}\right)=f_{n-1}\left(\mathbf{1}_{n-1}\right)=1,\] where we use that \(\mathbf{1}_{n}^{(l,l)}=\mathbf{1}_{n-1}\) and \(f_{n-1}(\mathbf{1}_{n-1})=1.\)
Step 2. We show that \(f_n\) is multilinear. Let \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and write \(\mathbf{A}=(A_{kj})_{1\leqslant k,j\leqslant n}.\) We first show that \(f_n\) is \(1\)-homogeneous in each row. Say we multiply the \(i\)-th row of \(\mathbf{A}\) with \(s\) so that we obtain a new matrix \(\hat{\mathbf{A}}=(\hat{A}_{kj})_{1\leqslant k,j\leqslant n}\) with \[\hat{A}_{kj}=\left\{\begin{array}{cc} A_{kj}, & k\neq i,\\ sA_{kj}, & k=i.\end{array}\right.\] We need to show that \(f_n(\hat{\mathbf{A}})=sf_n(\mathbf{A}).\) We compute \[\begin{aligned} f_n(\hat{\mathbf{A}})&=\sum_{k=1}^n(-1)^{l+k}\hat{A}_{kl}f_{n-1}(\hat{\mathbf{A}}^{(k,l)})\\ &=(-1)^{l+i}sA_{il}f_{n-1}(\hat{\mathbf{A}}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\hat{\mathbf{A}}^{(k,l)}). \end{aligned}\] Now notice that \(\hat{\mathbf{A}}^{(i,l)}=\mathbf{A}^{(i,l)},\) since \(\mathbf{A}\) and \(\hat{\mathbf{A}}\) only differ in the \(i\)-th row, but this is the row that is removed. Since \(f_{n-1}\) is \(1\)-homogeneous in each row, we obtain that \(f_{n-1}(\hat{\mathbf{A}}^{(k,l)})=sf_{n-1}(\mathbf{A}^{(k,l)})\) whenever \(k \neq i.\) Thus we have \[\begin{aligned} f_n(\hat{\mathbf{A}})&=s(-1)^{l+i}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+s\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{A}^{(k,l)})\\ &=s\sum_{k=1}^n(-1)^{l+k}A_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)=sf_n(\mathbf{A}). \end{aligned}\] We now show that \(f_n\) is additive in each row. Say the matrix \(\mathbf{B}=(B_{kj})_{1\leqslant k,j\leqslant n}\) is identical to the matrix \(\mathbf{A},\) except for the \(i\)-th row, so that \[B_{kj}=\left\{\begin{array}{cc} A_{kj} & k\neq i\\ B_{j} & k=i\end{array}\right.\] for some scalars \(B_j\) with \(1\leqslant j\leqslant n.\) We need to show that \(f_n(\mathbf{C})=f_n(\mathbf{A})+f_n(\mathbf{B}),\) where \(\mathbf{C}=(C_{kj})_{1\leqslant k,j\leqslant n}\) with \[C_{kj}=\left\{\begin{array}{cc} A_{kj} & k\neq i\\ A_{ij}+B_{j} & k=i\end{array}\right.\] We compute \[f_n(\mathbf{C})=(-1)^{l+i}(A_{il}+B_l)f_{n-1}(\mathbf{C}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{C}^{(k,l)}).\] As before, since \(\mathbf{A},\mathbf{B},\mathbf{C}\) only differ in the \(i\)-th row, we have \(\mathbf{A}^{(i,l)}=\mathbf{B}^{(i,l)}=\mathbf{C}^{(i,l)}.\) Using that \(f_{n-1}\) is linear in each row, we thus obtain \[\begin{gathered} f_n(\mathbf{C})=(-1)^{l+i}B_lf_{n-1}(\mathbf{B}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{B}^{(k,l)})\\ +(-1)^{l+i}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{A}^{(k,l)})=f_n(\mathbf{A})+f_n(\mathbf{B}). \end{gathered}\]
Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:
interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_jv_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.
Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)
For \(n=2\) and choosing \(l=1,\) we obtain \[\det\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=a\det\left(\mathbf{A}^{(1,1)}\right)-c\det\left(\mathbf{A}^{(2,1)}\right)=ad-cb,\] in agreement with (5.1). For \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant 3} \in M_{3,3}(\mathbb{K})\) and choosing \(l=3\) we obtain \[\begin{gathered} \det\left(\begin{pmatrix}A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33} \end{pmatrix}\right)=A_{13} \det\left(\begin{pmatrix} A_{21} & A_{22} \\ A_{31} & A_{32} \end{pmatrix}\right)\\-A_{23}\det\left(\begin{pmatrix} A_{11} & A_{12} \\ A_{31} & A_{32}\end{pmatrix}\right)+A_{33} \det\left(\begin{pmatrix}A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix}\right)\\ \end{gathered}\] so that \[\begin{aligned} \det \mathbf{A}&=A_{13}(A_{21}A_{32}-A_{31}A_{22})-A_{23}(A_{11}A_{32}-A_{31}A_{12})\\ &\phantom{=}+A_{33}(A_{11}A_{22}-A_{21}A_{12})\\ &=A_{11}A_{22}A_{33}-A_{11}A_{23}A_{32}-A_{12}A_{21}A_{33}\\ &\phantom{=}+A_{12}A_{23}A_{31}+A_{13}A_{21}A_{32}-A_{13}A_{22}A_{31}. \end{aligned}\]
Exercises
Let \(V=\mathbb{R}^3\) and \(W=\mathbb{R}.\) Show that the map \[f : V^3 \to W, \quad (\vec{x},\vec{y},\vec{z}) \mapsto (\vec{x}\times\vec{y})\cdot \vec{z}\] is alternating and trilinear.
Solution
We first show that \(f\) is trilinear: Let \(s,t\in\mathbb{R}.\) By definition of \(\times,\) we have that \(f(\vec y,\vec x,\vec z) = -f(\vec x,\vec y,\vec z)\) for all \(\vec x,\vec y, \vec z\in \mathbb{R}^3\) and hence we will show linearity in the first and third slot only. Let \(\vec x,\vec x_1,\vec x_2,\vec y,\vec z,\vec z_1,\vec z_2\in\mathbb{R}^3.\) \[\begin{aligned} f(s\vec x_1+t\vec x_2,\vec y,\vec z) & = \left((s\vec x_1 + t\vec x_2)\times \vec y\right)\cdot \vec z \\ & = \left(s(\vec x_1\times \vec y)+t(\vec x_2\times \vec y)\right)\cdot \vec z \\ & = s(\vec x_1\times \vec y)\cdot \vec z + t(\vec x_2\times \vec y)\cdot \vec z\\ & = sf(\vec x_1,\vec y,\vec z) + tf(\vec x_2,\vec y,\vec z),\end{aligned}\] where we use distributivity of \(\times\) and \(\cdot\) over \(+.\)
\[\begin{aligned} f(\vec x,\vec y,s\vec z_1+t\vec z_2) & = (\vec x\times \vec y)\cdot(s\vec z_1+t\vec z_2) \\ & = s(\vec x\times \vec y)\cdot \vec z_1 + t(\vec x\times \vec y)\cdot \vec z_2, \end{aligned}\] where we use the linearity of \(\cdot\) in the second slot.
We are left to show that \(f\) is alternating: Note that \(f(\vec x,\vec x, \vec z)=0\) by definition of the cross product. Since \(f(\vec y,\vec x,\vec z) = -f(\vec x,\vec y,\vec z),\) it is enough to show that \(f(\vec x,\vec y,\vec y)= 0\): \[\begin{aligned} \left(\begin{pmatrix}x_1\\ x_2 \\ x_3\end{pmatrix}\times \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix}\right) \cdot \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix} & = \begin{pmatrix}x_2y_3-x_3y_2\\ x_3y_1-x_1y_3\\ x_1y_2-x_2y_1\end{pmatrix}\cdot \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix}. \end{aligned}\] Evaluating the dot product yields \[\color{blue}y_1x_2y_3\color{black}-\color{red}y_1y_2x_3\color{black} +\color{red}y_1y_2x_3\color{black} - x_1y_2y_3\color{black} + x_1y_2y_3 - \color{blue}y_1x_2y_3\] and this expression equals zero since terms with the same color cancel each other.