1 Fundamental notions
We start by introducing some notions that are fundamental for the study of curved spaces. Doing so will lead to a deeper understanding of some concepts from Linear Algebra and Analysis.
1.1 Points, vectors and the tangent space
Recall that we define \(\mathbb{R}^n\) as ordered \(n\)-tuples \(p=(x_1,\ldots,x_n)\) of scalars \(x_i \in \mathbb{R},\) \(1\leqslant i\leqslant n.\) We also consider column vectors of length \(n\) with real entries \[\vec{v}=\begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}.\] We write \(M_{m,n}(\mathbb{R})\) for the set of \((m\times n)\)-matrices with real entries. A column vector of length \(n\) may be thought of as an \((n\times 1)\)-matrix, hence we write \(M_{n,1}(\mathbb{R})\) for the set of such column vectors. Clearly we have a bijective map \[\Psi_n : \mathbb{R}^n \to M_{n,1}(\mathbb{R}), \quad (x_1,\ldots,x_n) \mapsto \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}\] which writes the entries of an \(n\)-tuple into a column vector. Because of this map, we may avoid a distinction between \(\mathbb{R}^n\) and \(M_{n,1}(\mathbb{R})\) and pretend they are the same thing. This was done so in Linear Algebra. In geometry, it turns out to be useful to think of \(\mathbb{R}^n\) and \(M_{n,1}(\mathbb{R})\) as different sets. The elements of \(\mathbb{R}^n\) are interpreted as points and will be denoted by \(p,q,r\ldots.\) The elements of \(M_{n,1}(\mathbb{R})\) are interpreted as vectors in \(\mathbb{R}^n\) that are attached to the origin \(0_{\mathbb{R}^n}=(0,0,\ldots,0)\in \mathbb{R}^n.\) They will be denoted by \(\vec{u},\vec{v},\vec{w},\ldots .\)
Already in elementary geometry the situation occurs where we consider vectors in \(\mathbb{R}^n\) that are not attached to the origin \(0_{\mathbb{R}^n},\) but rather to some other point \(p \in \mathbb{R}^n.\) Think for instance of the normal vector of a plane in \(\mathbb{R}^3\) not containing the origin \(0_{\mathbb{R}^3}.\)
In order to deal with vectors that are not attached to the origin, but to a point \(p \in \mathbb{R}^n,\) we introduce the so-called tangent space of \(\mathbb{R}^n\) at \(p,\) \[T_{p}\mathbb{R}^n=\left\{\vec{v}_{p}\,|\, \vec{v} \in M_{n,1}(\mathbb{R})\right\}.\] The element \(\vec{v}_{p} \in T_p\mathbb{R}^n\) is to be interpreted as attaching the vector \(\vec{v} \in M_{n,1}(\mathbb{R})\) at the basepoint \(p \in \mathbb{R}^n.\) The elements of \(T_p\mathbb{R}^n\) are called tangent vectors with basepoint \(p\). Observe that for all \(p \in \mathbb{R}^n\) the tangent space \(T_p\mathbb{R}^n\) is a vector space over \(\mathbb{R}\) when equipped with vector addition \(+_{T_p\mathbb{R}^n} : {T_p\mathbb{R}^n} \times {T_p\mathbb{R}^n} \to {T_p\mathbb{R}^n}\) defined by the rule \[\vec{v}_{p}+_{T_p\mathbb{R}^n}\vec{w}_{p}=(\vec{v}+_{M_{n,1}(\mathbb{R})}\vec{w})_{p}\] for all \(\vec{v}_{p},\)\(\vec{w}_{p} \in T_p\mathbb{R}^n\) and scalar multiplication \(\cdot_{T_p\mathbb{R}^n} : \mathbb{R}\times {T_p\mathbb{R}^n} \to {T_p\mathbb{R}^n}\) defined by the rule \[s\cdot_{T_p\mathbb{R}^n}\vec{v}_{p}=(s\cdot_{M_{n,1}(\mathbb{R})}\vec{v})_{p}\] for all \(s \in \mathbb{R}\) and all \(\vec{v}_{p} \in T_p\mathbb{R}^n.\) Here \(+_{M_{n,1}(\mathbb{R})}\) denotes usual component-wise addition of column vectors and \(\cdot_{M_{n,1}(\mathbb{R})}\) denotes usual component-wise scalar multiplication of a column vector by a scalar. Clearly, for all \(p \in \mathbb{R}^n\) we have a vector space isomorphism \[T_p\mathbb{R}^n \to M_{n,1}(\mathbb{R}), \quad \vec{v}_{p} \mapsto \vec{v}\] which simply “forgets“ the basepoint \(p \in \mathbb{R}^n.\) We can thus think of \(T_p\mathbb{R}^n\) as a copy of \(M_{n,1}(\mathbb{R})\) attached to \(p \in \mathbb{R}^n.\) The union of all these copies of \(\mathbb{R}^n\) is known as the tangent bundle of \(\mathbb{R}^n\) \[T\mathbb{R}^n=\bigcup_{p \in \mathbb{R}^n}T_p\mathbb{R}^n=\bigcup_{p \in \mathbb{R}^n}\left\{\vec{v}_{p}\,|\, \vec{v} \in M_{n,1}(\mathbb{R})\right\}.\] At this point the name tangent space is a bit confusing, since it is unclear to what \(T_p\mathbb{R}^n\) is tangent to. This will be clarified later on. If \(U\subset \mathbb{R}^n\) is an open subset, we define likewise \[TU=\bigcup_{p \in U}T_p\mathbb{R}^n.\] Observe that for each \(p \in \mathbb{R}^n\) the tangent space \(T_p\mathbb{R}^n\) is equipped with an ordered basis \[\mathbf{e}_p^{(n)}=\big([\vec{e}_1]_{p},\ldots,[\vec{e}_n]_{p}\big),\] where \(\{\vec{e}_1,\ldots,\vec{e}_n\}\) denotes the standard basis of \(M_{n,1}(\mathbb{R}).\) For all \(p \in \mathbb{R}^n\) we call \(\mathbf{e}^{(n)}_p\) the ordered standard basis of \(T_p\mathbb{R}^n.\)
Whenever \(n\) is clear from the context we simply write \(\mathbf{e}_p\) instead of \(\mathbf{e}^{(n)}_p.\)
Since \(M_{1,1}(\mathbb{R})\) is one-dimensional, so is \(T_t\mathbb{R}\) for all \(t \in\mathbb{R}\) and the ordered standard basis of \(T_t\mathbb{R}\) consists of a single vector which we denote by \(1_{t}.\)
An inner product on a vector space \(V\) over \(\mathbb{R}\) is a positive definite symmetric bilinear form \(\langle\cdot{,}\cdot\rangle: V\times V \to \mathbb{R}.\)
For all \(p \in \mathbb{R}^n,\) the standard inner product on \(T_p\mathbb{R}^n\) is the unique inner product \(\langle\cdot{,}\cdot\rangle_p\) for which \(\mathbf{e}_p\) is an orthonormal basis, that is, we have \[\langle [\vec{e}_i]_{p},[\vec{e}_j]_{p}\rangle_p=\delta_{ij}=\left\{\begin{array}{ll} 1, & i=j, \\ 0, & i\neq j.\end{array}\right.\]
We will henceforth always assume that \(T_p\mathbb{R}^n\) is equipped with \(\langle\cdot{,}\cdot\rangle_p.\) Whenever no confusion can arise about the point \(p\) at which \(\langle\cdot{,}\cdot\rangle_p\) is computed, we will usually simply write \(\langle\cdot{,}\cdot\rangle.\)
1.2 Smooth maps, diffeomorphisms and the differential
We recall some facts from Analysis II, but now with a slightly more geometric perspective.
This is a theorem, not a definition!
the partial derivative \(\partial_if(p)\) exists for all \(p \in U\);
the map \(\partial_i f : U \to \mathbb{R}^m,\) \(p \mapsto \partial_if(p)\) is continuous.
Recursively, we can define higher derivatives. For \(k \in \mathbb{N}, k \geqslant 2\) we call \(f : U \to \mathbb{R}^m\) \(k\)-times continuously differentiable if \(\partial_i f : U \to \mathbb{R}^m\) is \((k-1)\)-times continuously differentiable for all \(1\leqslant i\leqslant n.\) We write \[C^k(U,\mathbb{R}^m)=\left\{f : U \to \mathbb{R}^m\, |\, f \text{ is }k\text{-times continuously differentiable}\right\}\] and
We set \[C^{\infty}(U,\mathbb{R}^m)=\bigcap_{k \in \mathbb{N}}C^k(U,\mathbb{R}^m)\] and call the elements of \(C^{\infty}(U,\mathbb{R}^m)\) smooth maps from \(U\) to \(\mathbb{R}^m\).
Throughout this module we will almost exclusively consider smooth maps.
It is useful to have a notion of smoothness for maps that are defined on some arbitrary subset \(\mathcal{X}\subset \mathbb{R}^n.\) A map \(f : \mathcal{X} \to \mathbb{R}^m\) is called smooth if there exists an open subset \(U\subset \mathbb{R}^n\) containing \(\mathcal{X}\) and a smooth function \(\hat{f} : U \to \mathbb{R}^m\) so that \(\hat{f}(p)=f(p)\) for all \(p \in \mathcal{X}.\)
Given \(U\subset \mathbb{R}^n,\) let \(f : U \to \mathbb{R}^m\) be smooth and write \(f=(f_1,\ldots,f_m)\) for real-valued functions \(f_i : U \to \mathbb{R}.\)
The differential of \(f\) at \(p \in U\) is the unique linear map \[f_*|_{p} : T_p\mathbb{R}^n \to T_{f(p)}\mathbb{R}^m\] so that for all \[\vec{v}_p=\begin{pmatrix} v_1 \\ \vdots \\ v_n\end{pmatrix}_p \in T_p\mathbb{R}^n,\] we have \[f_*|_p(\vec{v}_p)=\vec{w}_{f(p)}=\begin{pmatrix} w_1 \\ \vdots \\ w_m\end{pmatrix}_{f(p)}\] with \[\tag{1.1} \begin{pmatrix} w_1 \\ \vdots \\ w_m\end{pmatrix}=\begin{pmatrix} \partial_1f_1(p) && \cdots && \partial_n f_1(p) \\ \vdots && \ddots && \vdots \\ \partial_1f_m(p) && \cdots && \partial_n f_m(p)\end{pmatrix}\begin{pmatrix} v_1 \\ \vdots \\ v_n\end{pmatrix}.\]
Recall that the \((m\times n)\)-matrix on the right in (1.1) is called the Jacobian matrix of \(f\) at \(p\). We denote it by \(\mathbf{J}f(p).\)
The map \(f_*|_{p}\) is thus the unique linear map \(T_p\mathbb{R}^n \to T_{f(p)}\mathbb{R}^m\) whose matrix representation with respect to the ordered basis \(\mathbf{e}_p^{(n)}\) of \(T_p\mathbb{R}^n\) and the ordered basis \(\mathbf{e}_{f(p)}^{(m)}\) of \(T_{f(p)}\mathbb{R}^m\) is given by \(\mathbf{J}f(p),\) that is, \[\boxed{\mathbf{M}\left(f_*|_{p},\mathbf{e}_p^{(n)},\mathbf{e}_{f(p)}^{(m)}\right)=\mathbf{J}f(p).}\] In particular, we have \[\boxed{\operatorname{rank}(f_*|_{p})=\operatorname{rank}(\mathbf{J}f(p))}\] for all \(p \in U.\)
For each \(p \in U\) we obtain a linear map \(f_*|_p : T_p \mathbb{R}^n \to T_{f(p)}\mathbb{R}^m.\) It is useful to think of the family \(\{ f_*|p\}_{p \in U}\) of all such linear maps as a single map \[f_* : TU \to T\mathbb{R}^m\] defined by the rule \[f_*(\vec{v}_p)=\vec{w}_{f(p)}, \qquad \text{where} \qquad \vec{w}=\mathbf{J}f(p)\vec{v}.\] That is, for all \(p \in U,\) the restriction of \(f_*\) to \(T_p\mathbb{R}^n \subset TU\) is given by \(f_*|_p.\) The map \(f_* : TU \to T\mathbb{R}^m\) is called the differential of \(f\).
Consider the smooth map \[f :\mathbb{R}^2 \to \mathbb{R}^2,\qquad p=(x,y) \mapsto f(p)=(x^2-y^2,xy).\] For the Jacobian we obtain \[\mathbf{J}f(p)=\begin{pmatrix} 2x && -2y \\ y && x \end{pmatrix}\] and hence for \[\vec{v}_p=\begin{pmatrix} u \\ w \end{pmatrix}_{(x,y)}\] we have \[f_*(\vec{v}_p)=\begin{pmatrix} 2xu-2yw \\ yu+xw\end{pmatrix}_{(x^2-y^2,xy)}.\]
Recall that \(\Psi_n : \mathbb{R}^n \to M_{n,1}(\mathbb{R})\) is the map that turns a point into a column vector. We use \(\Psi_n\) to let an \((m\times n)\)-matrix \(\mathbf{A}\) act on points of \(\mathbb{R}^n\) by the rule \[\mathbf{A}p:=\Psi_m^{-1}(\mathbf{A}\Psi_n(p))\] for all \(p \in \mathbb{R}^n,\) where on the right hand side \(\mathbf{A}\) acts on the column vector \(\Psi_n(p)\) by matrix multiplication.
Let \(\mathbf{A}\in M_{m,n}(\mathbb{R}),\) \(b \in \mathbb{R}^m\) and consider the map \[f_{\mathbf{A},b} : \mathbb{R}^n \to \mathbb{R}^m, \qquad p \mapsto \mathbf{A}p+b.\] Then we have \[(f_{\mathbf{A},b})_*(\vec{v}_p)=(\mathbf{A}\vec{v})_{\mathbf{A}p+b}\] for all \(p \in \mathbb{R}^n\) and \(\vec{v}_p \in T_p\mathbb{R}^n.\)
Let \(U \subset \mathbb{R}\) be open and \(f : U \to \mathbb{R}\) a smooth function. We have the usual derivative from Analysis I \[f^{\prime} : U \to \mathbb{R}, \qquad t \mapsto f^{\prime}(t)=\frac{\mathrm{d}f}{\mathrm{d}t}(t)=\lim_{h \to 0}\frac{1}{h}(f(t+h)-f(t)).\] We also have the differential in the sense of Definition 1.5 which is a map \(f_* : TU \to T\mathbb{R}.\) Now notice that for all \(t \in U\) we have \[\tag{1.2} f_*\left(1_{t}\right)=f^{\prime}(t)1_{f(t)}.\] Recommendation: Pause here and think about (1.2) until you understand it.
Diffeomorphisms are smooths maps that are bijective and admit a smooth inverse:
Let \(U\subset \mathbb{R}^n\) and \(V\subset \mathbb{R}^m\) be open sets and \(f : U \to V\) a smooth map. If \(f\) is bijective and \(f^{-1} : V \to U\) is smooth as well, then \(f : U \to V\) is called a diffeomorphism.
Recall from Analysis that if \(f : U \to V\) is a diffeomorphism, then \(n=m\) and moreover, for all \(p \in U\) the linear map \(f_*|_{p} : T_p\mathbb{R}^n \to T_{f(p)}\mathbb{R}^n\) is invertible.
If \(f : U \to \mathbb{R}^m\) is a smooth and injective map, we say \(f\) is a diffeomorphism onto its image, provided the inverse map \(f^{-1} : \operatorname{Im}(f) \to U\) is smooth as well. Here as usual we define \[\operatorname{Im}(f)=f(U)=\{q \in \mathbb{R}^m | q=f(p),\, p \in U\}.\]
From the chain rule in Analysis II we conclude:
Let \(U\subset \mathbb{R}^n\) and \(V\subset \mathbb{R}^m\) be open sets and \(f : U \to \mathbb{R}^m\) and \(g : V \to \mathbb{R}^k\) be smooth maps with \(f(U)\subset V.\) Then \(g\circ f : U \to \mathbb{R}^k\) is smooth and for all \(p \in U\) we have \[\tag{1.3} (g\circ f)_*|_{p}=g_*|_{{}f(p)}\circ f_*|_{p}.\] That is, the differential of the composition \(g\circ f\) at \(p\) is given by the composition of the linear map \(f_*|_{p} : T_p\mathbb{R}^n \to T_{f(p)}\mathbb{R}^m\) and the linear map \(g_*|_{{}f(p)} : T_{f(p)}\mathbb{R}^m \to T_{g(f(p))}\mathbb{R}^k.\)
The chain rule tells us that compositions of smooth maps are smooth, so are sums and products. More precisely:
If \(f,g : U \to \mathbb{R}^m\) are smooth, then so is \(f+_{C^{\infty}(U,\mathbb{R}^m)}g : U \to \mathbb{R}^m,\) where \[(f+_{C^{\infty}(U,\mathbb{R}^m)}g)(p)=f(p)+_{\mathbb{R}^m}g(p)\] for all \(p \in U.\)
If \(f,g : U \to \mathbb{R}\) are smooth, then so is \(f\cdot_{C^{\infty}(U,\mathbb{R})}g : U \to \mathbb{R},\) where \[(f\cdot_{C^{\infty}(U,\mathbb{R})}g)(p)=f(p)\cdot_{\mathbb{R}}g(p)\] for all \(p \in U.\)
1.3 Vector fields and the gradient
A vector field attaches a tangent vector \(\vec{v}_p\) to every point \(p\) of its domain of definition. More precisely:
A vector field on some open subset \(U\subset \mathbb{R}^n\) is a map \(X : U \to T\mathbb{R}^n\) so that \(X(p) \in T_p\mathbb{R}^n\) for all \(p \in U.\) For a vector field \(X : U \to T\mathbb{R}^n\) there exists unique functions \(X_i : U \to \mathbb{R},\) \(1\leqslant i\leqslant n,\) so that \[X(p)=\begin{pmatrix} X_1(p) \\ \vdots \\ X_n(p)\end{pmatrix}_p\] for all \(p \in U.\) The vector field is called smooth if the functions \(X_i\) are smooth for all \(1\leqslant i\leqslant n.\)
Vector fields appear naturally in physics. For instance, an electromagnetic field is an example of a vector field. Likewise, in the classical Newtonian theory of gravity, the gravitational field is an example of a vector field.
Write \(p=(x_1,x_2)\) for an element of \(\mathbb{R}^2,\) then \[X : \mathbb{R}^2 \to T\mathbb{R}^2, \qquad p=(x_1,x_2) \mapsto \begin{pmatrix} -x_2 \\ x_1 \end{pmatrix}_{p}\] is a smooth vector field on \(\mathbb{R}^2.\)
Every smooth function gives rise to a vector field:
Let \(U\subset \mathbb{R}^n\) and \(f : U \to \mathbb{R}\) be a smooth function. Then the so-called gradient of \(f\) defined by \[\operatorname{grad}f : U \to T\mathbb{R}^n, \qquad p \mapsto \begin{pmatrix} \partial_1 f(p) \\ \vdots \\ \partial_n f(p) \end{pmatrix}_p\] is a smooth vector field on \(U.\)
Consider the smooth function \(f : \mathbb{R}^2 \to \mathbb{R}\) defined by the rule \[f(p)=(x_1)^2+(x_2)^2,\] where we write \(p=(x_1,x_2).\) Then we have \[\operatorname{grad} f(p)=\begin{pmatrix} 2x_1 \\ 2x_2 \end{pmatrix}_p.\]
For an open set \(U \subset \mathbb{R}^n\) and a smooth real-valued function \(f : U \to \mathbb{R},\) show that \[f_*(\vec{v}_p)=\langle \operatorname{grad}f(p),\vec{v}_p\rangle 1_{f(p)}\] for all \(\vec{v}_p \in TU.\)