10.6 Adams-Moulton methods

In the previous section we saw that the three-step Adams-Bashforth method is the unique convergent three step third order explicit linear multistep method with characteristic polynomial \(\rho\) given by \(\rho(w)=w^{3}-w^{2}.\) This characterization is general: for any \(s\ge1,\) the unique convergent \(s\)-order \(s\)-step explicit linear multistep method with the characteristic polynomial \(\rho\) given by \(\rho(w)=w^{s}-w^{s-1}\) is the \(s\)-step Adams-Bashforth method.

Recall that in the light of Theorem 10.11, \(\rho(w)=w^{s}-w^{s-1}\) is in some sense the best case for \(\rho.\) What if we use this \(\rho\) but allow our method to be implicit? Can we achieve a higher order than \(s\)?

Example 10.15

(One step \(s=1\)). For \(s=1\) steps the Adams-Bashforth method has \(\rho(w)=w-1,\) and \(\sigma(w)=1\) (it coincides with the Euler’s method). It is of order one, i.e. \[\xi-\sigma(1+\xi)\log(1+\xi)=O(\xi^{2}).\tag{10.45}\] If we allow for an implicit method we can add a term to \(\sigma(1+\xi)\) which is linear in \(\xi.\) we make the ansatz \(\sigma(1+\xi)=1+a\xi\) for a to be determined. Then \[\begin{array}{ccl} \sigma(1+\xi)\ln(1+\xi) & = & (1+a\xi)\ln(1+\xi)\\ & = & \ln(1+\xi)+a\xi\ln(1+\xi)\\ & = & (\xi-\frac{\xi^{2}}{2}+O(\xi^{3}))+a(\xi^{2}+O(\xi^{3}))\\ & = & \xi+\xi^{2}(a-\frac{1}{2})+O(\xi^{3}). \end{array}\] Thus if \(a=\frac{1}{2}\) then the r.h.s. of ([eq:ch10_am_1_ps]) is \(O(\xi^{3}),\) and the corresponding method is second order.

To write down the method concretely we substitute \(\xi=w-1\) in \(\sigma\) to obtain \[\sigma(w)=1+\frac{w-1}{2}=\frac{1+w}{2}.\] The characteristic polynomials \(\rho(w)=w-1\) and \(\sigma(w)=\frac{1+w}{2}\) correspond to the method \[y_{n+1}-y_{n}=h\frac{F(t_{n},y_{n})+F(t_{n+1},y_{n+1})}{2}.\tag{10.46}\] This coincides with the trapezoidal rule method from ([eq:ch9_trap_intro_rule]). It is an implicit method of order \(2,\) one more order than the corresponding explicit method.

Example 10.16

(Two step \(s=2\)). For \(s=2\) we have \(\rho(w)=w^{2}-w,\) corresponding to \(\rho(1+\xi)=(1+\xi)=\xi+\xi^{2}.\) We seek \(\sigma\) of degree \(2\) such that \[\xi+\xi^{2}-\sigma(1+\xi)\log(1+\xi)=O(\xi^{4}).\] With the ansatz \(\sigma(1+\xi)=a+b\xi+c\xi^{2}\) the l.h.s. equals \[\begin{array}{l} \xi+\xi^{2}-\sigma(1+\xi)\log(1+\xi)\\ =\xi+\xi^{2}-\sigma(1+\xi)\xi-\sigma(1+\xi)\frac{\xi^{2}}{2}+\sigma(1+\xi)\frac{\xi^{3}}{3}+O(\xi^{4})\\ =\xi+\xi^{2}-(a+b\xi+c\xi^{2})\xi+(a+b\xi+O(\xi^{3}))\frac{\xi^{2}}{2}+(a+O(\xi))\frac{\xi^{3}}{3}+O(\xi^{4})\\ =\xi(1-a)+\xi^{2}(1-b+\frac{a}{2})+\xi^{3}\left(-c+\frac{b}{2}+\frac{a}{3}\right)+O(\xi^{4}). \end{array}\] The first three terms vanish if \(a=1\) and \(b=\frac{3}{2}\) and \(c=\frac{3}{4}+\frac{1}{3}=\frac{5}{12},\) corresponding to \[\sigma(1+\xi)=1+\frac{3}{2}\xi+\frac{5}{12}\xi^{2}.\] To write down the corresponding method, substitute \(\xi=w-1\) in \(\sigma\) to obtain \[\sigma(w)=1+\frac{3}{2}(w-1)+\frac{5}{12}(w-1)^{2}=-\frac{1}{12}+\frac{2}{3}w+\frac{5}{12}w^{2}.\] This is the method \[y_{n+2}-y_{n+1}=h\left(\frac{5}{12}F(t_{n+2},y_{n+2})+\frac{2}{3}F(t_{n+1},y_{n+1})-\frac{1}{12}F(t_{n},y_{n})\right).\tag{10.47}\] It is an implicit method of order \(3,\) one more order than the \(2\)-step Adams-Bashforth method.

In general, for every \(s\) there is a unique \(\sigma\) of degree \(s\) such that \[(1+\xi)^{s-1}\xi-\sigma(1+\xi)\log(1+\xi)=O(\xi^{s+2}).\] These methods are called the Adams-Moulton methods. They are in some sense the implicit version of the Adams-Bashforth methods. The one-step Adams-Moulton method is ([eq:AM_1]) of order \(2,\) and the two-step Adams-Moulton method is ([eq:AM_2]) of order \(3.\)

10.7 Backward differentiation formulas

Another class of multistep methods are obtained by deciding that one wants \(\sigma(w)=w^{s},\) which corresponds to a method of the form \[\sum_{m=0}^{s}a_{m}y_{m}=hF(y_{n+s},t_{n+s}).\] These are in some sense “maximally implicit”, since only the term that makes a method implicit appears on the r.h.s.

The assumption \(\sigma(w)=w^{s}\) corresponds to \(\sigma(1+\xi)=(1+\xi)^{s}.\) The quantity ([eq:order_cond_taylor_in_xi]) then equals \[\rho(1+\xi)-(1+\xi)^{s}\log(1+\xi).\]

Example 10.17

(One step \(s=1\)). For \(s=1\) we have \[(1+\xi)\ln(1+\xi)=\xi+\frac{\xi^{2}}{2}-\frac{\xi^{3}}{6}+\ldots,\] so the maximal order is obtained by setting \(\rho(1+\xi)=\xi.\) This corresponds to \(\rho(w)=w-1,\) i.e. to the method \[y_{n+1}-y_{n}=hF(y_{n+1},t_{n+1}).\tag{10.48}\] This is the backwards Euler method, of order \(1.\)

Example 10.18

(One step \(s=2\)). For \(s=2\) we have \[(1+\xi)\ln(1+\xi)=\xi+\frac{3}{2}\xi^{2}+\frac{\xi^{3}}{3}+\ldots,\] so the maximal order is obtained when \(\rho(1+\xi)=\xi+\frac{3}{2}\xi^{2}.\) This corresponds to \[\rho(w)=w-1+\frac{3}{2}(w-1)^{2}=\frac{3}{2}w^{2}-2w+\frac{1}{2},\] i.e. \[\frac{3}{2}y_{n+2}-2y_{n+1}+\frac{1}{2}y_{n}=hF(y_{n+1},t_{n+1}).\] Normalizing so that the coefficient of \(y_{n+2}\) is \(1\) this is equivalently written as \[y_{n+2}-\frac{4}{3}y_{n+1}+\frac{1}{3}y_{n}=\frac{2}{3}hF(y_{n+1},t_{n+1}).\tag{10.49}\] This method is of order \(2.\) To determine if the method is convergent we must verify the root condition. Since \[w^{2}-\frac{4}{3}w+\frac{1}{3}=\left(w-1\right)\left(w-\frac{1}{3}\right)\] the roots of \(\rho\) are \(w=1,w=\frac{1}{3}\) the root condition is satisfied!

The methods that arise like this from the choice \(\sigma(w)=w^{s}\) are called Backward differentiation formula (BDF). The one-step BDF is ([eq:BFD_1]) and the two-step BDF ([eq:BFD_2]). The \(s\)-step method is of order \(s.\) It turns out that for \(s=1,2,3,4,5,6\) the \(\rho\) of the \(s\)-step BDF satisfies the root condition, while for \(s\ge7\) the root condition is unfortunately not satisfied. Thus the method is convergent only for \(s\in\{1,\ldots,6\}\) steps.

10.8 Dahlquist barriers

The \(3\)-step Adams-Bashforth method is of order \(3,\) while its implicit counterpart the \(3\)-step Adams-Moulton method is of order \(4.\) The \(3\)-step BDF is of order \(3.\) There exist \(3\)-step methods of higher order. For instance \[\begin{array}{l} y_{n+3}+\frac{27}{11}y_{n+2}-\frac{27}{11}y_{n+1}-y_{n}\\ =h\left(\frac{3}{11}F(t_{n+3},y_{n+3}+\frac{27}{11}F(t_{n+2},y_{n+2})+\frac{27}{11}f(t_{n+1},y_{n+1})+\frac{3}{11}F(t_{n},y_{n})\right) \end{array}\] is a \(3\)-step method of order \(6\)! Unfortunately, it does not satisfy the root condition and is not convergent. Even more unfortunately, this is the case in general, as proved by the next “barrier” theorem, which characterizes the highest order of a linear \(s\)-step multistep method.

Theorem 10.19 • The First Dahlquist Barrier

The order \(p\) of a convergent \(s\)-step linear multistep method satisfies

\(p\le s+2\) if \(s\) is even,
\(p\le s+1\) if \(s\) is odd,
\(p\le s\) if the method is explicit.

The proof of this remarkable theorem is outside the scope of the course. The Adams-Bashforth \(s\)-step method “achieves the barrier” for explicit method, since it has order \(s.\) By the theorem no other explicit method achieves a higher order, so we don’t have to waste effort trying to find clever \(\sigma,\rho\) that give an explicit method of order \(s+1\) or above!

For odd \(s\) the Adams-Moulton achieves the barrier \(p=s+1\) for implicit methods. For even \(s\) the Adams-Moulton have order \(p=s+1,\) and Theorem 10.19 only rules out orders \(p\ge s+3.\) One can in fact achieve order \(p=s+2\) with the Milne-Simpson methods which are obtained by setting \(\rho(w)=w^{s}-w^{s-2}\) (which satisfies the root condition), and solving for the coefficients of \(\sigma.\)

10.9 Summary

The following table summarizes the the multistep methods we have considered in this chapter.

	Order	Explicit or implicit	Convergent for
Adams-Bashforth	\(s\)	Explicit	\(1\le s<\infty\)
Adams-Moulton	\(s+1\)	Implicit	\(1\le s<\infty\)
Backward differentiation formula	\(s\)	Implicit	\(1\le s\le6\)

11 Numerics III: Runge-Kutta methods

In this chapter we study a very important class of numerical schemes called Runge-Kutta (RK) methods. Instead of using multiple previous approximations like the multistep methods of the previous chapter, they use multiple intermediate approximations. These intermediate approximations approximate the solution \(y(t)\) for several \(t\in[t_{n},t_{n+1}].\) Like for several other methods we have seen, the starting point of the derivation of the Runge-Kutta methods is an approximation scheme for a definite integral \(\int_{t_{n}}^{t_{n+1}}f(t)dt.\) Applying the scheme to approximate \(\int_{t_{n}}^{t_{n+1}}F(t,y(t))dt\) yields a time-stepping method (where of course \(y\) is the the solution to an ODE \(y'=F(t,y)\)). Runge-Kutta methods are somewhat similar in spirit to quadrature rules for approximating definite integrals. As a warm-up, the first section introduces quadrature and derives the especially accurate Gaussian quadrature rules.

11.1 Gaussian quadrature

Quadrature rules are the building blocks of methods for numerically approximating definite integrals. A quadrature rule for an interval \([a,b]\) approximates a definite integral \(\int_{a}^{b}f(t)dt\) by a linear combination, i.e. \[\int_{a}^{b}f(t)dt\approx\sum_{j=1}^{k}b_{j}f(c_{j})\tag{11.1}\] with a fixed and small number \(k\) of summands and fixed parameters \(a\le c_{1}<\ldots<c_{k}\le b,b_{1},\ldots,b_{k}.\) The parameters \(c_{i}\) are called nodes and the parameters \(b_{k}\) weights.

\(\text{ }\)Small intervals. Clearly for say \([a,b]=[0,1]\) and \(k=2\) one can’t expect the error in this approximation to be particularly small in general. Arbitrarily small error is achieved by applying the quadrature rule in small intervals of length \(h,\) with \(h\downarrow0.\) Applied to a small interval \([t_{n},t_{n+1}]=[t_{n},t_{n}+h]\) a quadrature for \([0,1]\) turns into \[\int_{t_{n}}^{t_{n+1}}f(t)dt\approx h\sum_{j=1}^{k}b_{j}f(t_{n}+c_{j}h),\] and setting \(h=\frac{1}{N}\) and \(t_{n}=nh,n\ge0\) for a large integer \(N\) and summing over \(n\) yields the approximation \[\int_{0}^{1}f(t)dt\approx\sum_{n=0}^{N-1}h\sum_{j=1}^{k}b_{j}f(t_{n}+c_{j}h).\tag{11.2}\] Note that the left Riemann sum approximation arises from the quadrature rule for \([0,1]\) given by \(k=1,c_{1}=0,b_{1}=1,\) and the trapezoidal rule approximation ([eq:ch9_trapezoidal_rule_integral]) arises from the quadrature rule \(k=2,c_{1}=0,c_{1}=1,b_{1}=b_{2}=\)\(\frac{1}{2}.\)

\(\text{ }\)From local to global error bounds. Error bounds for \[\left|\int_{0}^{1}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\tag{11.3}\] translate to error bounds for the approximation ([eq:ch11_gaus_quad_global_approx_intro]). For left Riemann sum quadrature rule the error ([eq:ch11_guas_quad_error_0_1]) satisfies \[\left|\int_{0}^{1}f(t)dt-f(0)\right|\le\sup_{t\in[0,1]}|f'(t)|,\tag{11.4}\] and for the trapezoidal quadrature rule it satisfies \[\left|\int_{0}^{1}f(t)dt-\frac{f(0)+f(1)}{2}\right|\le c\sup_{t\in[0,1]}|f''(t)|,\tag{11.5}\] both for say analytic \(f.\) The next lemma deduces an error bound for the approximation ([eq:ch11_gaus_quad_global_approx_intro]) from a bound for ([eq:ch11_guas_quad_error_0_1]) of the type of ([eq:ch11_gaus_quad_rieman_quad_error])-([eq:ch11_gaus_quad_trapezoidal_quad_error]).

Lemma 11.1

Consider a quadrature rule \(k\ge1,\) \(0\le c_{1}<\ldots<c_{k}\le1\) and \(b_{1},\ldots,b_{k}\in\mathbb{R}\) for the interval \([0,1].\) Assume that there is a constant \(c\in[0,\infty)\) and a \(p\ge1\) such that for any analytic \(f:[0,1]\to\mathbb{R}\) \[\left|\int_{0}^{1}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le\kappa\sup_{t\in[0,1]}|f^{(p)}(t)|.\tag{11.6}\] Then for any analytic \(g:[0,1]\to\mathbb{R}\) and \(h=\frac{1}{N}\) for an integer \(N\ge1\) \[\left|\int_{0}^{1}g(t)dt-\sum_{n=0}^{N-1}h\sum_{j=1}^{k}b_{j}g(t_{n}+hc_{j})\right|\le h^{p}\kappa\sup_{t\in[0,1]}|g^{(p)}(t)|.\tag{11.7}\]

Proof. The l.h.s. of ([eq:ch11_guas_quad_global_error]) is at most \[\sum_{n=0}^{N-1}e_{n}\tag{11.8}\] where \[e_{n}=\left|\int_{t_{n}}^{t_{n+1}}g(t)dt-h\sum_{j=1}^{k}b_{j}g(t_{n}+hc_{j})\right|.\] With the change of variables \(t=t_{n}+sh\) and \(r_{n}(s)=g(t_{n}+sh)\) it follows that \[\int_{t_{n}}^{t_{n+1}}g(t)dt=h\int_{0}^{1}g(t_{n}+sh)ds=h\int_{0}^{1}r_{n}(s)ds.\] Thus \[e_{n}=h\left|\int_{0}^{1}r_{n}(s)ds-\sum_{j=1}^{k}b_{j}r_{n}(c_{j})\right|.\] Thus by ([eq:ch11_gaus_quad_quad_error]) \[e_{n}\le\kappa h\sup_{t\in[0,1]}\left|r_{n}^{(p)}(t)\right|\tag{11.9}\] Furthermore \(r_{n}^{(p)}(s)=h^{p}g^{(p)}(t_{n}+hs)\) so that \[\max_{n=0}^{N-1}\sup_{s\in[0,1]}\left|r_{n}^{(p)}(s)\right|\le h^{p}\sup_{t\in[0,1]}|g^{(p)}(t)|.\] This implies that \(e_{n}\le h^{p+1}\kappa\sup_{t\in[0,1]}|g^{(p)}(t)|,\) and plugging this into ([eq:ch11_guas_quad_global_error_first_step]) yields ([eq:ch11_guas_quad_global_error]).

Remark 11.2

The fully general definition of quadrature allows for a weight function \(w:[a,b]\to[0,\infty)\) and approximates the weighted integral \(\int_{a}^{b}f(t)w(t)dt\) by a sum \(\sum_{j=1}^{k}b_{j}f(c_{j}).\) We limit ourselves to the case of a constant weight function \(w(t)=1.\)

The goal of this section is to construct optimal quadrature rules for all \(k\ge1,\) in the sense that they satisfy ([eq:ch11_gaus_quad_quad_error]) for the largest possible \(p.\)

First we introduce the concept of order of a quadrature rule. For each \(p\) let \(\mathbb{P}_{p}\) denote the set of all polynomials of degree at most \(p,\) which is a real finite-dimensional vector space.

Definition 11.3 • Order of a quadrature rule

Let \([a,b]\) be a non-empty interval, and let \(k\ge1,c_{1},\ldots,c_{k},b_{1},\ldots,b_{k}\) be the parameters of a quadrature rule for \([a,b].\) If \(p\ge1\) and \[\int_{a}^{b}f(t)dt=\sum_{j=1}^{k}b_{j}f(c_{j})\tag{11.11}\] (i.e. the quadrature rule approximates \(\int_{a}^{b}f(t)dt\) with zero error) for every \(f\in\mathbb{P}_{p-1},\) then the quadrature is said to be of order (at least) \(p\).
If the quadrature rule of order \(p\) but not of order \(p+1\) then it is of order exactly \(p.\)

The next lemma shows that the bound ([eq:ch11_gaus_quad_quad_error]) holds if the order of quadrature is (at least) \(p.\) What’s more, a bound like ([eq:ch11_gaus_quad_quad_error]) holding for some \(p=m\) but not for \(p=m+1\) is equivalent to the method being of order \(m.\)

Lemma 11.4

Fix a quadrature rule \(k,c_{1},\ldots,c_{k},b_{1},\ldots,b_{k}.\) Let \(p\ge1.\)

a) If the quadrature rule is of order \(p\) then \[\left|\int_{a}^{b}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le c\sup_{t\in[a,b]}|f^{(p)}(t)|\:\:\text{for all analytic }f:[a,b]\to\mathbb{R},\tag{11.12}\] where \(c=(k+1)\frac{(b-a)^{p}}{p!}.\)

b) Conversely, assume that that ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds for some constant \(c\in[0,\infty).\) Then the quadrature rule is of order (at least) \(p.\)

c) Let \(m\ge1.\) The quadrature rule is of order exactly \(m\) iff ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds when \(p=m\) for some constant \(c\in[0,\infty),\) but does not hold for any constant \(c\in[0,\infty)\) when \(p=m+1.\)

Proof. The last claim c) follows from a) and b).
b) Polynomials are analytic and if they are of degree at most \(p-1\) then \(f^{(p)}(t)=0\) for all \(t.\) Thus if ([eq:ch11_quad_order_p_error_rate_error_bound_c]) holds, then the identity ([eq:ch11_gauss_quad_order_def_eq]) holds for all such polynomials of degree at most \(p-1,\) which implies that Definition 11.3 is satisfied.
a) By Taylor’s theorem with remainder it holds for any analytic \(f\) and \(p\) that \[f(t)=q(t)+\frac{(t-a)^{p}}{p!}f^{(p)}(s)\] for some \(s\in[a,b],\) where \[q(t)=\sum_{j=0}^{p-1}\frac{(t-a)^{j}}{j!}f^{(j)}(t)\] is a polynomial of degree at most \(p-1.\) In particular for \(t=c_{j}\) \[f(c_{j})=q(c_{j})+\frac{(c_{j}-a)^{p}}{p!}f^{(p)}(s_{j})\] for some \(s_{j}\in[a,b].\) The assumption that the quadrature is of order at least \(p\) implies that \[\int_{a}^{b}q(t)dt=\sum_{j=1}^{k}b_{j}q(c_{j}),\] so \[\left|\int_{a}^{b}f(t)dt-\sum_{j=1}^{k}b_{j}f(c_{j})\right|\le\frac{(t-a)^{p}}{p!}|f^{(m)}(s)|+\sum_{j=1}^{k}\frac{(c_{j}-a)^{p}}{p!}|f^{(m)}(s_{j})|,\] which implies ([eq:ch11_quad_order_p_error_rate_error_bound_c]) with the claimed constant \(c.\)

The next lemma shows that for any \(k\ge1\) and distinct nodes \(c_{1},\ldots,c_{k}\) there exist \(b_{1},\ldots,b_{k}\) such that the quadrature rule corresponding to these \(b_{i},c_{i}\) has order at least \(k.\) Recall that the Vandermonde matrix of a sequence \(x_{1},\ldots,x_{k}\) is the matrix \(k\times k\) matrix \[V(x_{1},\ldots,x_{k}):=\left(\begin{matrix}1 & x_{1} & x_{1}^{2} & \dots x_{1}^{k-1}\\ 1 & x_{2} & x_{2}^{2} & \dots x_{2}^{k-1}\\ \vdots & \vdots & \vdots & \ddots\vdots\\ 1 & x_{k} & x_{k}^{2} & \dots x_{k}^{k-1} \end{matrix}\right)\] Recall furthermore that \(\det(V(x_{1},\ldots,x_{k}))=\prod_{1\le i<j\le k}(x_{i}-x_{j}),\) so a Vandermonde matrix is invertible whenever the \(x,\ldots,x_{k}\) are distinct.

Lemma 11.5

For any non-empty \([a,b],k\ge1\) and distinct nodes \(a\le c_{1}<\ldots<c_{k}\le b\) the system of equations \[\sum_{j=1}^{k}b_{j}c_{j}^{m}=\int_{a}^{b}t^{m}dt,m=0,\ldots,k-1\tag{11.14}\] uniquely determines weights \(b_{1},\ldots,b_{k}.\) The quadrature combining these \(b_{i}\) and \(c_{i}\) is of order at least \(k.\)

Proof. The system of equations ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) is equivalent to \(b^{T}V(c_{1},\ldots,c_{k})=v,\) for the column vector \(v\in\mathbb{R}^{k}\) with \(v_{m+1}=\int_{a}^{b}t^{m}dt,m=0,\ldots,k-1.\) Since the \(c_{i}\) are distinct the Vandermonde matrix is non-degenerate. Thus a unique solution \(b=(b_{1},\ldots,b_{k})\in\mathbb{R}^{k}\) exists. Next let \(f(t)=t^{m}\) and note that by ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) \[\sum_{j=1}^{k}b_{j}f(c_{j})-\int_{a}^{b}t^{m}dt=\sum_{j=1}^{k}b_{j}c_{j}^{m}-\int_{a}^{b}t^{m}dt=0\quad\quad m=0,\ldots,k-1.\] If \(f\) is a polynomial of degree at most \(k-1\) then it is a linear combination of \(t^{m},m=0,\ldots,k-1,\) so it follows that for all such \(f\) \[\sum_{j=1}^{k}b_{j}f(c_{j})-\int_{a}^{b}f(t)dt=0.\] This implies that the method is of order at least \(k.\)

This means that we can construct a quadrature with better error bound that the trapezoidal rule method (for which \(k=2\)) by simply picking any \(k=3\) nodes \(c_{1},c_{2},c_{3}\) and solving for \(b_{1},b_{2},b_{3}\) - this will yield a quadrature of order \(p=3.\) It turns out however, than one can do significantly better! Indeed for any \(k\ge1\) one can obtain a quadrature of order \(2k\) by choosing \(c_{1},\ldots,c_{k}\) in a particular way! Furthermore, this is an optimal quadrature: no other rule has higher order.

These special \(c_{1},\ldots,c_{k}\) arise as the roots of certain polynomials. More specifically, they are the roots of orthogonal polynomials with respect to the interval \([a,b].\) To define them we introduce the inner-product \[\left\langle f,g\right\rangle _{a,b}=\int_{a}^{b}f(t)g(t)dt\] on the finite dimensional linear vector space \(\mathbb{P}_{l},\) for any \(l.\) The orthogonal polynomials w.r.t. to \([a,b]\) are defined as \[\begin{array}{c} \begin{array}{c} \text{the }k\text{-th orthogonal polynomial }P_{k}=P_{k,[a,b]}\text{ is the unique }\\ \text{monic }P_{k}\in\mathbb{P}_{k}\text{ such that }\left\langle P_{k},f\right\rangle _{a,b}=0\text{ \,for all }f\in\mathbb{P}_{k-1}. \end{array}\end{array}\tag{11.15}\] To see that ([eq:ch11_guass_quad_legendre_poly_def]) indeed defines a sequence \(P_{0},P_{1},P_{2},\ldots\) of polynomials uniquely, recall that on any finite dimensional inner-product space one can define the linear projections onto any subspace. Thus we can define the linear projection \(\Pi^{\mathbb{P}_{k-1}^{\bot}}\) from the finite-dimensional \(\mathbb{P}_{k}\) onto the subspace \(\mathbb{P}_{k-1}^{\bot}\) orthogonal to \(\mathbb{P}_{k-1}.\) Then if \(q_{k}\) is any polynomial of order exactly \(k\) - e.g. \(q_{k}(t)=t^{k}\) - then the projection \(P_{k}=\Pi^{P_{k-1}^{\bot}}q_{k}\) is a non-zero polynomial in \(\mathbb{P}_{k}\backslash\mathbb{P}_{k-1}\) that satisfies \(\left\langle P_{k},p\right\rangle =0\) for all \(p\in\mathbb{P}_{k-1}.\) Normalizing it to be monic shows that for every \(k\) a \(P_{k}\) satisfying ([eq:ch11_guass_quad_legendre_poly_def]) exists. To show uniqueness, note that if \(P_{k}\) and \(Q_{k}\) are two monic polynomials of degree \(k\) satisfying ([eq:ch11_guass_quad_legendre_poly_def]), then \(r=P_{k}-Q_{k}\) is polynomial of degree at most \(k-1\) and so \(\left\langle r,r\right\rangle =\left\langle P_{k},r\right\rangle -\left\langle Q_{k},r\right\rangle =0,\) implying that \(\left\langle r,r\right\rangle =0\) and thus \(P_{k}=Q_{k}.\)

The orthogonal polynomials for \([-1,1]\) are called the Legendre polynomials. Let us denote \[\text{denote the }k\text{-th Legendre polynomial by }Q_{k}.\tag{11.16}\] The polynomials \(Q_{k}\) can be derived by the Gram-Schmidt process applies to the basis \(1,t,t^{2},\ldots,t^{k}\) of \(\mathbb{P}_{k}.\) This yields \(Q_{1}(t)=1.\) Next \[Q_{1}(t)=t-\left\langle t,Q_{0}\right\rangle Q_{0}=t,\] (where we drop the subscript on \(\left\langle \cdot,\cdot\right\rangle _{-1,1}\)) and then \[Q_{2}(t)=t^{2}-\left\langle t^{2},Q_{1}\right\rangle Q_{1}-\left\langle t^{2},Q_{0}\right\rangle Q_{0}=t^{2}-\int_{-1}^{1}t^{2}dt=t^{2}-\frac{2}{3},\] and so on. The first four Legendre polynomials are \[\begin{array}{ccl} Q_{0}(t) & = & 1,\\ Q_{1}(t) & = & t,\\ Q_{2}(t) & = & t^{2}-\frac{2}{3},\\ Q_{3}(t) & = & t^{3}-\frac{3}{5}t. \end{array}\tag{11.17}\] For other \([a,b]\) the orthogonal polynomials can either be derived by Gram-Schmidt, or written in terms of the Legendre polynomials \(Q_{k}\) via \[P_{k}(t)=P_{k,[a,b]}(t)=(b-a)^{k}Q_{k}\left(\frac{t-\frac{1}{2}(a+b)}{b-a}\right),\tag{11.18}\] (one can check that these \(P_{k}\) are monic and for any \(f\in\mathbb{P}_{k-1}\) it follows that \(\left\langle P_{k},f\right\rangle _{a,b}=0\) from the fact that \(\left\langle Q_{k},f\right\rangle _{-1,1}=0\)).

Lemma 11.6

For every \(k\ge1\) and non-empty interval \([a,b]\) the corresponding orthogonal polynomial \(P_{k}\) has exactly \(k\) distinct roots, which all lie in \([a,b].\)

Proof. Let \(x_{1},\ldots,x_{m}\) be the distinct roots of \(P_{k}\) in \([-1,1].\) Define the polynomial \[q(t)=\prod_{i=1}^{m}(t-x_{i}).\] The function \(P_{k}(t)\) changes sign exactly \(m\) times in \([-1,1],\) and the changes happen at the roots \(x=x_{k}.\) The same applies to \(q(t).\) Therefore \(P_{k}(t)q(t)\ge0\) for all \(t\in[-1,1]\) or \(P_{k}(t)q(t)\le0\) for all \(t\in[-1,1].\) Therefore \(\left\langle P_{k},q\right\rangle =\int P_{k}(t)q(t)dt\ne0.\) The polynomial \(q\) is of degree \(m.\) If \(m\le k-1\) then it has degree at most \(k-1,\) so by definition of \(P_{k}\) it follows that \(\left\langle P_{k},q\right\rangle =0.\) This is a contradiction, so it must hold that \(m\ge k.\)

Lastly, \(P_{k}\) is of degree \(k\) so by the fundamental theorem of algebra \(P_{k}\) can have at most \(k\) roots - thus \(m=k\) and \(x_{1},\ldots,x_{k}\) are all the roots of \(P_{k}.\)

We now show that a quadrature rule where the nodes are the roots of an orthogonal polynomial achieves the highest possibly order.

Theorem 11.7

If a non-empty interval \([a,b].\) Let \(k\ge1.\) If \(c_{1},\ldots,c_{k}\) are the roots of \(P_{k}\) and \(b_{1},\ldots,b_{k}\) the unique solutions of ([eq:ch11_quad_order_quad_of_order_at_lesat_k]) then:

The corresponding quadrature is of order \(2k.\)
No other quadrature has higher order.

Proof. We must show that for any polynomial \(f\) of degree at most \(2k-1\) it holds that \[\int_{a}^{b}f(t)dt=\sum_{i=1}^{k}b_{i}f(c_{i}).\tag{11.20}\] By Lemma 11.5 we already know that our quadrature has order at least \(k,\) so ([eq:ch11_gauss_quad__ord_thm_need_to_show]) holds \(f\) has degree at most \(k-1.\) Let \(f\) be any polynomial of degree at most \(2k-1.\) Recall that for any two non-vanishing polynomials \(a(t),b(t)\) there exists polynomials a quotient polynomial \(q(t)\) and a rest \(r(t)\) polynomial such that the degree of \(r\) strictly less than that of \(b,\) and \[a(t)=b(t)q(t)+r(t).\] Using this with \(a(t)=f(t)\) and \(q(t)=P_{k}(t)\) yields \[f(t)=q(t)P_{k}(t)+r(t),\] where \(r\in\mathbb{P}_{k-1}.\) Since \(r\) has degree at most \(k-1\) we have \[\sum_{i=1}^{k}b_{i}r(c_{i})=\int_{a}^{b}r(t)dt.\tag{11.21}\] Since \(q\) has degree at most \(k-1\) we have \[\int_{a}^{b}f(t)dt=\left\langle q(t),P_{k}(t)\right\rangle +\int_{a}^{b}r(t)dt=\int_{a}^{b}r(t)dt.\tag{11.22}\] Lastly \[\sum_{i=1}^{k}b_{i}f(c_{i})=\sum_{i=1}^{k}b_{i}q(c_{i})P_{k}(c_{i})+\sum_{i=1}^{k}b_{i}r(c_{i})=\sum_{i=1}^{k}b_{i}r(c_{i}),\tag{11.23}\] since the \(c_{i}\) are roots of \(P_{k}.\) Combining ([eq:ch11_gauss_quad_thm_2k_end_1])-([eq:ch11_gauss_quad_thm_2k_end_3]) implies that \[\int_{a}^{b}f(t)dt=\sum_{i=1}^{k}b_{i}f(c_{i}).\] Since this holds for any \(f\) of degree at most \(2k-1,\) the quadrature rule is of order \(2k.\)

It remains to show that no rule has higher order. Fix an arbitrary quadrature with parameters \(b_{i},c_{i}.\) Let \[f(t)=\prod_{i=1}^{k}(t-c_{i})^{2},\] which is a polynomial of degree at most \(2k.\) We have \[\sum_{i=1}^{k}b_{i}f(c_{i})=0,\] since \(c_{1},\ldots,c_{k}\) are roots of \(f(t),\) while \[\int_{a}^{b}f(t)dt>0,\] since \(f(t)\) is non-negative and not identically zero. Thus the quadrature is not exact for this \(f(t),\) so it is not of order \(2k+1.\)

11.2 Explicit Runge-Kutta (ERK) schemes

As we have done a few times before, let us take the task of approximating the integral on the r.h.s. of the identity \[y(t_{n+1})-y(t_{n})=\int_{t_{n}}^{t_{n+1}}F(t,y(t))dt\tag{11.25}\] as a starting point for devising a numerical time-stepping method (where of course \(y(t)\) is the solution of a generic ODE \(y'=F(t,y)\)). In particular, inspired by the quadrature rules of the previous section we fix \(k\ge1,\) distinct nodes \(c_{1},\ldots,c_{k}\) and weights \(b_{1},\ldots,b_{k}\) and try to approximate the r.h.s. by \[h\sum_{j=1}^{k}b_{j}F(t_{n}+c_{j}h,y(t_{n}+c_{j}h)).\tag{11.26}\] This does not immediately yield a numerical time-stepping method of the type that produces a next approximation \(y_{n+1}\) from the previous approximation \(y_{n},\) since in such a setting we do not have approximations for the \(y(t_{n}+c_{j}h)\) (except possibly for \(j=1,\) in the special case \(c_{1}=0\)). The idea of Runge-Kutta (RK) methods is to first compute intermediate approximations \[\xi_{j}\approx y(t_{n}+c_{j}h)\tag{11.27}\] and then use them in the final approximation \[y(t_{n+1})\approx y_{n+1}=y_{n}+h\sum_{j=1}^{k}b_{j}F(t_{n}+c_{j}h,\xi_{j}).\tag{11.28}\]

\(\text{ }\)Form of intermediate stages. Note that \(\xi_{k}\) is supposed to approximate \(y(t_{n}+c_{k}h),\) and similarly to ([eq:ch11_ERK_starting_point]) \[y(t_{n}+c_{k}h)-y(t_{n})=\int_{t_{n}}^{t_{n}+c_{k}h}F(t,y(t))dt.\tag{11.29}\] Applying the philosophy of ([eq:ch11_ERK_starting_point])-([eq:ch11_ERK_intrp_final_step]) to the r.h.s. would entail approximating it as a weighted sum of approximations of the solution at points corresponding to some nodes \(\tilde{c}_{1},\tilde{c}_{2},\ldots\) in the interval \([0,c_{k}].\) Since we are anyway aiming to generate the approximations ([eq:ch11_ERK_penultimate_step]) corresponding to the nodes \(c_{i}\) it seems natural to reuse these and to use \(c_{1},\ldots,c_{k-1}\) as nodes when approximating the r.h.s of ([eq:ch11_ERK_penultimate_step]). That is, approximate it as \[y(t_{n}+c_{k}h)-y(t_{n})=\int_{t_{n}}^{t_{n}+c_{k}h}F(t,y(t))dt\approx h\sum_{j=1}^{k-1}a_{k,j}F(t_{n}+c_{j}h,\xi_{j}).\] This suggests to compute \(\xi_{k}\) based on \(y_{n},\xi_{1},\ldots,\xi_{k-1}\) as \[\xi_{k}=y_{n}+h\sum_{j=1}^{k-1}a_{k,j}F(t_{n}+c_{j}h,\xi_{j}).\] Note that while sticking to the same nodes, we do introduce new weights \(a_{k,j}\) that are specific to computing \(\xi_{k}\) and can be optimized for overall accuracy. We can compute the approximation \(\xi_{k-1}\) from the previous \(y_{n},\xi_{1},\ldots,\xi_{k-2}\) in the same way using \[\xi_{k-1}=y_{n}+h\sum_{j=1}^{k-2}a_{k-1,j}F(t_{n}+c_{j}h,\xi_{j}),\] or for general \(1\le l\le k\) \[\xi_{l}=y_{n}+h\sum_{j=1}^{l-1}a_{l,j}F(t_{n}+c_{j}h,\xi_{j}),l=1,\ldots,k.\tag{11.30}\] If the first intermediate approximation is of \(y(t_{n}+ch)\) for \(c>0,\) then since there are no previous intermediate values to exploit the approximation must simply be based only on \(y_{n}.\) It is customary to set \(c_{1}=0\) so that \(\xi_{1}=y_{n}\) and \(\xi_{2}=y_{n}+ha_{2,1}F(t_{n},\xi_{1})\) is this first intermediate approximation. This is consistent with the case \(l=1\) of ([eq:ch11_ERK_intro_gen_xi_step]).

An explicit Runge-Kutta method is thus parameterized by an integer \(k\ge1\) and numbers \(c_{2},\ldots,c_{k},b_{1},\ldots,b_{k},a_{l,j},1\le j<l\le k.\) Given the parameters is takes the form \[\begin{array}{ccl} \xi_{1} & = & y_{n},\\ \xi_{l} & = & y_{n}+h\sum_{j=1}^{l-1}a_{l,j}F(t_{n}+c_{j}h,\xi_{j}),2\le l\le k,\\ y_{n+1} & = & y_{n}+h\sum_{l=1}^{k}b_{l}F(t_{n}+c_{l}h,\xi_{l}), \end{array}\tag{11.31}\] where \(c_{1}=0.\) The methods corresponding to a given \(k\) are called \(k\)-stage methods. One picks “good” values for the parameters \(c_{i},b_{i},a_{l,j}\) by Taylor expanding \(y_{n+1}\) in \(h,\) and matching to the Taylor expansion of the true solution.

\(\text{ }\)One stage \(k=1.\) If \(k=1\) the only intermediate stage is \(\xi_{1}=y_{n}\) and the ERK reads \(y_{n+1}=y_{n}+hb_{1}F(t_{n},y_{n}).\) This is only consistent (i.e. of some order \(p\ge0\)) if \(b_{1}=1,\) which is Euler’s method.

\(\text{ }\)Two stage \(k=2.\) For \(k=2\) stages the general method takes the form \[\begin{array}{ccl} \xi_{1} & = & y_{n},\\ \xi_{2} & = & y_{n}+ha_{2,1}F(t_{n}+c_{2}h,\xi_{1})\\ y_{n+1} & = & y_{n}+hb_{1}F(t_{n},\xi_{1})+hb_{2}F(t_{n}+c_{2}h,\xi_{2}). \end{array}\] To compute the local error of this method, consider its result when \(y_{n}\) is replaced by the true value \(y(t_{n})\): \[\begin{array}{ccl} \tilde{\xi}_{1} & = & y(t_{n})\\ \tilde{\xi}_{2} & = & y(t_{n})+ha_{2,1}F(t_{n}+c_{2}h,\tilde{\xi}_{1})\\ \tilde{y}_{n+1} & = & y(t_{n})+hb_{1}F(t_{n},\tilde{\xi}_{1})+hb_{2}F(t_{n}+c_{2}h,\tilde{\xi}_{2}). \end{array}\] Let us Taylor expand all the quantities in \(h.\) For the second line we obtain \[\tilde{\xi}_{2}=y(t_{n})+ha_{2,1}F(t_{n},y(t_{n}))+h^{2}a_{2,1}\partial_{t}F(t_{n},y(t_{n}))+O(h^{3}).\tag{11.32}\] To lighten the notation let us omit the arguments of \(F\) or its derivatives, if those arguments are \(t=t_{n}\) and \(y=y(t_{n}).\) Let us also write \(\Delta_{2}=\tilde{\xi}_{2}-y(t_{n}).\) Writing ([eq:ch11_ERK_k=00003D2_xi2]) in this notation yields \[\Delta_{2}=ha_{2,1}F+h^{2}a_{2,1}\partial_{t}F+O(h^{3}).\tag{11.33}\] From ([eq:ch11_ERK_k=00003D2_delta2_exp]) we see that \(\Delta_{2}=O(h).\) Using the same conventions, the last row reads \[\tilde{y}_{n+1}-y(t_{n})=hb_{1}F+hb_{2}F(t_{n}+c_{2}h,y(t_{n})+\Delta_{2}).\tag{11.34}\] Only the final term needs to be expanded. Its expansion is \[F(t_{n}+c_{2}h,y(t_{n})+\Delta_{2})=F+c_{2}h\partial_{t}F+\Delta_{2}\partial_{y}F+O(h^{2}).\] Substituting in the r.h.s. of ([eq:ch11_ERK_k=00003D2_delta2_exp]) for \(\Delta_{2}\) this turns into \[F(t_{n}+c_{2}h,y(t_{n})+\Delta_{2})=F+c_{2}h\partial_{t}F+ha_{2,1}F\partial_{t}F+O(h^{2}).\tag{11.35}\] Finally plugging into ([eq:ch11_ERK_k=00003D2_final_line_exp_step]) yields \[\tilde{y}_{n+1}-y(t_{n})=hb_{1}F+hb_{2}\left(F+c_{2}h\partial_{t}F+ha_{2,1}F\partial_{t}F\right)+O(h^{3}),\tag{11.36}\] which can be rearranged to \[\tilde{y}_{n+1}-y(t_{n})=h(b_{1}+b_{2})F+h^{2}b_{2}\left(c_{2}\partial_{t}F+a_{2,1}F\partial_{t}F\right)+O(h^{3}).\tag{11.37}\] We can now compare this to the Taylor expansion of the solution \(y(t)\) around \(t=t_{n}.\) It reads \[y(t_{n+1})-y(t_{n})=hy'(t_{n})+\frac{h^{2}}{2}y''(t_{n})+O(h^{3}).\] We have \[y'(t_{n})=F(t_{n},y(t_{n}))=F,\] and \[y''(t_{n})=\frac{d}{dt}F(t,y(t))|_{t=t_{n}}=\partial_{t}F+y'(t_{n})\partial_{y}F=\partial_{t}F+F\partial_{y}F.\] Thus the expansion is \[y(t_{n+1})-y(t_{n})=hF+\frac{h^{2}}{2}(\partial_{t}F+F\partial_{y}F)+O(h^{3}).\tag{11.38}\] Comparing ([eq:ch11_ERK_k=00003D2_set_expansion_final]) and ([eq:ch11_ERK_y_exp_up_to_h^2]), these match up to order \(h^{2}\) provided \[b_{1}+b_{2}=1\quad\quad b_{2}c_{2}=\frac{1}{2}\quad\quad b_{2}a_{2,1}=\frac{1}{2}.\tag{11.39}\] If these conditions are satisfied, then \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{3}).\]

\(\text{ }\)Order of ERK. As for multistep methods, we say that an ERK method that produces the approximation \(\tilde{y}_{n+1}\) from \(y_{n}=y(t_{n})\) (the true solution at time \(t=t_{n}\)) is of order (at least) \(p\) if for any IVP with analytic \(F\) it holds that \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{p+1}).\] Methods with parameters satisfying ([eq:ch11_ERK_k=00003D2_order_conditions]) are thus of order (at least) \(p=2.\)

\(\text{ }\)RK tableaux. The conditions ([eq:ch11_ERK_k=00003D2_order_conditions]) are satisfied for instance if \(b_{1}=0,b_{2}=1,a_{2,1}=c_{2}=\frac{1}{2}.\) But also if \(c_{2}=a_{2,1}=\frac{2}{3},b_{1}=\frac{1}{4},b_{2}=\frac{3}{4},\) or if \(b_{1}=b_{2}=\frac{1}{2},a_{2,1}=c_{2}=1.\) The parameters of a RK method are often recorded in a table called a tableau, which for \(k=2\)-stage explicit methods take the form \[\begin{array}{c|cc} 0\\ c_{2} & a_{2,1}\\ \hline & b_{1} & b_{2} \end{array},\] so that the three methods mention above correspond to the tableaux \[\begin{array}{c|cc} 0\\ \frac{1}{2} & \frac{1}{2}\\ \hline & 0 & 1 \end{array},\quad\quad\text{and}\quad\quad\begin{array}{c|cc} 0\\ \frac{2}{3} & \frac{2}{3}\\ \hline & \frac{1}{4} & \frac{3}{4} \end{array},\quad\quad\text{and}\quad\quad\begin{array}{c|cc} 0\\ 1 & 1\\ \hline & \frac{1}{2} & \frac{1}{2} \end{array}\tag{11.40}\] respectively. The general \(k=3\)-stage explicit RK method takes the form \[\begin{array}{c|ccc} 0\\ c_{2} & a_{2,1}\\ c_{3} & a_{3,1} & a_{3,2}\\ \hline & b_{1} & b_{2} & b_{3} \end{array},\tag{11.41}\] and for any number of stages \(k\ge1\) the tableau takes the form \[\begin{array}{c|c} c & A\\ \hline & b \end{array},\] where we view \(c=(c_{1},\ldots,c_{k})\) as a row vector, \(b=(b_{1},\ldots,b_{k})\) as column vector and the Runge-Kutta (RK) matrix \(A\) is the \(k\times k\) matrix \(A_{i,j}=a_{i,j}.\) An explicit RK method has \(1c=0\) and \(A_{l,i}=0\) unless \(i<l,\) i.e. its RK matrix \(A\) is strictly lower triangular.

\(\text{ }\)Order conditions. The conditions ([eq:ch11_ERK_k=00003D2_order_conditions]) obtained by matching coefficients in the Taylor expansion to achieve a certain order are fittingly called order conditions. Since ([eq:ch11_ERK_k=00003D2_order_conditions]) ensure order \(p=2,\) they are called the second-order conditions. Deriving the order conditions by brute force quickly becomes very tedious. Already the computation of the third-order conditions corresponding to \(p=3\) is too involved to present here. Instead, let us consider a simplified setting where the ODE is autonomous, i.e. \(F(t,y)\) is a function only of \(y.\) For such a \(F\) all the partial derivatives with respect to \(t\) vanish, greatly reducing the number of terms in the expansion. As it happens, the order conditions one obtains are the same as for a general \(F.\) We won’t prove this fact, which also does not extend to \(p=4,5,\ldots.\).

\(\text{ }\)Third-order conditions. Consider the general explicit method with \(k=3\) stages specified by the tableau ([eq:ch11_ERK_intro_general_k=00003D3_stage_tableau]). Applied with the exact value \(y(t_{n})\) in place of \(y_{n}\) when \(F\) is autonomous it yields \[\begin{array}{lcl} \tilde{\xi}_{1} & = & y(t_{n}),\\ \tilde{\xi}_{2} & = & y(t_{n})+ha_{2,1}F(\tilde{\xi}_{1}),\\ \tilde{\xi}_{3} & = & y(t_{n})+ha_{3,1}F(\tilde{\xi}_{1})+ha_{2,1}F(\tilde{\xi}_{2}),\\ \tilde{y}_{n+1} & = & y(t_{n})+hb_{1}F(\tilde{\xi}_{1})+hb_{2}F(\tilde{\xi}_{2})+hb_{3}F(\tilde{\xi}_{3}), \end{array}\tag{11.42}\] We extend the expansion of the solution ([eq:ch11_ERK_y_exp_up_to_h^2]) to third order by noting that \[y'''(t_{n})=\frac{d^{2}}{dt^{2}}F(y(t))|_{t=t_{n}}=\frac{d}{dt}F'(y(t))y'(t)|_{t=t_{n}}=\frac{d}{dt}F'(y(t))F(y(t))|_{t=t_{n}}=F''F^{2}+(F')^{2}F,\] so \[y(t_{n+1})=y(t_{n})+hF+\frac{h^{2}}{2}FF'+\frac{h^{3}}{3!}\left(F''F^{2}+(F')^{2}F\right)+\ldots.\tag{11.43}\] Writing \(\Delta_{i}=y(t_{n})+\tilde{\xi}_{i}\) and using the convention that \(F(y(t_{n}))\) is written simply as \(F,\) this turns into \[\begin{array}{lcl} \Delta_{2} & = & ha_{2,1}F,\\ \Delta_{3} & = & ha_{3,1}F+ha_{2,1}F(y(t_{n})+\Delta_{2}),\\ \tilde{y}_{n+1} & = & y(t_{n})+hb_{1}+hb_{2}F(y(t_{n})+\Delta_{2})+hb_{3}F(y(t_{n})+\Delta_{3}). \end{array}\] The rest of the computation is an exercise on sheet 11. The result is the third-order conditions for \(k=3\) stage methods \[b_{1}+b_{2}+b_{3}=1,\quad b_{2}c_{2}+b_{3}c_{3}=\frac{1}{2},\quad b_{2}c_{2}^{2}+b_{3}c_{3}^{2}=\frac{1}{3},\quad b_{3}a_{3,2}c_{2}=\frac{1}{6}.\tag{11.44}\] Common \(k=3\)-stage methods (that satisfy ([eq:ch11_ERK_p=00003D3,k=00003D3_order_conditions])) include \[\begin{array}{c|ccc} 0\\ \frac{1}{2} & \frac{1}{2}\\ 1 & -1 & 2\\ \hline & \frac{1}{6} & \frac{2}{3} & \frac{1}{6} \end{array}\quad\quad\text{and}\quad\quad\begin{array}{c|ccc} 0\\ \frac{2}{3} & \frac{2}{3}\\ \frac{2}{3} & 0 & \frac{2}{3}\\ \hline & \frac{1}{4} & \frac{3}{8} & \frac{3}{8} \end{array}.\tag{11.45}\] \(\text{ }\)Four-stage method. A common method \(k=4\)-stage method is \[\begin{array}{c|cccc} 0\\ \frac{1}{2} & \frac{1}{2}\\ \frac{1}{2} & 0 & \frac{1}{2}\\ 1 & 0 & 0 & 1\\ \hline & \frac{1}{6} & \frac{1}{3} & \frac{1}{3} & \frac{1}{6} \end{array},\tag{11.46}\] which is of order \(p=4.\)

\(\text{ }\)Order barriers. The \(k=2\)-stage methods ([eq:ch11_ERK_common_two_stage_methods]) are of order \(p=2,\) the \(k=3\)-stage methods ([eq:ch11_ERK_common_three_stage]) are of order \(p=3,\) and the \(k=4\)-stage method ([eq:ch11_ERK_common_four_stage]) is of order \(p=4.\) But the pattern does not continue: there is no explicit Runge-Kutta method with \(k=5\) stages which is of order \(p=5\)! Instead at least \(k=6\) stages are required to obtain a method of order \(p=5.\) The following table shows the highest achievable order for ERK methods with \(k=1,2,\ldots,9\) stages.

Order barriers for Explicit Runge-Kutta methods.
# stages \(k\)	1	2	3	4	5	6	7	8	9
Highest order	1	2	3	4	4	5	6	6	7

\(\text{ }\)Rapid growth of number of order conditions. The number of order conditions that need to be verified to show that an RK method is of order \(p\) grows quickly. The following table¹⁸ gives the number for orders \(1\le p\le10\) (though it uses a slightly different way to count than us, e.g. we have three conditions for \(p=2\) in ([eq:ch11_ERK_k=00003D2_order_conditions]), the qualitative picture still stands).

Number of order conditions for Runge-Kutta methods.
Order \(p\) to verify	1	2	3	4	5	6	7	8	9	10
# order conditions	1	2	4	8	17	37	85	200	486	1205

Studying the order of general RK methods is challenging. There exists a combinatorial theory based on a correspondence between RK methods and trees (in the graph theoretic sense), which was used to compute the information in the tables. It is beyond the scope of this course. The last section of this chapter studies a special subclass of implicit Runge-Kutta methods whose order can be analyzed using the quadrature theory of the previous section.

11.3 Implicit Runge-Kutta (IRK) methods

An implicit Runge-Kutta (IRK) method with \(k\) stages takes the form \[\begin{array}{ccl} \xi_{j} & = & y_{n}+h\sum_{i=1}^{k}a_{j,i}F(t_{n}+c_{i}h,\xi_{i}),\quad j=1,\ldots,k,\\ y_{n+1} & = & y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}), \end{array}\tag{11.47}\] which generalizes ([eq:ch11_ERK_general_form]) by allowing the approximation \(\xi_{j}\) of each stage depend on all other - also later - stages. Correspondingly the RK matrix \(A\) is arbitrary. We no longer require \(c_{1}\ne0.\)

An implementation of an IRK method must approximate a solution to the system of \(k\) non-linear equations in the top line of ([eq:ch11_IRK_general_form]), which are in addition \(\mathbb{R}^{d}\)-valued for an \(\mathbb{R}^{d}\)-valued ODE. This can be rather costly in terms of computation. But IRK methods can also be more stable, and achieve higher order for the same number of stages than an ERK, so sometimes the trade-off is worth it.

Example 11.8

The Backward Euler method is an IRK with \(k=1\) stages and tableau \[\begin{array}{c|c} 1 & 1\\ \hline & 1 \end{array}.\]

\(\text{ }\)Order of IRK methods. Let \(\tilde{y}_{n+1}\) denote the value of the last line of ([eq:ch11_IRK_general_form]) after replacing \(y_{n}\) with the true value \(y(t_{n})\) in the equations ([eq:ch11_IRK_general_form]), and assume that for each \(h>0\) small enough the intermediate values \(\xi_{1},\ldots,\xi_{k}\) are an exact solution of the (modified) top line of ([eq:ch11_IRK_general_form]). If the bound \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{p+1})\] holds for all analytic \(F\) we say that the IRK method is of order (at least) \(p\).

Example 11.9

Consider the two stage IRK method whose tableau is \[\begin{array}{c|cc} 0 & \frac{1}{4} & -\frac{1}{4}\\ \frac{2}{3} & \frac{1}{4} & \frac{5}{12}\\ \hline & \frac{1}{4} & \frac{3}{4} \end{array}.\tag{11.48}\] Written out as equations it corresponds to \[\begin{array}{lcl} \xi_{1} & = & y_{n}+h\frac{1}{4}F(t_{n},\xi_{1})-h\frac{1}{4}F(t_{n}+\frac{2}{3}h,\xi_{2}),\\ \xi_{2} & = & y_{n}+h\frac{1}{4}F(t_{n},\xi_{1})+h\frac{5}{12}F(t_{n}+\frac{2}{3}h,\xi_{2}),\\ y_{n+1} & = & y_{n}+h\frac{1}{4}F(t_{n},\xi_{1})+h\frac{3}{4}F(t_{n}+\frac{2}{3}h,\xi_{2}). \end{array}\] Let use verify that it has order at least \(p=3,\) under the simplifying assumption that \(F\) is autonomous. We replace \(y_{n}\) with the exact value \(y(t_{n}),\) yielding \[\begin{array}{lll} \tilde{\xi}_{1} & = & y(t_{n})+h\frac{1}{4}F(\tilde{\xi}_{1})-h\frac{1}{4}F(\tilde{\xi}_{2}),\\ \tilde{\xi}_{2} & = & y(t_{n})+h\frac{1}{4}F(\tilde{\xi}_{1})+h\frac{5}{12}F(\tilde{\xi}_{2}),\\ \tilde{y}_{n+1} & = & y(t_{n})+h\frac{1}{4}F(\tilde{\xi}_{1})+h\frac{3}{4}F(\tilde{\xi}_{2}). \end{array}\] We now expand \(\tilde{y}_{n+1}\) in \(h\) up to order \(h^{3},\) and compare to the expansion of \(y(t_{n+1})\) up to order \(h^{3}\) for autonomous \(F,\) which was already carried out in ([eq:ch11_ERK_y_aut_exp_to_order_3]). Writing \(y=y(t_{n})\) and \(\Delta_{i}=\tilde{\xi}_{i}-y\) the equations turn into \[\begin{array}{lll} \Delta_{1} & = & \frac{h}{4}F(y+\Delta_{1})-h\frac{1}{4}F(y+\Delta_{2}),\\ \Delta_{2} & = & \frac{h}{4}F(y+\Delta_{1})+h\frac{5}{12}F(y+\Delta_{2}),\\ \tilde{y}_{n+1}-y & = & \frac{h}{4}F(y+\Delta_{1})+h\frac{3}{4}F(y+\Delta_{2}). \end{array}\] The top lines imply that \(\Delta_{1},\Delta_{2}=O(h).\) Using the expansions \[F(y+\Delta_{i})=F+\Delta_{i}F'+\frac{\Delta_{i}^{2}}{2}F''+O(h^{3})\] the bottom line turns into \[\tilde{y}_{n+1}-y=hF+h\left(\frac{1}{4}\Delta_{1}+\frac{3}{4}\Delta_{2}\right)F'+h\left(\frac{1}{8}\Delta_{1}^{2}+\frac{3}{8}\Delta_{2}^{2}\right)F''+O(h^{4}).\tag{11.49}\] We see that we need the expansion of \(\Delta_{1}\) and \(\Delta_{2}\) up to order \(h^{2}\) to be able to determine all terms up to order \(h^{3}\) in the bottom row. Using the expansions \[F(y+\Delta_{i})=F+\Delta_{i}F'+O(h^{2})\] in the top two lines we obtain \[\begin{array}{lll} \Delta_{1} & = & \frac{h}{4}(\Delta_{1}-\Delta_{2})F'+O(h^{3}),\\ \Delta_{2} & = & h\frac{2}{3}F+h\left(\frac{1}{4}\Delta_{1}+\frac{5}{12}\Delta_{2}\right)F'+O(h^{3}). \end{array}\] The top line implies that \(\Delta_{1}=O(h^{2}),\) which allows us to eliminate it from the r.h.s.s. of the top two lines, yielding \[\begin{array}{lll} \Delta_{1} & = & -\frac{h}{4}\Delta_{2}F'+O(h^{3}),\\ \Delta_{2} & = & h\frac{2}{3}F+h\frac{5}{12}\Delta_{2}F'+O(h^{3}). \end{array}\] Substituting \(\Delta_{2}=h\frac{2}{3}F+O(h^{2})\) to eliminate all \(\Delta_{2}\) on the r.h.s.s. yields \[\begin{array}{lll} \Delta_{1} & = & -\frac{h^{2}}{6}FF'+O(h^{3}),\\ \Delta_{2} & = & h\frac{2}{3}F+h^{2}\frac{5}{18}FF'+O(h^{3}). \end{array}\] Plugging these into the bottom line ([eq:ch11_IRK_example_bottom_line_after_sub]) gives \[\tilde{y}_{n+1}-y=hF+h\left(-\frac{h^{2}}{6}FF'\frac{1}{4}+\frac{3}{4}\left(h\frac{2}{3}F+h^{2}\frac{5}{18}FF'\right)\right)F'+h\frac{3}{8}\left(h\frac{2}{3}F\right)^{2}F''+O(h^{4})\] Collecting terms we get \[\tilde{y}_{n+1}-y=hF+\frac{h^{2}}{2}FF'+h^{3}K+O(h^{4}),\] for \[K=-\frac{1}{6}FF'F'\frac{1}{4}+\frac{3}{4}\frac{5}{18}FF'F'+\frac{3}{8}\left(\frac{2}{3}F\right)^{2}F''.\] The last expression simplifies to \(K=\frac{F(F')^{2}+F^{2}F''}{6},\) so \[\tilde{y}_{n+1}-y=hF+\frac{h^{2}}{2}FF'+h^{3}\frac{F(F')^{2}+F^{2}F''}{6}+O(h^{4}).\] This matches ([eq:ch11_ERK_y_aut_exp_to_order_3]) up to order \(h^{3},\) so we have shown that \[\tilde{y}_{n+1}-y(t_{n+1})=O(h^{4})\] for every analytic autonomous \(F.\) The method is thus of order \(p=3\) “for autonomous \(F"\) - in reality is of order \(p=3\) in the proper sense, though we refrain from proving it.

11.4 Collocation methods

Collocation methods are a subclass of IRK’s that have a more concrete quantitative connection to quadrature, and whose order can be analyzed using the quadrature results of Section 11.1.

Each \(k\ge1\) and choice of distinct \(c_{1},\ldots,c_{k}\) gives rise to a unique collocation method, i.e. uniquely specifies the RK matrix \(A\) and the RK weights \(b_{1},\ldots,b_{k}\) of a \(k\)-stage IRK. The method can be defined in two equivalent ways: either in terms of polynomial approximation, or as the IRK with certain parameters.

\(\text{ }\)Polynomial interpolation definition. The polynomial approximation description starts with \(k\ge1\) and distinct \(c_{1},\ldots,c_{k}.\) It then defines the next approximation \(y_{n+1}\) of \(y(t_{n+1})\) from an already computed approximation \(y_{n}\) of \(y(t_{n})\) in terms of a polynomial \(u\) of degree at most \(k\) such that \[\begin{array}{l} u(t_{n})=y_{n},\\ u'(t_{n}+c_{l}h)=F(t_{n}+c_{l}h,u(t_{n}+c_{l}h)),\quad1\le l\le k. \end{array}\tag{11.51}\] Note that the second line of ([eq:ch11_colloc_def]) requires that the polynomial \(u\) satisfies the ODE condition \(u'(t)=F(t,u(t))\) not at all \(t\in[t_{n},t_{n+1}],\) but rather only for the \(k\) collocation points \(t=t_{n}+c_{l}h.\) The collocation method then sets \[y_{n+1}=u(t_{n+1}).\tag{11.52}\] From its degree \(k\) we see that the number of degrees of freedom in picking \(u\) is \(k+1.\) On the other hand ([eq:ch11_colloc_def]) represents \(k+1\) constraints on \(u.\) Since the number of degrees of freedom and number of constraints match, one may expect that there is a unique \(u\) that satisfies ([eq:ch11_colloc_def]) (this heuristic argument is far from a proof however, since the constraints of the bottom row of ([eq:ch11_colloc_def]) are non-linear equations when \(F\) is non-linear).

\(\text{ }\)Definition as IRK. For any \(k\ge1,c_{1},\ldots,c_{k}\in\mathbb{R}\) consider the polynomials \[p_{i}(t)=\prod_{j\ne i}\frac{t-c_{j}}{c_{i}-c_{j}},\quad1\le i\le k,\tag{11.53}\] (which are the basis polynomials for Lagrange interpolating polynomials with nodes given by the \(c_{i}\)). The collocation method corresponding to \(c_{1},\ldots,c_{k}\) turns out to be equivalent to the IRK \[\begin{array}{l} \xi_{l}=y_{n}+h\sum_{i=1}^{k}a_{l,i}F(t_{n}+c_{i}h,\xi_{i}),\quad l=1,\ldots,k,\\ y_{n+1}=y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}), \end{array}\tag{11.54}\] for \[a_{l,i}=\int_{0}^{c_{l}}p_{i}(t)dt\quad\quad\text{and}\quad\quad b_{i}=\int_{0}^{1}p_{i}(t)dt\quad\quad\text{for}\quad\quad1\le i,l\le k,\tag{11.55}\] as proved by the following lemma.

Lemma 11.10

Let \(k\ge1\) and fix distinct \(c_{1},\ldots,c_{k}\in\mathbb{R}.\) Let \(a_{l,i},b_{i}\) be as in ([eq:ch11_colloc_IRK_def_a_b]). Let \(n\ge0\) and \(y_{n}\in\mathbb{R}^{d}.\)
a) If \(\xi_{1},\ldots,\xi_{k}\in\mathbb{R}^{d}\) satisfy the first line of ([eq:ch11_colloc_IRK_def]) (i.e. if \(\xi_{l}\) are the result of the intermediate stages of the IRK corresponding ([eq:ch11_colloc_IRK_def_a_b]) going from \(t_{n}\) to \(t_{n+1}\)), then there exists a polynomial \(u(t)\) of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]) and \[u(t_{n+1})=y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}),\tag{11.56}\] (i.e. the collocation method corresponding to ([eq:ch11_colloc_def])-([eq:ch11_colloc_def_y_n+1]) going from \(t_{n}\) to \(t_{n+1}\) produces the same \(y_{n+1}\) as the IRK ([eq:ch11_colloc_IRK_def])).
b) Conversely, if \(u(t)\) is a polynomial of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]) then there exist \(\xi_{1},\ldots,\xi_{k}\in\mathbb{R}^{d}\) that satisfy the first line of ([eq:ch11_colloc_IRK_def]) and ([eq:ch11_colloc_IRK_equiv_lemma_y_n+1_same]) (i.e. if the collocation method ([eq:ch11_colloc_def])-([eq:ch11_colloc_def_y_n+1]) yields the approximation \(u(t_{n+1})\) of \(y(t_{n+1})\) when going from \(t_{n}\) to \(t_{n+1},\) then the IRK ([eq:ch11_colloc_IRK_def]) also approximates \(y(t_{n+1})\) by \(u(t_{n+1})\)).

Proof. Note that the \(a_{l,i},b_{i}\) of ([eq:ch11_colloc_IRK_def_a_b]) can equivalently be defined via \[ha_{l,i}=\int_{t_{n}}^{t_{n}+c_{l}h}p_{i}\left(\frac{t-t_{n}}{h}\right)dt\quad\quad\text{and}\quad\quad hb_{i}=\int_{t_{n}}^{t_{n+1}}p_{i}\left(\frac{t-t_{n}}{h}\right)dt,\tag{11.57}\] (the equivalence follows from the change of variables \(t\to\frac{t-t_{n}}{h}\)).

a) Using the characterization ([eq:ch11_colloc_equiv_lemma_ha_hb]) of \(a_{l,i}\) it follows from ([eq:ch11_colloc_IRK_def]) that \[\xi_{l}=y_{n}+\int_{t_{n}}^{t_{n}+c_{l}h}\sum_{i=1}^{k}p_{i}\left(\frac{t-t_{n}}{h}\right)F(t_{n}+c_{i}h,\xi_{i})dt,\quad l=1,\ldots,k.\] In other words \[r(t)=\sum_{i=1}^{k}p_{i}\left(\frac{t-t_{n}}{h}\right)F(t_{n}+c_{i}h,\xi_{i}),\tag{11.58}\] satisfies \[\xi_{l}=y_{n}+\int_{t_{n}}^{t_{n}+c_{l}h}r(t)dt,\quad l=1,\ldots,k.\tag{11.59}\] The function \(r(t)\) is a polynomial of degree at most \(k-1,\) and is the Lagrange interpolating polynomial satisfying \[r(t_{n}+c_{l}h)=F(t_{n}+c_{l}h,\xi_{l}),\quad l=1,\ldots,k.\tag{11.60}\] The integral \[u(t):=y_{n}+\int_{t_{n}}^{t}r(s)ds\tag{11.61}\] is a polynomial of degree at most \(k,\) trivially satisfies the first line of ([eq:ch11_colloc_def]), and also satisfies \[u(t_{n}+c_{l}h)=y_{n}+\int_{t_{n}}^{t_{n}+c_{l}h}r(s)ds=\xi_{l},\tag{11.62}\] (by ([eq:ch11_colloq_equiv_lem_xi_l_in_terms_of_r(t)]) to ([eq:ch11_colloq_equiv_lem_u_def_from_r])) and therefore \[u'(t_{n}+c_{l}h)=r(t_{n}+c_{l}h)=F(t_{n}+c_{l}h,\xi_{l})=F(t_{n}+c_{l}h,u(t_{n}+c_{l}h)),\tag{11.63}\] (from ([eq:ch11_colloq_equiv_lem_r_interpolating_vals])-([eq:ch11_colloq_equiv_lem_u_prime_from_F])). Thus \(u\) is a polynomial of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]). Finally we have \[\begin{array}{ccccl} u(t_{n+1}) & = & y_{n}+\int_{t_{n}}^{t_{n+1}}r(t)dt & = & y_{n}+\sum_{i=1}^{k}\left(\int_{t_{n}}^{t_{n+1}}p_{i}(\frac{t-t_{n}}{h})dt\right)F(t_{n}+c_{i}h,\xi_{i})\\ & & & = & y_{n}+h\sum_{i=1}^{k}b_{i}F(t_{n}+c_{i}h,\xi_{i}) \end{array}\] (from ([eq:ch11_colloq_equiv_lem_u_def_from_r]), ([eq:ch11_colloq_equiv_lem_r(t)_def]) and ([eq:ch11_colloc_equiv_lemma_ha_hb])). Thus \(u\) also satisfies ([eq:ch11_colloc_IRK_equiv_lemma_y_n+1_same]).

b) Given a polynomial \(u\) of degree at most \(k\) that satisfies ([eq:ch11_colloc_def]), define \[\xi_{l}:=u(t_{n}+c_{l}h),\quad l=1,\ldots,k.\tag{11.64}\] The derivative \(u'(t)\) is a polynomial of degree at most \(k-1\) such that \[u'(t_{n}+c_{i}h)=F(t_{n}+c_{i}h,\xi_{i})\quad,i=1,\ldots,k,\] (from the second line of ([eq:ch11_colloc_def]) and ([eq:ch11_colloq_equiv_lem_xi_from_u])). But there is only one such polynomial (namely the Lagrange interpolating polynomial) so in fact \[u'(t)=\sum_{i=1}^{k}p_{i}\left(\frac{t-t_{n}}{h}\right)F(t_{n}+c_{i}h,\xi_{i}).\] This implies that \[u(t)-u(t_{n})=\sum_{i=1}^{k}\left(\int_{t_{n}}^{t}p_{i}\left(\frac{s-t_{n}}{h}\right)ds\right)F(t_{n}+c_{i}h,\xi_{i})\tag{11.65}\] for all \(t.\) The case \(t=t_{n}+c_{l}h\) of ([eq:ch11_colloq_equiv_lem_u_from_integral]) yields \[\xi_{l}-y_{n}=h\sum_{i=1}^{k}a_{l,i}F(t_{n}+c_{i}h,\xi_{i}),\quad l=1,\ldots,k,\] (using ([eq:ch11_colloq_equiv_lem_xi_from_u]), the first line of ([eq:ch11_colloc_def]) and ([eq:ch11_colloc_equiv_lemma_ha_hb])), which is a restatement of the first line of ([eq:ch11_colloc_IRK_def]). Similarly the case \(t=t_{n+1}\) of ([eq:ch11_colloq_equiv_lem_u_from_integral]) yields ([eq:ch11_colloc_IRK_equiv_lemma_y_n+1_same]).

We emphasize that only some IRK methods are also collocation method - most IRK methods have no collocation interpretation. The only advantage of IRK methods that are also collocation methods is that it is easier to rigorously prove results about their order, which is the goal of this section.

\(\text{ }\)Order of collocation methods. To do so we will use the following remarkable formula concerning functions \(v(t)\) that are “almost” solutions of an IVP. One might say that the IVP \(y'(t)=F(t,y(t)),t\ge t_{0},y(t_{0})=y_{0},\) is “almost” solved by any \(v(t)\) such that \(v'(t)\approx F(t,v(t)),t\ge t_{0},v(t_{0})=y_{0},\) and one might hope that such a \(v(t)\) is a good approximation of \(y(t).\) The identity ([eq:ch11_colocation_alekseev_gr=0000F6bner]) expresses the error \(y(t)-v(t)\) in the latter approximation in terms of the error \(v'(t)-F(t,v(t))\) in the former.

Theorem 11.11 • The Alekseev-Gröbner Lemma

Let \(d\ge1\) and let \(y'(t)=F(t,y(t)),t\ge t_{0},y(t_{0})=y_{0}\) be an \(\mathbb{R}^{d}\)-valued IVP for a smooth \(F\) with smooth solution \(y(t).\) Then for any smooth \(v(t)\) with \(v(t_{0})=y_{0},\) there exists a \(\mathbb{R}^{d\times d}\)-valued smooth function \(\Phi(t,y)\) such that \[v(t)-y(t)=\int_{t_{0}}^{t}\Phi(t-s,v(t-s))(v'(s)-F(t,v(s))ds,\quad\quad t\ge t_{0}.\tag{11.67}\]

Remark 11.12

Recall that the solution of the first order linear \(\mathbb{R}\)-valued ODE \[z'(t)=a(t)z(t)+b(t),t\ge t_{0},z(t_{0})=0,\tag{11.68}\] has the explicit solution given in Corollary 3.25. Let us write it as \[z(t)=\int_{t_{0}}^{t}\Phi(t-s)b(s)ds\quad\quad\text{for}\quad\quad\Phi(t-s)=e^{\int_{s}^{t}a(r)dr},\] cf. ([eq:ch11_colocation_alekseev_gr=0000F6bner]). As we did in the the part “Heuristically deriving ansatz for \(b(t)\ne0\)” of Section 3.5, one can interpret this formula in terms of the “infinitesimal additive increments” that the term \(b(t)\) in ([eq:ch11_alekseev_gr=0000F6bner_remark_fo_lin_ODE]) cause during “infinitesimal time intervals” \([s,s+\varepsilon],\) for \(\varepsilon\downarrow0.\) The increment added to \(z\) during \([s,s+\varepsilon]\) can be thought of as \(\approx b(s)\varepsilon,\) but once this increment has been “added” to \(z(s+\varepsilon)\) it is then subject to the evolution of the ODE in the interval \([s+\varepsilon,t]\) “corresponding” the other term \(a(t)z(t)\) in ([eq:ch11_alekseev_gr=0000F6bner_remark_fo_lin_ODE]). The latter is captured by the factor \(\Phi(t-s),\) which “projects forward” the increment \(b(s)\varepsilon\) “added to” \(z\) during \([s,s+\varepsilon]\) to a resulting contribution to \(z(t)\) at the later time \(t>s\) of essentially \(\Phi(t-s)b(s)\varepsilon.\)

The formula ([eq:ch11_colocation_alekseev_gr=0000F6bner]) can be interpreted in a similar way: the error in the approximation \(v'(s)\approx F(t,v(s))\) at time \(s\) is \(v'(s)-F(s,v(s)),\) and this error produces an “infinitesimal additive increment” to the error \(v-y\) of about \((v'(s)-F(s,v(s)))\varepsilon\) in the interval \([s,s+\varepsilon],\varepsilon\downarrow0,\) which is “projected forward” by the factor \(\Phi(t-s,v(t-s))\) so that the resulting contribution at time \(t>s\) is essentially \(\Phi(t-s,v(t-s))(v'(s)-F(t,v(s))\varepsilon.\)

We omit the full proof of ([eq:ch11_colocation_alekseev_gr=0000F6bner]), and instead only make it plausible verifying it in the special case of \(F(t,y)\) autonomous and linear in \(y.\) For such \(F\) the identity is with an explicit \(\Phi\) is derived by direct computation.

Proof for linear autonomous \(F(t,y)\):. If \(F(t,y)\) is linear and autonomous then there is a matrix \(A\) such that \(F(t,y)=Ay.\) Then the IVP reads \[y'=Ay,t\ge t_{0},y(t_{0})=y_{0}.\] By Theorem 7.10 the unique solution is \[y(t)=e^{(t-t_{0})A}y_{0}.\tag{11.69}\] Furthermore letting \(\Delta(t)=v'(t)-F(t,v(t))\) we have \(\Delta(t)=v'(t)-Av(t)\) by linearity, so \(v(t)\) satisfies the IVP \[v'(t)=Av(t)+\Delta(t),t\ge t_{0},v(t_{0})=y_{0},\tag{11.70}\] in terms of \(\Delta.\) In the scalar case \(A,y_{0},v(t),y(t)\in\mathbb{R}\) Corollary 3.25 provides the unique solution \(v(t)=e^{a(t-t_{0})}y_{0}+\int_{t_{0}}^{t}e^{a(t-s)}\Delta(s)ds.\) When \(d\ge2\) this inspires the ansatz \[v(t)=e^{(t-t_{0})A}y_{0}+\int_{t_{0}}^{t}e^{(t-s)A}\Delta(s)ds,\tag{11.71}\] which by Lemma 7.9 and the fundamental theorem of calculus satisfies \[v'(t)=Ae^{(t-t_{0})A}y_{0}+\Delta(t),t\ge t_{0},\] so that ([eq:ch11_collocation_alekseen_gr=0000F6bner_proof_v_IVP_sol]) indeed solves the IVP ([eq:ch11_collocation_alekseen_gr=0000F6bner_proof_v_IVP]). E.g. by Theorem 7.12 it is the unique solution.

It follows directly from ([eq:ch11_collocation_alekseen_gr=0000F6bner_proof_y_IVP_sol]) and ([eq:ch11_collocation_alekseen_gr=0000F6bner_proof_v_IVP_sol]) that \[v(t)-y(t)=\int_{t_{0}}^{t}e^{(t-s)A}\Delta(s)ds.\] Defining \(\Phi(s,v)=e^{sA},\) which is smooth, the r.h.s. equals \[\int_{t_{0}}^{t}\Phi(t-s,v(t-s))\Delta(s)ds,\] which yields the identity ([eq:ch11_colocation_alekseev_gr=0000F6bner]) in this special case.

We can now use the identity ([eq:ch11_colocation_alekseev_gr=0000F6bner]) to prove that if \(k\ge1\) and \(c_{1},\ldots,c_{k}\) are the nodes of Gaussian quadrature on \([0,1],\) then the collocation method ([eq:ch11_colloc_def]) for these \(c_{i}\) has order exactly \(2k,\) in the sense of order of an IRK method of Section 11.3. Compare this to the order of the quadrature corresponding to \(c_{1},\ldots,c_{k}\) in the sense of Section 11.1, which is also exactly \(2k\) by Theorem 11.7.

Theorem 11.13

Let \(k\ge1\) and let \(c_{1},\ldots,c_{k}\) be the distinct roots of the \(k\)-th orthogonal polynomial \(P_{k}\) from ([eq:ch11_guass_quad_legendre_poly_def]) for the interval \([0,1].\) Then then the collocation method ([eq:ch11_colloc_def]) is of order exactly \(2k.\)

Proof. For an arbitrary IVP \(y'(t)=F(t,y(t)),t\ge t_{0},y(t_{0})=y_{0}\) with analytic \(F,\) and any \(n\ge0\) and \(y_{n}\) let \(\tilde{y}\) denote the analytic solution of \(y'(t)=F(t,y(t)),t\ge t_{n},y(t_{n})=y_{n}\) and let denote the result \(y_{n+1}\) of the collocation method ([eq:ch11_colloc_def]) going from \(t_{n}\) to \(t_{n+1}.\) We must show that \[y_{n+1}-\tilde{y}(t_{n+1})=O(h^{2k+1}).\] Let \(h>0\) and let \(u(t)\) be a polynomial of degree at most \(k\) such that \[u(t_{n})=y_{n},\quad\quad u'(t_{n}+c_{l}h)=h\sum_{l=1}^{k}F(t_{n}+c_{l}h,u(t_{n}+c_{l}h)),l=1,\ldots,k,\tag{11.73}\] so that \(y_{n+1}=u(t_{n+1})\) and \[y_{n+1}-\tilde{y}(t_{n+1})=u(t_{n+1})-\tilde{y}(t_{n+1}).\tag{11.74}\] We use the identity ([eq:ch11_colocation_alekseev_gr=0000F6bner]) with \(t_{n}\) in place of \(t_{0},\) \(t_{n+1}\) in place of \(t,\) \(u\) in place of \(v\) and \(\tilde{y}\) in place of \(y,\) which yields \[u(t_{n+1})-\tilde{y}(t_{n+1})=\int_{t_{n}}^{t_{n+1}}f(t)dt,\] where \[f(t)=\phi(t)\Delta(t),\quad\quad\phi(t)=\Phi(t_{n+1}-t,u(t_{n+1}-t)),\quad\quad\Delta(t)=u'(t)-F(t,u(t)),\] and the functions \(\Phi,\) \(\phi,\Delta,f\) are smooth. With the change of variables \(t=t_{n}+sh\) we obtain that \[u(t_{n+1})-\tilde{y}(t_{n+1})=h\int_{0}^{1}g(s)ds\quad\quad\text{for}\quad\quad g(s)=f(t_{n}+sh).\] We approximate \(\int_{0}^{1}g(s)ds\) using Gaussian quadrature, and bound the error that arises. Firstly, note that since \(u\) satisfies ([eq:ch11_colloc_gaussian_quad_IRK_has_order_2k_proof_u]) we have \(\Delta(t_{n}+c_{l}h)=0\) for all \(l=1,\ldots,k,\) so also \(g(c_{l})=f(t_{n}+c_{l}h)=0\) for all \(l.\) But then the quadrature approximation of \(\int_{0}^{1}g(s)ds\) is simply \(\sum_{l=1}^{k}g(c_{l})=0.\) Since \(g(t)\) is smooth and the Gaussian quadrature rule is of order \(2k,\) Lemma 11.4 implies the error bound \[\left|\int_{0}^{1}g(s)ds\right|\le c\sup_{t\in[0,1]}\left|g^{(2k)}\right|=ch^{2k}\sup_{t\in[0,1]}\left|f^{(2k)}(t_{n}+sh)\right|.\] This implies that for \(0<h\le1\) the error ([eq:ch11_colloc_gaussian_quad_IRK_has_order_2k_err_first_step]) is \(O(h^{2k+1}).\) Thus the collocation method ([eq:ch11_colloc_def]) for these \(c_{i}\) is of order at least \(2k.\)

The proof that the method is not of order \(2k+1\) or higher is an exercise on sheet 11.

The collocation methods with \(c_{i}\) given by the roots of the Legendre polynomial \(P_{k}\) on \([0,1]\) are called Gauss-Legendre (Runge-Kutta) methods. . Recalling ([eq:ch11_gauss_quad_legendre_polys_for_a_b_from_standard]) the roots \(c_{1},\ldots,c_{k}\) are related the roots \(\tilde{c}_{1},\ldots,\tilde{c}_{k}\) of the Legendre polynomial \(Q_{k}\) for the interval \([-1,1]\) by \[c_{i}=\frac{\tilde{c}_{i}+1}{2}.\]

Example 11.14

(\(k=2\)). For \(k=1\) the Legendre polynomial for \([-1,1]\) is \[Q_{1}(t)=t\] (recall ([eq:ch11_gauss_quad_legendre_polys_-1,1])) with root \(\tilde{c}_{1}=0,\) so \(P_{1}\) has root \(c_{1}=\frac{1}{2}.\) We have \(p_{1}(t)=1\) so \[a_{1,1}=\int_{0}^{c_{1}}dt=c_{1}=\frac{1}{2}\] and \[b_{1}=\int_{0}^{1}dt=1.\] Thus the Gauss-Legendre method for \(k=1\) has the tableau \[\begin{array}{c|c} \frac{1}{2} & \frac{1}{2}\\ \hline & 1 \end{array}.\] It is a method of order \(2.\)

Example 11.15

(\(k=2\)). For \(k=2\) the computation yields the tableau \[\begin{array}{c|cc} \frac{1}{2}-\frac{1}{2\sqrt{3}} & \frac{1}{4} & \frac{1}{2}-\frac{1}{2\sqrt{3}}\\ \frac{1}{2}+\frac{1}{2\sqrt{3}} & \frac{1}{4}+\frac{1}{2\sqrt{3}} & \frac{1}{4}\\ \hline & \frac{1}{2} & \frac{1}{2} \end{array}.\tag{11.75}\] It is a method of order \(4.\)

Example 11.16

(\(k=3\)). For \(k=3\) the computation yields the tableau \[\begin{array}{c|ccc} \frac{1}{2}-\frac{\sqrt{15}}{10} & \frac{5}{36} & \frac{2}{9}-\frac{\sqrt{15}}{15} & \frac{5}{36}-\frac{\sqrt{15}}{30}\\ \frac{1}{2} & \frac{5}{36}+\frac{\sqrt{15}}{24} & \frac{2}{9} & \frac{5}{36}-\frac{\sqrt{15}}{24}\\ \frac{1}{2}+\frac{\sqrt{15}}{10} & \frac{5}{36}+\frac{\sqrt{15}}{30} & \frac{2}{9}+\frac{\sqrt{15}}{15} & \frac{5}{36}\\ \hline & \frac{5}{18} & \frac{4}{9} & \frac{5}{18} \end{array}.\] The method is of order \(6.\)

11.5 Global convergence

In contrast to multistep methods, every Runge-Kutta method of order \(p\) is (globally) convergent of order \(p,\) meaning that it is convergent and the global error is of order \(h^{p}\) for IVPs with analytic \(F.\) We record this in the next theorem, whose proof we omit.

Theorem 11.17

Consider a general \(k\)-stage Runge-Kutta method ([eq:ch11_IRK_general_form]). If the method is of order \(p,\) then for any analytic \(F\) and IVP \(y'(t)=F(t,y(t)),t\in[t_{0},t_{\max}],y(t_{0})=y_{0}\) with a unique analytic solution \(\tilde{y}(t)\) the approximations \(y_{n},n\ge0,\) satisfy the global error estimate \[\max_{n:t_{n}\le t_{\max}}\left|y_{n}-\tilde{y}(t_{n})\right|=O(h^{p}).\]