10 Numerics II: Multistep methods
As for the previous chapter, and probably for the coming chapter, the exposition here borrows heavily from A First Course in the Numerical Analysis of Differential Equations by Arieh Iserles.
Euler’s method from Definition 9.1 is of order \(p=1.\)
The trapezoidal rule method of Definition 9.9 is of order \(p=2.\)
Euler’s method from Definition 9.1 is convergent.
The trapezoidal rule method from Definition 9.9 is convergent.
Furthermore for any analytic \(F\) and solution \(y\) to the IVP ([eq:ch9_numerical_time_stepping_def_IVP]) there exist constants \(c_{1},c_{2}>0\) such that if \(0<h\le c_{1}\) then \[\max_{0\le n\le\frac{t_{\max}-t_{0}}{h}}\left|e_{n}\right|\le c_{2}h^{2}.\tag{9.53}\]
Luckily, there is a simple condition which does guarantee convergence of a consistent multistep method. It will be introduced in Section 10.4.
\(\text{ }\)Notation for steps. A one step method uses the previous approximation \(y_{m}\) to produce the next approximation \(y_{m+1},\) and then \(y_{m+1}\) to produce \(y_{m+2},\) and so on, i.e. it has the form \[y_{n+1}=Y(F,t_{n},y_{n}),\quad n\ge0.\] A multistep method with two steps uses the two previous approximations \(y_{n},y_{n+1}\) to produce the next approximation \(y_{n+2},\) and then \(y_{n+1},y_{n+2}\) to produce \(y_{n+3},\) and so on, i.e. it has the form \[y_{n+2}=Y(F,t_{n},y_{n},y_{n+1}),\quad n\ge0.\] A multistep method with \(s\ge1\) steps uses the \(s\) previous approximations \(y_{n},y_{n+1},\ldots,y_{n+s-1}\) to produce the next approximation \(y_{n+s},\) and then the \(s\) approximations \(y_{n+1},\ldots,y_{n+s}\) to produce \(y_{n+s+1},\) and so on, i.e. it has the form \[y_{n+s}=Y(F,t_{n},y_{n},y_{n+1},\ldots,y_{n+s-1}),\quad n\ge0.\]
\(\text{ }\)Starting multistep methods from only \(y_{0}.\) The rule of an \(s\)-step method can only be used if one already has the \(s\) approximations \(y_{0},\ldots,y_{s-1}.\) When numerically solving a standard IVP \[\dot{y}=F(t,y),\quad\quad y(t_{0})=y_{0},\] one only has the one initial \(y_{0}\) to begin with. The solution is simply to use one or more methods with fewer steps to obtain \(y_{1},\ldots,y_{s-1},\) and after that point switch to the chosen \(s\)-step method.
10.1 Adams-Bashforth methods
We first study the Adams-Bashforth multistep methods, which can be thought of as exploiting polynomial interpolation to extract more accuracy from multiple steps.
\(\text{ }\)Starting point. As for Euler and trapezoidal rule methods, the starting point is the identity \[y(t_{n+s})-y(t_{n+s-1})=\int_{t_{n+s-1}}^{t_{n+s}}F(t,y(t))dt,\tag{10.2}\] which any solution of the ODE \(\dot{y}=F(t,y)\) must satisfy. This is combined with an approximation of the definite integral on the r.h.s. of ([eq:ch9_a_d_step_integral]). The Euler method arises from applying the approximation \(\int_{t_{n}}^{t_{n+1}}f(t)dt\approx hf(t_{n})\) to r.h.s. of ([eq:ch9_a_d_step_integral]) (in the case \(s=1\)), while the trapezoidal rule method arises from applying \(\int_{t_{n}}^{t_{n+1}}f(t)dt\approx\frac{h}{2}(f(t_{n})+f(t_{n+1})).\) Similarly a two-step method can be obtained from an approximation of \(\int_{t_{n+1}}^{t_{n+2}}f(t)dt\) in terms of \(f(t),t\in\{t_{n,}t_{n+1}\},\) a three-step method from an approximation of \(\int_{t_{n+2}}^{t_{n+3}}f(t)dt\) in terms of \(f(t),t\in\{t_{n,}t_{n+1},t_{n+2}\},\) and so on for any number \(s\ge1\) of steps. To keep notation light, we present the three-step case before giving the general \(s\)-step formulas.
That is, the unique lowest degree polynomial that interpolates them.
\(\text{ }\)Deriving the Adams-Bashforth rule. For this \(p(t)\) the integral on the r.h.s. of ([eq:ch9_multistep_ab_three_steps_int_approx]) equals \[\int_{t_{n+2}}^{t_{n+3}}p(t)dt=\tilde{b}_{0}f(t_{n})+\tilde{b}_{1}f(t_{n+1})+\tilde{b}_{2}f(t_{n+2}),\] for \[\tilde{b}_{m}=\int_{t_{n+2}}^{t_{n+3}}p_{m}(t)dt,\quad\quad m=0,1,2.\] The change variables \(t=t_{n}+r\) maps \(\{t_{n},t_{n+1},t_{n+2}\}\) to \(\{0,h,2h\}\) and yields the identities \[\tilde{b}_{0}=\int_{2h}^{3h}\frac{(r-h)(r-2h)}{(0-h)(0-2h)}dr,\quad\tilde{b}_{1}=\int_{2h}^{3h}\frac{(r-0)(r-2h)}{(h-0)(h-2h)}dr,\quad\tilde{b}_{2}=\int_{2h}^{3h}\frac{(r-0)(r-h)}{(2h-0)(2h-h)}dr.\] These show that the \(\tilde{b}_{m}\) don’t depend on \(n.\) The dependence on \(h\) can also be “factored out” by applying another change of variables \(r=2h+hu,\) yielding \(\tilde{b}_{i}=hb_{i}\) for \[b_{0}=\int_{2}^{3}\frac{(u-1)(u-2)}{(0-1)(0-2)}du,\quad b_{1}=\int_{2}^{3}\frac{(u-0)(u-2)}{(1-0)(1-2)}dr,\quad b_{2}=\int_{2}^{3}\frac{(u-1)(u-2)}{(3-1)(3-2)}du,\] which do not depend on \(h.\) Explicitly \[b_{0}=\frac{1}{(-1)(-2)}\int_{2}^{3}(u-1)(u-2)du=\frac{1}{2}\int_{2}^{3}\left(u^{2}-3u+2\right)du=\frac{1}{2}\left(\frac{u^{3}}{3}-u^{2}\right)|_{2}^{3}=\frac{5}{12}.\] Similarly, one can verify that \(b_{1}=-\frac{4}{3}\) and \(b_{2}=\frac{23}{12}.\) Thus we have arrived at the approximation \[\int_{t_{n+2}}^{t_{n+3}}f(t)dt\approx h\left(b_{2}f(t_{n+2})+b_{1}f(t_{n+1})+b_{0}f(t_{n})\right)=h\left(\frac{23}{12}f(t_{n+2})-\frac{4}{3}f(t_{n+1})+\frac{5}{12}f(t_{n})\right).\] Applying it for the r.h.s. of ([eq:ch9_a_d_step_integral]) we arrive at the rule \[y_{n+3}=y_{n+2}+h\left(\frac{23}{12}F(t_{n+2},y_{n+2})-\frac{4}{3}f(t_{n+1},y_{n+1})+\frac{5}{12}f(t_{n},y_{n})\right).\] This is the three step \((s=3)\) Adams-Bashforth method.
For a general number \(s\ge1\) of steps one uses the Lagrange polynomials for \(s\) nodes \[p_{m}(t)=\frac{q_{m}(t)}{q_{m}(t_{n+m})},\quad\text{for}\quad q_{m}(t)=\prod_{m'\in\{0,\ldots,s-1\}\setminus\{m\}}(t-t_{n+m'}),\quad m=0,\ldots,s-1,\] yielding coefficients \[b_{m}=\frac{\int_{s-1}^{s}\prod_{m'\in\{0,\ldots,s-1\}\setminus\{m\}}\left(u-m'\right)du}{\prod_{m'\in\{0,\ldots,s-1\}\setminus\{m\}}\left(m-m'\right)},\quad m=0,\ldots,s-1,\tag{10.4}\] and the \(s\)-step Adams-Bashforth rule \[y_{n+s}=y_{n+s-1}+h\sum_{m=0}^{s-1}b_{m}F(t_{n+m},y_{t+m}).\] For \(s=1\) the only coefficient is \(b_{0}=1,\) so one recovers the Euler method. For \(s=2\) the coefficients are \(b_{0}=-\frac{1}{2}\) and \(b_{1}=\frac{3}{2},\) so the rule is \[y_{n+2}=y_{n+1}+h\left(\frac{3}{2}F(t_{n+1},y_{n+1})-\frac{1}{2}F(t_{n},y_{n})\right).\]
\(\text{ }\)Local error/order of Adams-Bashforth rule. The local error of the method is the error in the approximation of \(y(t_{n+s})\) if one uses the exact values for the previous steps \(y(t_{n}),\ldots,y(t_{n+s-1}).\) That approximation is \[y(t_{n+s-1})+\int_{t_{n+s-1}}^{t_{n+s}}p(t)dt\] for \[p(t)=p_{0}(t)f(t_{n})+\ldots+p_{s-1}(t)f(t_{n+s-1}),\quad\text{where}\quad f(t)=F(t,y(t)).\] Thus the error is \[\int_{t_{n+s-1}}^{t_{n+s}}p(t)dt-\int_{t_{n+s-1}}^{t_{n}}f(t)dt\tag{10.5}\] The Lagrange remainder formula bounds the error in the approximation \(f(t)\approx p(t).\)
Let \(I\) be a closed interval, and \(f:I\to\mathbb{R}\) an \(s\)-times continuously differentiable function. Let \(r_{1},\ldots,r_{s}\in I\) be distinct, and let \(p(t)\) be the unique degree \(s-1\) polynomial such that \(p(r_{i})=f(r_{i})\) for all \(i.\) Then for every \(t\in I,\) there is a \(\xi\in I\) such that \[f(t)-p(t)=\frac{f^{(s)}(\xi)}{s!}\prod_{i=1}^{s}(t-r_{i}).\tag{10.6}\]
With \(t\in[t_{n+s-1},t_{n+s}]\) and \(r_{1},\ldots,r_{s}=t_{n},\ldots,t_{n+s-1}\) the r.h.s. of ([eq:ch10_lagrange_remainder_formula]) is \(O(h^{s}\)), showing that \(f(t)=p(t)+O(h^{s})\) for \(t\in[t_{n+s-1},t_{n+s}].\) Using the latter in ([eq:ch10_a_d_loc_err]) yields \[\left|\int_{t_{n+s-1}}^{t_{n+s}}p(t)dt-\int_{t_{n+s-1}}^{t_{n}}f(t)dt\right|\le\int_{t_{n+s-1}}^{t_{n+s}}\left|p(t)-f(t)\right|dt=O\left(h^{s}\int_{t_{n+s-1}}^{t_{n+s}}1dt\right)=O(h^{s+1}).\] Thus ([eq:ch10_a_d_loc_err]) is \(O(h^{s+1}\)), which means that the \(s\)-step Adams-Bashforth method is of order \(s\)! We have proved the following lemma.
The Adams-Bashforth method with \(s=1,2,\ldots\) steps has order \(p=s.\)
Note that the Adams-Bashforth methods are all explicit. Thus the two-step Adams-Bashforth matches the order of the trapezoidal rule method, without being implicit, and the three-step Adams-Bashforth is of higher order than both.
10.2 General linear multistep methods
In this section we introduce a general class of multistep method, of which Adams-Bashforth is a special case. We also present a simple criterion to check if a specific method in the class is consistent (i.e. of order \(p\) for some \(p\ge1\)).
\(\text{ }\)General class of linear multistep methods. The formula \[\sum_{m=0}^{s}a_{m}y_{n+m}=h\sum_{m=0}^{s}b_{m}F(t_{n+m},y_{n+m})\tag{10.8}\] for arbitrary coefficients \(a_{m},b_{m}\) defines a general class of multistep methods. Adams-Bashforth corresponds to \(a_{s}=1,a_{s-1}=-1\) and \(a_{i}=0\) for all other \(i,\) and \(b_{s}=0\) with \(b_{0},\ldots,b_{s-1}\) given by ([eq:ch9_a_d_s_step_b_m_def_as_int]). If \(b_{s}\ne0\) then a multiple of \(hF(t_{n+s},y_{n+s})\) appears on the r.h.s., which makes the method implicit, like the trapezoidal rule method. The rule ([eq:ch9_general_multistep]) is then understood as an implicit equation for \(y_{n+s}\) for given \(y_{n},\ldots,y_{n+s-1}.\) As already mentioned the Adams-Bashforth methods are explicit, and indeed they have \(b_{s}=0.\)
One can “encode” the coefficients \(a_{m},b_{m}\) in the generating polynomials \[\rho(w)=\sum_{m=0}^{s}a_{m}w^{m}\quad\quad\text{and}\quad\quad\sigma(w)=\sum_{m=0}^{s}b_{m}w^{m},\tag{10.9}\] so that every method of the type ([eq:ch9_general_multistep]) can be specified either by the formula ([eq:ch9_general_multistep]) or by writing down \(\rho(w),\sigma(w).\) The \(s\)-step Adams-Bashforth method corresponds to \[\rho(w)=w^{s}-w^{s-1}\quad\quad\text{and}\quad\quad\sigma(w)=\begin{cases} 1 & \text{ for }s=1,\\ \frac{3}{2}w-\frac{1}{2} & \text{ for }s=2,\\ \frac{23}{12}w^{2}-\frac{4}{3}w+\frac{5}{12} & \text{ for }s=3,\\ \ldots & \ldots \end{cases}\tag{10.10}\] (Some authors call them characteristic polynomials, but we avoid this term to not cause confusion with the other kinds of characteristic polynomials already introduced in this course). The generating polynomials ([eq:ch10_general_multistep_rho_sigma]) are also extremely useful tools to study the properties of a linear multistep method. Our first such result is the following theorem, which gives a criterion for determining the order of a general multistep method ([eq:ch9_general_multistep]) from the generating polynomials.
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
By exactly of order \(p\) we mean that \(p\) is the highest \(p'\) such that the method is of order \(p'\) - or equivalently that it is of order \(p\) but not of order \(p+1.\)
For the proof we use some intermediate lemmas. Let \[A_{k}=\sum_{m=0}^{s}a_{m}m^{k},\quad\quad B_{k}=k\sum_{m=0}^{s}b_{m}m^{k-1}.\tag{10.12}\]
It holds for all \(z\in\mathbb{R}\) that \[\rho(e^{z})-\sigma(e^{z})z=\sum_{k=0}^{\infty}\frac{z^{k}}{k!}(A_{k}-B_{k}).\tag{10.13}\]
Proof. Since the power series of \(z\to e^{zm}\) has infinite radius of convergence for each \(m\) we obtain \[\rho(e^{z})=\sum_{m=0}^{s}a_{m}\sum_{k=0}^{\infty}\frac{m^{k}z^{k}}{k!}=\sum_{k=0}^{\infty}\frac{z^{k}}{k!}A_{k},\] and \[\sigma(e^{z})z=z\sum_{m=0}^{s}b_{m}\sum_{k=0}^{\infty}\frac{m^{k}z^{k}}{k!}=\sum_{k=0}^{\infty}\frac{z^{k}}{k!}B_{k}.\] This proves the claim.
The condition ([eq:ch10_multistep_order_cond]) holds iff \[A_{k}=B_{k}\text{ for }k=0,\ldots,p\text{ and }A_{p+1}\ne B_{p+1}.\tag{10.14}\]
Proof. The change of variables \(w=e^{z}\) and the fact that \(e^{z}-1=z+O(z^{2})\) shows that ([eq:ch10_multistep_order_cond]) is equivalent to the \[\begin{array}{c} \text{the existence of constants }c'\ne0,r'>0\text{ such that}\\ \rho(e^{z})-\sigma(e^{z})z=c'z^{p+1}+O(z^{p+2})\quad\quad\text{for}\quad\quad\left|z\right|\le r'. \end{array}\tag{10.15}\] The equivalence of ([eq:ch10_multistep_order_cond_Ak_Bk]) and ([eq:ch10_multistep_order_cond_ez]) then follows from the identity ([eq:ch10_multistep_order_method_rho_sigma_ident]).
For any analytic \(F(t,y),\) analytic solution \(y(t)\) of \(\dot{y}=F(t,y),\) and \(t_{0}\) there is a constant \(r>0\) such that if \(0<h\le r\) then \[\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right).\tag{10.16}\]
Proof. Since \(y\) is analytic its power series has a positive radius of convergence \(r'''>0\) at \(t_{0},\) and if \(h\) is such that \(t_{s}-t_{0}<r''',\) then it holds that \(y(t)\) can be expanded as \[y(t_{m})=\sum_{k=0}^{\infty}\frac{(t_{m}-t_{0})^{k}}{k!}y^{(k)}(t_{0})=\sum_{k=0}^{\infty}\frac{m^{k}h^{k}}{k!}y^{(k)}(t_{0}),\tag{10.17}\] and \(y'(t_{m})\) can be expanded as \[y'(t_{m})=\sum_{k=0}^{\infty}\frac{(t_{m}-t_{0})^{k}}{k!}y^{(k+1)}(t_{0})=\sum_{k=1}^{\infty}\frac{m^{k-1}h^{k-1}}{(k-1)!}y^{(k)}(t_{0}).\tag{10.18}\] Using ([eq:ch10_multistep_order_cond_y_taylor_exp]) gives \[\sum_{m=0}^{s}a_{m}y(t_{m})=\sum_{m=0}^{s}a_{m}\sum_{k=0}^{\infty}\frac{m^{k}h^{k}}{k!}y^{(k)}(t_{0})=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})A_{k},\] and using ([eq:ch10_multistep_order_cond_y_prime_taylor_exp]) gives \[h\sum_{m=0}^{s}b_{m}y'(t_{m})=h\sum_{m=0}^{s}b_{m}\sum_{k=0}^{\infty}\frac{m^{k}h^{k}}{k!}y^{(k+1)}(t_{0})=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})B_{k}.\] Thus it holds that \[\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))=\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right).\]
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
\(\implies\)Assume that a multistep method ([eq:ch9_general_multistep]) is of order \(p\) but not of order \(p+1.\) Since it is of order \(p\) it must hold for all analytic \(F\) with analytic solution \(y\) to \(y'(t)=F(t,y)\) and \(t_{0}\) that \[\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))=O(h^{p+1}).\] By ([eq:ch10_A_k_B_k_identity_y]) it therefore holds for all such \(y\) that \[\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right)=O(h^{p+1}).\tag{10.19}\] Assume \(A_{k}\ne B_{k}\) for some \(k\in\{0,1,\ldots,p\}.\) Consider the particular case \(y'(t)=y(t),t\ge0,y(t_{0})=1\) (i.e. \(F(t,y)=y\)) which has unique exact solution \(y(t)=e^{t}.\) Then \(y^{(k)}(t_{0})=1\) for all \(k,\) so the term with \(h^{k}\) on the l.h.s. of ([eq:ch10_multistep_order_proof_AkBk]) is non-vanishing. But then the l.h.s. can not be \(O(h^{p+1}\)). Thus if the method ([eq:ch9_general_multistep]) is of order \(p,\) it implies that \(A_{k}=B_{k}\) for \(k\in\{0,\ldots,p\}.\)
Assume now that the method is not of order \(p+1.\) This means that there exist \(F,y,t_{0},y_{0}\) (with \(F,y\) analytic etc.) and constants \(c,r>0\) such that \[\left|\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right)\right|\ge c'h^{p+2}\quad\quad\text{ for }\quad\quad0<h\le r.\tag{10.20}\] Assume for contradiction that \(A_{k}=B_{k}\) for \(k\le p+1.\) Note that then \[\begin{array}{l} \left|\sum_{k=0}^{\infty}\frac{h^{k}}{k!}y^{(k)}(t_{0})\left(A_{k}-B_{k}\right)\right|\\ \le\max_{m=0}^{s}\left|a_{m}\right|\sum_{k=p+2}^{\infty}\left|\frac{h^{k}}{k!}m^{k}y^{(k)}(t_{0})\right|+\max_{m=0}^{s}\left|b_{m}\right|h\sum_{k=p+2}^{\infty}\left|\frac{h^{k-1}}{(k-1)!}m^{k-1}y^{(k)}(t_{0})\right|. \end{array}\] Since \(y\) is analytic, the two infinite sums on the r.h.s. are absolutely convergent. But this implies that they are \(O(h^{p+2}\)), which contradicts ([eq:order_thm_for_contradiction]). Thus if a method is of order \(p\) but not of order \(p+1,\) then \(A_{p+1}\ne B_{p+1}.\)
This implies that if a multistep method is of order \(p\) but not of order \(p+1,\) then ([eq:ch10_multistep_order_cond_Ak_Bk]) holds.
\(\impliedby\) Assume now that ([eq:ch10_multistep_order_cond_Ak_Bk]) holds. Similarly to above, for any \(F,y,t_{0},y_{0}\) (with \(F,y\) analytic etc.) it holds that \[\begin{array}{l} \left|\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))\right|\\ \le\max_{m=0}^{s}\left|a_{m}\right|\sum_{k=p+1}^{\infty}\left|\frac{h^{k}}{k!}m^{k}y^{(k)}(t_{0})\right|+\max_{m=0}^{s}\left|b_{m}\right|h\sum_{k=p+1}^{\infty}\left|\frac{h^{k-1}}{(k-1)!}m^{k-1}y^{(k)}(t_{0})\right|\\ =O(h^{s+1}), \end{array}\] showing that the method is of order (at least) \(p=s.\)
Also since \(A_{k}=B_{k}\) for \(k=0,\ldots p\) but \(A_{p+1}\ne B_{p+1}\) it holds for the particular \(y'(t)=y(t),t\ge0,y(t_{0})=1\) (i.e. \(F(t,y)=y\)) with exact solution \(y(t)=e^{t},\) that \[\begin{array}{l} \left|\sum_{m=0}^{s}a_{m}y(t_{m})-h\sum_{m=0}^{s}b_{m}F(t_{m},y(t_{m}))\right|\\ \ge|A_{p+1}-B_{p+1}|\frac{h^{p+1}}{(p+1)!}\left|y^{(k)}(t_{0})\right|\\ -\max_{m=0}^{s}\left|a_{m}\right|\sum_{k=p+2}^{\infty}\left|\frac{h^{k}}{k!}m^{k}y^{(k)}(t_{0})\right|+\max_{m=0}^{s}\left|b_{m}\right|h\sum_{k=p+2}^{\infty}\left|\frac{h^{k-1}}{(k-1)!}m^{k-1}y^{(k)}(t_{0})\right|\\ \ge ch^{p+1} \end{array}\] for a constant \(c\) for all small enough \(h.\) This implies that the method is not of order \(p+1.\)
We have proved that ([eq:ch10_multistep_order_cond_Ak_Bk]) implies that the multistep method is of order \(p\) and not order \(p+1.\)
To check if the criterion ([eq:ch10_multistep_order_cond_ez]) holds in practice one expands \(\rho(w)-\sigma(w)\ln w\) as a power series around \(w=1\) and determines its first non-vanishing term. It is convenient to use the change of variables \(w=1+\xi\) and expand \(\rho(1+\xi)-\sigma(1+\xi)\ln(1+\xi)\) in \(\xi\) around \(\xi=0.\) Since \(\xi\to\ln(1+\xi)\) has a power series with radius of convergence \(1\) and \(\rho,\sigma\) are polynomial the l.h.s. also has a power series with radius of convergence \(1,\) i.e. there are \(c_{0},c_{1},\ldots\) such that \[\rho(1+\xi)-\sigma(1+\xi)\ln(1+\xi)=\sum_{k\ge0}c_{k}\xi^{p}\quad\text{for}\quad\left|\xi\right|<1.\tag{10.21}\] Note that ([eq:ch10_multistep_order_cond_ez]) holds iff \(c_{0}=\ldots=c_{p}=0,c_{p+1}\ne0.\) In other words if the lowest order non-vanishing term in ([eq:order_cond_taylor_in_xi]) is a multiple of \(\xi^{p+1},\) then the method has order exactly \(p.\) Note that one must only compute the coefficients \(c_{k}\) up to the first non-vanishing one, the complete power series is not needed. For the computation one uses the well-known power series for \(\ln(1+\xi)\) around \(\xi=0\): \[\ln(1+\xi)=\xi-\frac{1}{2}\xi^{2}+\frac{1}{3}\xi^{3}+\ldots.\]
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
In the last chapter we briefly mentioned the family of theta methods. Recall from ([eq:ch9_theta_method_def]) that the method corresponding to a fixed \(\theta\in[0,1]\) is \[y_{n+1}=y_{n}+h(\theta F(t_{n},y)+(1-\theta)F(t_{n+1},y_{n+1}))\] For this method \(\rho(w)=w-1\) and \(\sigma(w)=\theta w+(1-\theta),\) i.e. \(\rho(1+\xi)=\xi\) and \(\sigma(1+\xi)=1+\theta\xi.\) Therefore \[\begin{array}{lcl} p(1+\xi)-\sigma(1+\xi)\ln(1+\xi) & = & \xi-(1+\theta\xi)\ln(1+\xi)\\ & = & \xi-(1+\theta\xi)(\xi-\frac{1}{2}\xi^{2}+\frac{1}{3}\xi^{3}+\ldots)\\ & = & \xi-\xi+\frac{1}{2}\xi^{2}-\frac{1}{3}\xi^{3}-\theta\xi^{2}+\frac{\theta}{2}\xi^{3}+\frac{1}{3}\xi^{3}+\ldots\\ & = & \left(\frac{1}{2}-\theta\right)\xi^{2}-\frac{1}{3}\xi^{3}+\ldots. \end{array}\] Thus the theta method has order \(p=2\) for \(\theta=\frac{1}{2},\) and otherwise order \(p=1.\)
We have already determined these orders for \(\theta=1\) a.k.a. Euler’s method, and \(\theta=\frac{1}{2}\) a.k.a. the trapezoidal rule method. For \(\theta\ne\frac{1}{2},1\) these are new insights. In particular this proves that the backward Euler method for which \(\theta=0\) \[y_{n+1}=y_{n}+hF(t_{n}+1,y_{n+1})\] is of order \(p=1.\)
10.3 Interlude - difference equations
When we study global convergence of multistep methods in the next section we will investigate the behavior of difference equations. These are discrete analogues of differential equations. For instance \[y_{n+1}=y_{n}+ay_{n}+b\] is the difference equation which has some similarity with the ODE \(\dot{y}=ay+b.\) Since difference equations are interesting and useful in their own right, we dedicate this short section to them.
\(\text{ }\)General form. The general, possibly implicit, \(k\)-th order difference equation is \[G(n+k,y_{n},y_{n+1},\ldots,y_{n+k-1},y_{n+k}),n\ge0,\tag{10.24}\] for a general function \(G(t,x_{0},\ldots,x_{k})\) (cf. the general possibly implicit \(k\)-th order ODE ([eq:ch3_implicit_eq])), and the general \(k\)-th order explicit difference equation is \[y_{k+n}=F(n+k,y_{n},y_{n+1},\ldots,y_{n+k-1}),n\ge0,\tag{10.25}\] for a function \(F(t,x_{0},x_{2},\ldots,x_{k-1})\) (cf. the general \(k\)-th order explicit ODE ([eq:ch3_explicit_eq])). Note that the update rules of the multistep method with \(s\) steps from the last section are a difference equation of order \(s,\) which is why the latter pop up in the next section about global convergence.
\(\text{ }\)Solutions. Any sequence \(y_{n},n\ge0,\) satisfying ([eq:ch10_difference_equations_implicit]) or ([eq:ch10_difference_equations_explicit]) is called a solution of the difference equation. Any initial values \(y_{0},\ldots,y_{k-1}\) in \(\mathbb{R}\) or \(\mathbb{C}\) together with ([eq:ch10_difference_equations_implicit]) or ([eq:ch10_difference_equations_explicit]) uniquely define the entire solution \(y_{n},n\ge0\) (note that existence and uniqueness of the solution holds trivially here, unlike for the corresponding ODEs). The equations ([eq:ch10_difference_equations_implicit]) resp. ([eq:ch10_difference_equations_explicit]) are autonomous if the functions \(G,F\) do not depend on \(t.\) We don’t have much to say about difference equations at this level of generality. But for linear autonomous difference equations we can construct a complete theory that essentially mirrors the theory for linear autonomous ODEs.
Let \(a_{0},\ldots,a_{k-1},b\in\mathbb{C}\) and \(I\) be a non-empty interval.
a) Assume \(a_{0}\ne0.\) Then a function \(y\) is a solution to ([eq:ch5_ho_lin_aut_ODE_def]) iff \(x=y-\frac{b}{a_{0}}\) is a solution to ([eq:ch5_ho_lin_aut_ODE_homo_def]).
b) \(y(t)=0\) is a solution to ([eq:ch5_ho_lin_aut_ODE_homo_def]).
c) If \(y\) is a solution to ([eq:ch5_ho_lin_aut_ODE_homo_def]) then \(\alpha y\) is also a solution for any \(\alpha\in\mathbb{R}.\)
d) If \(y_{1}\) and \(y_{2}\) are solutions to ([eq:ch5_ho_lin_aut_ODE_homo_def]) then \(y_{1}+y_{2}\) is also a solution to ([eq:ch5_ho_lin_aut_ODE_homo_def]).
\(\text{ }\)Characteristic polynomial. The characteristic polynomial of ([eq:ch10_difference_equations_linear_aut_homo]) is \[p(r)=\sum_{l=0}^{k}a_{l}r^{l},\tag{10.28}\] (cf. the characteristic polynomial for \(k\)-th order linear autonomous ODEs from ([eq:ch5_ho_lin_aut_char_poly_with_ak])). For any \(q\in\mathbb{C}\) the ansatz \(y_{n}=q^{n}\) plugged into ([eq:ch10_difference_equations_linear_aut_homo]) yields \[\sum_{l=0}^{k}a_{l}y_{n+l}=\sum_{l=0}^{k}a_{l}q^{n+l}=q^{n}p(q).\] Thus for any root \(r\) of \(p(r)\) the sequence \(y_{n}=r^{n}\) is a solution (similarly to how \(y(t)=e^{rt}\) solves a linear autonomous homogeneous ODE if \(r\) is a root of its characteristic polynomial). If \(p(r)\) has \(v\) distinct roots \(r_{1},\ldots,r_{v}\) (possibly complex), then for any coefficients \(\alpha_{1},\ldots,\alpha_{v}\in\mathbb{C}\) \[y_{n}=\alpha_{1}r_{1}^{n}+\ldots+\alpha_{v}r_{v}^{n},n\ge0,\] is a \(v\)-parameter family of solution. If the polynomial \(p(r)\) has the full set of \(k\) distinct roots, then this \(k\)-parameter family covers all solutions.
Let \(h(r)\) be a non-zero complex-valued polynomial of degree \(k\) in \(r,\) and \(s\) a root of multiplicity \(m\in\{1,2,\ldots,k\}.\) Then \(h^{(i)}(s)=0\) for all \(i<m.\)
10.4 Global convergence
We now examine the global error of linear multistep methods. That is, we study \[\max_{n\ge0:t_{n}\le t_{\max}}|e_{n}^{h}|\tag{10.31}\] where \(e_{k}=e_{k}^{h}=y_{k}^{h}-y(t_{k}^{h})\) and the \(y_{n}\) are computed by the general multistep method ([eq:ch9_general_multistep]). Recall that an auxiliary method needs to provide \(y_{1}^{h},\ldots,y_{s-1}^{h}\) before the \(s\)-step method can get started, and these are typically subject to an “initial” error \[\max_{n=0}^{s-1}|e_{n}^{h}|\ne0.\tag{10.32}\] Suppose therefore that we have a sequence \(h=h_{N},N\ge1,\) that converges to zero for \(N\to\infty,\) and for each \(h=h_{N}\) starting values \(y_{0}^{h},\ldots,y_{s-1}^{h},\) and subsequent approximations \(y_{s}^{h},y_{s+1}^{h},\ldots\) derived from the multistep method. We call the multistep method convergent if for any analytical \(F\) and analytical solution \(y\) of \(\dot{y}=F(t,y)\) the global error ([eq:ch10_glob_conv_glob_err]) converges to zero as \(h\downarrow0,\) whenever the error in the starting values ([eq:ch10_glob_conv_init_err]) converges to zero. That is, if \[\lim_{h\downarrow0}\max_{n=0}^{s-1}|e_{n}^{h}|=0\quad\quad\implies\quad\quad\lim_{h\downarrow0}\max_{n\ge0:t_{n}\le t_{\max}}|e_{n}^{h}|=0.\]
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
If one starts the two-step method ([eq:ch10_global_non_convergent_method_example]) with \(y_{0}=y_{1}=0\) instead (i.e. with the correct values from the true solution \(y=0\)) then the solution of the difference equation is \(y_{n}=0,\) which is a perfect approximation. But any small error, e.g. from rounding, that causes \(y_{n}\ne y_{n+1}\) at some point will cause the solution to start diverging.
\(\text{ }\)Condition for convergence. Remarkably, the failure mode seen above (large eigenvalues of \(\rho(w)\)) is the only way consistent method can fail to be convergent. Indeed there is a simple criterion for the eigenvalues of \(\rho\) that together with consistency implies convergence. It is called the root condition, and a multistep method with generating polynomial \(\rho(w)\) satisfies it if \[\begin{array}{c} \text{all roots of }\rho(w)\text{ lie in the complex unit disk, and}\\ \rho(w)\text{ has no repeated roots on the boundary of the unit disk}. \end{array}\tag{10.35}\] Some authors define the terminology stable linear multistep method to mean any linear multistep method that satisfies the condition ([eq:ch10_multistep_gllobal_err_root_cond]). The famous Dahlquist Equivalence Theorem, which we now present, says that if a multistep method is consistent (i.e. of order \(p\) for some \(p\ge1\)) and satisfies the root condition, then it is convergent! (In the other terminology: a stable consistent linear multistep method is convergent).
The multistep method ([eq:ch9_general_multistep]) with generating polynomials \(\rho(w),\sigma(w)\) is convergent iff it is consistent and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]).
The full proof of this theorem is outside the scope of this course. We content ourselves with proving the easier of two directions of the “iff”, namely that convergence implies that the method is consistent and the root condition is satisfied. This direction thus proves that the root condition and consistency are necessary conditions for convergence.
The multistep method ([eq:ch9_general_multistep]) with generating polynomials \(\rho(w),\sigma(w)\) is convergent iff it is consistent and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]).
The multistep method ([eq:ch9_general_multistep]) with generating polynomials \(\rho(w),\sigma(w)\) is convergent iff it is consistent and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]).
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
Our task is thus to show that a method being convergent implies that its generating polynomials satisfy ([eq:ch10_conv_implies_consistent_proof_WTS]).
Again we consider the very simple ODE \(y'(t)=0,t\ge0,\) this time with initial condition \(y(0)=1,\) whose exact solution is \(y(t)=1,t\ge0.\) As before, applying a convergent multistep method to this ODE amounts to using the rule \[a_{s}y_{n+s}^{h}+\ldots+a_{0}y_{n}^{h}=0.\tag{10.39}\] Starting the method with initial values \(y_{0}^{h}=\ldots=y_{s-1}^{h}=1,\) which are the exact values, the method being convergent certainly implies that the error \(e_{s}^{h}\) converges to zero as \(h\downarrow0.\) From which we deduce that \[\lim_{h\downarrow0}y_{s}^{h}=1.\tag{10.40}\] With these initial values the case \(n=0\) of ([eq:ch10_conv_implies_consistent_proof_p(1)=00003D1_proof_diff_eq]) equals \[a_{s}y_{s}^{h}+a_{s-1}+\ldots+a_{0}=0.\] By taking the limit \(h\downarrow0\) and using ([eq:ch10_conv_implies_consistent_proof_p(1)=00003D1_proof_first_approx_conv]) this implies \[a_{s}+a_{s-1}+\ldots+a_{0}=0.\] The l.h.s. is \(p(1),\) so we have proved the left member of ([eq:ch10_conv_implies_consistent_proof_WTS]).
To prove tor the other part of ([eq:ch10_conv_implies_consistent_proof_WTS]) we consider the IVP \(y'(t)=1,y(0)=0\) with exact solution \(y(t)=t.\) Since this corresponds \(F\equiv1\) identically the r.h.s. of the rule ([eq:ch9_general_multistep]) for this method equals \(h(b_{s}+\ldots+b_{0})=h\sigma(1),\) so it can be written as \[\sum_{l=0}^{s}a_{l}y_{n+l}^{h}=h\sigma(1).\tag{10.41}\] If \(\rho'(1)=0\) in addition to the already proved \(\rho(1)=0,\) then \(w=1\) is a repeated root of \(\rho.\) But then \(\rho\) does not satisfy the root condition ([eq:ch10_multistep_gllobal_err_root_cond]), which is a contradiction since we already proved that every convergent method satisfies it. Thus \(\rho'(1)\ne0,\) and we can define the sequence \(z_{n}^{h}=\frac{\sigma(1)}{\rho'(1)}t_{n}^{h},b\ge0.\) This sequence is a solution to the difference equation ([eq:ch10_conv_implies_consistent_proof_p'(1)=00003Dsigma(1)_proof_diff_eq]), since substituting \(z_{n}^{h}\) for \(y_{n}^{h}\) in the l.h.s. yields \[\sum_{l=0}^{s}a_{l}z_{n}^{h}=\sum_{l=0}^{s}a_{l}\frac{\sigma(1)}{\rho'(1)}t_{n+l}^{h}=\frac{\sigma(1)}{\rho'(1)}\sum_{l=0}^{s}a_{l}(n+l)h=\frac{\sigma(1)}{\rho'(1)}\left(\rho(1)+\rho'(1)\right)h=h\sigma(1).\] Thus when started with the initial values \(y_{n}^{h}=z_{n}^{h}\) for \(n=0,\ldots,s-1,\) applying the present multistep method produces the approximations \[y_{n}^{h}=z_{n}^{h}=\frac{\sigma(1)}{\rho'(1)}t_{n}^{h},n\ge s.\] With this choice of the initial values \(y_{0}^{h},\ldots,y_{s-1}^{h},\) the “initial error” is \[\max_{0\le n\le s-1}|e_{n}^{h}|=\max_{0\le n\le s-1}\left|\frac{\sigma(1)}{\rho'(1)}t_{n}^{h}-t_{n}^{h}\right|=\left|\frac{\sigma(1)}{\rho'(1)}-1\right|\left|t_{s}^{h}\right|=\left|\frac{\sigma(1)}{\rho'(1)}-1\right|sh\downarrow0,\] so the assumption that the method is convergent implies that the global error also tends to zero. In particular, along the sequence \(h_{N}=\frac{1}{N}\) it must hold that \(\left|e_{N}^{h_{N}}\right|\to0\) as \(N\to\infty.\) But \[e_{N}^{h_{N}}=y(t_{N}^{h_{N}})-\frac{\sigma(1)}{\rho'(1)}t_{N}^{hN}=y(1)-\frac{\sigma(1)}{\rho'(1)}=1-\frac{\sigma(1)}{\rho'(1)}.\] Thus we deduce that the method being convergent implies that \(\sigma(1)=\rho'(1),\) proving the second part of ([eq:ch10_conv_implies_consistent_proof_WTS]).
This completes the proof of the implication.
The theta methods from ([eq:ch9_theta_method_def]), which include Euler’s method, the trapezoidal rule method, and the backward Euler method as special cases, all have \(\rho(w)=w-1.\) This \(\rho\) has just one root \(w=1\) on the boundary of the of the unit disk, but is not repeated, so that the root condition is satisfied. The method is also consistent for any \(\theta\) (indeed it is of order \(p=2\) if \(\theta=\frac{1}{2}\) and order \(p=1\) otherwise). Thus Dahlquist Equivalence Theorem implies that all of these methods are convergent.
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]
One should never use non-convergent methods. Even if one happens to behave OK for a specific IVP and step-size \(h,\) the error tends to paradoxically become larger as one shrinks \(h,\) quickly reaching catastrophic levels. Thus multistep methods that do not satisfy the root condition ([eq:ch10_multistep_gllobal_err_root_cond]) are absolutely useless.
The multistep method ([eq:ch9_general_multistep]) with generating polynomials \(\rho(w),\sigma(w)\) is convergent iff it is consistent and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]).
If the linear multistep method ([eq:ch9_general_multistep]) with corresponding polynomials \(\rho(w),\sigma(w)\) from ([eq:ch10_general_multistep_rho_sigma]), has order \(p\ge1\) and satisfies the root condition ([eq:ch10_multistep_gllobal_err_root_cond]) then it is convergent of order \(p\ge1.\)
10.5 Deriving methods algebraically
In Section 10.1 we derived the Adams-Bashforth methods by starting with a “clever idea” for reducing the error and working out its consequences - in that specific case the clever idea was polynomial interpolation. This is one common approach for deriving or presenting “good” numerical methods. However the criteria ([eq:ch10_multistep_order_cond]) and ([eq:ch10_multistep_gllobal_err_root_cond]) enable an alternative purely algebraic approach.
The \(s\)-step Adams-Bashforth methods has \(\rho(w)=w^{s}-w^{s-1}=w^{s}(w-1),\) with the simple (i.e. not repeated) root \(w=1,\) and repeated root \(w=0.\) Heuristically one may think of the global convergence behavior of a method as being better the further away roots are from boundary of the unit disk. It follows from Theorem 10.3 that a consistent method must have \(p(1)=0,\) so the root \(w=1\) on the boundary is always present. From this point of view the Adams-Bashforth methods are in some sense optimal, since the roots are as far away from the boundary of the unit disk as possible.
The multistep method ([eq:ch9_general_multistep]) is of order exactly \(p\ge1\) iff \[\begin{array}{c} \text{there exist constants }c\ne0,r>0\text{ such that}\\ \rho(w)-\sigma(w)\ln w=c(w-1)^{p+1}+O(|w-1|^{p+2})\quad\text{for }\quad|w-1|\le r. \end{array}\tag{10.11}\]