Differential Equation Basics

Initial Value Problems

Differential equations describe the relation between an unknown function and its derivatives. To solve a differential equation is to find a function that satisfies the relation, typically while satisfying some additional conditions as well

In a canonical initial value problem, the behavior of the system is described by an ordinary differential equation (ODE) of the form

x˙=f(x,t)\dot{\boldsymbol{x}} = f(\boldsymbol{x}, t)

where:

  • ff is a known function

  • x\boldsymbol{x} is the state of the system

  • x˙\dot{\boldsymbol{x}} is x\boldsymbol{x} ’s time derivative

In an initial value problem we are given x(t0)=x0\boldsymbol{x}(t_{0}) = \boldsymbol{x}_{0} at some starting time t0t_{0}, and wish to follow x\boldsymbol{x} over time thereafter.

2D visualization:

The trajectory swept out by p\boldsymbol{p} through ff forms an integral curve of the vector field

ff may or may not depend directly on time tt. If it does, then not only the point p\boldsymbol{p} but the the vector field itself moves. So that point's velocity depends not only on where it is, but on when it arrives there.

Numerical Solutions

Numerical solutions take discrete time steps starting with the initial value x(t0)\boldsymbol{x}(t_{0}).

To take a step:

  1. use the derivative function ff to calculate an approximate change in x\boldsymbol{x}: Δx\Delta{\boldsymbol{x}}, over a time interval Δt\Delta{t}

  2. increment x\boldsymbol{x} by Δx\Delta{\boldsymbol{x}} to obtain the new value

In calculating a numerical solution, the derivative function ff is regarded as a black box: we provide numerical values for x\boldsymbol{x} and tt, receiving in return a numerical value for x˙\dot{\boldsymbol{x}}

Numerical methods operate by performing one or more of these derivative evaluations at each time step.

Euler’s Method

The simplest numerical method is called Euler’s method.

let hh is the stepsize parameter

Euler’s method simply computes the next state x(t+h)\boldsymbol{x}(t +h)by taking a step in the derivative direction:

x(t+h)=x(t)+hx˙(t)\boldsymbol{x}(t +h) = \boldsymbol{x}(t) + h \dot{\boldsymbol{x}}(t)

Instead of the true integral curve, the approximate solution follows a polygonal path, obtained by evaluating the derivative at the beginning of each leg:

Though simple, Euler’s method is not accurate:

Consider the case of a 2D function ff whose integral curves are concentric circles. A point p\boldsymbol{p} governed by ff is supposed to orbit forever on whichever circle it started on. Instead, with each Euler step, p\boldsymbol{p} will move on a straight line to a circle of larger radius, so that its path will follow an outward spiral. Shrinking the stepsize will slow the rate of this outward drift, but never eliminate it.

Moreover, Euler’s method can be unstable:

Consider a 1D function f=kxf = -kx (is satisfied by x=ektx = e^{-kt}), which should make the point p\boldsymbol{p} decay exponentially to zero. For sufficiently small step sizes we get reasonable behavior, but when:

  • h>1/kh > 1/k, we have Δx>x\lvert \Delta{x} \rvert > \lvert x \rvert, so the solution oscillates about zero

  • Beyond h=2/kh = 2/k, the oscillation diverges, and the system blows up

这里应该是假定了 k>0k > 0 ?

Finally, Euler’s method isn’t even efficient:

Most numerical solution methods spend nearly all their time performing derivative evaluations, so the computational cost per step is determined by the number of evaluations per step.

Though Euler’s method only requires one evaluation per step, the real efficiency of a method depends on the size of the steps it lets you take—while preserving accuracy and stability—as well as on the cost per step.

More sophisticated methods, even some requiring as many as four or five evaluations per step, can greatly outperform Euler’s method because their higher cost per step is more than offset by the larger stepsizes they allow.

数值方法需要平衡结果(精确性、稳定性)与效率(计算开销)

复杂的数值方法主要计算开销在每一步的 derivative evaluation 上,但是由于它们的复杂性,在同样的精确性、稳定性要求下允许更大的步长,因此需要更少的步数

计算开销(总价)综合了每一步的开销(单位价格)与步数(数量)

To understand how we go about improving on Euler’s method, we need to look more closely at the error that the method produces. The key to understanding what’s going on is the Taylor series: Assuming x(t)\boldsymbol{x}(t) is smooth, we can express its value at the end of the step as an infinite sum involving the the value and derivatives at the beginning:

x(t+h)=x(t)+hx˙(t)+h22!x¨(t)++hnn!nx(t)tn\boldsymbol{x}(t + h) = \boldsymbol{x}(t) + h \dot{\boldsymbol{x}}(t) + \frac{h^2}{2!}\ddot{\boldsymbol{x}}(t) + \cdots + \frac{h^n}{n!}\frac{\partial^{n} \boldsymbol{x}(t)}{\partial t^{n}}

We get the Euler update formula by truncating the series, which means that Euler’s method would be correct only if all derivatives beyond the first were zero, i.e. if x(t)\boldsymbol{x}(t) were linear.

The error term, the difference between the Euler step and the full, untruncated Taylor series, is dominated by the leading term h22!x¨(t)\frac{h^2}{2!}\ddot{\boldsymbol{x}}(t). Consequently, we can describe the error as O(h2)O(h^2) (read “Order h squared”.)

Suppose that we chop our stepsize in half. Although this produces only about one fourth the error we got with a stepsize of hh, we have to take twice as many steps over any given interval. That means that the error we accumulate over an interval [t0,t1][t_0, t_1] depends linearly upon hh. Theoretically, using Euler’s method we can numerically compute with as little error as we want, by choosing a suitably small stepsize. In practice, a great many timesteps might be required, depending on the error and the function ff .

The Midpoint Method

If we were able to evaluate x¨\ddot{\boldsymbol{x}} as well as x˙\dot{\boldsymbol{x}}, we could acheive O(h3)O(h^3) accuracy instead of O(h2)O(h^2):

x(t+Δt)=x(t)+hx˙(t)+h22!x¨(t)+O(h3)\boldsymbol{x}(t + \Delta t) = \boldsymbol{x}(t) + h \dot{\boldsymbol{x}}(t) + \frac{h^2}{2!}\ddot{\boldsymbol{x}}(t) + O(h^{3})

Recall that the time derivative x˙\dot{\boldsymbol{x}} is given by a function f(x(t),t)f(\boldsymbol{x}(t), t). For simplicity in what follows, we will assume that the derivative function ff does depends on time only indirectly through x\boldsymbol{x}, so that x˙=f(x(t))\dot{\boldsymbol{x}} = f(\boldsymbol{x}(t)). The chain rule then gives:

x¨=fxx˙=ff\ddot{\boldsymbol{x}} = \frac{\partial{f}}{\partial{\boldsymbol{x}}}\dot{\boldsymbol{x}} = f^{\prime}f

To avoid having to evaluate ff^{\prime},which would often be complicated and expensive, we can approximate the second-order term just in terms of ff, and substitute the approximation into the truncated Taylor series of x\boldsymbol{x}, leaving us with O(h3)O(h^3) error. To do this, we perform another Taylor expansion of ff:

f(x+Δx)=f(x)+Δxf(x)+O(Δx2)f(\boldsymbol{x} + \Delta \boldsymbol{x}) = f(\boldsymbol{x}) + \Delta \boldsymbol{x} f^{\prime}(\boldsymbol{x}) + O(\Delta \boldsymbol{x}^{2})

We first introduce x¨\ddot{\boldsymbol{x}} into this expression by choosing non-fixed step Δx\Delta{\boldsymbol{x}} at time tt, that is, varying with x(t)\boldsymbol{x}(t):

Δx=h2f(x)\Delta \boldsymbol{x} = \frac{h}{2}f(\boldsymbol{x})

so that:

f(x+h2f(x))=f(x)+h2f(x)f(x)+O(h2)=f(x)+h2x¨(t)+O(h2)f(\boldsymbol{x} + \frac{h}{2}f(\boldsymbol{x})) = f(\boldsymbol{x}) + \frac{h}{2}f(\boldsymbol{x}) f^{\prime}(\boldsymbol{x}) + O(h^{2}) = f(\boldsymbol{x}) + \frac{h}{2}\ddot{\boldsymbol{x}}(t) + O(h^{2})

We can now multiply both sides by hh (turning the O(h2)O(h^2) term into O(h3)O(h^3)) and rearrange, yielding:

h22x¨(t)+O(h3)=h{f[x(t)+h2f[x(t)]f[x(t)]}\frac{h^{2}}{2} \ddot{\boldsymbol{x}}(t) + O(h^{3}) = h\{f[\boldsymbol{x}(t) + \frac{h}{2}f[\boldsymbol{x}(t)] - f[\boldsymbol{x}(t)]\}

Substituting the right hand side into the truncated Taylor series of x\boldsymbol{x} gives the update formula (notice that f[x(t)]=x˙(t)f[\boldsymbol{x}(t)] = \dot{\boldsymbol{x}}(t)):

x(t+Δt)=x(t)+h{f[x(t)+h2f[x(t)]}\boldsymbol{x}(t + \Delta t) = \boldsymbol{x}(t) + h\{f[\boldsymbol{x}(t) + \frac{h}{2}f[\boldsymbol{x}(t)]\}

可以直接把 O(h3)O(h^{3})替代掉?或者这里只是给出近似公式

This formula:

  1. evaluates an Euler step

  2. performs a second derivative evaluation at the midpoint of the step

  3. using the midpoint evaluation to update x\boldsymbol{x}

Hence the name midpoint method. The midpoint method is correct to within O(h3)O(h^3), but requires two evaluations of ff.

图里应该是 Δtf(x)\Delta{t}f(\boldsymbol{x})f(x+Δx/2)f(\boldsymbol{x} + \Delta{\boldsymbol{x}}/2)?

We don’t have to stop with an error of O(h3)O(h^3). By evaluating ff a few more times, we can eliminate higher and higher orders of derivatives. The most popular procedure for doing this is a method called Runge-Kutta of order 4 and has an error per step of O(h5)O(h^5). (The Midpoint method could be called Runge-Kutta of order 2.) We won’t derive the fourth order Runge-Kutta method, but the formula for computing x(t+h)\boldsymbol{x}(t + h) is listed below:

k1=hf(x,t)k2=hf(x+k1/2,t+h/2)k3=hf(x+k2/2,t+h/2)k4=hf(x+k3,t+h)x(t+h)=x+16k1+13k2+13k3+16k4\begin{align*} k_{1} &= hf(\boldsymbol{x}, t) \\ k_{2} &= hf(\boldsymbol{x} + k_{1}/2, t + h/2) \\ k_{3} &= hf(\boldsymbol{x} + k_{2}/2, t + h/2) \\ k_{4} &= hf(\boldsymbol{x} + k_{3}, t + h) \\ \boldsymbol{x}(t + h) &= \boldsymbol{x} + \frac{1}{6}k_{1} + \frac{1}{3}k_{2} + \frac{1}{3}k_{3} + \frac{1}{6}k_{4} \end{align*}

Adaptive Stepsizes

Whatever the underlying method, a major problem lies in determing a good stepsize.

Ideally, we want to choose hh as large as possiblebut not so large as to give us an unreasonable amount of error, or worse still, to induce instability.

理想的步长尽可能大(快),但不会产生不可接受的错误与不稳定性(精确性、稳定性)

If we choose a fixed stepsize, we can only proceed as fast as the “worst” sections of x(t)\boldsymbol{x}(t) will allow.

What we would like to do is to vary hh as we march forward in time. This is the idea of adaptive stepsizing: varying hh over the course of solving the ODE.

Here we’ll be present adaptive stepsizing for Euler’s method. The basic idea is as follows. Lets assume we have a given stepsize hh, and we want to know how much we can consider changing it.

Suppose we compute two estimates for x(t+h)\boldsymbol{x}(t + h). We compute an estimate xa\boldsymbol{x}_{a}, by taking an Euler step of size hh. We also compute an estimate xb\boldsymbol{x}_{b} by taking two Euler steps of size h/2h/2.

Both are differ from the true value of x(t+h)\boldsymbol{x}(t + h) by O(h2)O(h^{2}). That means that they differ from each other by O(h2)O(h^{2}). As a result, we can write that a measure of the current error ee is:

e=xaxbe = \lvert \boldsymbol{x}_{a} - \boldsymbol{x}_{b} \rvert

This gives us a convenient estimate to the error in taking an Euler step of size hh.

Suppose that we are willing to have an error of as much as 10410^{−4} per step, and that the current error is only 10810^{−8}. Since the error goes up as h2h^{2}, we can increase the stepsize to:

(104108)12h=100h(\frac{10^{-4}}{10^{-8}})^{\frac{1}{2}}h = 100h

Conversely, if we currently had an error of 10310^{−3}, and could only tolerate an error of 10410^{−4}, we would have to decrease the stepsize to:

(104103)12h=.316h(\frac{10^{-4}}{10^{-3}})^{\frac{1}{2}}h = .316h

但是如何知道我们什么时候需要什么级别的误差?

Implementation

The ODEs we will want to solve may represent many things—for instance, a collection of masses and springs, some rigid bodies, or a deformable object.

We want to implement ODE solvers and the models on which they operate in a way that isolates each from the internal details of the other. This will make it possible to change solvers easily, and also make the solver code reusable.

Fortunately, this kind of modularity is not difficult to acheive, since all solvers can be expressed in terms of a small, stereotyped set of operations. Presumably, the system of ODE-governed objects will be embodied in a structure of some kind. The approach is to write type-specific code that operates on this structure to perform the standard operations, then to implement solvers in terms of these generic operations.

From the solver’s viewpoint, the system on which it operates is a black-box function f(x,t)f(\boldsymbol{x}, t). The solver needs to be able to evaluate ff, as required, at any values of x\boldsymbol{x} and tt, and then to install the updated x\boldsymbol{x} and tt when a time step is taken. To support these operations, the object that represents the ODE being solved must be able to handle these requests from the solver:

  • Return dim(x)dim(\boldsymbol{x}): Since x\boldsymbol{x} and x˙\dot{\boldsymbol{x}} may be vectors, the solver must know their length, to allocate storage, perform vector arithmetic ops, etc.

  • Get set x\boldsymbol{x} and tt: The solver must be able to install new values at the end of a step. In addition, a multi-step method must set them to intermediate values in the course of performing derivative evaulations.

  • Evaluate ff at the current x\boldsymbol{x} and tt.

In an object-oriented language, these operations would naturally be implemented as generic functions that are handled in a type-specific way.

In a non-object-oriented language generic functions would be faked by installing pointers to type-specific functions in structure slots, or simply by passing the function pointers as arguments to the solver.

Later on we will consider in detail how these operations are to be implemented for specific models such as particle-and-spring systems

Last updated