Stiff systems
Lawrence F. Shampine and Skip Thompson (2007), Scholarpedia, 2(3):2855. | doi:10.4249/scholarpedia.2855 | revision #139228 [link to/cite this article] |
Contents |
What is Stiffness?
Stiff systems of ordinary differential equations are a very important special case of the systems taken up in Initial Value Problems. There is no universally accepted definition of stiffness. Some attempts to understand stiffness examine the behavior of fixed step size solutions of systems of linear ordinary differential equations with constant coefficients. The eigenvalues \(\lambda_i\) of the Jacobian matrix completely characterize the stability of the system in this case. They also determine the behavior of explicit numerical methods applied to the system.
Solutions of the test equation \[ y'(t) = \lambda_i y(t), \quad y(0) = 1. \] for \(Re(\lambda_i) < 0\) decay exponentially fast as \(t\) increases. Equations of this kind arise in a natural way when the components of more general systems are uncoupled, so how a method performs on the test equation indicates how they will perform on more general equations. Assuming that \(Re(\lambda_i) < 0\) for all eigenvalues, a commonly used stiffness index is \[ L = \max |Re(\lambda_i)|. \]
This measure is extended to general differential equations by considering eigenvalues of the local Jacobian matrix. It should be noted that \( L \) is not invariant under a simple rescaling of the problem. That raises the distinction between the mathematical problem and the computational problem. The computational problem includes the nature of the error control and the error tolerances; and whether a problem is stiff may depend on this. In particular, rescaling a problem must include a corresponding change of error control if an equivalent problem is to be solved. It might also be remarked that pure imaginary eigenvalues are not allowed in these measures of stiffness because problems with high frequency oscillations require a completely different approach. An alternative measure is the ratio of the local solution time scale (or resolution step size) to the smallest damping time constant, \(min[-1/Re(\lambda_i)]\ .\) This can be more useful when some \( Re(\lambda_i) > 0 \) or when some other driving term sets the local solution time scale. Finally, it should also be pointed out that some authors prefer to use the stiffness ratio \[ S = \frac{\max |Re(\lambda_i)|} {\min |Re(\lambda_i)|} . \] as a measure of stiffness.
Generally, we consider a system to be stiff if the stiffness index is "large." More specifically, a system is considered to be stiff on an interval \([t_0,t_f]\) if \(L(t_f - t_0) \gg 1 \ .\) Often the role of the interval is overlooked, though we will see below that what might seem a small value of \(L\) could contribute to a stiff problem because the interval is long. We have seen examples for which the reverse is true. For instance, a nuclear reactor water hammer modeled by a system of several thousand differential equations had an \(L\) that was extremely large. Nevertheless, the model was easily solved using explicit methods because the time interval of interest was a fraction of a millisecond and as a consequence, the problem was not stiff.
Computational Stiffness
Stiffness ratios are helpful as far as they go, but they do not account adequately for several important practical issues. Some authors (Gear (1971)) prefer to approach the question of stiffness using a different test equation \[ y'(t) = A (y - F(t)) + F'(t). \] Here \( A \) is a constant Jacobian which is assumed to have eigenvalues with negative real part so that all solutions converge to \( F(t) \ .\) As Gear notes, for a problem to be stiff, the limit solution \( F(t) \) must be a smooth, slowly varying function. The fundamental issue is how a step size that will resolve the behavior of the solution of interest relates to the stability of the problem as indicated by the \( -Re(\lambda_i) \ .\)
It must be recognized that stiff problems in practice can have eigenvalues with positive real part as long as they are not too big. The stiffness ratio is not defined in this case and in particular, it is not defined for the common case of linear conservation laws which result in zero eigenvalues. Indeed, the classic Robertson chemical kinetics problem (http://www.dm.uniba.it/~testset/) typically used to illustrate stiffness has two zero eigenvalues. Any definition of stiffness that does not apply to this standard test problem is not very useful. The stiffness ratio is not applicable to a scalar problem, hence is not applicable to the combustion example in the next section.
The stiffness ratio does not account for the role of the interval. Exercise 9.1 in Shampine (1994) provides two kinds of examples that show that for this reason the stiffness ratio can say that a problem is stiff, but it is in fact not stiff as far as any popular code is concerned, and vice versa. The stiffness ratio does not account for eigenvalues that are relatively near the imaginary axis. Stiffness in such a case depends on precisely which method is being used because methods typically have less satisfactory stability then. In particular, moderate to high order BDFs can suffer from stiffness when solving such a problem just as much as explicit Runge-Kutta methods.
There is clearly a conflict between wanting a precise mathematical definition of stiffness and the practical issue of what kind of code will perform in an acceptable way. In this context it is instructive to refer to a comment of Germund Dahlquist quoted in Exercise 9.1 of Shampine (1994): "The stiffness ratios used by some authors ... may be of use when one estimates the amount of work needed, if a stiff problem is to be solved with an explicit method, but they are fairly irrelevant in connection with implicit methods...."
We hasten to point out that the above definitions ignore the crucial role of per-step costs. More specifically, the choice of methods should include the cost of taking a step with a suitable stiff method versus the corresponding cost with a nonstiff method. In fact, the gain in step size achieved with a stiff method is offset somewhat by its higher per-step cost; so for a mildly stiff problem a nonstiff method may be cheaper.
An Example
A simple combustion model O'Malley (1991) and Shampine et al. (2003) shows what is special about solving stiff systems numerically. For very small values of \(\epsilon > 0\ ,\) the concentration \(y(t)\) of a reactant satisfies \[ y'(t) = y^2 (1 - y), \quad 0 \le t \le 2/ \epsilon, \quad y(0) = \epsilon \] Although it is not possible to illustrate all aspects of solving stiff systems with this simple scalar equation, the problem illustrates the most important aspects. The stability of an initial value problem is crucial to its numerical solution. The linear stability of \(y' = F(y)\) is determined by the eigenvalues of the Jacobian \(J = \partial F/\partial y\ ,\) which here is \(2 y - 3 y^2\ .\) On the interval \([0,1/\epsilon]\ ,\) the solution \(y(t)\) is positive , hence \(J > 0\) and the problem is unstable. However,\(y(t)\) is \(O(\epsilon)\) on this interval, so \(J\) is \(O(\epsilon)\) and the equation is only mildly unstable on an interval of length \(O(1/\epsilon)\ .\) It is easy to compute the solution on this interval with an explicit Runge-Kutta method. At ignition the solution increases rapidly to a limit value of 1. Any method will need to use a relatively small step size to resolve this sharp change in \(y(t)\) and again methods like explicit Runge-Kutta work very well. The situation is quite different on the interval \([1/\epsilon, 2/\epsilon]\) where \(y(t) \approx 1\ .\) The solution looks very easy to approximate, but it is found that an explicit Runge-Kutta code has many failed steps and the average successful step size seems inordinately small. Whether a system is stiff is a complicated matter, but there are two main ingredients that are seen in this example. One is a solution that is slowly varying and the other is a very stable problem. In the example \(J \approx -1\) on \([1/\epsilon, 2/\epsilon]\ ,\) so the differential equation is stable. Though the magnitude of \(J\) is not large, this interval is long and it is the combination that makes the problem very stable. The essence of the difficulty is that when solving non-stiff problems, a step size small enough to provide the desired accuracy is small enough that the stability of the numerical method is qualitatively the same as that of the differential equations. For a stiff problem, step sizes that would provide an accurate solution with an explicit Runge-Kutta method must be reduced greatly to keep the computation stable. An equally important practical matter is that methods with superior stability properties cannot be evaluated by simple iteration as for non-stiff problems because the step size has to be very small for the iteration to converge. In practice, whether an initial value problem is stiff depends on the stability of the problem, the length of the interval of integration, the stability of the numerical method, and the manner in which the method is implemented. Often the best way to proceed is to try one of the solvers intended for non-stiff systems. If it is unsatisfactory, the problem may be stiff. If the problem is stiff, there are effective solvers available. These solvers can be used for non-stiff systems, too, but unfortunately they can be more expensive, even far more expensive, than a solver intended for such problems. The figures show in real time the numerical solution of the combustion model with an explicit Runge-Kutta method and a variable order BDF solver for \( \epsilon = \)1e-3. In the figures the centers of the circles are the computed solution and the gaps between them correspond to the local step sizes. Notice how differently the methods behave on the last half of the interval where the problem is moderately stiff. The relative performance depicted in the figures for the stiff and nonstiff solvers is demonstrated even more (in fact, far more) dramatically for smaller values of \( \epsilon \ .\) Another delightful discussion of this problem is available elsewhere (http://www.mathworks.com/company/newsletters/news_notes/clevescorner/may03_cleve.html).
Stability Issues
The Euler methods are easy to analyze and they show what can happen with other methods. When the step size is a constant \(h\ ,\) the explicit (forward) Euler method (AB1) applied to the test equation results in \(y_{n} = { \left( 1 + h \lambda_i \right) }^n\ .\) For the method to be stable, the \(y_n\) must decay to 0 as \(t_n\) increases. Clearly this happens only if \(| \left(1 + h \lambda_i \right) | < 1\ ,\) which is to say that \(h\lambda_i\) must lie inside a unit circle centered at \((-1,0)\ .\) Such a region is called a stability region. When \(|Re(\lambda_i)|\) is large compared to the length of the interval of integration, the solution quickly becomes very easy to approximate in the sense of absolute error. The problem is stiff for this method because a step size that approximates the solution well must be severely restricted to keep the computation stable. It is illuminating to observe that this problem is not stiff when a relative error control is used because the step size must be restricted to achieve the desired accuracy for all time. The implicit Backward Euler method (BDF1) \[ y_{n+1} = y_n + h \lambda_i \, y_{n+1} \] generates the approximations \(y_{n} = 1/\left( 1 - h \lambda_i \right)^n \ ,\) which decay to 0 for all \(h\) - the stability region of BDF1 is the whole left-half complex plane. The same is true of the trapezoidal rule (AM2). Like the forward Euler method, all explicit methods have finite stability regions. A great deal of effort has been devoted to finding implicit methods with regions encompassing as much of the left-half complex plane as possible and in particular, extending to infinity. All the BDFs have stability regions extending to infinity, though the stability regions become smaller as the order increases and the methods are unusable for orders greater than 6.
Implicit Methods
The backward Euler method has excellent stability, but it is an implicit method. Simple iteration is the standard way to evaluate implicit methods when solving non-stiff problems, but if we use it to evaluate the formula of (BDF1), we have \[ y_{n+1}^{[m+1]} = y_n + h \lambda_i \, y_{n+1}^{[m]} \] It is easy to see that we must have \( |h \lambda_i| < 1\) for the iterates \(y_{n+1}^{[m]}\) to converge. This restricts the step size as much as stability restricts the forward Euler method. Clearly to take advantage of the stability of implicit methods, we must use a more effective scheme to solve a system of nonlinear algebraic equations at each step. This is done with a carefully modified Newton method. Most codes use Gaussian elimination to solve the linear equations of Newton's method. The cost of approximating the Jacobians of Newton's method can be reduced dramatically when the Jacobian is banded or sparse or has some other special structure. Fortunately, structured Jacobians are common for very large systems of ordinary differential equations. Some solvers that have been designed for extremely large, but highly structured systems arising in the spatial discretization of partial differential equations use preconditioned Krylov techniques to solve the linear systems iteratively. In any case, evaluating the implicit formula is relatively expensive, indeed, it dominates the computational effort. An expensive step is worthwhile only when a step size that will provide the specified accuracy with a cheap explicit method must be reduced greatly to keep the computation stable. There are many problems for which this is true, problems that can be solved easily with an appropriate solver, but scarcely, or not at all, with a solver intended for non-stiff problems.
Further Reading and Stiff IVP Solvers
The text Shampine et al. (2003) is an introduction to methods and software for stiff systems in the problem solving environment MATLAB. They are discussed at a similar level in Ascher and Petzold (1998) and at a higher level in Gear (1971) and Hairer and Wanner (1991). All these texts provide references to software. The book Aiken (1985) is a survey of the art and practice of stiff computation that contains a large bibliography. GAMS, the Guide to Available Mathematical Software (http://gams.nist.gov/), is a convenient way to locate software for solving stiff systems numerically. In addition, SUNDIALS Equation Solvers describes the SUNDIALS collection of stiff solvers available for solving extremely large systems of ordinary differential equations arising from the spatial discretization of multi-dimensional systems of partial differential equations.
References
- R.C. Aiken, ed., Stiff Computation, Oxford University Press, Oxford, U.K., 1985.
- U.M. Ascher and L.R. Petzold, Computer Methods for Ordinary Differential Equations and Differential--Algebraic Equations, SIAM, Philadelphia, 1998.
- C.W. Gear, Numerical Initial Value Problems in Ordinary Differential Equations, Prentice--Hall, Englewood Cliffs, NJ, 1971.
- E. Hairer and G. Wanner, Solving Ordinary Differential Equations II, Stiff and Differential--Algebraic Problems, Springer--Verlag, Berlin, 1991.
- R.E. O'Malley, Singular Perturbation Methods for Ordinary Differential Equations, Springer--Verlag, New York, 1991.
- L.F. Shampine, Numerical Solution of Ordinary Differential Equations, Chapman & Hall, New York, NY, 1994.
- L.F. Shampine, I. Gladwell, and S. Thompson, Solving ODEs with MATLAB, Cambridge University Press, Cambridge, 2003.
Internal references
- Eugene M. Izhikevich (2007) Equilibrium. Scholarpedia, 2(10):2014.
- Lawrence F. Shampine and Skip Thompson (2007) Initial value problems. Scholarpedia, 2(3):2861.
- Rob Schreiber (2007) MATLAB. Scholarpedia, 2(7):2929.
- Kendall E. Atkinson (2007) Numerical analysis. Scholarpedia, 2(8):3163.
- Jeff Moehlis, Kresimir Josic, Eric T. Shea-Brown (2006) Periodic orbit. Scholarpedia, 1(7):1358.
- John Butcher (2007) Runge-Kutta methods. Scholarpedia, 2(9):3147.
- Philip Holmes and Eric T. Shea-Brown (2006) Stability. Scholarpedia, 1(10):1838.
- Alan C. Hindmarsh and Radu Serban (2007) Sundials equation solvers. Scholarpedia, 2(3):2860.
See Also
Initial Value Problem, Numerical Analysis, Relaxation Oscillator