A general purpose root finding algorithm

Overview

If you’ll recall from Backward Euler we arrived at an update rule where relied on itself. This implicit method can be difficult to solve exactly, but by moving all the terms to the left hand side so that the right hand side is zero we can recast the problem as a root finding problem. Newton’s Method is one way to find the roots of complicated functions, as well shall see.

Newton’s method for functions of one variable

The point-slope equation of a straight line

The point slope equation of a straight line is not quite as popular as the slope-intercept form so by way of refresher here it is. \[ f(x) - f(x_{1}) = m(x - x_{1}) \] It defines a straight line in terms of it’s constant slope and a point on the line . We know some calculus so we can replace with the derivative . Since the slope is constant we can evaluate the derivative at any point on the line, so a natural choice is \[ f(x) - f(x_{1}) = f’(x_{1})(x - x_{1}) \]

Motivation

The motivation for Newton’s method is that if we have an “okay” guess at the root, , then we can use a straight line approximation to at to improve our guess. By repeating this process our guess can be iteratively improved to arbitrary accuracy.

NewtonIteration Ani

The update rule comes from setting in the point-slope form and solving for \[ f’(x_{i})(x_{i+1}-x_{i}) = -f(x_{i}) \iff x_{i+1} = x_{i} - \frac{f(x_{i})}{f’(x_{i})} \]

Proof of convergence

This section will be kind of math heavy, but don’t be too intimidated.

Assume that the true root of is . Let be our current guess at . Then we can use Taylor’s theorem to expand about . \[ f(a) = f(x_{i}) + f’(x_{i})(a-x_{i}) + R \] Where is the remainder term. There are multiple ways to characterize the remainder. We will be using the Lagrange form, which is based on the Mean Value Theorem \[ R = \frac{1}{2}f''(c_{i})(a-x_{i})^{2} \] where (or depending on which is less than the other).

Since is the true root we know that the first equation is actually \[ 0 = f(x_{i}) + f’(x_{i})(a-x_{i}) + \frac{1}{2}f''(c_{i})(a-x_{i})^{2} \]

Dividing this whole thing by and rearranging we have \[ \frac{f(x_{i})}{f’(x_{i})} - x_{i} + a = \frac{-f''(c_{i})}{2f’(x_{i})}(a-x_{i})^{2} \]

Recall that our update rule is \[ x_{n+1} = x_{i} - \frac{f(x_{i})}{f’(x_{i})} \]

plugging this into our taylor expansion \[ a - x_{n+1} = \frac{-f''(c_{i})}{2f’(x_{i})}(a-x_{i})^{2} \]

Denote the error in our estimate by \[ \varepsilon_{i+1} = \frac{-f''(c_{i})}{2f’(x_{i})}\varepsilon_{i}^{2} \]

so

\[ |\varepsilon_{i+1}| = \left\lvert\frac{f''(c_{i})}{2f’(x_{i})}\right\rvert\varepsilon_{i}^{2} \]

What we want here is for as . There are precise conditions for when this is gaurenteed to occur that I won’t go into here. But suffice to say that if your initial guess is somewhat decent, and your function isn’t too funky, then the error will converge to zero quadratically. Afterall is proportional to .

Newton’s method for vector valued functions

Unfortunately for us we don’t have a simple function of 1 variable. Recall that our update rule for Backward Euler is \[ \vec{y}_{n+1} - \vec{y}_{n} - hf(t_{n+1},\vec{y}_{n+1}) = 0 \] where . Introducting some notation: \[ G(\vec{y}_{n+1}) = \begin{bmatrix} \vec{x}_{n+1} - \vec{x}_{i} - h\vec{v}_{n+1} \\ \vec{v}_{n+1} - \vec{v}_{n} - h\frac{\vec{F}(\vec{x}_{n+1},\vec{v}_{n+1})}{m} \end{bmatrix} = \begin{bmatrix} \vec{X}(\vec{x}_{n+1},\vec{v}_{n+1}) \\ \vec{V}(\vec{x}_{n+1},\vec{v}_{n+1}) \end{bmatrix} \]

So we want to find the root, of the vector valued funciton . To do this we will make use of the generalization of Newton’s Method.

At this point the subscript notation gets a little hairy. Let be the approximation to . Then the generalized Newton’s method is \[ \nabla G(\vec{z}_{i}) (\vec{z}_{i+1} - \vec{z}_{i})= -G(\vec{z}_{i}) \]

is the Jacobian matrix of evaluated at .

\[ \nabla G = \begin{bmatrix} \frac{\partial \vec{X}}{\partial \vec{x}_{n+1}} & \frac{\partial \vec{X}}{\partial \vec{v}_{n+1}} \\ \frac{\partial \vec{V}}{\partial \vec{x}_{n+1}} & \frac{\partial \vec{V}}{\partial \vec{v}_{n+1}} \end{bmatrix} = \begin{bmatrix} I & hI \\ -\frac{h}{m}\frac{\partial \vec{F}}{\partial \vec{x}_{n+1}} & I - \frac{h}{m}\frac{\partial \vec{F}}{\partial \vec{v}_{n+1}} \end{bmatrix} \]

Where is the identity matrix and the derivatives of the are Jacobians. \[ \frac{\partial \vec{F}}{\partial \vec{x}_{n+1}} = \begin{bmatrix} \frac{\partial F_{x}}{\partial x_{n+1,x}} & \frac{\partial F_{x}}{\partial x_{n+1,y}} & \frac{\partial F_{x}}{\partial x_{n+1,z}} \\ \frac{\partial F_{y}}{\partial x_{n+1,x}} & \frac{\partial F_{y}}{\partial x_{n+1,y}} & \frac{\partial F_{y}}{\partial x_{n+1,z}} \\
\frac{\partial F_{z}}{\partial x_{n+1,x}} & \frac{\partial F_{z}}{\partial x_{n+1,y}} & \frac{\partial F_{z}}{\partial x_{n+1,z}} \end{bmatrix} \]

\[ \frac{\partial \vec{F}}{\partial \vec{v}_{n+1}} = \begin{bmatrix} \frac{\partial F_{x}}{\partial v_{n+1,x}} & \frac{\partial F_{x}}{\partial v_{n+1,y}} & \frac{\partial F_{x}}{\partial v_{n+1,z}} \\ \frac{\partial F_{y}}{\partial v_{n+1,x}} & \frac{\partial F_{y}}{\partial v_{n+1,y}} & \frac{\partial F_{y}}{\partial v_{n+1,z}} \\
\frac{\partial F_{z}}{\partial v_{n+1,x}} & \frac{\partial F_{z}}{\partial v_{n+1,y}} & \frac{\partial F_{z}}{\partial v_{n+1,z}} \end{bmatrix} \]

Recall that our goal is . In the vector valued case Newton’s method is a system of linear equations of the form . If we can solve this system for then we can find by writing . Solving a system of linear equations is a well studied problem that I won’t go into detail on. Safe to say the the Eigen library has many built in methods for solving this kind of problem, so we will be leveraging those.

In the next article we’ll finally implement Backward Euler