7

Solving Nonlinear Equations

Finding a solution for an equation of the form f(x) = 0 is a frequently-encountered problem. It is not as easy to find roots of this equation as it might at first appear. Exact solutions are only possible if f(x) is a polynomial of degree n ≤ 9. By “exact solution,” we mean there is some procedure to calculate a root by using the parameters of the equation (as in the equation ax2 + bx + c = 0). In what follows, the term “root of an equation f(x) = 0” will be used interchangeably with “zero of f(x)”.

The root finding procedure is based on an iterative process or, in other words, on a process of successive approximations. An iterative process has four stages:

1. Root localization – that is, finding an interval which contains the only root.

2. Choosing an initial value, x0, or, in other words, letting x0 be an approximate value of x*.

3. Employing some procedure k times to obtain a set of approximations xk, starting from x0.

4. Finding out how close every approximation, xk, is to the exact solution (in terms of prescribed accuracy εp). If some approximation is in the εp-neighborhood of x*, then the iterative process is complete.

The last point will be satisfied only if:

image

that is, approximations xk converge to x*. That is why a great deal of attention will be given to conditions wherein an iterative process converges.

7.1 Calculation of Roots with the use of Iterative Functions

In what follows, we will consider another approach to calculate roots. An equation f(x) = 0 may be represented in the following equivalent form:

image (7.1)

where g(x) is an iterative function. Thus, f(x*) = 0 is consistent with x* = g(x*). It turns out that form (7.1) is more appropriate for the construction of iterative algorithms. An iterative process is constructed in the following manner: let us set the value of x0 and the succeeding approximations will be calculated by the formula

image (7.2)

Figure 7.1 shows the geometrical interpretation of the convergent iterative process (7.2).

image

Figure 7.1 Successive approximations generated by iterative scheme (7.2): a) g′(x) > 0 and b) g′(x) < 0.

The following theorem explains conditions which provide convergence of the iterative process (7.2).

Theorem 7.1

Let I = [a, b] be a closed, bounded interval, and let g(x) be a function from I to I. In addition, g(x) satisfies the Lipschitz condition

image

for arbitrary points y and z in I. Let x0 ∈ I and xk + l = g(xk). Then the sequence {xk} converges to the unique solution x * ∈ I of the equation x = g(x).

One can derive an estimate of the error of the kth approximant, which depends only on the two successive approximants and the Lipschitz constant. Because |xk + 1 − xk| = |g(xk) − g(xk − 1)|, we can conclude from the Lipschitz condition that for all k, |xk + 1 = xk| ≤ L|xk − xk − 1|. Then

image (7.3)

This gives us a bound on the error of the kth approximant. It should be noted that if L is close to one, convergence is very slow. So, a good part of our effort will be concentrated on how to construct an iterative process which converges rapidly.

There are a lot of methods for solving an equation f(x= 0. We will consider only basic approaches. The iterative processes to be considered may be categorized as follows:

• One-point iteration:

image (7.4)

• Multi-point iteration:

image (7.5)

where β1(x), …, βn(x) are some functions.

• One-point iteration with memory:

image (7.6)

Now, the term “rapid convergence” should be explained in more detail. To do this, we take the iterative process (7.4) as an example. To begin with, let us introduce the error of kth approximation as ek = xk − x *, then xk = x * + ek and xk + 1 = x * + ek + 1. After substituting these expressions in the iterative process (7.4), we can expand g(x* + ek) in terms of ek in the neighborhood of x* (assuming that all derivatives of g(x) up to order p are continuous):

image

image

Taking into account that x* = g(x*) we arrive at

image (7.7)

If g(n)(x*) = 0 for n = 1, …, p − 1 and g(p)(x *) ≠ 0, then an iterative function, g, has an order, p, and thus it follows immediately from equation (7.7) that

image (7.8)

where p is called the order of the sequence {xk} and C is called the asymptotic error constant. When p = 1, an iterative sequence has linear convergence to x*, while with p > 1, convergence is superlinear. It is clear that if ek is sufficiently small, then with p > 1 an iterative sequence may converge very rapidly.

Next, we need to discuss how to estimate how closely the kth approximation approaches the exact solution x*. To do this, we can use inequality (7.3). If xk ∈ [x * − εpx * + εp], then the following conditions are fulfilled for absolute and relative error estimation, respectively:

image (7.9)

In addition to conditions (7.9), usually another condition must be satisfied

image

This condition follows from the formulation of the problem. Therefore, an iterative process is finished when one of those conditions is fulfilled.

7.1.1 One-point iterative processes

To begin with, we will consider the case when g′(x *) ≠ 0. Then, as it follows from equation (7.7) that ek + 1 = g′(x *)ek and with the proviso that

image (7.10)

convergence is linear. In practice, one cannot use condition (7.10), but we can require that the following condition be satisfied:

image (7.11)

In this case, theoretical condition (7.10) is also satisfied. The simplest way to represent a source equation in form (7.1) is as follows:

image (7.12)

and the associated iterative scheme is

image (7.13)

This is the method of fixed-point iteration. In order that condition (7.11) is satisfied at every point in I, we should determine parameter α as follows:

image

To be more specific, we can choose

image

The initial value x0 should be chosen from interval I.

Convergence of the method of fixed-point iteration is not very rapid, but there is a procedure to decrease the number of iterations using the same information. Expression (7.7) for method (7.13) can be written as

image

or

image

Let us consider two consecutive iterations:

image

image

After eliminating R, we can express x* from those two expressions

image (7.14)

If we take zk + 1 as an approximation to x*, then the accuracy of this approximation is O((ek)2) – that is, the sequence {zk} has convergence of the second order, as opposed to the sequence {xk} which has linear convergence. The transformation {xk} → {zk} is called Aitken’s δ2-process.

There are a lot of procedures to achieve superlinear convergence. Let the derivatives of f(x) up to order s be continuous in I. Then, in the neighborhood of xk∈ I, f(x) can be represented as a polynomial using Taylor’s formula (Figure 7.2)

image

Figure 7.2 Polynomial approximation of the function f(x) near the point xk.

image (7.15)

image

where yk lies in the interval determined by xk and x. The next approximation to x* can be found from the equation

image (7.16)

By this means, expressing xk + 1 from equation (7.16), we obtain the transformation xkxk + 1, and this defines some iterative scheme. It is easy to show that the iterative sequence generated by (7.15) and (7.16) has the order p = s.

Next, we need to discuss how to choose an initial value x0 which initiates a convergent iterative process. Let x*∈I, with derivative ƒ(s)(x) continuous in I and f′(x)(s)(x) 0 for all x∈ I. When conditions

image (7.17)

image (7.18)

are satisfied, and min(x*, xk< xk + 1 < max(x*, xk), then the iterative sequence {xk} generated by (7.15) and (7.16) converges monotonically to x*.

Now, let us consider the particular methods which follow from general approach (7.15) and (7.16). When s = 2, equation (7.16) has the following form:

image

Solving this equation with respect to xk + 1, we obtain the following iterative scheme:

image (7.19)

with g(x) = x − f(x)/f′(x). This is the well known Newton’s method and it has a simple geometrical interpretation. Function y(x) = f(xk) + f′(xk)(x − xk) is the tangent to the curve of f(x) at the point xk. Then approximation xk + l is the intersection point of the tangent with the x-axis. To choose an initial value x0 it is necessary to use condition (7.17) with s = 7.

When s = 3, equation (7.16) has the form

image (7.20)

Solving this equation with respect to xk + 1, we obtain

image (7.21)

To satisfy Theorem 7.1, we must add the last two terms of equation (7.21) when f′(x> 0 and subtract them when f′(x< 0. It should also be taken into account that the sum of the last two terms in expression (7.21) approaches zero near the root. This may result in the loss of accuracy. It is known that for quadratic equation ax2 + bx + c = 0, the following relation is valid:

image

After applying this conversion to (7.21), we finally arrive at the following iterative scheme:

image (7.22)

An initial value x0 should be chosen from condition (7.18) with s = 8.

The polynomial Ps(xk + 1) may be reduced to a first-degree polynomial without changing the order of the iterative sequence generated. Two possibilities to modify equation (7.20) follow:

1) replace one of the (xk + 1 − xk) factors in (xk + 1 − xk)2 by − f(xk)/f′(xk). After solving the modified equation with respect to (xk + l − xk) we obtain the following iterative scheme:

image (7.23)

image

This is the Halley method.

2) replace (xk + rxk)2 by (f(xk)/f′(xk))7 Then after solving the modified equation with respect to (xk + 1 − xk) we get

image (7.24)

image

These methods are more attractive in comparison to iterative scheme (7.22) because they have the same cubic convergence but computation of the square root is avoided.

The following function implements the fixed-point method to find the root of an equation f(x= 0 such that the equation could be re written as x = g(x). It takes three input arguments; the first argument is a function handle. Here, it is the function g(x) which is related to function f(x), for which we want to find out the root. The second argument is the initial guess value of root. The last argument is the error tolerance of the search method as discussed previously. The code listing is as follows.

Listing 7.1

1. function y = fixedpoint(g, x0, tol)

2. % fixed point function finds the root of a given function

3. %f(x) = 0 with the intial guess x0 and accuracy of tol

4. % such that the equation f(x) = 0 could be written as

5. % x = g(x)

6. % input: g - input function such that x = g(x)

7. % x0 - initial guess value of root

8. % tol - error tolerance

9. % output: y - root value

10. y_old = x0;

11. y_new = feval(g, x0);

12. i = 0;

13. while (abs(y_new - y_old) > tol)

14.  i = i + l;

15.  y_old = y_new;

16.  y_new = feval(g, y_old);

17.  s = sprint(‘Iteration = %d Approximate root = %5.10f, i, y_new);

18.  disp(s);

19. end;

20. y = y_ new;

To find the root of an equation f(x= cos(x) − x = 0. We will first re-write the equation as x = g(x) = cos(x). Then, we will implement the following function.

Listing 7.2

Function y = g(x)

% x = cos(x)

y = cos (x);

Then we will call the fixedpoint function as follows.

Listing 7.3

> > y = fixedpoint(@g, 0, 1e-6)

Iteration = 1 Approximate root = 0.5403023059

….

Iteration = 34 Approximate root = 0.7390855264

y =

0.73908552636192

We don’t show all the results to save the space. It took 34 iterations to reach the approximate solution within desired error tolerance. We observed that even after trying with different initial root values closer to the root the performance of this method did not improve.

The following function implements the controlled fixed-point iterative method. It is a variation of the fixed-point method. The signature of the function remains almost the same except for the addition of the alpha control parameter. We do not need to transform the equation from f(x) = 0 to x = g(x) so the function handle in first argument should point to the m-script implementation of f(x) rather than g (x). The code is listed in the following.

Listing 7.4

1. function y = cfixedpoinu(f, x0, alpha, tol)

2. % cfixedpoint function finds the root of a given function

3. % f(x) = 0 with the intial guess x0 and accuracy of tol

4. % such that the iteration is controlled using parameter alpha

5. % input: f – input function such that f(x) = 0

6. % x0- initial guess value of root

7. % alpha - control parameter

8. % tol - error tolerance

9. % output: y - root value

10. y_old = x0;

11. Y_new = y_old + alpha * feval(f, x0);

12. i = 0;

13. while (abs(y_new - y_old) > tol)

14.  i = i + I;

15.  Y_old = y_new;

16.  y_new = y_old + alpha * feval(f, y_old);

17.  s = sprintf(‘Iteration = %d Approximate root = %5.10f’, i, Y_new);

18.  disp(s);

19. end;

20. y = y_new

We apply this function on the same problem, i.e.,

image

First, we should implement this function using m-script as given in the following listing.

Listing 7.5

1. function y = f(x)

2. % f(x) = cos(x)- x = 0

3. y = cos(x)-x

Then, to apply the controlled fixed-point method on this function, we will make the following function call from the interpreter.

Listing 7.6

> > y = cfixedpoint(@f, 0,0.5, Ie-6)

Iteration = 1 Approximate root = 0.7379893799

Iteration = 2 Approximate root = 0. 7389060909

Iteration = 3 Approximate root = 0.7390559087

Iteration = 4 Approximate root = 0.7390803638

Iteration = 5 Approximate root = 0.7390843549

Iteration = 6 Approximate root = 0.7390850062

y =

0.73908500619363

The result obtained is the same as in previous sections but it takes only 6 iterations to solve the same problem. Controller fixed-point method is an enormous improvement in terms of performance compared to the fixed-point method.

7.1.2 Multi-point iterative processes

To construct an iterative process with cubic convergence using one-point iteration, we need to employ the higher-order derivatives of f(x). This may sometimes result in much computational effort. Another way to achieve cubic convergence is to use a multi-point iterative scheme (see (7.5)). One of the procedures to construct such a scheme is based on a superposition. Let us calculate the next approximation xk + 1, in two stages (note that the first formula represents Newton’s method):

image

A combination of these two formulae results in the following iterative scheme:

image (7.25)

The order of the iterative sequence is three, that is, ek + 1 = C(ek)3 In some cases, a multi-point iterative scheme is more efficient in comparison to one-point iteration. Only the function itself and its first derivative are employed and again we have rapid convergence. The construction procedure for iterative scheme (7.25) enables us to adopt the same condition for choosing the initial values as for Newton’s method.

By analogy with scheme (7.25), we can construct another iterative scheme:

image

and after the combination of these expressions we arrive at

image

This iterative scheme also produces the sequence with the order of convergence p = 3.

The following function implements Newton’s method of root approximation. In this function, the first argument points to the implementation of function f(x) whereas the second argument points to the implementation of first derivative of this function, f′(x). The third argument, x0, is the initial guess of the root and last argument, tol, is the error tolerance.

Listing 7.7

1. function y = newton(f, df x0, tol)

2. % newton method finds the root of a given function f(x) = 0

3. % input: f – input function such that f(x) = 0

4.  % df- first derivative of the function f(x)

5.  % x0 - initial guess value of root

6.  % tol - error tolerance

7. % output: y - root value

8. y_old = x0;

9. y_new = y_old - feval(f, y_old) /feval(df, y_old);

10. i = 0;

11. while (abs (y_new - y_old) > tol)

12.  i = i + l;

13.  y_old = y_new;

14.  y_new = y_old-feval(f,y_old)/feval(df,y_old);

15.  s = sprintf(‘Iteration = %d Approximate root = %5.10f, i, y_new);

16.  disp(s);

17. end;

18. y = y_new;

We use the same example as in the previous sections.

image

The implementation of this function is shown previously. The derivative of this function is as follows.

image

And the implementation of the derivative is as follows.

Listing 7.8

1. function y = df(x)

2. y = − sin(x) − 1;

Then, to apply Newton’s method on this function, we will make the following function call from the interpreter.

Listing 7.9

> > y = newton(@f, @df 0, 1e-6)

Iteration = 1 Approximate root = 0.7503638678

Iteration = 2 Approximate root = 0.7391128909

Iteration = 3 Approximate root = 0.7390851334

Iteration = 4 Approximate root = 0.7390851332

y =

21 0.73908513321516

Once again, the result obtained is the same as in previous methods and it takes 4 iterations to reach the solution. This is the best method so far. The only disadvantage of this method is that the user has to write the implementation of the derivative of the function as well. Readers could try to modify the above implementation and use the numerical approach to find out the derivative of the function. Although it will take longer to run the function, the user will not have to write the extra function implementation.

The multi-point method is implemented in the following. The function takes the same arguments as in Newton’s method.

Listing 7.10

function x = multipoint(f, df, x0, tol)

% newton method finds the root of a given function f(x) = 0

% input: f – input function such that f(x) = 0

% df- first derivative of the function f(x)

% x0 - initial guess value of root

% tol - error tolerance

% output: x - root value

x_old = x0;

y = x_old –feval(f, x_old) / feval(df, x_old);

x_new = y – feval(f, y) / feval(df, x_old);

i = 0;

while (abs(x_new - x_old) > tol)

 i = i + 1;

 x_old = x_new;

 y = x_old – feval(f, x_old) / feval(df, x_old);

 x_new = y – feval(f, y) / feval(df, x_old);

 s = sprintf(‘Iteration = %d Approximate root = %5.10f’, i, x_new);

 disp(s);

end;

x = x_new;

We use the same example as in the previous sections.

image

To apply the multipoint method, we will make the following function call from the interpreter.

Listing 7.11

> > y = multipoint(@f, @df, 0, 1e-6)

Iteration = 1 Approximate root = 0.7379315267

Iteration = 2 Approximate root = 0.7390851331

Iteration = 3 Approximate root = 0.7390851332

ans =

 0.73908513321516

This method proves faster than Newton’s method. The result is the same as before and it takes even fewer iterations to reach the solution.

7.1.3 One-point iterative processes with memory

Despite the power of methods with superlinear convergence, their use of the derivatives f′(x) and f″(x) can be troublesome. It often requires much more effort to evaluate f′(x) and ƒ(x) than f(x). In addition, many applications involve functions that do not have closed-form expressions for their derivatives.

In this section we consider some modifications to one-point iterations that avoid the exact evaluation of derivatives. The first derivative of f(x) at the point xk may be approximately evaluated in the following manner:

image

Substituting this expression in (7.19) we obtain

image (7.26)

This is the secant method. Convergence of this iterative scheme is described by expression (7.8) with some constant C and p ≈ 1.62, which is slower than Newton’s method (p = 2). However, the secant method requires just one evaluation of f(xk) per iteration (we can use the value f(xk − 1) from the previous iteration). To start an iterative process we need to choose two initial values x0 and x1 To define these initial values, we can use the same condition we used for Newton’s method, except that the condition should be applied to x0 and x1:

image (7.27)

The implementation of secant method in the following requires two guess values of the root, unlike the previous method.

Listing 7.12

1. function y = secant(f, df, x0, xl, tal)

2. %This method finds the root of a given function f(x) = 0

3. % input: f- input function such that f(x) = 0

4. % x0 < first initial guess value of root

5. % xl - second initial guess value of root

6. % tol - error tolerance

7. % output: y - root value

8. y_old_old = x0;

9. y_old = x1;

10. y_new = y_old-feval(f,y_old) * (y_old - y_old_old)/ (feval(f, y_old) –feval(f, y_old_old));

11. i = 0;

12. while (abs(y_new -y_old) > tol)

13.  i = i + 1;

14.  y_old_old = y_old;

15.  y_old = y_new;

16.  y_new = y_old – feval(f, y_old) * (y_old – y_old_old)/ (feval(f,y old) –feval(f, y_old_old»;

17.  s = sprint(‘Iteration = %d Approximate root = %5.10f, i, y_new);

18.  disp(s);

19. end;

20. y = y_new;

We use the same example as in the previous sections.

image

The implementation of these functions is shown previously. To apply the secant method, we will make the following function call from the interpreter.

Listing 7.13

> > y = secant(@f, @df, 0, 1, 1e-6)

Iteration = 1 Approximate root = 0.7362989976

Iteration = 2 Approximate root = 0.7391193619

Iteration = 3 Approximate root = 0.7390851121

Iteration = 4 Approximate root = 0.7390851332

y =

0.73908513321500

Secant method took only 4 iterations to find the root. Its performance is comparable to the multi-point method to solve this example.

7.2 Exercises

Find the root of equation f(x) = 0. If the equation has more than one root, then find the root that has the smallest magnitude. Calculate the root with prescribed tolerance using one of the iterative methods. To solve the problem find an interval that contains the root and initial guess that initiates the convergent iterative process. Suggested iterative methods are

(1) simple iteration with Aitken’s δ2-process,

(2) Newton’s method (7.19),

(3) one-point iteration (7.22),

(4) one-point iteration (7.23),

(5) one-point iteration (7.24),

(6) multi-point iteration (7.25),

(7) secant method (7.26)

Set of equations to solve:

1. 

image

2. 

image

3. 

image

4. 

image

5. 

image

6. 

image

7. 

image

8. 

image

9. 

image

10. 

image

11. 

image

12. 

image

13. 

image

14. 

image

15. 

image

16. 

image

17. 

image

18. 

image

19. 

image

20. 

image

21. 

image

22. 

image

23. 

image

24. 

image

25. 

image

26. 

image

27. 

image

28. 

image

29. 

image

30. 

image

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset