Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

6.10* Conditioning and the Rayleigh Quotient

In Section 3.4, we studied specific techniques that allow us to solve systems of linear equations in the form $A x = b$ $A x = b$ , where A is an $m \times n$ $m \times n$ matrix and b is an $m \times 1$ $m \times 1$ vector. Such systems often arise in applications to the real world. The coefficients in the system are frequently obtained from experimental data, and, in many cases, both m and n are so large that a computer must be used in the calculation of the solution. Thus two types of errors must be considered. First, experimental errors arise in the collection of data since no instruments can provide completely accurate measurements. Second, computers introduce roundoff errors. One might intuitively feel that small relative changes in the coefficients of the system cause small relative errors in the solution. A system that has this property is called well-conditioned; otherwise, the system is called ill-conditioned.

We now consider several examples of these types of errors, concentrating primarily on changes in b rather than on changes in the entries of A. In addition, we assume that A is a square, complex (or real), invertible matrix since this is the case most frequently encountered in applications.

Example 1

Consider the system

\begin{array}{rcl} x_{1} + x_{2} & = & 5 \\ x_{1} - x_{2} & = & 1. \end{array}

$\begin{array}{rcl} x_{1} + x_{2} & = & 5 \\ x_{1} - x_{2} & = & 1. \end{array}$

The solution to this system is

(\begin{array}{l} 3 \\ 2 \end{array}) .

$(\begin{array}{l} 3 \\ 2 \end{array}) .$

Now suppose that we change the system somewhat and consider the new system

\begin{array}{rcl} x_{1} + x_{2} & = & 5 \\ x_{1} - x_{2} & = & 1.0001. \end{array}

$\begin{array}{rcl} x_{1} + x_{2} & = & 5 \\ x_{1} - x_{2} & = & 1.0001. \end{array}$

This modified system has the solution

(\begin{array}{l} 3.00005 \\ 1.99995 \end{array}) .

$(\begin{array}{l} 3.00005 \\ 1.99995 \end{array}) .$

We see that a change of $10^{- 4}$ $10^{- 4}$ in one coefficient has caused a change of less than $10^{- 4}$ $10^{- 4}$ in each coordinate of the new solution. More generally, the system

\begin{array}{rcl} x_{1} + x_{2} & = & 5 \\ x_{1} - x_{2} & = & 1 + δ \end{array}

$\begin{array}{rcl} x_{1} + x_{2} & = & 5 \\ x_{1} - x_{2} & = & 1 + δ \end{array}$

has the solution

(\begin{array}{l} 3 + δ / 2 \\ 2 - δ / 2 \end{array}) .

$(\begin{array}{l} 3 + δ / 2 \\ 2 - δ / 2 \end{array}) .$

Hence small changes in b introduce small changes in the solution. Of course, we are really interested in relative changes since a change in the solution of, say, 10, is considered large if the original solution is of the order $10^{- 2}$ $10^{- 2}$ , but small if the original solution is of the order $10^{6}$ $10^{6}$ .

We use the notation $δ b$ $δ b$ to denote the vector $b^{'} - b$ $b^{'} - b$ , where b is the vector in the original system and $b^{'}$ $b^{'}$ is the vector in the modified system. Thus we have

δ b = (\begin{array}{c} 5 \\ 1 + h \end{array}) - (\begin{array}{c} 5 \\ 1 \end{array}) = (\begin{array}{c} 0 \\ h \end{array}) .

$δ b = (\begin{array}{c} 5 \\ 1 + h \end{array}) - (\begin{array}{c} 5 \\ 1 \end{array}) = (\begin{array}{c} 0 \\ h \end{array}) .$

We now define the relative change in b to be the scalar $| | δ b | | / | | b | |$ $| | δ b | | / | | b | |$ , where $| | \cdot | |$ $| | \cdot | |$ denotes the standard norm on $C^{n}$ $C^{n}$ (or $R^{n}$ $R^{n}$ ); that is, $| | b | | = \sqrt{〈 b, b 〉}$ $| | b | | = \sqrt{〈 b, b 〉}$ . Most of what follows, however, is true for any norm. Similar definitions hold for the relative change in x. In this example,

\frac{| | δ b | |}{| | b | |} = \frac{| h |}{\sqrt{26}} and \frac{| | δ x | |}{| | x | |} = \frac{‖ (\begin{array}{l} 3 + h / 2 \\ 2 - h / 2 \end{array}) - (\begin{array}{l} 3 \\ 2 \end{array}) ‖}{‖ (\begin{array}{l} 3 \\ 2 \end{array}) ‖} = \frac{| h |}{\sqrt{26}} .

$\frac{| | δ b | |}{| | b | |} = \frac{| h |}{\sqrt{26}} and \frac{| | δ x | |}{| | x | |} = \frac{‖ (\begin{array}{l} 3 + h / 2 \\ 2 - h / 2 \end{array}) - (\begin{array}{l} 3 \\ 2 \end{array}) ‖}{‖ (\begin{array}{l} 3 \\ 2 \end{array}) ‖} = \frac{| h |}{\sqrt{26}} .$

Thus the relative change in x equals, coincidentally, the relative change in b; so the system is well-conditioned.

Example 2

Consider the system

\begin{array}{rcrcl} x_{1} & + & x_{2} & = & 3 \\ x_{1} & + & 1.00001 x_{2} & = & 3.00001, \end{array}

$\begin{array}{rcrcl} x_{1} & + & x_{2} & = & 3 \\ x_{1} & + & 1.00001 x_{2} & = & 3.00001, \end{array}$

which has

(\begin{array}{l} 2 \\ 1 \end{array})

$(\begin{array}{l} 2 \\ 1 \end{array})$

as its solution. The solution to the related system

\begin{array}{rcrcl} x_{1} & + & x_{2} & = & 3 \\ x_{1} & + & 1.00001 x_{2} & = & 3.00001 + δ \end{array}

$\begin{array}{rcrcl} x_{1} & + & x_{2} & = & 3 \\ x_{1} & + & 1.00001 x_{2} & = & 3.00001 + δ \end{array}$

(\begin{array}{l} 2 - (10^{5}) h \\ 1 + (10^{5}) h \end{array}) .

$(\begin{array}{l} 2 - (10^{5}) h \\ 1 + (10^{5}) h \end{array}) .$

Hence,

\frac{| | δ x | |}{| | x | |} = 10^{5} \sqrt{2 / 5} | h | \geq 10^{4} | h |,

$\frac{| | δ x | |}{| | x | |} = 10^{5} \sqrt{2 / 5} | h | \geq 10^{4} | h |,$

while

\frac{| | δ b | |}{| | b | |} \approx \frac{| h |}{3 \sqrt{2}} .

$\frac{| | δ b | |}{| | b | |} \approx \frac{| h |}{3 \sqrt{2}} .$

Thus the relative change in x is at least $10^{4}$ $10^{4}$ times the relative change in b! This system is very ill-conditioned. Observe that the lines defined by the two equations are nearly coincident. So a small change in either line could greatly alter the point of intersection, that is, the solution to the system.

To apply the full strength of the theory of self-adjoint matrices to the study of conditioning, we need the notion of the norm of a matrix. (See Exercises 26-30 of Section 6.1 for further results about norms.)

Definition.

Let A be a complex (or real) $n \times n$ $n \times n$ matrix. Define the norm of A by

| | A | |_{E} = \max_{x \neq 0} \frac{| | A x | |}{| | x | |},

$| | A | |_{E} = \max_{x \neq 0} \frac{| | A x | |}{| | x | |},$

where $x \in C^{n}$ $x \in C^{n}$ or $x \in R^{n}$ $x \in R^{n}$ .

Intuitively, $| | A | |_{E}$ $| | A | |_{E}$ represents the maximum magnification of a vector by the matrix A. The question of whether or not this maximum exists, as well as the problem of how to compute it, can be answered by the use of the so-called Rayleigh quotient.

Definition.

Let B be an $n \times n$ $n \times n$ self-adjoint matrix. The Rayleigh quotient for $x \neq 0$ $x \neq 0$ is defined to be the scalar $R (x) = 〈 B x, x 〉 / | | x | |^{2}$ $R (x) = 〈 B x, x 〉 / | | x | |^{2}$ .

The following result characterizes the extreme values of the Rayleigh quotient of a self-adjoint matrix.

Theorem 6.44.

For a self-adjoint matrix $B \in M_{n \times n} (F)$ $B \in M_{n \times n} (F)$ , we have that $\max_{x \neq 0} R (x)$ $\max_{x \neq 0} R (x)$ is the largest eigenvalue of B and $\min_{x \neq 0} R (x)$ $\min_{x \neq 0} R (x)$ is the smallest eigenvalue of B.

Proof.

By Theorems 6.19 (p. 381) and 6.20 (p. 381), we may choose an orthonormal basis ${v_{1}, v_{2}, \dots, v_{n}}$ ${v_{1}, v_{2}, \dots, v_{n}}$ of eigenvectors of B such that $B v_{i} = λ_{i} v_{i} (1 \leq i \leq n)$ $B v_{i} = λ_{i} v_{i} (1 \leq i \leq n)$ , where $λ_{1} \geq λ_{2} \geq \dots \geq λ_{n}$ $λ_{1} \geq λ_{2} \geq \dots \geq λ_{n}$ . (Recall that by the lemma to Theorem 6.17, p. 370, the eigenvalues of B are real.) Now, for $x \in F^{n}$ $x \in F^{n}$ , there exist scalars $a_{1}, a_{2}, \dots, a_{n}$ $a_{1}, a_{2}, \dots, a_{n}$ such that

x = \sum_{i = 1}^{n} a_{i} v_{i} .

$x = \sum_{i = 1}^{n} a_{i} v_{i} .$

Hence

\begin{array}{rcl} R (x) & = & \frac{〈 B x, x 〉}{| | x | |^{2}} = \frac{⟨ \sum_{i = 1}^{n} a_{i} λ_{i} v_{i}, \sum_{j = 1}^{n} a_{j} v_{j} ⟩}{| | x | |^{2}} \\ = & \frac{\sum_{i = 1}^{n} λ_{i} | a_{i} |^{2}}{| | x | |^{2}} \leq \frac{λ_{1} \sum_{i = 1}^{n} | a_{i} |^{2}}{| | x | |^{2}} = \frac{λ_{1} | | x | |^{2}}{| | x | |^{2}} = λ_{1} . \end{array}

$\begin{array}{rcl} R (x) & = & \frac{〈 B x, x 〉}{| | x | |^{2}} = \frac{⟨ \sum_{i = 1}^{n} a_{i} λ_{i} v_{i}, \sum_{j = 1}^{n} a_{j} v_{j} ⟩}{| | x | |^{2}} \\ = & \frac{\sum_{i = 1}^{n} λ_{i} | a_{i} |^{2}}{| | x | |^{2}} \leq \frac{λ_{1} \sum_{i = 1}^{n} | a_{i} |^{2}}{| | x | |^{2}} = \frac{λ_{1} | | x | |^{2}}{| | x | |^{2}} = λ_{1} . \end{array}$

It is easy to see that $R (v_{1}) = λ_{1}$ $R (v_{1}) = λ_{1}$ , so we have demonstrated the first half of the theorem. The second half is proved similarly.

Corollary 1.

For any square matrix A, $| | A | |_{E}$ $| | A | |_{E}$ is finite and, in fact, equals $\sqrt{λ}$ $\sqrt{λ}$ , where $λ$ $λ$ is the largest eigenvalue of A* A.

Proof.

Let B be the self-adjoint matrix A*A, and let $λ$ $λ$ be the largest eigenvalue of B. Since, for $x \neq 0$ $x \neq 0$ ,

0 \leq \frac{| | A x | |^{2}}{| | x | |^{2}} = \frac{〈 A x, A x 〉}{| | x | |^{2}} = \frac{〈 A * A x, x 〉}{| | x | |^{2}} = \frac{〈 B x, x 〉}{| | x | |^{2}} = R (x),

$0 \leq \frac{| | A x | |^{2}}{| | x | |^{2}} = \frac{〈 A x, A x 〉}{| | x | |^{2}} = \frac{〈 A * A x, x 〉}{| | x | |^{2}} = \frac{〈 B x, x 〉}{| | x | |^{2}} = R (x),$

it follows from Theorem 6.44 that $| | A | |_{E}^{2} = λ$ $| | A | |_{E}^{2} = λ$ .

Observe that the proof of Corollary 1 shows that all the eigenvalues of A*A are nonnegative. For our next result, we need the following lemma.

Lemma. For any square matrix A, $λ$ $λ$ is an eigenvalue of A* A if and only if $λ$ $λ$ is an eigenvalue of AA*.

Proof.

Let $λ$ $λ$ be an eigenvalue of A*A. If $λ = 0$ $λ = 0$ , then A*A is not invertible. Hence A and A* are not invertible, so that $λ$ $λ$ is also an eigenvalue of AA* . The proof of the converse is similar.

Suppose now that $λ \neq 0$ $λ \neq 0$ . Then there exists $x \neq 0$ $x \neq 0$ such that $A * A x = λ x$ $A * A x = λ x$ . Apply A to both sides to obtain $(A A *) (A x) = λ (A x)$ $(A A *) (A x) = λ (A x)$ . Since $A x \neq 0$ $A x \neq 0$ (lest $λ x = 0$ $λ x = 0$ ), we have that $λ$ $λ$ is an eigenvalue of AA*. The proof of the converse is left as an exercise.

Corollary 2.

Let A be an invertible matrix. Then $| | A^{- 1} | |_{E} = 1 / \sqrt{λ}$ $| | A^{- 1} | |_{E} = 1 / \sqrt{λ}$ , where $λ$ $λ$ is the smallest eigenvalue of A* A.

Proof.

Recall that $λ$ $λ$ is an eigenvalue of an invertible matrix if and only if $λ^{- 1}$ $λ^{- 1}$ is an eigenvalue of its inverse.

Now let $λ_{1} \geq λ_{2} \geq \dots \geq λ_{n}$ $λ_{1} \geq λ_{2} \geq \dots \geq λ_{n}$ be the eigenvalues of A*A, which by the lemma are the eigenvalues of AA*. Then $| | A^{- 1} | |_{E}^{2}$ $| | A^{- 1} | |_{E}^{2}$ equals the largest eigenvalue of $(A^{- 1}) * A^{- 1} = {(A A *)}^{- 1}$ $(A^{- 1}) * A^{- 1} = {(A A *)}^{- 1}$ , which equals $1 / λ_{n}$ $1 / λ_{n}$ .

For many applications, it is only the largest and smallest eigenvalues that are of interest. For example, in the case of vibration problems, the smallest eigenvalue represents the lowest frequency at which vibrations can occur.

We see the role of both of these eigenvalues in our study of conditioning.

Example 3

Let

A = (\begin{array}{r} 1 & 0 & 1 \\ - 1 & 1 & 0 \\ 0 & 1 & 1 \end{array}) .

$A = (\begin{array}{r} 1 & 0 & 1 \\ - 1 & 1 & 0 \\ 0 & 1 & 1 \end{array}) .$

Then

B = A * A = (\begin{array}{r} 2 & - 1 & 1 \\ - 1 & 2 & 1 \\ 1 & 1 & 2 \end{array}) .

$B = A * A = (\begin{array}{r} 2 & - 1 & 1 \\ - 1 & 2 & 1 \\ 1 & 1 & 2 \end{array}) .$

The eigenvalues of B are 3, 3, and 0. Therefore $| | A | |_{E} = \sqrt{3}$ $| | A | |_{E} = \sqrt{3}$ . For any

x = (\begin{array}{c} a \\ b \\ c \end{array}) \neq 0,

$x = (\begin{array}{c} a \\ b \\ c \end{array}) \neq 0,$

we may compute R(x) for the matrix B as

3 \geq R (x) = \frac{〈 B x, x 〉}{| | x | |^{2}} = \frac{2 (a^{2} + b^{2} + c^{2} - a b + a c + b c)}{a^{2} + b^{2} + c^{2}} .

$3 \geq R (x) = \frac{〈 B x, x 〉}{| | x | |^{2}} = \frac{2 (a^{2} + b^{2} + c^{2} - a b + a c + b c)}{a^{2} + b^{2} + c^{2}} .$

Now that we know $| | A | |_{E}$ $| | A | |_{E}$ exists for every square matrix A, we can make use of the inequality $| | A x | | \leq | | A | |_{E} \cdot | | x | |$ $| | A x | | \leq | | A | |_{E} \cdot | | x | |$ , which holds for every x.

Assume in what follows that A is invertible, $b \neq 0$ $b \neq 0$ , and $A x = b$ $A x = b$ . For a given $δ b$ $δ b$ , let $δ x$ $δ x$ be the vector that satisfies $A (x + δ x) = b + δ b$ $A (x + δ x) = b + δ b$ . Then $A (δ x) = δ b$ $A (δ x) = δ b$ , and so $δ x = A^{- 1} (δ b)$ $δ x = A^{- 1} (δ b)$ . Hence

| | b | | = | | A x | | \leq | | A | |_{E} \cdot | | x | | and | | δ x | | = | | A^{- 1} (δ b) | | \leq | | A^{- 1} | |_{E} \cdot | | δ b | | .

$| | b | | = | | A x | | \leq | | A | |_{E} \cdot | | x | | and | | δ x | | = | | A^{- 1} (δ b) | | \leq | | A^{- 1} | |_{E} \cdot | | δ b | | .$

Thus

\frac{| | δ x | |}{| | x | |} \leq \frac{| | x | |^{2}}{| | b | | / | | A | |_{E}} \leq \frac{| | A^{- 1} | |_{E} \cdot | | δ b | | \cdot | | A | |_{E}}{| | b | |} = | | A | |_{E} \cdot | | A^{- 1} | |_{E} \cdot (\frac{| | δ b | |}{| | b | |}) .

$\frac{| | δ x | |}{| | x | |} \leq \frac{| | x | |^{2}}{| | b | | / | | A | |_{E}} \leq \frac{| | A^{- 1} | |_{E} \cdot | | δ b | | \cdot | | A | |_{E}}{| | b | |} = | | A | |_{E} \cdot | | A^{- 1} | |_{E} \cdot (\frac{| | δ b | |}{| | b | |}) .$

Similarly (see Exercise 9),

\frac{1}{| | A | |_{E} \cdot | | A^{- 1} | |_{E}} (\frac{| | δ b | |}{| | b | |}) \leq \frac{| | δ b | |}{| | x | |} .

$\frac{1}{| | A | |_{E} \cdot | | A^{- 1} | |_{E}} (\frac{| | δ b | |}{| | b | |}) \leq \frac{| | δ b | |}{| | x | |} .$

The number $| | A | |_{E} \cdot | | A^{- 1} | |_{E}$ $| | A | |_{E} \cdot | | A^{- 1} | |_{E}$ is called the condition number of A and is denoted cond(A). We summarize these results in the following theorem.

Theorem 6.45.

For the system $A x = b$ $A x = b$ , where A is invertible and $b \neq 0$ $b \neq 0$ , the following statements are true.

(a) We have $\frac{1}{cond (A)} \frac{| | δ b | |}{| | b | |} \leq \frac{| | δ x | |}{| | x | |} \leq cond (A) \frac{| | δ b | |}{| | b | |}$ $\frac{1}{cond (A)} \frac{| | δ b | |}{| | b | |} \leq \frac{| | δ x | |}{| | x | |} \leq cond (A) \frac{| | δ b | |}{| | b | |}$ .
(b) If $λ_{1}$ $λ_{1}$ and $λ_{n}$ $λ_{n}$ are the largest and smallest eigenvalues, respectively, of A*A, then $cond (A) = \sqrt{λ_{1} / λ_{n}}$ $cond (A) = \sqrt{λ_{1} / λ_{n}}$ .

Proof.

Statement (a) follows from the previous inequalities, and (b) follows from Corollaries 1 and 2 to Theorem 6.44.

It should be noted that the definition of cond(A) depends on how the norm of A is defined. There are many reasonable ways of defining the norm of a matrix. In fact, the only property needed to establish Theorem 6.45(a) and the two displayed inequalities preceding it is that $| | A x | | \leq | | A | |_{E} \cdot | | x | |$ $| | A x | | \leq | | A | |_{E} \cdot | | x | |$ for all x.

It is clear from Theorem 6.45(a) that $cond (A) \geq 1$ $cond (A) \geq 1$ . It is left as an exercise to prove that $cond (A) = 1$ $cond (A) = 1$ if and only if A is a scalar multiple of a unitary or orthogonal matrix. Moreover, it can be shown with some work that equality can be obtained in (a) by an appropriate choice of b and $δ b$ $δ b$ .

We can see immediately from (a) that if cond(A) is close to 1, then a small relative error in b forces a small relative error in x. If cond(A) is large, however, then the relative error in x may be small even though the relative error in b is large, or the relative error in x may be large even though the relative error in b is small! In short, cond(A) merely indicates the potential for large relative errors.

We have so far considered only errors in the vector b. If there is an error $δ A$ $δ A$ in the coefficient matrix of the system $A x = b$ $A x = b$ , the situation is more complicated. For example, $A = δ A$ $A = δ A$ may fail to be invertible. But under the appropriate assumptions, it can be shown that a bound for the relative error in x can be given in terms of cond(A). For example, Charles Cullen (Charles G. Cullen, An Introduction to Numerical Linear Algebra, PWS Publishing Co., Boston 1994, p. 60) shows that if $A + δ A$ $A + δ A$ is invertible, then

\frac{| | A x | |}{| | x + δ x | |} \leq cond (A) \frac{| | δ A | |_{E}}{| | A | |_{E}} .

$\frac{| | A x | |}{| | x + δ x | |} \leq cond (A) \frac{| | δ A | |_{E}}{| | A | |_{E}} .$

It should be mentioned that, in practice, one never computes cond(A) from its definition, for it would be an unnecessary waste of time to compute $A^{- 1}$ $A^{- 1}$ merely to determine its norm. In fact, if a computer is used to find $A^{- 1}$ $A^{- 1}$ , the computed inverse of A in all likelihood only approximates $A^{- 1}$ $A^{- 1}$ , and the error in the computed inverse is affected by the size of cond(A). So we are caught in a vicious circle! There are, however, some situations in which a usable approximation of cond(A) can be found. Thus, in most cases, the estimate of the relative error in x is based on an estimate of cond(A).

Exercises

Label the following statements as true or false.
1. (a) If $A x = b$ $A x = b$ is well-conditioned, then cond(A) is small.
2. (b) If cond(A) is large, then $A x = b$ $A x = b$ is ill-conditioned.
3. (c) If cond(A) is small, then $A x = b$ $A x = b$ is well-conditioned.
4. (d) The norm of A equals the Rayleigh quotient.
5. (e) The norm of A always equals the largest eigenvalue of A.
Compute the norms of the following matrices.
1. (a) $(\begin{array}{r} 4 & 0 \\ 1 & 3 \end{array})$ $(\begin{array}{r} 4 & 0 \\ 1 & 3 \end{array})$
2. (b) $(\begin{array}{r} 5 & 3 \\ - 3 & 3 \end{array})$ $(\begin{array}{r} 5 & 3 \\ - 3 & 3 \end{array})$
3. (c) $(\begin{array}{r} 1 & \frac{- 2}{\sqrt{3}} & 0 \\ 0 & \frac{- 2}{\sqrt{3}} & 1 \\ 0 & \frac{2}{\sqrt{3}} & 1 \end{array})$ $(\begin{array}{r} 1 & \frac{- 2}{\sqrt{3}} & 0 \\ 0 & \frac{- 2}{\sqrt{3}} & 1 \\ 0 & \frac{2}{\sqrt{3}} & 1 \end{array})$
Prove that if B is symmetric, then $| | B | |_{E}$ $| | B | |_{E}$ is the largest eigenvalue of B.
Let A and $A^{- 1}$ $A^{- 1}$ be as follows:

$A = (\begin{array}{r} 6 & 13 & - 17 \\ 13 & 29 & - 38 \\ - 17 & - 38 & 50 \end{array}) and A^{- 1} = (\begin{array}{r} 6 & - 4 & 1 \\ - 4 & 11 & 7 \\ - 1 & 7 & 5 \end{array}) .$ $A = (\begin{array}{r} 6 & 13 & - 17 \\ 13 & 29 & - 38 \\ - 17 & - 38 & 50 \end{array}) and A^{- 1} = (\begin{array}{r} 6 & - 4 & 1 \\ - 4 & 11 & 7 \\ - 1 & 7 & 5 \end{array}) .$

The eigenvalues of A are approximately 84.74, 0.2007, and 0.0588.
1. (a) Approximate $| | A | |_{E}, | | A^{- 1} | |_{E}$ $| | A | |_{E}, | | A^{- 1} | |_{E}$ , and cond(A). (Note Exercise 3.)
2. (b) Suppose that we have vectors x and $\tilde{x}$ $\tilde{x}$ such that $A x = b$ $A x = b$ and $| | b - A \tilde{x} | | \leq 0.001$ $| | b - A \tilde{x} | | \leq 0.001$ . Use (a) to determine upper bounds for $| | \tilde{x} - A^{- 1} b | |$ $| | \tilde{x} - A^{- 1} b | |$ (the absolute error) and $| | \tilde{x} - A^{- 1} b | | / | | A^{- 1} b | |$ $| | \tilde{x} - A^{- 1} b | | / | | A^{- 1} b | |$ (the relative error).
Suppose that x is the actual solution of $A x = b$ $A x = b$ and that a computer arrives at an approximate solution $\tilde{x}$ $\tilde{x}$ . If $cond (A) = 100, | | b | | = 1$ $cond (A) = 100, | | b | | = 1$ , and $| | b - A \tilde{x} | | = 0.1$ $| | b - A \tilde{x} | | = 0.1$ , obtain upper and lower bounds for $| | x - \tilde{x} | | / | | x | |$ $| | x - \tilde{x} | | / | | x | |$ .
Let

$B = (\begin{array}{r} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{array}) .$ $B = (\begin{array}{r} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{array}) .$

Compute

$R (\begin{array}{r} 1 \\ - 2 \\ 3 \end{array}), | | B | |_{E}, and cond (B) .$ $R (\begin{array}{r} 1 \\ - 2 \\ 3 \end{array}), | | B | |_{E}, and cond (B) .$
Let B be a symmetric matrix. Prove that $\min_{x \neq 0} R (x)$ $\min_{x \neq 0} R (x)$ equals the smallest eigenvalue of B.
Prove that if $λ$ $λ$ is an eigenvalue of AA*, then $λ$ $λ$ is an eigenvalue of A*A. This completes the proof of the lemma to Corollary 2 to Theorem 6.44.
Prove that if A is an invertible matrix and $A x = b$ $A x = b$ , then

$\frac{1}{| | A | |_{E} \cdot | | A^{- 1} | |_{E}} (\frac{| | δ b | |}{| | b | |}) \leq \frac{| | δ x | |}{| | x | |} .$ $\frac{1}{| | A | |_{E} \cdot | | A^{- 1} | |_{E}} (\frac{| | δ b | |}{| | b | |}) \leq \frac{| | δ x | |}{| | x | |} .$
Prove the left inequality of (a) in Theorem 6.45.
Prove that $cond (A) = 1$ $cond (A) = 1$ if and only if A is a scalar multiple of a unitary or orthogonal matrix.
1. (a) Let A and B be square matrices that are unitarily equivalent. Prove that $| | A | |_{E} = | | B | |_{E}$ $| | A | |_{E} = | | B | |_{E}$ .
2. (b) Let T be a linear operator on a finite-dimensional inner product space V. Define
  
  $| | T | |_{E} = \max_{x \neq 0} \frac{| | T (x) | |}{| | x | |} .$ $| | T | |_{E} = \max_{x \neq 0} \frac{| | T (x) | |}{| | x | |} .$
  
  Prove that $| | T | |_{E} = | | {[T]}_{β} | |_{E}$ $| | T | |_{E} = | | {[T]}_{β} | |_{E}$ , where $β$ $β$ is any orthonormal basis for V.
3. (c) Let V be an infinite-dimensional inner product space with an orthonormal basis ${v_{1}, v_{2}, \dots}$ ${v_{1}, v_{2}, \dots}$ . Let T be the linear operator on V such that $T (v_{k}) = k v_{k}$ $T (v_{k}) = k v_{k}$ . Prove that $| | T | |_{E}$ $| | T | |_{E}$ (defined in (b)) does not exist.
Visit goo.gl/B8Uw33 for a solution.

The next exercise assumes the definitions of singular value and pseudoinverse and the results of Section 6.7.

Let A be an $n \times n$ $n \times n$ matrix of rank r with the nonzero singular values $σ_{1} \geq σ_{2} \geq \dots \geq σ_{r}$ $σ_{1} \geq σ_{2} \geq \dots \geq σ_{r}$ . Prove each of the following results.
1. (a) $| | A | |_{E} = σ_{1} .$ $| | A | |_{E} = σ_{1} .$
2. (b) $| | A^{†} | |_{E} = \frac{1}{σ_{r}} .$ $| | A^{†} | |_{E} = \frac{1}{σ_{r}} .$
3. (c) If A is invertible (and hence $r = n$ $r = n$ ), then $cond (A) = \frac{σ_{1}}{σ_{n}}$ $cond (A) = \frac{σ_{1}}{σ_{n}}$ .

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 6.10* Conditioning and the Rayleigh Quotient

Create new playlist

Sign In

Sign Up

Table of Contents for
6.10* Conditioning and the Rayleigh Quotient