We introduced the LU-decomposition as a method for solving a linear system of equations. However, until now we have only decomposed the matrix A into L and U. Obviously, we now want to take the second step and see that this matrix decomposition helps us in solving a linear system of equations.
For this we return to the system of Eq. 25.2 from which we also took the matrix. The right-hand side is
We will now use the function DoLUDecomposition in combination with Eq. 25.20 and Eq. 25.21 for solving the linear system of equations
described by Eq. 25.18 and Eq. 25.19.
Calculating . We first solve Eq. 25.19 by applying Eq. 25.20 for finding by forward-substitution. For this, we find the following equations
Calculating . Now that we have we proceed by finding by applying Eq. 25.21. For this we find the following equations
which obviously is the correct result (see Eq. 25.5). As you can see, we are able to find the solution to our original system of equation pretty much effortlessly once we have performed the LU-decomposition. In this case, we did not even have to perform a row permutation and there seems to be no problem with stability. This is obviously not the case, as our solver would still fail if we found a zero on the downwards-diagonal due to the division in the back-substitution performed in Eq. 25.21. As we have already done previously, this problem can be circumvented by changing the order of the equations, i.e., by implementing pivoting.
Maple Worksheet. Obviously, at this point we want to implement this scheme as a function. The function DoLUSubstitution implements the substitution steps required for solving linear systems of equations using LU-decomposition. It requires the matrices L and U as well as the vector as parameters. It consists of two loops, the first of which performs the forward-substitution (see line 6) while the second one performs the back-substitution (see line 14).
For the forward-substitution, the sum required for Eq. 25.20 is created in line 9 before applying Eq. 25.20 in line 11 to find the individual entries of . For the back-substitution, we first calculate the sum required for Eq. 25.21 in line 17 before applying Eq. 25.21 in line 19 for finding .
If we run our example (see line 24) we obtain the correct result
Stability Consideration. As we can see, the LU-decomposition gives us the correct result and seems to be sufficiently stable. However, this is only because the example we chose happens to have no zeros on the downwards-diagonal. As already discussed before, these would result in a division by zero. As sparsely populated matrices with a lot of zero entries are very common, it is a good idea to implement pivoting for LU-decomposition as well.
As we have already done for SOR in section 25.3.4.4, we will implement partial pivoting for LU-decomposition. This method is often referred to as permutating LU-decomposition (PLU). For this we change the implementation of the function DoLUDecomposition as shown in listing 25.7. As you can see, we use a pivoting vector piv that is first initialized to the sequence (1, 2, . . . , n) in line 8. Inside of the line loop line 12 we perform the pivoting operation as before (see line 14). We keep the original order for the matrices L and U, but we then use the pivoting vector to access the respective row in m as can be seen in line 26 and line 34. We then need to return the pivoting vector in addition to the matrices L and U in line 37 because we require the pivoting vector to correctly associate the individual entries of the vectors and when solving the linear system of equations.
Example. If we apply the function DoLUDecompositionPivot to our original matrix (see line 40) you can see from the output that the algorithm has performed a pivoting operation by switching the first and the second rows. We have found previously that pivoting is not necessary for this problem. However, switching the two rows places the 7 on the downwards-diagonal in the first row. This makes our algorithm more stable even though it is not required.
A Second Example. In the next step, we modify our original problem by setting (pretty much deliberately) two zeros on the downwards-diagonal. If we ran this second example (see line 43) with the original implementation of our LU-decomposition using the function DoLUDecomposition, we would receive an error due to the division by zero. If we run this example with the LU-decomposition using pivoting, we avoid this error and obtain the solution
The pivoting vector we also receive shows the sequence (2, 1, 3) indicating that the algorithm switched the lines such that the zeros are moved off the downwards-diagonal.
Adapting the Function DoLUSubstitution. Now that we have adapted the LU-decomposition to support pivoting, we also need to adapt the function DoLUSubstitution. The implementation of the method DoLUSubstitutionPivot is shown in listing 25.8 and is in analogy to the implementation of LUSubstitution (see listing 25.6). The only difference is that the function expects the pivoting vector piv as additional parameter. We then use this vector for accessing the correct entries in (see line 11). We can now run our original example (see line 24) which would obviously return the correct result again. In addition we can now also solve the second example we applied when testing the LU-decomposition with pivoting in listing 25.7. This is the output of line 27. The solution to this matrix is
which is the correct result.
In this section we have discussed LU-decomposition, a very useful method for solving systems of linear equations. Besides being an exact method that does not require iteration, LU-decomposition allows us to use the decomposed matrices L and U for various values of , and thus to “reuse” a solution to linear systems with changing right-hand sides. We have seen that LU-decomposition by itself is already reasonably stable. By implementing partial pivoting, we can use this method to solve almost any kind of linear system of equations, which makes it a very potent and widely applicable method.
In this section we will briefly introduce a modification of the classical LU-decomposition which is a little more memory efficient. In many cases, it is not necessary to have both matrices U and L separately. As they are both triangular matrices, they can be stored in a single matrix without losing information. In fact, there is a modification to the classical LU-decomposition that works with the matrix A directly transferring it to a matrix of the form
with In being the identity matrix of rank n. This scheme is referred to as sequential dense LU-decomposition. It is a very memory-efficient scheme and therefore lends itself well to implementation for numerical solvers.
We will derive the scheme using the generic matrix A that we gradually transform to the dense LU-decomposed matrix .
First Row. We now change the entries of A row-by-row figuring out how we need to modify them in order to end up with . We begin with the first row for which we need to find the values of u11 to u12. For this we find
which means that the values of a11 to a1n can remain unchanged as they already represent the correct values of u11 to u1n. In algorithmic notation we could note down that
for i = 1 as the row index and i ≤ j ≤ n as the column index.
Remainder of First Column. After having cleared the first row, we proceed the rest of the first column for which we find the following equations
As you can see, we find the remaining values of the first column by simply dividing the entries of A by the entry in the first row and first column a11. In algorithmic notation we can note
with the row index i with 1 ≤ i ≤ n. As you can see, we simply have to divide the entries ai1 by the value a11 to find li1. On the other hand, we may just as well simply perform this division on the matrix A itself, which will be the first modification we make. We therefore replace the first row entries of A with the values a0i1 according to
Second Row. At this point, the first row and the first column have been determined and the respective entries in L and U are known. We now proceed with the second row. Obviously, it would be convenient to simply repeat the process. However, we will see that this is not possible directly as the values of the entries we already determined will interfere. This is where the second modification will be made to A. By a small trick, we will change A in order to make all influences of already determined entries “disappear.” For the second row we find the following equations
As you can see, the operation is almost analogous to the treatment of the first row with the one exception that we cannot simply copy the values of a2j. We need to subtract the produce l21 u1j first. Now, we could just as well simply subtract this product from a2j and end up with the same result. In algorithmic notation, we could note down that
where we have used Eq. 25.36 to set u1j = a1j and Eq. 25.37 to set l21 = a021. Obviously, we expect a similar influence of the already determined values on all other lines as well. We will therefore perform a likewise subtraction on the other lines using the same formula according to
for i ≥ 2 and j ≥ 2. We will see in a moment that we actually required this modification.
Remainder of Second Column. Having processed the second row, we now turn to the second column starting from the second row. Proceeding likewise we find the following equations
In algorithmic notation we note
This expression can be simplified by noting that for j = 2 Eq. 25.39 tells us that a022 = u22 in which case we can rewrite the expression to
As we presumed just a moment ago, we actually find that the modification we made by Eq. 25.40 was necessary and that this modification allows us to write Eq. 25.41 as
for j = 2. If we compare this with Eq. 25.37, we find that this is the same operation and we can change a0i2 again in-place.
Iteration. Obviously, at this point we have found an iteration pattern if we repeat the process consisting of the following steps starting with A using the line index i
1. In line i replace for all (i, j) with i < j ≤: n replace according to Eq. 25.37
2. Correct submatrix entries (k, j) with i < k ≤ n and i < j ≤ n: replace according to Eq. 25.40
3. Repeat for next row i + 1
We will quickly carry out this example using the matrix we have been using as example so far
We proceed by the outlined scheme by first dividing all entries for i < j ≤ n = 3 in the first column by a11 = 3 in which case we end up with
In the next step we correct the submatrix entries (k, j) for i < k ≤ n and i < j ≤ n finding
We then proceed to i = 2 again dividing all values i < j ≤ n = 3 (which is only the value a32) by yielding
We then proceed by correcting the remaining submatrix which only consists of the entry a33, which we correct to the final result
If you compare this result with Eq. 25.33, you can see that we have obtained the matrix A′ according to Eq. 25.35.
In the next step, we will transfer the algorithm to a Maple worksheet as we will need this specific algorithm again. The listing is shown in listing 25.9. The sequential dense LU-decomposition is implemented by the function DoLUDecompositionSequentialDense. It expects the matrix to decompose as parameter m. It will modify this matrix in place, therefore it may be required to copy the matrix if the original is to be retained. First the rank of the matrix is determined (see line 2) before the individual rows are processed indexed with the variable i. For each line the columns are iterated using the variable j dividing each entry according to Eq. 25.37. After this, the submatrix is corrected. For this a new row index k is introduced. The column index j can be reused here, but the value i is required which in turn requires the new index. Line 9 implements Eq. 25.40. The last two lines (see line 16) output the example which we derived manually.
As already discussed during the discussion of the regular LU-decomposition, it is generally required to use pivoting for LU-decomposition. We will therefore reimplement DoLUDecompositionSequentialDense in a way that it supports pivoting. The listing is shown in listing 25.10. We implement partial pivoting in analogy to listing 25.7 using a pivoting vector piv (see line 4). You will see that this implementation is identical to the implementation of listing 25.7. The only difference is that each access to a row i must now be made via the pivoting vector. As we only have the matrix m in which we construct the matrices L and U, we cannot keep the original order as we did for the matrices in listing 25.7.
If we run our example (see line 27), we will see that we obtain the correct solution
which is the same result as obtained from DoLUDecompositionPivot, keeping in mind that L and U are combined according to Eq. 25.35. Also, by inspecting piv you can see that the algorithm changed the order of the rows ending up with the sequence (2, 1, 3). The function DoLUDecompositionPivot will rearrange L and U in order to “hide” this reordering from the user. The function DoLUDecompositionSequentialDensePivot cannot do this, as the function only has a single matrix to operate on. Therefore the substitution function for DoLUDecompositionSequentialDensePivot would need to account for this inability to reorder by accessing L and U (which are then stored in m) via the pivoting vector as index.
Listing 25.11 shows the implementation of the substitution step for sequential dense LU-decomposition. This listing is in analogy to listing 25.6 with the only difference that it accepts the matrix m which has been modified by DoLUDecompositionSequentialDensePivot. As a consequence of the pivoting, we need to access all entries in the matrix via the pivoting vector both during the forward- (see line 9) and the back-substitution (see line 17). However, we provide the vectors and in the correct order so the user will not see any signs of the pivoting.
If we run the example which we also used in line 27, we see that the solution we find is
which is identical to the solution found in Eq. 25.34.
There is one algorithm which we want to introduce at this point that is commonly used to solve linear systems of equations where the matrix A is tridiagonal. This algorithm is generally referred to as the tridiagonal matrix algorithm or Thomas1 algorithm. Please recall that tridiagonal matrices are matrices which have nonzero entries on the downwards-diagonal and in the columns right and left of the diagonal only. All other entries are zero. These matrices occur surprisingly often and they must usually not be solved by a complete LU-decomposition if the matrix is set up such that the largest values are collected on the downwards-diagonal. Again, such matrices occur frequently. We will come across one example when studying one-dimensional FEM (see Eq. 32.48). We will use this example here while introducing the algorithm. Please refer to section 32.5 for the derivation of the matrix.
The general linear system of equation can be written as
which we may rewrite to
for the ith equation. We split up the solving process in two steps: a forward step and a back-substitution.
Forward Step: First Row. The first step is dividing the first row by β1. Doing this, we obtain a 1 on the downwards-diagonal entry of the matrix. In this case we can rewrite Eq. 25.44 for i = 1 to
The first two equations are now given by
Forward Step: Subsequent Rows. We now multiply the first row by α2 and subtract it from the second row finding
We then divide the second row by (β2 − α2γ′1) thus obtaining again a 1 on the diagonal entry of the second row. We then obtain
which we simplify to
We then proceed with all subsequent rows accordingly. The general scheme is
After proceeding with these modifications Eq. 25.43 can be written as
Back-Substitution. The back-substitution uses the fact that the last line of Eq. 25.48 gives the solution for xn directly, which is simply b0n. We can then substitute all the rows above the last row one by one noting that
We will quickly implement the Thomas algorithm in Maple. The function we will implement is called DoThomas and is shown in listing 25.12 together with the example from Eq. 32.48. We first restart the Maple server and include Core.txt (see line 1). The function DoThomas accepts the matrix A as parameter m and the right-hand side vector . It first determines the rank of the matrix and prepares the solution vector x (see line 4 and line 5). It then calculates the values for the first row applying Eq. 25.45 for γ′1 and b′1 (see line 6 and line 7). It then loops all subsequent rows applying Eq. 25.46 for finding γ′i and Eq. 25.47 for finding b′i (see line 9 and line 10). The loop only extends to n − 1 because for the last row only b′n can be calculated as this row has no entry for γ (see line 12). Please note that in all cases the algorithm does not set the values for α′i and β′1, as these are not required to find the solution.
Having completed the forward step, the algorithm then proceeds with the back-substitution first setting the value xn directly (see line 13). It then loops (see line 14) from the second-to-last value all the way to the first setting in the values in the solution vector according to Eq. 25.49.
Having implemented DoThomas we then proceed by solving the example taken from Eq. 32.48 using the indicated values for A and (see line 30). Executing line 31 yields the solution
which is correct. In section 32.5 we will detail the use of this solution and its physical interpretation.
As with all numerical methods, it is useful to consider the criteria for numerical stability. The critical step for the numerical stability is the division operation in Eq. 25.46 and Eq. 25.47. For all cases in which the denominator results in very small values, the algorithm is prone to becoming numerically unstable. Without going into details, we will obtain a stable scheme whenever
In these cases, the denominator will be large and therefore the division operation will result in a continuous decrease in magnitude. In fact, many applications yield tridiagonal matrices which fulfill these conditions, including our example. Our implementation does not check for numerical stability. In cases where stability is a problem, LU-decomposition with pivoting should be applied which will yield stable solutions.
The Thomas algorithm is a very fast and efficient algorithm for solving linear systems of equations involving tridiagonal matrices. It is often a significantly faster alternative to using a full LU-decomposition. The Thomas algorithm is numerically quite stable and in many applications it is a good first choice to solve a tridiagonal matrix problem.
The two functions DoLUDecompositionSequentialDensePivot and DoLUSubstitutionDensePivot are very useful methods for solving linear systems of equations, which will come in handy for problems we will study later. We will therefore export these functions into CoreFunctions.txt in order to access them later. In addition we will also add the function DoThomas that we will use later on for solving tridiagonal systems of equations.
In this section, we have discussed a modification of the LU-decomposition which is more memory efficient as it operates on the input matrix A directly and does not require the additional matrices L and U created during the regular LU-decomposition. This is the reason why this scheme is usually the preferred method implemented in numerical solvers. We also implemented pivoting and substitution for the sequential dense LU-decomposition, which gives us a very nice method to solve almost any type of linear system of equations. In addition we also studied the Thomas algorithm, which is a very useful numerical method for solving linear systems of equations involving tridiagonal matrices.
In this section, we have discussed analytical as well as numerical methods for solving linear systems of equations. As we will see, having methods available to quickly solve linear systems of equations is paramount for many numerical methods. Most of the methods discussed in this section will be required in the following sections when studying the most commonly used numerical methods in fluid mechanics.