We now turn to factoring. The basic method of dividing an integer by all primes is much too slow for most purposes. For many years, people have worked on developing more efficient algorithms. We present some of them here. In Chapter 21, we’ll also cover a method using elliptic curves, and in Chapter 25, we’ll show how a quantum computer, if built, could factor efficiently.
One method, which is also too slow, is usually called the Fermat factorization method. The idea is to express as a difference of two squares: . Then gives a factorization of . For example, suppose we want to factor . Compute , , , , until we find a square. In this case, . Therefore,
The Fermat method works well when is the product of two primes that are very close together. If , it takes steps to find the factorization. But if and are two randomly selected 300-digit primes, it is likely that will be very large, probably around 300 digits, too. So Fermat factorization is unlikely to work. Just to be safe, however, the primes for an RSA modulus are often chosen to be of slightly different sizes.
We now turn to more modern methods. If one of the prime factors of has a special property, it is sometimes easier to factor . For example, if divides and has only small prime factors, the following method is effective. It was invented by Pollard in 1974.
Choose an integer . Often is used. Choose a bound . Compute as follows. Let and . Then . Let . If , we have found a nontrivial factor of .
Suppose is a prime factor of such that has only small prime factors. Then it is likely that will divide , say . By Fermat’s theorem, , so will occur in the greatest common divisor of and . If is another prime factor of , it is unlikely that , unless also has only small prime factors. If , not all is lost. In this case, we have an exponent (namely ) and an such that . There is a good chance that the method (explained later in this section) will factor . Alternatively, we could choose a smaller value of and repeat the calculation.
For an example, see Example 34 in the Computer Appendices.
How do we choose the bound ? If we choose a small , then the algorithm will run quickly but will have a very small chance of success. If we choose a very large , then the algorithm will be very slow. The actual value used will depend on the situation at hand.
In the applications, we will use integers that are products of two primes, say , but that are hard to factor. Therefore, we should ensure that has at least one large prime factor. This is easy to accomplish. Suppose we want to have around 300 digits. Choose a large prime , perhaps around . Look at integers of the form , with running through some integers around . Test for primality by the Miller-Rabin test, as before. On the average, this should produce a desired value of in less than 400 steps. Now choose a large prime and follow the same procedure to obtain . Then will be hard to factor by the method.
The elliptic curve factorization method (see Section 21.3) gives a generalization of the method. However, it uses some random numbers near and only requires at least one of them to have only small prime factors. This allows the method to detect many more primes , not just those where has only small prime factors.
Since it is the basis of the best current factorization methods, we repeat the following result from Section 9.4.
Let be an integer and suppose there exist integers and with , but . Then is composite. Moreover, gives a nontrivial factor of .
For an example, see Example 33 in the Computer Appendices.
How do we find the numbers and ? Let’s suppose we want to factor . Observe the following:
If we multiply the relations, we obtain
Since , we now can factor 3837523 by calculating
The other factor is .
Here is a way of looking at the calculations we just did. First, we generate squares such that when they are reduced mod they can be written as products of small primes (in the present case, primes less than 20). This set of primes is called our factor base. We’ll discuss how to generate such squares shortly. Each of these squares gives a row in a matrix, where the entries are the exponents of the primes 2, 3, 5, 7, 11, 13, 17, 19. For example, the relation gives the row 6, 2, 0, 0, 1, 0, 0, 0.
In addition to the preceding relations, suppose that we have also found the following relations:
We obtain the matrix
Now look for linear dependencies mod 2 among the rows. Here are three of them:
When we have such a dependency, the product of the numbers yields a square. For example, these three yield
Therefore, we have for various values of and . If , then yields a nontrivial factor of . If , then usually or , so we don’t obtain a factorization. In our three examples, we have
, but
and
and
We now return to the basic question: How do we find the numbers 9398, 19095, etc.? The idea is to produce squares that are slightly larger than a multiple of , so they are small mod . This means that there is a good chance they are products of small primes. An easy way is to look at numbers of the form for small and for various values of . Here denotes the greatest integer less than or equal to . The square of such a number is approximately , which is approximately mod . As long as is not too large, this number is fairly small, hence there is a good chance it is a product of small primes.
In the preceding calculation, we have and , for example.
The method just used is the basis of many of the best current factorization methods. The main step is to produce congruence relations
An improved version of the above method is called the quadratic sieve. A recent method, the number field sieve, uses more sophisticated techniques to produce such relations and is somewhat faster in many situations. See [Pomerance] for a description of these two methods and for a discussion of the history of factorization methods. See also Exercise 52.
Once we have several congruence relations, they are put into a matrix, as before. If we have more rows than columns in the matrix, we are guaranteed to have a linear dependence relation mod 2 among the rows. This leads to a congruence . Of course, as in the case of 1st + 5th + 6th considered previously, we might end up with , in which case we don’t obtain a factorization. But this situation is expected to occur at most half the time. So if we have enough relations – for example, if there are several more rows than columns – then we should have a relation that yields with . In this case is a nontrivial factor of .
In the last half of the twentieth century, there was dramatic progress in factoring. This was partly due to the development of computers and partly due to improved algorithms. A major impetus was provided by the use of factoring in cryptology, especially the RSA algorithm. Table 9.2 gives the factorization records (in terms of the number of decimal digits) for various years.
On the surface, the Miller-Rabin test looks like it might factor quite often; but what usually happens is that is reached without ever having . The problem is that usually . Suppose, on the other hand, that we have some exponent , maybe not , such that for some with . Then it is often possible to factor .
Suppose we have an exponent and an integer such that . Write with odd. Let , and successively define for . If , then stop; the procedure has failed to factor . If, for some , we have , stop; the procedure has failed to factor . If, for some , we have but , then gives a nontrivial factor of .
Of course, if we take , then any works. But then , so the method fails. But if and are found by some reasonably sensible method, there is a good chance that this method will factor .
This looks very similar to the Miller-Rabin test. The difference is that the existence of guarantees that we have for some , which doesn’t happen as often in the Miller-Rabin situation. Trying a few values of has a very high probability of factoring .
Of course, we might ask how we can find an exponent . Generally, this seems to be very difficult, and this test cannot be used in practice. However, it is useful in showing that knowing the decryption exponent in the RSA algorithm allows us to factor the modulus. Moreover, if a quantum computer is built, it will perform factorizations by finding such an exponent via its unique quantum properties. See Chapter 25.
For an example for how this method is used in analyzing RSA, see Example 32 in the Computer Appendices.