Alice wants to send a message to Bob, but they have not had previous contact and they do not want to take the time to send a courier with a key. Therefore, all information that Alice sends to Bob will potentially be obtained by the evil observer Eve. However, it is still possible for a message to be sent in such a way that Bob can read it but Eve cannot.
With all the previously discussed methods, this would be impossible. Alice would have to send a key, which Eve would intercept. She could then decrypt all subsequent messages. The possibility of the present scheme, called a public key cryptosystem, was first publicly suggested by Diffie and Hellman in their classic 1976 paper [Diffie-Hellman]. However, they did not yet have a practical implementation (although they did present an alternative key exchange procedure that works over public channels; see Section 10.4). In the next few years, several methods were proposed. The most successful, based on the idea that factorization of integers into their prime factors is hard, was proposed by Rivest, Shamir, and Adleman in 1977 and is known as the RSA algorithm.
It had long been claimed that government cryptographic agencies had discovered the RSA algorithm several years earlier, but secrecy rules prevented them from releasing any evidence. Finally, in 1997, documents released by CESG, a British cryptographic agency, showed that in 1970, James Ellis had discovered public key cryptography, and in 1973, Clifford Cocks had written an internal document describing a version of the RSA algorithm in which the encryption exponent (see the discussion that follows) was the same as the modulus .
Here is how the RSA algorithm works. Bob chooses two distinct large primes and and multiplies them together to form
He also chooses an encryption exponent such that
He sends the pair to Alice but keeps the values of and secret. In particular, Alice, who could possibly be an enemy of Bob, never needs to know and to send her message to Bob securely. Alice writes her message as a number . If is larger than , she breaks the message into blocks, each of which is less than . However, for simplicity, let’s assume for the moment that . Alice computes
and sends to Bob. Since Bob knows and , he can compute and therefore can find the decryption exponent with
As we’ll see later,
so Bob can read the message.
We summarize the algorithm in the following table.
Bob chooses
Then
Let the encryption exponent be
The values of and are sent to Alice.
Alice’s message is cat. We will depart from our earlier practice of numbering the letters starting with ; instead, we start the numbering at and continue through . We do this because, in the previous method, if the letter appeared at the beginning of a message, it would yield a message number starting with , so the would disappear.
The message is therefore
Alice computes
She sends to Bob.
Since Bob knows and , he knows . He uses the extended Euclidean algorithm (see Section 3.2) to compute such that
The answer is
Bob computes
so he obtains the original message.
For more examples, see Examples 24–30 in the Computer Appendices.
There are several aspects that need to be explained, but perhaps the most important is why . Recall Euler’s theorem (Section 3.6): If , then . In our case, . Suppose . This is very likely the case; since and are large, probably has neither as a factor. Since , we can write for some integer . Therefore,
We have shown that Bob can recover the message. If , Bob still recovers the message. See Exercise 37.
What does Eve do? She intercepts . She does not know . We assume that Eve has no way of factoring . The obvious way of computing requires knowing . We show later that this is equivalent to knowing and . Is there another way? We will show that if Eve can find , then she can probably factor . Therefore, it is unlikely that Eve finds the decryption exponent .
Since Eve knows , why doesn’t she simply take the th root of ? This works well if we are not working mod but is very difficult in our case. For example, if you know that , you cannot calculate the cube root of 3, namely on your calculator and then reduce mod 85. Of course, a case-by-case search would eventually yield , but this method is not feasible for large .
How does Bob choose and ? They should be chosen at random, independently of each other. How large depends on the level of security needed, but it seems that they should have at least 300 digits. For reasons that we discuss later, it is perhaps best if they are of slightly different lengths. When we discuss primality testing, we’ll see that finding such primes can be done fairly quickly (see also Section 3.6). A few other tests should be done on and to make sure they are not bad. For example, if has only small prime factors, then is easy to factor by the method (see Section 9.3), so should be rejected and replaced with another prime.
Why does Bob require ? Recall (see Section 3.3) that has a solution if and only if . Therefore, this condition is needed in order for to exist. The extended Euclidean algorithm can be used to compute quickly. Since is even, cannot be used; one might be tempted to use . However, there are dangers in using small values of (see Section 9.2, Computer Problem 14, and Section 23.3), so something larger is usually recommended. Also, if is a moderately large prime, then there is no difficulty ensuring that . It is now generally recommended that . Among the data collected for [Lenstra2012 et al.] is the distribution of RSA encryption exponents that is given in Table 9.1.
In the encryption process, Alice calculates . Recall that this can be done fairly quickly and without large memory, for example, by successive squaring. See Section 3.5. This is definitely an advantage of modular arithmetic: If Alice tried to calculate first, then reduce mod , it is possible that recording would overflow her computer’s memory. Similarly, the decryption process of calculating can be done efficiently. Therefore, all the operations needed for encryption and decryption can be done quickly (i.e., in time a power of ). The security is provided by the assumption that cannot be factored.
We made two claims. We justify them here. Recall that the point of these two claims was that finding or finding the decryption exponent is essentially as hard as factoring . Therefore, if factoring is hard, then there should be no fast, clever way of finding .
Claim 1: Suppose is the product of two distinct primes. If we know and , then we can quickly find and .
Note that
Therefore, we know and . The roots of the polynomial
are and , but they can also be calculated by the quadratic formula:
This yields and .
For example, suppose and we know that . Consider the quadratic equation
The roots are
For another example, see Example 31 in the Computer Appendices.
Claim 2: If we know and , then we can probably factor .
In the discussion of factorization methods in Section 9.4, we show that if we have an exponent such that for several with , then we can probably factor . Since is a multiple of , say , we have
whenever . The factorization method can now be applied.
For an example, see Example 32 in the Computer Appendices.
Claim 2’: If is small or medium-sized (for example, if has several fewer digits than ) and we know , then we can factor .
We use the following procedure:
Compute (that is, round up to the nearest integer).
Compute
Solve the quadratic equation
The solutions are and .
Let and . We discover that . Compute
Therefore, . Then
The roots of the equation
are and , and we can check that .
Why does this work? We know that , so we write . Since , we know that
so . We have
Usually, both and are approximately . In practice, and therefore (which is less than ) are much smaller than . Therefore, is much smaller than , which means that and it rounds off to .
Once we have , we use to solve for . As we have already seen, once we know and , we can find and by solving the quadratic equation.
One way the RSA algorithm can be used is when there are several banks, for example, that want to be able to send financial data to each other. If there are several thousand banks, then it is impractical for each pair of banks to have a key for secret communication. A better way is the following. Each bank chooses integers and as before. These are then published in a public book. Suppose bank A wants to send data to bank B. Then A looks up B’s and and uses them to send the message. In practice, the RSA algorithm is not quite fast enough for sending massive amounts of data. Therefore, the RSA algorithm is often used to send a key for a faster encryption method such as AES.
PGP (= Pretty Good Privacy) used to be a standard method for encrypting email. When Alice sends an email message to Bob, she first signs the message using a digital signature algorithm such as those discussed in Chapter 13. She then encrypts the message using a block cipher such as triple DES or AES (other choices are IDEA or CAST-128) with a randomly chosen 128-bit key (a new random key is chosen for each transmission). She then encrypts this key using Bob’s public RSA key (other public key methods can also be used). When Bob receives the email, he uses his private RSA exponent to decrypt the random key. Then he uses this random key to decrypt the message, and he checks the signature to verify that the message is from Alice. For more discussion of PGP, see Section 15.6.