IAAWA Public Key Cryptography

Section 12.2 Public Key Cryptography

If traditional cryptosystems are used, anyone who knows enough to encode a message will also know enough to decode an intercepted message. In 1976, W. Diffie and M. Hellman proposed public key cryptography, which is based on the observation that the encryption and decryption procedures need not have the same key. This removes the requirement that the encoding key be kept secret. The encoding function \(f\) must be relatively easy to compute, but \(f^{-1}\) must be extremely difficult to compute without some additional information, so that someone who knows only the encrypting key cannot find the decrypting key without prohibitive computation. It is interesting to note that to date, no system has been proposed that has been proven to be “one-way;” that is, for any existing public key cryptosystem, it has never been shown to be computationally prohibitive to decode messages with only knowledge of the encoding key.

Subsection The RSA Cryptosystem

The RSA cryptosystem, introduced in 1978 by Ron Rivest, Adi Shamir, and Leonard Adleman, is based on the difficulty of factoring large numbers. Though it is not a difficult task to find two large random primes and multiply them together, factoring a 150-digit number that is the product of two large primes would take 100 million computers operating at 10 million instructions per second about 50 million years under the fastest algorithms available in the early 1990s. Although the algorithms have improved, factoring a number that is a product of two large primes is still computationally prohibitive.

The RSA cryptosystem works as follows. Suppose that we choose two random 150-digit prime numbers \(p\) and \(q\text{.}\) Next, we compute the product \(n= pq\) and also compute \(\phi(n) = m = (p - 1)(q-1)\text{,}\) where \(\phi\) is the Euler \(\phi\)-function. Now we start choosing random integers \(E\) until we find one that is relatively prime to \(m\text{;}\) that is, we choose \(E\) such that \(\gcd(E, m) = 1\text{.}\) Using the Euclidean algorithm, we can find a number \(D\) such that \(DE \equiv 1 \pmod{m}\text{.}\) The numbers \(n\) and \(E\) are now made public.

Suppose now that person B (Bob) wishes to send person A (Alice) a message over a public line. Since \(E\) and \(n\) are known to everyone, anyone can encode messages. Bob first digitizes the message according to some scheme, say \(\text{A} = 00, \text{B} = 02, \ldots, \text{Z}= 25\text{.}\) If necessary, he will break the message into pieces such that each piece is a positive integer less than \(n\text{.}\) Suppose \(x\) is one of the pieces. Bob forms the number \(y = x^E \mod n\) and sends \(y\) to Alice. For Alice to recover \(x\text{,}\) she need only compute \(x = y^D \bmod n\text{.}\) Only Alice knows \(D\text{.}\)

Example 12.5.

Before exploring the theory behind the RSA cryptosystem or attempting to use large integers, we will use some small integers just to see that the system does indeed work. Suppose that we wish to send some message, which when digitized is \(25\text{.}\) Let \(p = 23\) and \(q = 29\text{.}\) Then

\begin{equation*} n = pq = 667 \end{equation*}

and

\begin{equation*} \phi(n) = m = (p - 1)(q - 1) = 616. \end{equation*}

We can let \(E = 487\text{,}\) since \(\gcd(616, 487) = 1\text{.}\) The encoded message is computed to be

\begin{equation*} 25^{487} \bmod 667 = 169. \end{equation*}

This computation can be reasonably done by using the method of repeated squares as described in Chapter 9. Using the Euclidean algorithm, we determine that \(191 E = 1 + 151 m\text{;}\) therefore, the decrypting key is \((n, D) = ( 667, 191)\text{.}\) We can recover the original message by calculating

\begin{equation*} 169^{191} \bmod 667 = 25. \end{equation*}

Now let us examine why the RSA cryptosystem works. We know that \(DE \equiv 1 \pmod{ m}\text{;}\) hence, there exists a \(k\) such that

\begin{equation*} DE = km + 1 = k \phi(n) + 1. \end{equation*}

There are two cases to consider. In the first case assume that \(\gcd(x, n) = 1\text{.}\) Then by Theorem 11.18,

\begin{equation*} y^D = (x^E)^D = x^{DE} = x^{km + 1} = (x^{\phi(n)})^k x = (1)^k x = x \bmod n. \end{equation*}

So we see that Alice recovers the original message \(x\) when she computes \(y^D \bmod n\text{.}\)

For the other case, assume that \(\gcd(x, n) \neq 1\text{.}\) Since \(n = pq\) and \(x \lt n\text{,}\) we know \(x\) is a multiple of \(p\) or a multiple of \(q\text{,}\) but not both. We will describe the first possibility only, since the second is entirely similar. There is then an integer \(r\text{,}\) with \(r \lt q\) and \(x = rp\text{.}\) Note that we have \(\gcd(x, q) = 1\) and that \(m=\phi(n)=(p - 1)(q - 1)=\phi(p)\phi(q)\text{.}\) Then, using Theorem 11.18, but now mod \(q\text{,}\)

\begin{equation*} x^{km} = x^{k\phi(p)\phi(q)} = (x^{\phi(q)})^{k\phi(p)} = (1)^{k\phi(p)} = 1 \bmod q. \end{equation*}

So there is an integer \(t\) such that \(x^{km}=1 + tq\text{.}\) Thus, Alice also recovers the message in this case,

\begin{equation*} y^D = x^{km + 1} = x^{km} x = (1 + tq) x = x + tq(rp) = x + trn = x \bmod n. \end{equation*}

We can now ask how one would go about breaking the RSA cryptosystem. To find \(D\) given \(n\) and \(E\text{,}\) we simply need to factor \(n\) and solve for \(D\) by using the Euclidean algorithm. If we had known that \(667 = 23 \cdot 29\) in Example 12.5, we could have recovered \(D\text{.}\)

Subsection Message Verification

There is a problem of message verification in public key cryptosystems. Since the encoding key is public knowledge, anyone has the ability to send an encoded message. If Alice receives a message from Bob, she would like to be able to verify that it was Bob who actually sent the message. Suppose that Bob's encrypting key is \((n', E')\) and his decrypting key is \((n', D')\text{.}\) Also, suppose that Alice's encrypting key is \((n, E)\) and her decrypting key is \((n, D)\text{.}\) Since encryption keys are public information, they can exchange coded messages at their convenience. Bob wishes to assure Alice that the message he is sending is authentic. Before Bob sends the message \(x\) to Alice, he decrypts \(x\) with his own key:

\begin{equation*} x' = x ^{D'} \bmod n'. \end{equation*}

Anyone can change \(x'\) back to \(x\) just by encryption, but only Bob has the ability to form \(x'\text{.}\) Now Bob encrypts \(x'\) with Alice's encryption key to form

\begin{equation*} y' = {x'}^E \bmod n, \end{equation*}

a message that only Alice can decode. Alice decodes the message and then encodes the result with Bob's key to read the original message, a message that could have only been sent by Bob.

Subsection Diffie-Hellman key exchange

Most commonly, RSA encryption is used together with a private-key encryption system. Even on a computer, the modular operations can take a long time to encrypt a long message. Instead, it is more practical to use RSA to send the key for a private-key encryption method and then use private-key encryption to send the message itself.

A key exchange algorithm is a way for two people to share a secret piece of data without an eavesdropper being able to obtain reconstruct that secret. Unlike RSA or other public-key encryption system, the shared secret is essentially random and not under the control of either participant.

Key exchange has another advantage, known as forward secrecy. In the RSA encryption method, if an attacker records encrypted communication, and then later obtains the secret key, he or she can decrypt all the recorded messages. In a key exchange system, the participants can forget the ephemeral key after it has been used.

Diffie-Hellman key exchange works as follows. We again use the names Alice and Bob for the two participants. They agree ahead of time on a prime \(p\) as well as a nonzero element \(g \in \mathbb Z_p\text{.}\) First, Alice picks a random number \(a\) and sends the modular power \(g^a\) to Bob. Second, Bob also picks a random number \(b\) and sends \(g^b\) to Alice. Now, Alice knows \(g^b\) from Bob's message, and knows \(a\) from having chosen it, so she can compute \((g^b)^a=g^{ab}\text{.}\) Similarly, Bob knows \(g^a\) and \(b\text{,}\) so he can compute \((g^a)^b=g^{ab}\text{.}\) The equivalence class \(g^{ab}\) is the shared secret between Alice and Bob. An eavesdropper would know both \(g^a\) and \(g^b\text{,}\) and would know \(g\) and \(p\) from the previous agreement. However, if the prime \(p\) is big enough, it will not be feasible for the eavesdropper to recover either exponent \(a\) or \(b\text{,}\) nor the secret \(g^{ab}\text{.}\)

Example 12.6.

For this example we use the prime \(p=347\) and the fixed element \(g=2\text{.}\) Alice randomly chooses \(a=306\text{,}\) computes \(2^{306} \equiv 36 \pmod{347}\text{,}\) and sends 36 to Bob. Bob randomly chooses \(b=86\text{,}\) computes \(2^{86} \equiv 120 \pmod{347}\text{,}\) and sends 120 to Alice. Now, Alice can compute \(120^{306} \equiv 289 \pmod{359}\text{.}\) Separately, Bob can compute \(36^{86} \equiv 289 \pmod{347}\text{.}\) Therefore, 289 is the shared secret between Alice and Bob.

An eavesdropper would see the transmitted integers 36 and 120, but could not easily determine the shared secret 289. They could try to figure out Alice's secret \(a\) by looking for integers \(a\) such that \(2^a \equiv 36 \pmod{347}\text{.}\) However, this computation this would be significantly harder than the single exponentiations that Alice and Bob do. In practice, Alice and Bob would use a prime with over 600 decimal digits, in which case recovering \(a\) from \(g\) and \(g^a\) is impractical, even using the most sophisticated algorithms known today.

Diffie-Hellman key exchange works modulo a prime, but all we have really used is that multiplication in \(\mathbb Z_p^\times\) is a group. A very common variation of Diffie-Hellman is to use an elliptic curve over a finite field as the group. An elliptic curve has a more complicated definition of its multiplication operation, but the algorithm works similarly. Alice and Bob agree on the curve and an element \(g\text{.}\) Alice randomly chooses an integer \(a\) and computes \(g^a\text{,}\) which is a product in the elliptic curve, and sends \(g^a\) to Bob. Bob randomly chooses an integer \(b\text{,}\) computes \(g^b\) and sends \(g^b\) to Alice. Now, both Alice and Bob can each compute \(g^{ab}\text{,}\) but it is infeasible for an eavesdropper to do so.

Subsection Historical Note

Encrypting secret messages goes as far back as ancient Greece and Rome. As we know, Julius Caesar used a simple shift code to send and receive messages. However, the formal study of encoding and decoding messages probably began with the Arabs in the 1400s. In the fifteenth and sixteenth centuries mathematicians such as Alberti and Viete discovered that monoalphabetic cryptosystems offered no real security. In the 1800s, F. W. Kasiski established methods for breaking ciphers in which a ciphertext letter can represent more than one plaintext letter, if the same key was used several times. This discovery led to the use of cryptosystems with keys that were used only a single time. Cryptography was placed on firm mathematical foundations by such people as W. Friedman and L. Hill in the early part of the twentieth century.

The period after World War I saw the development of special-purpose machines for encrypting and decrypting messages, and mathematicians were very active in cryptography during World War II. Efforts to penetrate the cryptosystems of the Axis nations were organized in England and in the United States by such notable mathematicians as Alan Turing and A. A. Albert. The Allies gained a tremendous advantage in World War II by breaking the ciphers produced by the German Enigma machine and the Japanese Purple ciphers.

By the 1970s, interest in commercial cryptography had begun to take hold. There was a growing need to protect banking transactions, computer data, and electronic mail. In the early 1970s, IBM developed and implemented LUZIFER, the forerunner of the National Bureau of Standards' Data Encryption Standard (DES).

The concept of a public key cryptosystem, due to Diffie and Hellman, is very recent (1976). It was further developed by Rivest, Shamir, and Adleman with the RSA cryptosystem (1978). It is not known with mathematical certainty how secure any of these systems are. The trapdoor knapsack cryptosystem, developed by Merkle and Hellman, has been broken. It is still an open question whether or not the RSA system can be broken. In 1991, RSA Laboratories published a list of semiprimes (numbers with exactly two prime factors) with a cash prize for whoever was able to provide a factorization. Although the challenge ended in 2007, many of these numbers have not yet been factored.

There been a great deal of controversy about research in cryptography and cryptography itself. In 1929, when Henry Stimson, Secretary of State under Herbert Hoover, dismissed the Black Chamber (the State Department's cryptography division) on the ethical grounds that “gentlemen do not read each other's mail.” During the last two decades of the twentieth century, the National Security Agency wanted to keep information about cryptography secret, whereas the academic community fought for the right to publish basic research. Currently, research in mathematical cryptography and computational number theory is very active, and mathematicians are free to publish their results in these areas.