5 Computing in \(U_n\) and RSA cryptography
As well as being interesting just from a pure theoretical standpoint, the group of units \(U_n\) is highly important in a major real-world application of number theory: cryptography – devising codes for securely transmitting secret information.
5.1 Powers mod \(n\)
Suppose we want to compute \(3^{123456789} \bmod 7\) (more precisely: to compute the unique representative in \(\{0, \dots, 6\}\) of its congruence class). How might we do this? One obvious idea would be to compute \(3^{1234\dots}\) as an integer, and then reduce it modulo 7.
This would work, eventually, but it would be a horrendous mess, because \(3^{123456789}\) is huge, with millions of digits. So we need a better approach.
Using the \(\varphi\) function: Since 7 is prime, we know that \(\varphi(7) = 6\); and \(123456789 = 3 \bmod 6,\) so it is \(6q + 3\) for some \(q.\) Since 3 is coprime to 7, we conclude that \[3^{123456789} = 3^{6q + 3} = (3^6)^q \cdot 3^3 = 1^q \cdot 27 = 6 \bmod 7.\]
This algorithm works very well if the modulus \(n\) is small (but the exponent is large), as in the previous example. But if \(n\) is a bit bigger, there are two problems.
Compute \(3^{123456789} \bmod 21311.\)
Here we hit two snags. Firstly, to compute \(\varphi(21311),\) we have to factor 21311 into primes (which is doable on a computer, but takes a while, and would rapidly become impractical for larger moduli). Secondly, even once we’ve computed \(\varphi(21311) = 21000\) and \(123456789 \bmod 21000 = 18797,\) we still have to compute \(3^{18797},\) which has about 9000 digits! So this is clearly not a sensible method.
Instead, we’ll use a method called repeated squaring The idea is to write the exponent as a sum of powers of 2, which we can always do; this is just the binary expansion of \(n.\) (This is easy to compute from the base-10 expansion, and if you’re working on a computer, the computer probably converted your input to binary as soon as you entered it.) Now, we can easily make a table of values of \(3^{(2^i)} \bmod 21311\) for small \(i\) by repeated squaring: \[\begin{aligned} 3^{2} &= 3^2 = 9\\ 3^{4} &= 9^2 = 81\\ 3^{8} &= 81^2 = 6561\\ 3^{16} &= 6561^2 = 19812\\ 3^{32} &= 19812^2 = 9346\\ 3^{64} &= 9346^2 = 15238\\ &\vdots \end{aligned}\] Because we reduce modulo \(n = 21311\) after every squaring step, we never have to deal with integers bigger than \(n^2,\) so the computations are manageable. Once we have computed a table of \(3^{(2^i)}\) for all \(i\) up to 26, we can use the formula \[123456789 = 2^{26} + 2^{25} + 2^{24} + 2^{22} + \dots + 2^2 + 1,\] to compute \(3^{123456789} = 20878 \bmod 21311.\)
There’s nothing very special about \(U_n\) here: if \(G\) is a finite group, and you have a practical way of representing elements of \(G\) on a computer and calculating the group operation, then you can use repeated squaring to efficiently compute \(g^n\) for any \(g \in G\) and \(n \in \mathbb{N}.\)
5.2 Polynomial vs. exponential time
To formalise the ideas of “hard to compute” versus “easy to compute”, we use the notion of polynomial-time and exponential-time algorithms. These compare the number of steps needed for some computational method, as a function of the length of the input (the amount of space required to write it down) – e.g. the number of decimal (or binary) digits needed to write down an integer. We say some algorithm is polynomial-time if the number of steps required, for input of length \(N,\) is bounded above by a constant multiple of \(N^k\) for some constant \(k.\) Similarly, if it’s bounded above by a constant multiple of \(C^N\) for some \(C,\) we say it’s exponential-time. Since exponentials grow much faster than polynomials, any polynomial-time algorithm will eventually beat any exponential-time one.
Note that since we ignore constant factors, it doesn’t matter exactly how we measure the input length, as long as we stay within a constant factor of the original measure. E.g. if the input is a number, we could count its decimal digits, or its binary digits (bits); since these differ by a factor \(\log_{2} 10,\) this does not change whether an algorithm is polynomial or exponential time.
For example, computing the product \(a \cdot b\) of two integers via the standard school-book “long multiplication” method requires approximately \(N_a N_b\) steps, where \(N_r\) is the number of binary digits of \(r.\) Since \(N_a N_b \leqslant\tfrac{1}{4}(N_a + N_b)^2,\) and \(N_a + N_b\) is the total length of the input, this is clearly a polynomial-time algorithm. The “repeated squaring” algorithm above, for computing \(a^b \bmod N,\) is also polynomial-time.
On the other hand, testing whether a number \(r\) is prime by trying all potential factors up to \(\sqrt{r}\) (“trial division”) involves at least \(\sqrt{r}\) steps, which is clearly exponential in \(N_r.\)
There’s a big difference here between primality testing – answering the yes/no question “is \(N\) prime?” – and factorisation – computing the prime factors of \(N.\) These might seem like the same problem, but they aren’t: there are situations where you know \(N\) cannot be prime without being able to produce a specific factor of \(N.\)
For instance, suppose you compute \(2^{N-1} \bmod N,\) and it’s not 1. Then \(N\) cannot be prime, since otherwise it would contradict Fermat’s little theorem). So you know that \(N\) has a nontrivial factor; but there’s no obvious way to work out what that factor is using the information you have about \(2^{N-1} \bmod N.\)
Primality testing can be done in polynomial time. This was proved by Miller in 1976 assuming an open conjecture in analytic number theory, the generalised Riemann hypothesis. In 2004, Agrawal, Kayal and Saxena gave a different algorithm, for which they could prove unconditionally (without assuming any conjectures) that it gave the correct answer in polynomial time.
It’s widely believed that factorisation cannot be done in polynomial time on a conventional computer5 . There are algorithms (such as the Number Field Sieve) which are much better than trial division, but they are still much slower than any polynomial time algorithm.“Conventional” as opposed to “quantum”. Quantum computers could, theoretically, factorise large numbers much faster than any conventional computer could; but building quantum computers on a realistic scale has proved to be somewhat difficult. This would be an interesting project topic.
It is this “gap” – that the complexity of factoring integers grows much faster than the complexity of testing whether integers are prime – that is vitally important in many applications of number theory.
5.3 Public key cryptography
We’ll now learn about applications to secure communication – the science of cryptography. This could be used by a spy sending intelligence reports back to his home base; or it could be something much more mundane, like you logging into your bank account from a smartphone. This has two steps: encryption – the process the sender uses to transform a message into a coded form – and decryption, the opposite process that recovers the readable text from the coded message.
Traditional cryptographic techniques (prior to the 1970’s) relied on the existence of a shared secret: both sender and recipient needed to know some piece of information which, if revealed to an outsider, would allow them to read the secret message themselves. This can be difficult to achieve: it requires coordination in advance between the sender and recipient.
Sometimes the entire system is the shared secret; but then any security lapse means redesigning the whole system from scratch. So most practical cryptographic systems rely on choosing a “secret key” which is an arbitrary number, string of letters, etc; and then scrambling up the input message in a way that depends on this secret key. It doesn’t matter if an attacker knows how the system works, as long as they don’t know the secret key that was used for a particular message. That way, if one of your agents is captured, you just need to choose a new key, not a whole new algorithm!
Public key cryptography refers to a class of systems where the information needed to encrypt a message is different from the information needed to decrypt it. In such systems, each participant has a public key and a private key. If Alice wants to send a message to Bob, she can encrypt it knowing only Bob’s public key, but only someone knowing his private key can encrypt it again. So Bob doesn’t need to tell Alice – or anybody else – what his private key is; and as long as he keeps his private key secret, he can announce his public key openly to the world, without compromising the security of the system.
Of course, such a system can only work if it is impossible to determine the private key from the public one without an impractically lengthy computation. This is where number theory comes in: primes and prime factorisations are a rich source of difficult calculations!
There are some obvious security holes, of course. If Bob asks Alice a question that has only a few possible answers (e.g. just “yes” or “no”) then an attacker can try encrypting both “yes” and “no” with the public key. This will give two different gibberish messages, but if one of those exactly matches the gibberish message Alice has just radioed to Bob, then the attacker knows the message.
(This is typically solved by padding messages with randomly chosen nonsense phrases. However, this is not without its dangers too, as is shown by a famous cryptographic mix-up during a World War II naval battle, involving a nonsense padding phrase being accidentally interpreted as part of the message, changing its meaning completely.)
5.4 The RSA cryptosystem
They were not in fact the first to discover it; 20 years later it was revealed that Brtitish security services had already discovered the algorithm in 1973, but kept the discovery secret, and Rivest et al. rediscovered it independently.
RSA relies on the following observation: factorising large numbers into primes is difficult. If I give you two 20-digit numbers \(p,\) \(q,\) then you can compute \(N = pq\) in a few minutes. But if I give you a 40-digit number, and tell you that it’s the product of two 20-digit primes, then it would take a very long time indeed to compute those prime factors.
In RSA, each participant chooses the following data:
two large prime numbers \(p, q\);
an encryption exponent \(e,\) with \(1 < e < \varphi(pq) = (p-1)(q-1)\) and \(e\) coprime to \(\varphi(pq).\)
They announce to the world the product \(N = pq\) and the encryption exponent \(e,\) but keep the factors \(p\) and \(q\) secret. Using this secret information, they can compute the decryption exponent \[d = e^{-1} \bmod \varphi(N).\]
Suppose one participant (Bob) wants to send information to another (Alice). Bob finds out Alice’s modulus \(N\) and encryption exponent \(e.\) He converts his message into a series of chunks, each of which is represented by an integer \(m\) in the range \(1 < m < N,\) and for each chunk he computes \[c = m ^ e \bmod N.\] These \(c\)’s are the encrypted message he transmits to Alice.
Alice then takes each chunk \(c\) and computes \[c^d \bmod N = (m^e)^d = m^{de} \bmod N.\] Since \(de = 1 \bmod \varphi(N),\) this is just \(m \bmod N,\) recovering the original message.
The security of this system relies on the fact that it’s impossible to compute \(\varphi(N)\) from \(N\) without factorising \(N,\) and factorising large integers is hard – much harder than any of the other steps in the algorithm.
Suppose Alice’s public key is \[\begin{aligned} N &= 21311, & e &= 11 \end{aligned}\]
Bob wants to send the message “TINKER”.
Bob converts this into 3-letter blocks ‘TIN | KER’ and converts each one into a number in base 26, \[TIN = 13065, \qquad KER = 6881.\] For the first block, he computes \(13065 ^ {11} = 2460 \bmod 21311,\) and the second \(6881 ^ {11} = 14867 \bmod 21311.\) So he sends the message 02460 14867.
Alice knows that \(21311 = 101 \times 211,\) so \(\varphi(N) = 21000,\) and hence the decryption exponent is \(19091,\) since \(11 \times 19091 = 21001.\) So she just computes \(2460 ^ {19091} = 13065 \bmod N,\) etc, and recovers the original message.
That said, RSA is becoming less popular nowadays because other algorithms – typically based on elliptic curves – can offer similar levels of security while using smaller keys and quicker encryption/decryption times. The widely used elliptic-curve algorithm ECDSA, used with a 256-bit key, is estimated to be roughly as secure as RSA with a 3000-bit key.