Chapter 22 - Finding and Generating Prime Numbers
Chapter 22 - Finding and Generating Prime Numbers
22
FINDING AND GENERATING PRIME NUMBERS
“Mathematicians have tried in vain to this day to discover some order in the sequence of prime numbers, and we have reason to
believe that it is a mystery into which the human mind will never penetrate.”
—Leonhard Euler, 18th-century mathematician
All the ciphers described in this book so far have been in existence for hundreds of years. These ciphers worked
well when hackers had to rely on pencil and paper, but they’re more vulnerable now that computers can
manipulate data trillions of times faster than a person can. Another problem with these classical ciphers is that
they use the same key for encryption and decryption. Using one key causes problems when you’re trying to send
an encrypted message: for example, how can you securely send the key to decrypt it?
In Chapter 23, you’ll learn how the public key cipher improves on the old ciphers by using very large prime
numbers to create two keys: a public key for encryption and a private key for decryption. To generate prime
numbers for the public key cipher’s keys, you’ll need to learn about some properties of prime numbers (and the
difficulty of factoring large numbers) that make the cipher possible. In this chapter, you’ll exploit these features
of prime numbers to create the primeNum.py module, which can generate keys by quickly determining whether
or not a number is prime.
https://inventwithpython.com/cracking/chapter22.html 1/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
There is an infinite number of prime numbers, which means there is no such thing as the largest prime
number. They continue to get bigger and bigger, just like regular numbers. The public key cipher uses large
prime numbers to make the key too big to brute-force.
Prime numbers can be difficult to find, and large prime numbers, such as those used for public keys, are even
harder to find. To generate large prime numbers as public keys, we’ll find a random large number and then check
whether the number is prime by using a primality test. If the number is prime according to the primality test,
we’ll use it; otherwise, we’ll continue creating and testing large numbers until we find one that is prime.
Let’s look at some very large numbers to illustrate how big the prime numbers used in the public key cipher
can be.
A googol is 10 raised to the power of 100 and is written as a 1 followed by 100 zeros:
10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000
10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000,000
But these are tiny numbers compared to the prime numbers the public key cipher uses. For example, a typical
prime number used in the public key program has hundreds of digits and might look something like this:
112,829,754,900,439,506,175,719,191,782,841,802,172,556,768,
253,593,054,977,186,2355,84,979,780,304,652,423,405,148,425,
447,063,090,165,759,070,742,102,132,335,103,295,947,000,718,
386,333,756,395,799,633,478,227,612,244,071,875,721,006,813,
307,628,061,280,861,610,153,485,352,017,238,548,269,452,852,
733,818,231,045,171,038,838,387,845,888,589,411,762,622,041,
204,120,706,150,518,465,720,862,068,595,814,264,819
This number is so big that I bet you didn’t even notice the typo in it.
A few other interesting features of prime numbers are also useful to know. Because all even numbers are
multiples of two, 2 is the only possible even prime number. Also, multiplying two prime numbers should result
in a number whose only factors are 1, itself, and the two prime numbers that were multiplied. (For example,
multiplying prime numbers 3 and 7 results in 21, whose only factors are 1, 21, 3, and 7.)
Integers that are not prime are called composite numbers because they’re composed of at least two factors
besides 1 and the number. Every composite number has a prime factorization, which is a factorization composed
of only prime numbers. For example, the composite number 1386 is composed of the prime numbers 2, 3, 7, and
11, because 2 × 3 × 3 × 7 × 11 = 1386. Each composite number’s prime factorization is unique to that composite
number.
We’ll use this information about what makes a number prime to write a module that can determine whether a
small number is prime and generate prime numbers. The module, primeNum.py, will define the following
functions:
isPrimeTrialDiv() uses the trial division algorithm to return True if the number passed to it is prime
or False if the number passed to it is not prime.
primeSieve() uses the sieve of Eratosthenes algorithm to generate prime numbers.
rabinMiller() uses the Rabin-Miller algorithm to check whether the number passed to it is prime. This
algorithm, unlike the trial division algorithm, can work quickly on very large numbers. This function is called
not directly but rather by isPrime().
isPrime() is called when the user must determine whether a large integer is prime or not.
https://inventwithpython.com/cracking/chapter22.html 2/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
generateLargePrime() returns a large prime number that is hundreds of digits long. This function will
be used in the makePublicPrivateKeys.py program in Chapter 23.
primeNum.py
https://inventwithpython.com/cracking/chapter22.html 4/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
>>> primeNum.isPrime(13)
True
Importing the primeNum.py module lets us generate a very large prime number using
the generateLargePrime() function. It also lets us pass any number, large or small, to the isPrime() function
to determine whether it’s a prime number.
49 ÷ 2 = 24 remainder 1
49 ÷ 3 = 16 remainder 1
49 ÷ 4 = 12 remainder 1
49 ÷ 5 = 9 remainder 4
49 ÷ 6 = 8 remainder 1
49 ÷ 7 = 7 remainder 0
Because 7 evenly divides 49 with a remainder of 0, we know that 7 is a factor of 49. This means that 49 can’t
be a prime number, because it has at least one factor other than 1 and itself.
We can expedite this process by dividing by prime numbers only, not composite numbers. As mentioned
earlier, composite numbers are nothing more than composites of prime numbers. This means that if 2 can’t
divide 49 evenly, then a composite number such as 6, whose factors include 2, won’t be able to divide 49 evenly
either. In other words, any number that 6 divides evenly can also be divided by 2 evenly, because 2 is a factor of
6. Figure 22-1 illustrates this concept.
Figure 22-1: Any number that divides evenly by 6 also divides evenly by 2.
13 ÷ 2 = 6 remainder 1
13 ÷ 3 = 4 remainder 1
We only have to test integers up to (and including) the square root of the number we’re testing for primality.
The square root of a number refers to the number that when multiplied by itself results in that original number.
For example, the square root of 25 is 5 because 5 × 5 = 25. Because a number can’t have two factors greater than
https://inventwithpython.com/cracking/chapter22.html 5/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
its square root, we can limit the trial division algorithm test to integers less than the number’s square root. The
square root of 13 is about 3.6, so we only need to divide by 2 and 3 to determine that 13 is prime.
As another example, the number 16 has a square root of 4. Multiplying two numbers greater than 4 will
always result in a number greater than 16, and any factors of 16 greater than 4 will always be paired with factors
smaller than 4, such as 8 × 2. Therefore, you’ll find all the factors greater than the square root by finding any
factors less than the square root.
To find a number’s square root in Python, you can use the math.sqrt() function. Enter the following into
the interactive shell to see some examples of how this function works:
7. def isPrimeTrialDiv(num):
8. # Returns True if num is a prime number, otherwise False.
9.
10. # Uses the trial division algorithm for testing primality.
11.
12. # All numbers less than 2 are not prime:
13. if num < 2:
14. return False
Line 13 checks whether num is less than 2, and if it is, the function returns False, because a number less than
2 cannot be prime.
Line 17 begins the for loop that implements the trial division algorithm. It also takes the square root
of numusing math.sqrt() and uses the returned floating-point value to set the upper limit of the range of
integers we’ll test.
16. # See if num is divisible by any number up to the square root of num:
17. for i in range(2, int(math.sqrt(num)) + 1):
18. if num % i == 0:
19. return False
20. return True
Line 18 checks whether the remainder is 0 using the mod operator (%). If the remainder is 0, num is divisible
by i and is therefore not a prime number, and the loop returns False. If the for loop on line 17 never
returns False, the function returns True on line 20 to indicate that num is likely a prime number.
The trial division algorithm in the isPrimeTrialDiv() function is useful, but it’s not the only way to test for
primality. You can also find prime numbers using the sieve of Eratosthenes.
Then we repeat the process using multiples of three: we exclude 3 and mark 6, 9, 12, 15, 18, 21, and so on as
“not prime.” We repeat this process for the multiples of four excluding 4, the multiples of five except for 5, and
so on until we get through multiples of eight. We stop at 8 because it is larger than 7.071, which is the square
root of 50. All the multiples of 9, 10, 11, and so on will already have been marked, because any number that is a
factor larger than the square root will be paired with a factor smaller than the square root, which we’ve already
marked.
The completed sieve should look like Figure 22-4, with prime numbers shown in white boxes.
https://inventwithpython.com/cracking/chapter22.html 7/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
Using the sieve of Eratosthenes, we found that the prime numbers less than 50 are 2, 3, 5, 7, 11, 13, 17, 19,
23, 29, 31, 37, 41, 43, and 47. This sieve algorithm is best used when you want to quickly find all the prime
numbers in a certain range of numbers. It’s much faster than using the previous trial division algorithm to check
each number individually.
Line 27 creates a list of Boolean True values that represent the length of sieveSize. The 0 and 1 indexes are
marked as False because 0 and 1 are not prime numbers.
The for loop on line 32 goes through each integer from 2 up to the square root of sieveSize:
The variable pointer starts at the first multiple of i after i, which is i * 2 on line 33. Then the whileloop
sets the pointer index in the sieve list to False, and line 36 changes pointer to point to the next multiple
of i.
After the for loop on line 32 finishes, the sieve list should now contain True for each index that is a prime
number. We create a new list, which starts as an empty list in primes, loop over the entire sieve list, and append
numbers when sieve[i] is True or when i is prime:
https://inventwithpython.com/cracking/chapter22.html 8/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
The primeSieve() function can find all prime numbers within a small range, and
the isPrimeTrialDiv()function can quickly determine whether a small number is prime. But what about a
large integer that is hundreds of digits long?
If we pass a large integer to isPrimeTrialDiv(), it would take several seconds to determine whether or not
it’s prime. And if the number is hundreds of digits long, like the prime numbers we’ll use for the public key
cipher program in Chapter 23, it would take more than a trillion years just to figure out whether that number is
prime.
In the next section, you’ll learn how to determine whether a very large number is prime using the Rabin-
Miller primality test.
https://inventwithpython.com/cracking/chapter22.html 9/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
Don’t worry about how this code works. The important concept to keep in mind is that if
the rabinMiller() function returns True, the num argument is very likely to be prime.
If rabinMiller() returns False, num is definitely composite.
72. # Most of the time we can quickly determine if num is not prime
73. # by dividing by the first few dozen prime numbers. This is quicker
74. # than rabinMiller() but does not detect all composites.
75. LOW_PRIMES = primeSieve(100)
We’ll use this list just as we did in isPrimeTrialDiv() and discount any numbers less than 2 (lines 81 and
82):
When num isn’t less than 2, we can use the LOW_PRIMES list as a shortcut to test num, too. Checking
whether num is divisible by all the primes less than 100 won’t definitively tell us whether the number is prime,
but it might help us find composite numbers. About 90 percent of large integers passed to isPrime() can be
detected as composite by dividing by the prime numbers less than 100. The reason is that if the number can be
evenly divided by a prime number, such as 3, you don’t have to check whether the number can be evenly divided
by composite numbers 6, 9, 12, 15, or any other multiple of 3. Dividing the number by smaller primes is much
faster than executing the slower Rabin-Miller algorithm on the number, so this shortcut helps the program
execute more quickly about 90 percent of the time isPrime() is called.
Line 85 loops through each of the prime numbers in the LOW_PRIMES list:
83. # See if any of the low prime numbers can divide num:
84. for prime in LOW_PRIMES:
85. if (num == prime):
86. return True
87. if (num % prime == 0):
88. return False
If the integer in num is the same as prime, then obviously num must be a prime number and line 86
returns True. The integer in num is modded by each prime number using the mod operator on line 87, and if the
result evaluates to 0, we know that prime divides num so num is not prime. In that case, line 88 returns False.
Those are the three quick tests we’ll perform to determine whether a number is prime. If the execution
continues past line 87, the rabinMiller() function checks num’s primality.
Line 90 calls the rabinMiller() function to determine whether the number is prime; then
the rabinMiller() function takes its return value and returns it from the isPrime() function:
Now that you know how to determine whether a number is prime, we’ll use these primality tests to generate
prime numbers. These will be used by the public key program in Chapter 23.
https://inventwithpython.com/cracking/chapter22.html 10/11
3/5/2023 Chapter 22 - Finding and Generating Prime Numbers
If num is prime, line 98 returns num. Otherwise, the infinite loop goes back to line 96 to try a new random
number. This loop continues until it finds a number that the isPrime() function determines is prime.
The generateLargePrime() function’s keysize parameter has a default value of 1024. The larger
the keysize, the more possible keys there are and the harder the cipher is to brute-force. Public key sizes are
usually calculated in terms of numbers called bits, which you’ll learn more about in Chapters 23 and 24. For
now, just know that a 1024-bit number is very large: it’s about 300 digits!
Summary
Prime numbers have fascinating properties in mathematics. As you’ll learn in Chapter 23, they’re also the
backbone of ciphers used in professional encryption software. The definition of a prime number is simple
enough: it’s a number that has only 1 and itself as factors. But determining which numbers are prime takes some
clever code.
In this chapter, we wrote the isPrimeTrialDiv() function to determine whether or not a number is prime by
modding a number by all the numbers between 2 and the square root of the number. This is the trial division
algorithm. A prime number should never have a remainder of 0 when modded by any number other than its
factors, 1 and itself. So we know that a number that has a 0 remainder is not prime.
You learned that the sieve of Eratosthenes can quickly find all prime numbers in a range of numbers, though
it uses too much memory for finding large primes.
Because the sieve of Eratosthenes and the trial division algorithm in primeNum.py aren’t fast enough to find
large prime numbers, we needed another algorithm for the public key cipher, which uses extremely large prime
numbers that are hundreds of digits long. As a workaround, you learned to use the Rabin-Miller algorithm,
which uses complex mathematical reasoning to determine whether a very large number is prime.
In Chapter 23, you’ll use the primeNum.py module to write the public key cipher program. At last, you’ll
create a cipher that’s easier to use than the one-time pad cipher but cannot be hacked by the simple hacker
techniques introduced in this book!
PRACTICE QUESTIONS
Answers to the practice questions can be found on the book’s website
at https://www.nostarch.com/crackingcodes/.
https://inventwithpython.com/cracking/chapter22.html 11/11