Note: This section is more advanced than the rest of the chapter. It is included because finite fields are often used in cryptography. In particular, finite fields appear in four places in this book. The finite field GF(28)GF(28) is used in AES (Chapter 8). Finite fields give an explanation of some phenomena that are mentioned in Section 5.2. Finally, finite fields are used in Section 21.4, Chapter 22 and in error correcting codes (Chapter 24).
Many times throughout this book, we work with the integers mod p,p, where pp is a prime. We can add, subtract, and multiply, but what distinguishes working mod pp from working mod an arbitrary integer nn is that we can divide by any number that is nonzero mod p.p. For example, if we need to solve 3x≡1(mod5),3x≡1(mod5), then we divide by 3 to obtain x≡2(mod5).x≡2(mod5). In contrast, if we want to solve 3x≡1(mod6),3x≡1(mod6), there is no solution since we cannot divide by 3(mod6).3(mod6). Loosely speaking, a set that has the operations of addition, multiplication, subtraction, and division by nonzero elements is called a field. We also require that the associative, commutative, and distributive laws hold.
Example
The basic examples of fields are the real numbers, the complex numbers, the rational numbers, and the integers mod a prime. The set of all integers is not a field since we sometimes cannot divide and obtain an answer in the set (for example, 4/3 is not an integer).
Example
Here is a field with four elements. Consider the set
GF(4)={0,1,ω,ω2},
GF(4)={0,1,ω,ω2},
with the following laws:
0+x=x0+x=x for all x.x.
x+x=0x+x=0 for all x.x.
1⋅x=x1⋅x=x for all x.x.
ω+1=ω2.ω+1=ω2.
Addition and multiplication are commutative and associative, and the distributive law x(y+z)=xy+xzx(y+z)=xy+xz holds for all x,y,z.x,y,z.
Since
ω3=ω⋅ω2=ω⋅(1+ω)=ω+ω2=ω+(1+ω)=1,
ω3=ω⋅ω2=ω⋅(1+ω)=ω+ω2=ω+(1+ω)=1,
we see that ω2ω2 is the multiplicative inverse of ω.ω. Therefore, every nonzero element of GF(4)GF(4) has a multiplicative inverse, and GF(4)GF(4) is a field with four elements.
In general, a field is a set containing elements 0 and 1 (with 1≠01≠0) and satisfying the following:
It has a multiplication and addition satisfying (1), (3), (5) in the preceding list.
Every element has an additive inverse (for each x,x, this means there exists an element −x−x such that x+(−x)=0x+(−x)=0).
Every nonzero element has a multiplicative inverse.
A field is closed under subtraction. To compute x−y,x−y, simply compute x+(−y).x+(−y).
The set of 2×22×2 matrices with real entries is not a field for two reasons. First, the multiplication is not commutative. Second, there are nonzero matrices that do not have inverses (and therefore we cannot divide by them). The set of nonnegative real numbers is not a field. We can add, multiply, and divide, but sometimes when we subtract the answer is not in the set.
For every power pnpn of a prime, there is exactly one finite field with pnpn elements, and these are the only finite fields. We’ll soon show how to construct them, but first let’s point out that if n>1,n>1, then the integers mod pnpn do not form a field. The congruence px≡1(modpn)px≡1(modpn) does not have a solution, so we cannot divide by p,p, even though p≡0(modpn).p≡0(modpn). Therefore, we need more complicated constructions to produce fields with pnpn elements.
The field with pnpn elements is called GF(pn).GF(pn). The “GF” is for “Galois field,” named for the French mathematician Evariste Galois (1811–1832), who did some early work related to fields.
Example, continued
Here is another way to produce the field GF(4).GF(4). Let Z2[X]Z2[X] be the set of polynomials whose coefficients are integers mod 2. For example, 1+X3+X61+X3+X6 and XX are in this set. Also, the constant polynomials 00 and 11 are in Z2[X].Z2[X]. We can add, subtract, and multiply in this set, as long as we work with the coefficients mod 2. For example,
(X3+X+1)(X+1)=X4+X3+X2+1
(X3+X+1)(X+1)=X4+X3+X2+1
since the term 2X2X disappears mod 2. The important property for our purposes is that we can perform division with remainder, just as with the integers. For example, suppose we divide X2+X+1X2+X+1 into X4+X3+1.X4+X3+1. We can do this by long division, just as with numbers:
In words, what we did was to divide by X2+X+1X2+X+1 and obtain the X2X2 as the first term of the quotient. Then we multiplied this X2X2 times X2+X+1X2+X+1 to get X4+X3+X2,X4+X3+X2, which we subtracted from X4+X3+1,X4+X3+1, leaving X2+1.X2+1. We divided this X2+1X2+1 by X2+X+1X2+X+1 and obtained the second term of the quotient, namely 1. Multiplying 11 times X2+X+1X2+X+1 and subtracting from X2+1X2+1 left the remainder X.X. Since the degree of the polynomial XX is less than the degree of X2+X+1,X2+X+1, we stopped. The quotient was X2+1X2+1 and the remainder was X:X:
X4+X3+1=(X2+1)(X2+X+1)+X.
X4+X3+1=(X2+1)(X2+X+1)+X.
We can write this as
X4+X3+1≡X(modX2+X+1).
X4+X3+1≡X(modX2+X+1).
Whenever we divide by X2+X+1X2+X+1 we can obtain a remainder that is either 0 or a polynomial of degree at most 1 (if the remainder had degree 2 or more, we could continue dividing). Therefore, we define Z2[X](modX2+X+1)Z2[X](modX2+X+1) to be the set
{0,1,X,X+1}
{0,1,X,X+1}
of polynomials of degree at most 1, since these are the remainders that we obtain when we divide by X2+X+1.X2+X+1. Addition, subtraction, and multiplication are done mod X2+X+1.X2+X+1. This is completely analogous to what happens when we work with integers mod n.n. In the present situation, we say that two polynomials f(X)f(X) and g(X)g(X) are congruent mod X2+X+1,X2+X+1, written f(X)≡g(X)(modX2+X+1),f(X)≡g(X)(modX2+X+1), if f(X)f(X) and g(X)g(X) have the same remainder when divided by X2+X+1.X2+X+1. Another way of saying this is that f(X)−g(X)f(X)−g(X) is a multiple of X2+X+1.X2+X+1. This means that there is a polynomial h(X)h(X) such that f(X)−g(X)=(X2+X+1)h(X).f(X)−g(X)=(X2+X+1)h(X).
Now let’s multiply in Z2[X](modX2+X+1).Z2[X](modX2+X+1). For example,
X⋅X=X2≡X+1(modX2+X+1).
X⋅X=X2≡X+1(modX2+X+1).
(It might seem that the right side should be −X−1,−X−1, but recall that we are working with coefficients mod 2, so +1+1 and −1−1 are the same.) As another example, we have
X3≡X⋅X2≡X⋅(X+1)≡X2+X≡1(modX2+X+1).
X3≡X⋅X2≡X⋅(X+1)≡X2+X≡1(modX2+X+1).
It is easy to see that we are working with the set GF(4)GF(4) from before, with XX in place of ω.ω.
Working with Z2[X]Z2[X] mod a polynomial can be used to produce finite fields. But we cannot work mod an arbitrary polynomial. The polynomial must be irreducible, which means that it doesn’t factor into polynomials of lower degree mod 2. For example, X2+1,X2+1, which is irreducible when we are working with real numbers, is not irreducible when the coefficients are taken mod 2 since X2+1=(X+1)(X+1)X2+1=(X+1)(X+1) when we are working mod 2. However, X2+X+1X2+X+1 is irreducible: Suppose it factors mod 2 into polynomials of lower degree. The only possible factors mod 2 are XX and X+1,X+1, and X2+X+1X2+X+1 is not a multiple of either of these, even mod 2.
Here is the general procedure for constructing a finite field with pnpn elements, where pp is prime and n≥1.n≥1. We let ZpZp denote the integers mod p.p.
Zp[X]Zp[X] is the set of polynomials with coefficients mod p.p.
Choose P(X)P(X) to be an irreducible polynomial mod pp of degree n.n.
Let GF(pn)GF(pn) be Zp[X]Zp[X] mod P(X).P(X). Then GF(pn)GF(pn) is a field with pnpn elements.
The fact that GF(pn)GF(pn) has pnpn elements is easy to see. The possible remainders after dividing by P(X)P(X) are the polynomials of the form a0+a1X+⋯+an−1Xn−1,a0+a1X+⋯+an−1Xn−1, where the coefficients are integers mod p.p. There are pp choices for each coefficient, hence pnpn possible remainders.
For each n,n, there are irreducible polynomials mod pp of degree n,n, so this construction produces fields with pnpn elements for each n≥1.n≥1. What happens if we do the same construction for two different polynomials P1(X)P1(X) and P2(X),P2(X), both of degree nn? We obtain two fields, call them GF(pn)′GF(pn)′ and GF(pn)′′.GF(pn)′′. It is possible to show that these are essentially the same field (the technical term is that the two fields are isomorphic), though this is not obvious since multiplication mod P1(X)P1(X) is not the same as multiplication mod P2(X).P2(X).
3.11.1 Division
We can easily add, subtract, and multiply polynomials in Zp[X],Zp[X], but division is a little more subtle. Let’s look at an example. The polynomial X8+X4+X3+X+1X8+X4+X3+X+1 is irreducible in Z2[X]Z2[X] (although there are faster methods, one way to show it is irreducible is to divide it by all polynomials of smaller degree in Z2[X]Z2[X]). Consider the field
GF(28)=Z2[X](modX8+X4+X3+X+1).
GF(28)=Z2[X](modX8+X4+X3+X+1).
Since X7+X6+X3+X+1X7+X6+X3+X+1 is not 0, it should have an inverse. The inverse is found using the analog of the extended Euclidean algorithm. First, perform the gcd calculation for gcd(X7+X6+X3+X+1,X8+X4+X3+X+1).gcd(X7+X6+X3+X+1,X8+X4+X3+X+1). The procedure (remainder →→ divisor →→ dividend →→ ignore) is the same as for integers:
The last remainder is 1, which tells us that the “greatest common divisor” of X7+X6+X3+X+1X7+X6+X3+X+1 and X8+X4+X3+X+1X8+X4+X3+X+1 is 1. Of course, this must be the case, since X8+X4+X3+X+1X8+X4+X3+X+1 is irreducible, so its only factors are 1 and itself.
Now work the Extended Euclidean algorithm to express 1 as a linear combination of X7+X6+X3+X+1X7+X6+X3+X+1 and X8+X4+X3+X+1:X8+X4+X3+X+1:
which means that X2X2 is the multiplicative inverse of X7+X6+X3+X+1.X7+X6+X3+X+1. Whenever we need to divide by X7+X6+X3+X+1,X7+X6+X3+X+1, we can instead multiply by X2.X2. This is the analog of what we did when working with the usual integers mod p.p.
3.11.2 GF(28)
In Chapter 8, we discuss AES, which uses GF(28),GF(28), so let’s look at this field a little more closely. We’ll work mod the irreducible polynomial X8+X4+X3+X+1,X8+X4+X3+X+1, since that is the one used by AES. However, there are other irreducible polynomials of degree 8, and any one of them would lead to similar calculations. Every element can be represented uniquely as a polynomial
b7X7+b6X6+b5X5+b4X4+b3X3+b2X2+b1X+b0,
b7X7+b6X6+b5X5+b4X4+b3X3+b2X2+b1X+b0,
where each bibi is 0 or 1. The 8 bits b7b6b5b4b3b2b1b0b7b6b5b4b3b2b1b0 represent a byte, so we can represent the elements of GF(28)GF(28) as 8-bit bytes. For example, the polynomial X7+X6+X3+X+1X7+X6+X3+X+1 becomes 11001011.11001011. Addition is the XOR of the bits:
Multiplication is more subtle and does not have as easy an interpretation. That is because we are working mod the polynomial X8+X4+X3+X+1,X8+X4+X3+X+1, which we can represent by the 9 bits 100011011. First, let’s multiply X7+X6+X3+X+1X7+X6+X3+X+1 by X:X: With polynomials, we calculate
11001011→110010110(shift left and append a 0)→110010110⊕100011011(subtract X8+X4+X3+X+1)=010001101,
11001011→→=110010110110010110⊕100011011010001101,(shift left and append a 0)(subtract X8+X4+X3+X+1)
which corresponds to the preceding answer. In general, we can multiply by XX by the following algorithm:
Shift left and append a 0 as the last bit.
If the first bit is 0, stop.
If the first bit is 1, XOR with 100011011.100011011.
The reason we stop in step 2 is that if the first bit is 0 then the polynomial still has degree less than 8 after we multiply by X,X, so it does not need to be reduced. To multiply by higher powers of X,X, multiply by XX several times. For example, multiplication by X3X3 can be done with three shifts and at most three XORs. Multiplication by an arbitrary polynomial can be accomplished by multiplying by the various powers of XX appearing in that polynomial, then adding (i.e., XORing) the results.
In summary, we see that the field operations of addition and multiplication in GF(28)GF(28) can be carried out very efficiently. Similar considerations apply to any finite field.
The analogy between the integers mod a prime and polynomials mod an irreducible polynomial is quite remarkable. We summarize in the following.
integers⟷Zp[X]prime number q⟷irreducible P(X)ofdegreenZq⟷Zp[X](modP(X))field with q elements⟷field with pn elements
integersprime number qZqfield with q elements⟷⟷⟷⟷Zp[X]irreducible P(X)ofdegreenZp[X](modP(X))field with pn elements
Let GF(pn)∗GF(pn)∗ denote the nonzero elements of GF(pn).GF(pn). This set, which has pn−1pn−1 elements, is closed under multiplication, just as the integers not congruent to 0 mod pp are closed under multiplication. It can be shown that there is a generating polynomial g(X)g(X) such that every element in GF(pn)∗GF(pn)∗ can be expressed as a power of g(X).g(X). This also means that the smallest exponent kk such that g(X)k≡1g(X)k≡1 is pn−1.pn−1. This is the analog of a primitive root for primes. There are ϕ(pn−1)ϕ(pn−1) such generating polynomials, where ϕϕ is Euler’s function. An interesting situation occurs when p=2p=2 and 2n−12n−1 is prime. In this case, every nonzero polynomial f(X)≠1f(X)≠1 in GF(2n)GF(2n) is a generating polynomial. (Remark, for those who know some group theory: The set GF(2n)∗GF(2n)∗ is a group of prime order in this case, so every element except the identity is a generator.)
The discrete log problem mod a prime, which we’ll discuss in Chapter 10, has an analog for finite fields; namely, given h(x),h(x), find an integer kk such that h(X)=g(X)kh(X)=g(X)k in GF(pn).GF(pn). Finding such a kk is believed to be very hard in most situations.
3.11.3 LFSR Sequences
We can now explain a phenomenon that is mentioned in Section 5.2 on LFSR sequences.
Suppose that we have a recurrence relation
xn+m≡c0xn+c1xn+1+⋯+cm−1xn+m−1(mod2).
xn+m≡c0xn+c1xn+1+⋯+cm−1xn+m−1(mod2).
For simplicity, we assume that the associated polynomial
P(X)=Xm+cm−1Xm−1+cm−2Xm−2+⋯+c0
P(X)=Xm+cm−1Xm−1+cm−2Xm−2+⋯+c0
is irreducible mod 2. Then Z2[X](modP(X))Z2[X](modP(X)) is the field GF(2m).GF(2m). We regard GF(2m)GF(2m) as a vector space over Z2Z2 with basis {1,X,X2,X3,…,Xm−1}.{1,X,X2,X3,…,Xm−1}. Multiplication by XX gives a linear transformation of this vector space. Since
Therefore, multiplication by MXMX shifts the indices by 1. It follows easily that multiplication on the right by the matrix MjXMjX sends (x1,x2,…,xm)(x1,x2,…,xm) to (x1+j,x2+j,…,xm+j).(x1+j,x2+j,…,xm+j). If MjX≡I,MjX≡I, the identity matrix, this must be the original vector (x1,x2,…,xm).(x1,x2,…,xm). Since there are 2m−12m−1 nonzero elements in GF(2m),GF(2m), it follows from Lagrange’s theorem in group theory that X2m−1≡1,X2m−1≡1, which implies that M2m−1X=I.M2m−1X=I. Therefore, we know that x1≡x2m,x2≡x2m+1,….x1≡x2m,x2≡x2m+1,….
For any set of initial values (we’ll assume that at least one initial value is nonzero), the sequence will repeat after kk terms, where kk is the smallest positive integer such that Xk≡1(modP(X)).Xk≡1(modP(X)). It can be shown that kk divides 2m−1.2m−1.
In fact, the period of such a sequence is exactly k.k. This can be proved as follows, using a few results from linear algebra: Let v=(x1,…,xm)≠0v=(x1,…,xm)≠0 be the row vector of initial values. The sequence repeats when vMjX=v.vMjX=v. This means that the nonzero row vector vv is in the left null space of the matrix MjX−I,MjX−I, so det(MjX−I)=0.det(MjX−I)=0. But this means that there is a nonzero column vector w=(a0,…,am−1)Tw=(a0,…,am−1)T in the right null space of MjX−I.MjX−I. That is, MjXw=w.MjXw=w. Since the matrix MjXMjX represents the linear transformation given by multiplication by XjXj with respect to the basis {1,X,…,Xm−1},{1,X,…,Xm−1}, this can be changed back into a relation among polynomials:
Xj(a0+a1X+⋯+am−1Xm−1)≡a0+a1X+⋯+am−1Xm−1(modP(X)).
Xj(a0+a1X+⋯+am−1Xm−1)≡a0+a1X+⋯+am−1Xm−1(modP(X)).
But a0+a1X+⋯+am−1Xm−1(modP(X))a0+a1X+⋯+am−1Xm−1(modP(X)) is a nonzero element of the field GF(2m),GF(2m), so we can divide by this element to get Xj≡1(modP(X)).Xj≡1(modP(X)). Since j=kj=k is the first time this happens, the sequence first repeats after kk terms, so it has period k.k.
As mentioned previously, when 2m−12m−1 is prime, all polynomials (except 0 and 1) are generating polynomials for GF(2m).GF(2m). In particular, XX is a generating polynomial and therefore k=2m−1k=2m−1 is the period of the recurrence.