THE ROTA METHOD FOR SOLVING POLYNOMIAL EQUATIONS: A MODERN APPLICATION OF INVARIANT THEORY Kristofer Jorgenson Sul Ross State University, Box C-18 Alpine, TX 79832 United States of America
[email protected] Abstract: Algebraic invariants are defined for the purpose of gaining insights into solving polynomial equations. Polynomial invariants are disclosed here as an alternative to and to clarify the umbral method of Gian-Carlo Rota. A process for solving cubic polynomial equations is examined and extended to quintic (or 5th degree) polynomial equations. It is proved that a general cubic polynomial is “apolar” to a quadratic polynomial. It is proved that a quadratic polynomial and cubic polynomial which are apolar either both have repeated roots or both have distinct roots. In the case of repeated roots, these roots are shared by the cubic and quadratic polynomials that are apolar. In the case, in which the derived quadratic which is apolar to a given cubic has distinct roots, it is shown the cubic polynomial px may be transformed to px = c 1 x − r 1 3 + c 2 x − r 2 3 , where r 1 and r 2 are the distinct roots of the quadratic polynomial. This will allow the roots of the cubic to be found using algebraic operations. It will be shown that these methods can be extended to show that a given quintic polynomial is in general apolar to a cubic polynomial. Some remaining questions are posed at the end of the article. AMS Subject Classification: 13A50 Key Words: Invariant Theory, Invariants, Polynomial Equations. 1. Introduction Gian-Carlo Rota (1932-1999) made contributions to combinatorics with important papers published in the 1960s, and he also made contributions to invariant theory. He expressed the need to bring the theory of invariants back into the forefront of mathematics after being eclipsed by efforts to develop the language of modern algebra during the 20th century. Invariant Theory is important for gaining knowledge of the intrinsic properties of geometry that transcend coordinate systems. Geometric facts are classically described by means of equations which require a choice of coordinates. Since Descartes, geometric facts have been described using equations in the coordinates x 1 , x 2 , . . . , x n . If equations are to express geometric properties, then they must hold no matter what coordinate system is chosen. In other words, equations that describe geometric facts must be invariant under changes of coordinates.
Consider the change of variables given by the translation x ↦ x + c , where c is a constant, for a quadratic polynomial qx = x 2 + 2b 1 x + b 2 , which is written in a special way. If qx and qx + c each have a double root, then we say this property is invariant under translation. Let’s show this property holds in each case. Since qx = x 2 + 2b 1 x + b 2 = x 2 + 2b 1 x + b 21 + b 2 − b 21 = x + b 1 2 + b 2 − b 21 , q has a double root if and only if b 2 − b 21 = 0 . Likewise qx + c = x + c 2 + 2b 1 x + c + b 2 = x + c + b 1 2 + b 2 − b 21 has a double root if and only if b 2 − b 21 = 0 . So the property that a quadratic polynomial has a repeated (or double) root is invariant under translation, and the discriminant Db 1 , b 2 = b 2 − b 21 is the desired invariant. Example If qx = x 2 + 25x + 17, then qx = x + 5 2 − 8, and with b 1 = 5 and b 2 = 17, Db 1 , b 2 = 17 − 5 2 = −8 . Therefore, qx has two distinct roots. If c = 4, qx + c = x + c 2 + 2 ⋅ 5x + c + 17 = x + 4 2 + 2 ⋅ 5x + 4 + 17 = x + 4 + 5 2 + 17 − 5 2 = x + 9 2 − 8 = x 2 + 29x + 73 And we see that D73, 9 = 73 − 9 2 = −8. In either the case, the invariant D evaluated at qx or qx + 4 is −8, and so neither of these quadratic functions has a double root. Invariant theory is concerned with the problem of finding all invariants of a given set of polynomials, as well as their significance. The significance of an invariant is that every property of polynomials which is invariant under translations is expressed by the vanishing of one or more invariants. In other words, any set of polynomials which is invariant under translations is the same set as a set of polynomials obtained by setting to zero a set of invariants of such polynomials. To clarify these statements, we introduce the binomial notation.
2
2. Binomial Notation Previously we saw a quadratic polynomial of the form qx = x 2 + 2b 1 x + b 2 . A cubic polyomial written in “binomial” notation has the form px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 so that the coefficients with respect to x include the binomial coefficients n n! = j!n−j! , where n = deg x px, and each term has the form j n
a n−j x j , whereby 0 ≤ j ≤ n .
j In this notation, a quintic polynomial will take the form px = a 0 x 5 + 5a 1 x 4 + 10a 2 x 3 + 10a 3 x 2 + 5a 4 x + a 5 . We assume in this paper that all polynomials are in Cx , where C is the field of complex numbers. That is, all polynomials have complex coefficients. 3. Translation Polynomials Let px and qx be polynomials in Cx . Generally we write px and qx using the binomial notation: n n n px = a 0 x n + a 1 x n−1 + a 2 x n−2 + ⋅ ⋅ ⋅ + a n−1 x + a n 1 2 n−1 qx = b 0 x k +
k
k
b 1 x k−1 +
b 2 x k−2 + ⋅ ⋅ ⋅ +
k
1 2 k−1 where k = deg x qx, n = deg x px, and we assume k ≤ n .
b k−1 x + b k ,
Definition If T c is the translation operator on a polynomial px defined by T px = px + c, we can write n n n px + c = p 0 cx n + p 1 cx n−1 + p 2 cx n−2 + ⋅ ⋅ ⋅ + p n−1 cx + 1 2 n−1 where p j c, which we refer to as a translation polynomial, is a degree j polynomial in c given by c
p j c = c j a 0 +
j 1
c j−1 a 1 +
j 2
c j−2 a 2 + ⋅ ⋅ ⋅ +a j
Example Consider the cubic polynomial px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 .
3
Then T c px = px + c = a 0 x + c 3 + 3a 1 x + c 2 + 3a 2 x + c + a 3 = a 0 x 3 + 3a 0 c + a 1 x 2 + 3a 0 c 2 + 2a 1 c + a 2 x + a 0 c 3 + 3a 1 c 2 + 3a 2 c + a 3 , so the translation polynomials are p 0 c = a 0 , p 1 c = a 0 c + a 1 , p 2 c = a 0 c 2 + 2a 1 c + a 2 and p 3 c = a 0 c 3 + 3a 1 c 2 + 3a 2 c + a 3 . Our main focus will be on invariants that depend on the coefficients of two polynomials. 4. The Apolar Invariant A 2, 3 Definitions A polynomial Ia 0 , a 1 , a 2 , . . . , a n , b 0 , b 1 , b 2 , . . . , b k , x in the variables a 0 , a 1 , a 2 , . . . , a n , b 0 , b 1 , b 2 , . . . , b k , x is said to be an invariant of the two polynomials px and qx when Ia 0 , a 1 , a 2 , . . . , a n , b 0 , b 1 , b 2 , . . . , b k , x = Ip 0 c, p 1 c, p 2 c, . . . , p n c, q 0 c, q 1 c, q 2 c, . . . , q k c, x + c , for all complex numbers c . Sometimes these invariants are called covariants. For convenience, we use the notation Ipx, qx . Consequently, I is an invariant if and only Ipx, qx = IT c px, T c qx for all c . Remark Here we require that Ipx, qx equals IT c px, T c qx as a polynomial in n + k + 3 indeterminates. This means that c disappears via cancellation. Definitions The apolar invariant A 2,3 qx, px is defined for a quadratic polynomial qx = b 0 x 2 + 2b 1 x + b 2 and cubic polynomial px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 by A 2,3 qx, px = a 0 b 2 − 2a 1 b 1 + b 0 a 2 x + 2a 2 b 1 − b 0 a 3 − a 1 b 2 We say these two polynomials px and qx are A 2,3 − apolar when A 2,3 qx, px is identically 0 , that is, 0 for all x. To verify that A 2,3 is an invariant, meaning that A 2,3 qx, px = A 2,3 T c qx, T c px, we can work out the right side of this last equation by simply substituting the translation polynomials p j c in for each a j in A 2,3 qx, px = a 0 b 2 − 2a 1 b 1 + b 0 a 2 x + 2a 2 b 1 − b 0 a 3 − a 1 b 2 along with x + c for x and simplify. If A 2,3 qx, px is an invariant, then the c’s will disappear: A 2,3 T c qx, T c px = A 2,3 qx + c, px + c = a 0 b 2 + 2b 1 c + b 0 c 2 − 2a 1 + a 0 cb 1 + b 0 c + +a 2 + 2a 1 ⋅ c + a 0 c 2 b 0 x + c − a 1 + a 0 cb 2 + 2b 1 c + b 0 c 2 + 2a 2 + 2a 1 c + a 0 c 2 b 1 + b 0 c − b 0 a 3 + 3a 2 c + 3a 1 c 2 + a 0 c 3 = a 0 b 2 − 2a 1 b 1 + b 0 a 2 x + 2a 2 b 1 − b 0 a 3 − a 1 b 2 = A 2,3 qx, px Therefore, A 2,3 qx, px is an invariant.
4
Given px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 , a qx = b 0 x 2 + 2b 1 x + b 2 is A 2,3 − apolar to px if a 2 b 0 − 2a 1 b 1 + a 0 b 2 = 0 and a 3 b 0 − 2a 2 b 1 + a 1 b 2 = 0 . So finding qx is equivalent to solving a linear system in b 0 , b 1 , b 2 with augmented matrix a 2 −2a 1 a 0 0 a 3 −2a 2 a 1 0
.
This will result generally (with some exceptions) in a 1 − dimensional vector space of polynomials qx = b 0 x 2 + 2b 1 x + b 2 that are apolar to the given px as they form the null space of a homogeneous linear system. Example Define px = 9x 3 − 24x 2 + 11x + 2. In binomial form, px = 9x 3 + 3−8x 2 + 3 113 x + 2, so a 0 = 9, a 1 = −8, a 2 = a3 = 2 . Solve the system represented by 11 −2−8 9 0 a 2 −2a 1 a 0 0 3 = a 3 −2a 2 a 1 0 2 −2 113 −8 0 =
which row reduces to
1 0 − 279 265 0 1
213 265
0 0
. So b 0 =
279 265
11 3
16
2
− 223
9
0
−8 0
11 3
,
,
b 2 and b 1 = − 213 b . 265 2
This gives a 1 − dimensional space of quadratic polynomials apolar to px described by qx = 279 b 2 + 2 − 213 b 2 x + b 2 265 265 279 426 2 = b2 x − x+1 , 265 265 or alternately by qx = t279x 2 − 426x + 265 , where t is any complex constant. Linearity Properties In a more general setting, the set of quadratic polynomials S px that are found to be A 2,3 − apolar to a given px emerges as the null space of a homogeneous linear system. Therefore S px is a vector subspace of the vector space P 2 of polynomials of degee at most 2 (which itself is isomorphic as a vector space to C 3 ). Then S px satisfies the linearity properties of a vector space. That is, 1. If qx is A 2,3 − apolar to a given px, then so is cqx for all constants c . 2. If qx and sx are each degree at most 2 and apolar to a given cubic px, then so is the sum qx + sx. In particular the 0 polynomial is apolar to a given cubic.
5
If on the other hand we are given a quadratic qx = b 0 x 2 + 2b 1 x + b 2 , the task of finding an apolar px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 which satisfies the equation A 2, 3 qx, px = a 0 b 2 − 2a 1 b 1 + b 0 a 2 x − a 1 b 2 − 2a 2 b 1 + b 0 a 3 = 0 , is equivalent to solving a linear system with augmented matrix b 2 −2b 1 0
b2
b0
0 0
−2b 1 b 0 0
.
This in general will produce a 2 − dimensional subspace of polynomials S qx of degree at most 3 that are A 2,3 − apolar to the given quadratic qx . This leads to an important property: Theorem 1 If qx is 2nd degree and r is a root of qx, then qx is apolar to the cubic fx = x − r 3 . Proof: This can be shown by expanding fx : fx = x 3 − 3x 2 r + 3xr 2 − r 3 In this case, a 0 = 1, a 1 = −r, a 2 = r 2 , and a 3 = −r 3 . Then A 2, 3 qx, fx = a 0 b 2 − 2a 1 b 1 + b 0 a 2 x − a 1 b 2 − 2a 2 b 1 + b 0 a 3 = 1b 2 − 2−rb 1 + b 0 r 2 x − −rb 2 − 2r 2 b 1 + b 0 −r 3 = b 2 + 2rb 1 + b 0 r 2 x − b 2 + 2rb 1 + b 0 r 2 −r = qrx − qr−r = 0 since qr = 0 . Therefore qx is apolar to x − r 3 . ■ If qx has roots r 1 and r 2 , the Linearity Properties (1) and (2) imply that qx is A 2,3 − apolar to c 1 x − r 1 3 + c 2 x − r 2 3 for all choices of constants c 1 and c 2 . In the case in which the roots r 1 and r 2 of qx are distinct, then x − r 1 3 and x − r 2 3 are linearly independent and generate a 2 − dimensional vector space. Assuming now that qx is a quadratic derived initially as apolar to a given cubic px, then the the 2 − dimensional vector space of cubics S qx that are apolar to qx and which contains px equals S qx = c 1 x − r 1 3 + c 2 x − r 2 3 | c 1 , c 2 ∈ C , so px = c 1 x − r 1 3 + c 2 x − r 2 3 for some particular c 1 and c 2 . Example In the previous example, we found that qx = 279x 2 + 2−213x + 265 is apolar to px = 9x 3 − 24x 2 + 11x + 2 . Solving 279x 2 + 2−213x + 265 = 0 ⇒ 279 x 2 + 2− 71 x + 265 =0 93 279 2 71 1058 ⇒ x − 93 + 2883 = 0 71 23 71 23 and we find that x = 93 + 93 i 6 and x = 93 − 93 i 6 .
6
So 3 3 px = 9x 3 − 24x 2 + 11x + 2 = c 1 x − 71 + 23 i 6 + c 2 x − 71 − 23 i 6 93 93 93 93 for some particular c 1 and c 2 . To outline the solution of such a cubic px = c 1 x − r 1 3 + c 2 x − r 2 3 note that px = c 1 x − r 1 3 + c 2 x − r 2 3 = c 1 + c 2 x 3 + −3c 1 r 1 − 3c 2 r 2 x 2 + 3c 1 r 21 + 3c 2 r 22 x − c 1 r 31 − c 2 r 32 = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 so c 1 + c 2 = a 0 and −3c 1 r 1 − 3c 2 r 2 = 3a 1 , which implies that c 2 = a 0 − c 1 and c 1 r 1 + a 0 − c 1 r 2 = −a 1 , 2 +a 1 1 +a 1 ⇒ c 1 = a r0 r2 −r and c 2 = a 0 − c 1 = a r0 r1 −r . 1 2 a r 1 +a 1 2 +a 1 Then px = a r0 r2 −r x − r 1 3 + r0 1 −r x − r 2 3 , so 1 2 px = 0 ⇒ ar0 r2 2−+ra1 1 x − r 1 3 = − ar0 r1 1−+ra2 1 x − r 2 3 a 0 r 1 +a 1 x − r 1 3 = − r 1 −r 2 = a 0 r 1 + a 1 a r 2 +a 1 x − r2 a0r2 + a1 r0 2 −r 1 a 0 r 1 +a 1 1 3 Letting y = x−r x−r 2 and d = a 0 r 2 +a 1 , and using a polar form for d, we solve y = d for the 3 complex roots of d : y 0 , y 1 , y 2 via De Moivre’s Theorem.
Use y k =
x k −r 1 x k −r 2
, to solve for x k =
y k r 2 −r 1 y k −1
, k = 0, 1, 2 .
In the example, with px = 9x 3 − 24x 2 + 11x + 2 the above process yields x0 =
y 0 r 2 −r 1 y 0 −1
=
71 93
−
212949813507701300000000000000 6 578603896932131660122794313233
583686282464563 6 578603896932131660122794313233
−
i
≈ −0. 138071187458. x1 =
y 1 r 2 −r 1 y 1 −1
=
15606738876411400000000000000 6 925698051533942894964544936329
+
71 93
+
248557155553381 6 925698051533942894964544936329
i
≈ 0. 804737854124. and x2 =
y 2 r 2 −r 1 y 2 −1
=
12115755716993740000000000000 6 24000000000008754428964926071
+
71 93
+
18972058280657 6 72000000000026263286894778213
i
≈2
Remark Of course, the above example used to demonstrate this more general process can be solved using the Rational Root Test, division, and quadratic methods. The previous discussion proves the following.
7
Corollary 1 Let px = a 0 x 3 + 3a 1 x 2 + 3a 2 + a 3 be a typical cubic and assume the quadratic qx is apolar to px. If qx has two distinct roots r 1 and r 2 , then a0r2 + a1 r2 − r1
px =
a0r1 + a1 r1 − r2
x − r 1 3 +
x − r 2 3 .
Remark: Although S qx from the previous example is a vector space that contains cubic polynomials and the zero polynomial, it also contains 2nd degree polynomials. 3 3 Consider x − 71 + 23 i 6 − x − 71 − 23 i 6 93 93 93 93 = 3 − 46 i 6 x2 + 3 ⋅ 93
6532 8649
i 6x−
183 218 268 119
i 6 .
218 With gx = 3 − 46 i 6 x 2 + 3 ⋅ 6532 i 6 x − 183 i 6 in the place of px with 93 8649 268 119 2 a 0 = 0, and qx = 279x + 2−213x + 265, it checks that A 2, 3 qx, gx = 0 .
Note: The previous solution of a cubic px depends on the quadratic qx that is found to be apolar to the cubic having two distinct roots r 1 , r 2 . The following proves that if a quadratic has repeated roots, then for an A 2, 3 − apolar px, px = 0 can always be solved by division or similar methods. Theorem 2 Let quadratic qx = b 0 x 2 + 2b 1 x + b 2 and cubic px be apolar. Then qx has a repeated root x = r if and only if px has a repeated root at x = r. Proof: Assuming first that qx = b 0 x 2 + 2b 1 x + b 2 has a repeated root x = r, then qx = b 0 x − r 2 . Since b 0 x − r 2 = b 0 x 2 − 2b 0 rx + b 0 r 2 , then b 1 = −b 0 r and b 2 = b 0 r 2 . Solving the equation A 2, 3 qx, px = a 0 b 2 − 2a 1 b 1 + b 0 a 2 x − a 1 b 2 − 2a 2 b 1 + b 0 a 3 = 0 for a cubic b 2 −2b 1
px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 with the matrix
0
b2
0 0
b0
−2b 1 b 0 0
yields
for qx = b 0 x 2 − 2b 0 rx + b 0 r 2 b 0 r 2 2b 0 r b 0 0 0 the augmented matrix . 0 b 0 r 2 2b 0 r b 0 0 Since b 0 ≠ 0, it follows that this system is equivalent to one with the matrix r 2 2r 1 0 0 which implies that r ≠ 0 . 0 r 2 2r 1 0 Then
r 2 2r 1 0 0 0 r 2 2r 1 0
So a 0 = a 2
3 r2
+ a3
2 r3
=
1 0 − r32
, row reduces to
3ra 2 +2a 3 r3
and a 1 = −a 2
2 r
0 1 2 r
− a3
1 r2
=
− r23
0
1 0 r2 −2ra 2 −a 3 . r2
.
8
Therefore, a cubic px that is apolar to qx = b 0 x − r 2 must have the form 3ra 2 +2a 3 r3
px = Since
3a 2 r+2a 3 x+ra 3 r3
then px =
−2ra 2 −a 3 r2
x3 + 3
x 2 + 3a 2 x + a 3 . 3ra 2 +2a 3 r3
x − r 2 =
3a 2 r+2a 3 x+ra 3 r3
x3 + 3
−2ra 2 −a 3 r2
x 2 + 3a 2 x + a 3
x − r 2 , so px has the repeated root r .
On the other hand, if px has a repeated root x = r, then px = a 0 x − r 2 x − a = a 0 x 3 + −2a 0 r − a 0 ax 2 + a 0 r 2 + 2a 0 rax − a 0 r 2 a a 0 r 2 +2ra a So then a 3 = −a 0 r 2 a, a 2 = , and a 1 = − 0 2r+a . 3 3 The system with with augmented matrix a 0 −2a 1 a 2 0
equivalent to a0 −
a 0 2r+a 3
−2
a 0 r 2 +2ra 3
0
−a 0 r a
0
a 0 r 2 +2ra 3
a 3 −2a 2 a 1 0
, which is
becomes, with the above substitutions,
a 1 −2a 2 a 3 0 2 a 0 2r+a 3
a 2 −2a 1 a 0 0
2
, which row reduces to
1 0 −r 2 0
. This means that b 2 = r 2 b 0 and b 1 = −rb 0 , so that 0 1 r 0 qx = b 0 x 2 − 2rb 0 x + r 2 b 0 = b 0 x − r 2 is the typical quadratic polynomial apolar to a given cubic polynomial with repeated roots. ■ Note: Before leaving our discussion of the A 2, 3 invariant, it is worth noting that not every cubic polynomial is A 2, 3 − apolar to a nonzero quadratic polynomial. Consider px = x 3 + 3x 2 + 3x + 2 . In this case, a 0 = 1, a 1 = 1, a 2 = 1, and a 3 = 2 . In solving the linear system represented by a 2 −2a 1 a 0 0 a 3 −2a 2 a 1 0 which row reduces to we have that qx = 2 for every choice of b 2 .
1 0
0
0 1 − 1 2
1 2
0 0
=
1 −2 1 0 2 −2 1 0
,
, we get that b 0 = 0, and b 1 =
1 2
b 2 , and
b 2 x + b 2 = b 2 x + 1 is apolar to px = x 3 + 3x 2 + 3x + 2
9
5. The General Apolar Invariant A k, n Definitions For integers k and n with 1 ≤ k ≤ n, define the polynomials n−k k T i = −1 i x n−k−i a i b k − a i+1 b k−1 + ⋯ i 1 k
⋯ + −1 k−1
k−1
a k+i−1 b 1 + −1 k a k+i b 0
for 0 ≤ i ≤ n − k . The general apolar expression A k, n is defined by n−k
A k, n qx, px =∑ T i . i=0
If A k, n qx + c, px + c = A k, n qx, px for all c, then A k, n is an apolar invariant. If A k, n qx, px = 0 for all x, we will say that qx and px are A k, n − apolar. Example We know from before that for qx = b 0 x 2 + 2b 1 x + b 0 and px = a 0 x 3 + 3a 1 x 2 + 3a 2 x + a 3 , A 2, 3 qx, px = xa 0 b 2 − 2a 1 b 1 + a 2 b 0 − a 1 b 2 − 2a 2 b 1 + a 3 b 0 is an invariant, and this agrees with the above general formula when k = 2 and n = 3 . The following generalizes Th. 1. Theorem 3 Let qx be a polynomial of degree k, and let n be an integer with n ≥ k. If A k, n is an invariant, and r is a root of qx, then qx is A k, n − apolar to x − r n . Proof: Let px = x − r n . px = x n + n−rx n−1 + ⋯+
n
n 2
−r 2 x n−2 + ⋯
−r n−2 x 2 +
n−2
n
−r n−1 x + −r n
n−1
so that for px = a 0 x n +
n 1
a 1 x n−1 +
n 2
a 2 x n−2 + ⋅ ⋅ ⋅ +
n n−1
a n−1 x + a n ,
we have
10
♯
a 0 = 1, a 1 = −r, a 2 = −r 2 , and a j = −r j for all j ∈ 0, n ∩ Z. We need to show that A k, n qx, px = 0. For this, it’s sufficient to show for each i with 0 ≤ i ≤ n − k that k
aibk −
a i+1 b k−1 + ⋅ ⋅ ⋅ +−1 k−1
1
k
a k+i−1 b 1 + −1 k a k+i b 0
k−1
=0
This will prove that each term T i of A k, n qx, px is 0. Considering ♯, we have aibk −
k
a i+1 b k−1 + ⋅ ⋅ ⋅ +−1 k−1
1
−r i b k −
k
= −r i b k −
k
= −r i b k +
k
=
1
1
1
k k−1
−r i+1 b k−1 + ⋅ ⋅ ⋅ +−1 k−1 −r 1 b k−1 + ⋅ ⋅ ⋅ +−1 k−1 rb k−1 + ⋅ ⋅ ⋅ +
k k−1
a k+i−1 b 1 + −1 k a k+i b 0 k
−r k+i−1 b 1 + −1 k −r k+i b 0
k−1 k k−1
−r k−1 b 1 + −1 k −r k b 0
r k−1 b 1 + r k b 0
= −r i qr = 0 since qr = 0 . Therefore, A k, n qx, px = 0, and qx is A k, n − apolar to x − r n . ■ Theorem (Rota) For positive integers k and n with k ≤ n : (a) The dimension of the space of all polynomials of degree k which are apolar to a polynomial of degree n (with n < 2k) equals 2k − n, in general. (b) The dimension of the space of all polynomials of degree n which are apolar to a polynomial of degree k equals k. [2] Example: The Apolar Invariant A 3, 5 Definitions The apolar invariant A 3,5 qx, px is defined for a cubic polynomial qx = b 0 x 3 + 3b 1 x 2 + 3b 2 x + b 3 and quintic polynomial px = a 0 x 5 + 5a 1 x 4 + 10a 2 x 3 + 10a 3 x 2 + 5a 4 x + a 5 as A 3,5 qx, px = x 2 a 0 b 3 − 3a 1 b 2 + 3a 2 b 1 − a 3 b 0 − 2xa 1 b 3 − 3a 2 b 2 + 3a 3 b 1 − a 4 b 0 + a 2 b 3 − 3a 3 b 2 + 3a 4 b 1 − a 5 b 0 We say these polynomials px and qx are A 3,5 − apolar when A 3,5 qx, px is 0 for all x. To prove that A 3,5 qx, px is an invariant, that is,
11
A 3,5 qx, px = A 3,5 T c qx, T c px for all c, note that for A 3,5 qx, px = x 2 a 0 b 3 − 3a 1 b 2 + 3a 2 b 1 − a 3 b 0 − 2xa 1 b 3 − 3a 2 b 2 + 3a 3 b 1 − a 4 b 0 + a 2 b 3 − 3a 3 b 2 + 3a 4 b 1 − a 5 b 0 A 3,5 T c qx, T c px = x + c 2 a 0 b 3 + 3b 2 c + 3b 1 c 2 + b 0 c 3 − 3a 1 + a 0 cb 2 + 2b 1 c + ⋯ + 3a 2 + 2a 1 c + a 0 c 2 b 1 + b 0 c − a 3 + 3a 2 c + 3a 1 c 2 + a 0 c 3 b 0 −2x + ca 1 + a 0 cb 3 + 3b 2 c + 3b 1 c 2 + b 0 c 3 − 3a 2 + 2a 1 c + a 0 c 2 b 2 + 2b 1 c + b 0 ⋯ + 3a 3 + 3a 2 c + 3a 1 c 2 + a 0 c 3 b 1 + b 0 c − a 4 + 4a 3 c + 6a 2 c 2 + 4a 1 c 3 + a 0 c 4 +a 2 + 2a 1 c + a 0 c 2 b 3 + 3b 2 c + 3b 1 c 2 + b 0 c 3 − ⋯ ⋯ − 3a 3 + 3a 2 c + 3a 1 c 2 + a 0 c 3 b 2 + 2b 1 c + b 0 c 2 + ⋯ ⋯ + 3a 4 + 4a 3 c + 6a 2 c 2 + 4a 1 c 3 + a 0 c 4 b 1 + b 0 c − ⋯ ⋯ − a 5 + 5a 4 ⋅ c + 10a 3 c 2 + 10a 2 c 3 + 5a 1 c 4 + a 0 c 5 b 0 2 = x a 0 b 3 − 3a 1 b 2 + 3a 2 b 1 − a 3 b 0 − 2xa 1 b 3 − 3a 2 b 2 + 3a 3 b 1 − a 4 b 0 + a 2 b 3 − 3a 3 b 2 + 3a 4 b 1 − a 5 b 0 = A 3,5 qx, px Therefore, A 3,5 is an invariant to which Th. 3 applies. By this new definition of apolarity, given a degree 5 polynomial px = a 0 x 5 + 5a 1 x 4 + 10a 2 x 3 + 10a 3 x 2 + 5a 4 x + a 5 , the coefficients of an apolar cubic polynomial qx = b 0 x 3 + 3b 1 x 2 + 3b 2 x + b 3 would need to satisfy the equations in the 4 variables b 0 , b 1 , b 2 , b 3 : a 3 b 0 − 3a 2 b 1 + 3a 1 b 2 − a 0 b 3 = 0 a 4 b 0 − 3a 3 b 1 + 3a 2 b 2 − a 1 b 3 = 0 a 5 b 0 − 3a 4 b 1 + 3a 3 b 2 − a 2 b 3 = 0 This will generally produce a 1 − dimensional vector space of cubic polynomials each apolar to the given quintic, which agrees with Rota’s Theorem. Alternately, given a cubic polynomial qx = b 0 x 3 + 3b 1 x 2 + 3b 2 x + b 3 , the A 3,5 − invariant definition likewise gives a homogenous linear system in 6 variables a quintic px would need to satisfy: a 0 b 3 − 3a 1 b 2 + 3a 2 b 1 − a 3 b 0 =0 a 1 b 3 − 3a 2 b 2 + 3a 3 b 1 − a 4 b 0
=0
a 2 b 3 − 3a 3 b 2 + 3a 4 b 1 − a 5 b 0 = 0 which in general will result in a 3 − dimensional vector space of quintic polynomials each apolar to the given cubic. Given a quintic px = a 0 x 5 + 5a 1 x 4 + 10a 2 x 3 + 10a 3 x 2 + 5a 4 x + a 5 Th. 3 implies that if r is the root of an A 3,5 − apolar cubic polynomial qx, then qx is A 3,5 − apolar to x − r 5 . Then if r 1 , r 2 , r 3 are the distinct roots of qx, linearity properties imply that px = c 1 x − r 1 5 + c 2 x − r 2 5 + c 3 x − r 3 5 for particular
12
constants c 1 , c 2 , c 3 . For the quintic, px = 2x 5 + 7x 4 − 36x 3 − 107x 2 + 14x + 120, we need only consider using the monic associate of px, 1 ⋅ px = x 5 + 7 x 4 − 18x 3 − 107 x 2 + 7x + 60 2 2 2 in the calculations. So, let’s have a couple of definitions before delving into this example. Definitions: An associate of a polynomial px with coefficients in a field, such as the complex numbers, is any nonzero multiple of px. That is, cpx is an associate of px if c ≠ 0. A monic polynomial is a polynomial with leading coefficient of 1, so a monic associate of a polynomial px equals a10 ⋅ px, if a 0 is the leading coefficient of px. Note: Due to the linearity properties discussed previously, we can find the space of cubic polynomials that are apolar to px = 2x 5 + 7x 4 − 36x 3 − 107x 2 + 14x + 120 by finding the space of cubic polynomials apolar to the monic associate of px: 1 ⋅ px = x 5 + 72 x 4 − 18x 3 − 107 x 2 + 7x + 60 . 2 2 Further it’s sufficient to seek only the monic cubic polynomials that are apolar to fx = x 5 + 72 x 4 − 18x 3 − 107 x 2 + 7x + 60 . This will simplify the system of equations 2 3a 2 b 1 − 3a 1 b 2 + a 0 b 3 = a 3 b 0 3a 3 b 1 − 3a 2 b 2 + a 1 b 3 = a 4 b 0 3a 4 b 1 − 3a 3 b 2 + a 2 b 3 = a 5 b 0 that needs to be solved to the nonhomogeneous system 3a 2 b 1 − 3a 1 b 2 + b 3 = a 3 3a 3 b 1 − 3a 2 b 2 + a 1 b 3 = a 4 3a 4 b 1 − 3a 3 b 2 + a 2 b 3 = a 5 that has the augmented matrix 3a 2 −3a 1
1 a3
3a 3 −3a 2 a 1 a 4 3a 4 −3a 3 a 2 a 5 For fx = x 5 + 72 x 4 − 18x 3 − 107 x 2 + 7x + 60 2 7 9 = x 5 + 5 10 x 4 + 10 − 5 x 3 + 10 − 107 x2 + 5 20 107 7 a 3 = − 20 , a 4 = 5 , and a 5 = 60 . The matrix 3 − 95 −3 107 1 − 107 − 275 20 3 − 107 20 3
7 5
−3 − 95 −3 − 107 20
7 10 − 95
7 5
60
=
− 321 20 21 5
7 5
x + 60, a 1 =
− 21 10
1
− 107 20
27 5 321 20
7 10 − 95
7 5
7 10
, a 2 = − 95 ,
60
13
1 0 0 row reduces to
0 1 0 0 0 1
4657 1854 4853 927 23 741 1236
. So b 1 =
4657 1854
, b2 =
4853 927
, b3 =
23 741 1236
,
741 and qx = x 3 + 3 4657 x 2 + 3 4853 x + 231236 is apolar to 1854 927 px = 2x 5 + 7x 4 − 36x 3 − 107x 2 + 14x + 120 . To find a quadratic gx = d 0 x 2 + 2d 1 x + d 2 apolar to this qx, we solve the system of equations with augmented matrix
b 2 −2b 1 b 0 0 b 3 −2b 2 b 1 0 which row reduces to
1 0 0 1
=
7385 450 143 272 639 20 822 879 − 143 272 639
−2
4853 927 23 741 1236
0
1
0
4657 1854
0
,
.
0
7385 450 This gives d 0 = − 143 d , and d 1 = 272 639 2
−2
4657 1854 4853 927
20 822 879 143 272 639
d 2 , and
gx = −7385 450x 2 + 220 822 879x + 143 272 639 is a convenient associate. Solving gx = 0 yields the equation x 2 − 2 20 822 879 x − 143 272 639 = 0 ⇒ 7385 450 7385 450
x − 20 822 879 7385 450
2
− 1491 725 201 551 191 = 0 54 544 871 702 500
⇒ x = 20 822 879 ± 1491 725 201 551 191 7385 450 54 544 871 702 500 With gx having 2 distinct roots, Th. 2 implies that the cubic qx has 3 distinct roots r 1 , r 2 , and r 3 as well, which may be found using Cor. 1. Therefore, Th. 3 and linearity properties imply that px = k 1 x − r 1 5 + k 2 x − r 2 5 + k 3 x − r 3 5 for particular constants k 1 , k 2 , k 3 . Example: The Apolar Invariant A 2, 5 For a quadratic qx = b 0 x 2 + 2b 1 x + b 2 and quintic px = a 0 x 5 + 5a 1 x 4 + 10a 2 x 3 + 10a 3 x 2 + 5a 4 x + a 5 , the apolar expression A 2,5 qx, px = b 2 a 3 − 2b 1 a 4 + b 0 a 5 − 3xb 2 a 2 − 2b 1 a 3 + b 0 a 4 + 3x 2 b 2 a 1 − 2b 1 a 2 + b 0 a 3 − x 3 b 2 a 0 − 2b 1 a 1 + b 0 a 2 can be proved to be an invariant similar to the invariants A 2,3 and A 3,5 . However the task of finding a quadratic qx apolar to a given quintic px involves solving the system
14
b 2 a 3 − 2b 1 a 4 + b 0 a 5 = 0 b 2 a 2 − 2b 1 a 3 + b 0 a 4 = 0 b 2 a 1 − 2b 1 a 2 + b 0 a 3 = 0 b 2 a 0 − 2b 1 a 1 + b 0 a 2 = 0 which will generally yield a nonzero qx only in special cases. This agrees with Rota’s Theorem, which requires that n < 2k, which does not follow when n = 5 and k = 2. A nonzero solution to this overdetermined system, which can be solved by use of the augmented matrix a 3 −2a 4 a 5 0 a 2 −2a 3 a 4 0 a 1 −2a 2 a 3 0 a 0 −2a 1 a 2 0 which reduces to a3 0
−2a 4 −2a 23
+ 2a 2 a 4
0
0
0
0
a5
0
a4a3 − a2a5
0
−2a 33 + 4a 3 a 2 a 4 + 2a 1 a 5 a 3 − 2a 22 a 5 − 2a 1 a 24 0 0
0
only occurs if − 2a 33 + 4a 3 a 2 a 4 + 2a 1 a 5 a 3 − 2a 22 a 5 − 2a 1 a 24 = 0 , which is true only in special cases. Example For px = 2x 5 − 5x 4 + 130x 3 − 190x 2 + 485x − 211 , px = 2x 5 + 5−1x 4 + 1013x 3 + 10−19x 2 + 597x − 211, so a 0 = 2, a 1 = −1, a 2 = 13, a 3 = −19, a 4 = 97, a 5 = −211. The polynomial − 2a 33 + 4a 3 a 2 a 4 + 2a 1 a 5 a 3 − 2a 22 a 5 − 2a 1 a 24 takes on the value −2−19 3 + 4−191397 + 2−1−211−19 = −213 2 −211 − 2−197 2 = 0 So, a 3 −2a 4 a 5 0 a 2 −2a 3 a 4 0 a 1 −2a 2 a 3 0 a 0 −2a 1 a 2 0
=
−19
−297
−211 0
13
−2−19
97
0
−1
−213
−19
0
2
−2−1
13
0
, which reduces to
15
1 0 6 0 0 1
1 2
0
0 0 0 0
, gives that b 2 = −6b 0 and b 1 = − 12 b 0 .
0 0 0 0 Therefore qx = b 0 x 2 + 2b 1 x + b 2 = b 0 x 2 − x − 6 is A 2, 5 − apolar to px for all b 0 . Previous methods springing from truths governing vector spaces apply to produce that px = x + 2 5 + x − 3 5 , which can be solved by radicals, in this special case. 6. Questions 1. Theorem 2 implies that a quadratic polynomial has repeated roots if and only if an apolar cubic polynomial also has repeated roots. Can a similar connection be found between an apolar cubic and quintic? 2. In light of Question 1, given that the discriminant of a polynomial provides information regarding whether the roots of the polynomial are distinct or not, what connection(s) can be found between an apolar invariant and the discriminant of a polynomial? 3. What connections might be found between apolar invariant theory and what Galois Theory tells us regarding the solvability of a 5th-degree polynomial equation? 4. Since the form generally available for an arbitrary quintic px is px = k 1 x − r 1 5 + k 2 x − r 2 5 + k 3 x − r 3 5 , what methods might there be for solving a quintic polynomial equation of this form k 1 x − r 1 5 + k 2 x − r 2 5 + k 3 x − r 3 5 = 0 ? 5. Given that special cases of the general apolar expression A k, n have been proven to be invariants, can a general rigorous proof be found to show that A k, n is an invariant for all positive integers k and n ? References [1] Joseph P.S. Kung, Gian-Carlo Rota, The Invariant Theory of Binary Forms, Bulletin (New Series) of the American Mathematical Society, 10 (1984), 27-85. ISSN: 0273-0979 DOI: http://dx.doi.org/10.1090/S0273-0979-1984-15188-7 [2] Gian-Carlo Rota, Invariant Theory, Old and New, Colloquium Lecture delivered at the Annual Meeting of the American Mathematical Society and Mathematical Association of America, Baltimore MD, January 8, 1998. Unpublished manuscript.
16