On Markov's Undecidability Theorem for Integer

Vesa Halava | Tero Harju

On Markov’s Undecidability Theorem for Integer Matrices

TUCS Technical Report No 758, March 2006

On Markov’s Undecidability Theorem for Integer Matrices Vesa Halava

Department of Mathematics and TUCS - Turku Centre for Computer Science University of Turku FIN-20014 Turku, Finland. [email protected]

Tero Harju Department of Mathematics and TUCS - Turku Centre for Computer Science University of Turku, FIN-20014 Turku, Finland. [email protected]

TUCS Technical Report No 758, March 2006

Abstract We study a problem considered originally by A. Markov in 1947: Given two matrix semigroups, determine whether or not they contain a common element. This problem was proved undecidable by Markov for 4 × 4 matrices, even in the very restrict form, and for 3 × 3 matrices by Krom in 1981. Here we give a new proof in the 3 × 3 case which gives undecidability in an almost as restricted form as the result of Markov.

Keywords: Integer matrix; undecidability; semigroup; common element

TUCS Laboratory Discrete Mathematics for Information Technology Laboratory

1

Introduction

There exists a fundamental connection between word semigroups and the multiplicative semigroup of 2×2 integer matrices. Consider a binary alphabet Γ = {a, b}, and the following matrices 1 1 1 0 A= and B = 0 1 1 1 It was proved by J. Nielsen in 1924 [8], that the matrix semigroup hA, Bi generated by A and B is free. Since the semigroup Γ+ of all nonempty words over Γ is free, defining the mapping β : Γ+ → hA, Bi by β(a) = A and β(b) = B, we establish an isomorphism between these two free semigroups. Also, let β(ε) = I2 for the empty word ε, where I2 is the 2×2 identity matrix. We denote by Γ∗ = Γ+ ∪ {ε} the free monoid generated by Γ. In 1947 A. Markov [6] used this correspondence to prove an interesting simple undecidability result for matrices. He reduced the Post Correspondence Problem (PCP, for short) to 4 × 4 matrices. We recall the Post’s original definition for the PCP. Problem 1 (Post Correspondence Problem (PCP)). Let Γ = {a, b} be a binary alphabet. Given a set of n pairs of words, {(ui , vi ) | ui , vi ∈ Γ∗ , i = 1, . . . , n}, does there exist a nonempty sequence i1 , . . . , ik of indices from {1, . . . , n} such that ui 1 ui 2 · · · u i k = v i 1 v i 2 · · · v i k ? (1.1) The PCP can also be expressed using morphisms of words. For an instance {(ui , vi ) | 1 ≤ i ≤ n} ⊆ Γ∗ × Γ∗ of the PCP, let Σ = {a1 , a2 , . . . , an } be an alphabet and define two morphisms h, g : Σ∗ → Γ∗ by h(ai ) = ui and g(ai ) = vi for all i = 1, 2, . . . , n. Now the original form of the PCP is equivalent to the following problem. Problem 2 (PCP). Given two morphisms h, g : Σ∗ → Γ∗ , does there exist a nonempty word w ∈ Σ+ such that h(w) = g(w) ?

(1.2)

A given pair (h, g) of morphisms is an instance of the PCP. A word w with h(w) = g(w) is called a solution of the instance (h, g). The size of an instance (h, g) is the cardinality of the domain alphabet, i.e., the size is equal to |Σ|, when h, g : Σ∗ → Γ∗ . The following theorem was proved by E. Post in 1946 [10]. 1

Theorem 1. The PCP is undecidable. Next we shortly describe Markov’s construction. Let (h, g), for h, g : Σ∗ → Γ∗ , be an instance of the PCP. For all letters ai ∈ Σ, define matrices Zi = β(h(ai ))

and

Zi0 = β(g(ai )),

where Σ = {a1 , a2 , . . . , an }, Γ = {a, b} and the morphism β is defined in the above. Now the PCP can be written into following form. Given two sets {Z1 , Z2 , . . . , Zn }

and

{Z10 , Z20 , . . . , Zn0 }

of nonnegative integer matrices, determine whether or not there exist a sequence i1 , i2 , · · · , ik with k ≥ 1 such that Zi1 Zi2 · · · Zik = Zi01 Zi02 · · · Zi0k . The following undecidability result, stated by Markov [6], now follows. Theorem 2. Given two sets {X1 , X2 , . . . , Xn } and {Y1 , Y2 , . . . , Yn } of 2 × 2 nonnegative integer matrices with determinant equal to 1, it is undecidable whether or not there exists a sequence i1 , i2 , · · · , ik with k ≥ 1 such that Xi 1 Xi 2 · · · X i k = Y i 1 Y i 2 · · · Y i k . Next Markov combined the matrices Zi and Zi0 for all i. Define the 4 × 4 nonnegative integer matrices Zi , for i = 1, 2, . . . , n, by the following block form Zi 0 Zi = . 0 Zi0 Note that det(Zi ) = 1 for all i. Next define two special matrices A 0 B 0 A= and B = . 0 A 0 B Now the PCP is equivalent to determining whether or not there exist a sequence i1 , . . . , ik with k ≥ 1 and ij ∈ {1, . . . , n} for all j, such that Zi1 Zi2 · · · Zik ∈ hA, Bi. Moreover, it follows that Theorem 3 ([6]). Given two sets {X1 , X2 , . . . , Xn } and {Y1 , Y2 } of 4 × 4 nonnegative integer matrices with determinant equal to 1, it is undecidable whether or not there exists a matrix X ∈ hXi | 1 ≤ i ≤ ni such that X ∈ hY1 , Y2 i. Moreover, it can be assumed that X2 , . . . , Xn and Y1 , Y2 are all fixed. 2

The final statement about fixing all but one matrix follows from the Post’s construction in [10] where he proved the undecidability of the PCP. In the next section, we shall give a reasoning for this, and state, moreover, that n = 7 is enough. Our main result is the proof of Theorem 3 in the case of 3 × 3 integer matrices. In our proof, the determinants are not equal to 1. Still, our matrices are non-singular (i.e., invertible), since their determinants are powers of 2, and actually, the matrices are upper-triangular matrices. We shall use a variant of the coding technique introduced by Paterson in [9], where he proved that it is undecidable for given sets of 3 × 3 integer matrices, whether or not the zero matrix belongs to the matrix semigroup generated by given 3×3 integer matrices. This problem is called the mortality problem, for a recent proof, see [2]. Note that the mortality problem is a special form of the problem in Theorem 3, where Y1 = 0 = Y2 . Still, the zero matrix is singular. In 1981 Krom proved a related problem to be undecidable for 3×3 integer matrices in [4], see [3] for a somewhat simpler proof. In Krom’s result both semigroups are generated by n 3 × 3 integer matrices, where n is the size of the instance of the PCP reduced to the problem. Krom used a variant of Paterson’s coding. We shall use another variant of it, which is actually also a variant of coding of R.W. Floyd as stated in [5], where it was shown that it is undecidable whether or not a finitely generated matrix semigroup of 3 × 3 integer matrices has an element having a zero as the (1, 2) element. Note also, that Floyd’s construction can be transformed to another one implying, that it is undecidable whether or not an element of 3 × 3 integer matrix semigroup has an element having zero in the right upper corner, see [3].

2

New proof of Markov’s theorem

Let Γ = {a1 , a2 } be a fixed binary alphabet. Define the mapping σ : Γ∗ → N by setting σ(ε) = 0 and σ(ai1 ai2 · · · aik ) =

k X

ij 2k−j .

j=1

It is easy to see that for any two words u, v ∈ Γ∗ σ(uv) = 2|v| σ(u) + σ(v).

(2.1)

Note also that σ is injective. Next define the mapping γ : Γ∗ × Γ∗ → Z3×3 by   1 σ(v) σ(u) − σ(v) 2|u| − 2|v|  . γ(u, v) = 0 2|v| 0 0 2|u| 3

This mapping is clearly injective, because σ is injective, and it is also a morphism, since for all u1 , u2 , v1 , v2 in Γ∗ , γ(u1 , v1 )γ(u2 , v2 )    1 σ(v1 ) σ(u1 ) − σ(v1 ) 1 σ(v2 ) σ(u2 ) − σ(v2 ) 2|u1 | − 2|v1 |  0 2|v2 | 2|u2 | − 2|v2 |  = 0 2|v1 | 0 0 2|u1 | 0 0 2|u2 |   1 σ(v2 ) + σ(v1 )2|v2 | σ(u2 ) − σ(v2 ) + σ(v1 )(2|u2 | − 2|v2 | )   +2|u2 | (σ(u1 ) − σ(v1 ))  = |v1 | |v2 | |v1 | |u2 | |v2 | |u2 | |u1 | |v1 |  0 2 2 2 (2 − 2 ) + 2 (2 − 2 ) 0 0 2|u1 | 2|u2 |   1 σ(v1 v2 ) σ(u1 u2 ) − σ(v1 v2 ) (2.1) 2|u1 u2 | − 2|v1 v2 |  = γ(u1 u2 , v1 v2 ). = 0 2|v1 v2 | 0 0 2|u1 u2 | Finally we define two special  1 1 A 1 = 0 2 0 0

matrices (for letters of Γ),    0 1 2 0 0 and A2 = 0 2 0 2 0 0 2

Note that, in the (1, 2)-entries, 1 = σ(a1 ) and 2 = σ(a2 ). We can now prove the following theorem. Theorem 4. Given non-singular 3 × 3 integer matrices Xi , i = 1, . . . , n, it is undecidable whether or not there exists a matrix X ∈ hXi | i = 1, . . . , ni such that X ∈ hA1 , A2 i. Proof. Let (h, g) be an instance of the PCP, where h, g : Σ∗ → Γ∗ . Define the matrices Ma = γ(h(a), g(a)) for all a in Σ. Let w = a1 · · · am , where ai ∈ Σ for each i. Since γ is morphism, we have for the matrix M = Ma1 Ma2 · · · Mam ,   1 σ(g(w)) σ(h(w)) − σ(g(w)) 2|h(w)| − 2|g(w)|  . M = 0 2|g(w)| 0 0 2|h(w)| The matrices in hA1 , A2 i are of the form   1 σ(v) 0 0 2|v| 0 0 0 2|v| where v ∈ Γ+ . Therefore, M ∈ hA1 , A2 i if and only if σ(h(w)) = σ(g(w))

and

|h(w)| = |g(w)|,

and, moreover, since σ is injective, if and only if h(w) = g(w)(= v), where w = a1 a2 . . . am . Now the claim follows from the undecidability of the PCP.

4

Next we strengthen Theorem 4 by giving a restriction to the number of matrices Xi involving in it. Also, we restrict the matrices X of the semigroup. It is known that the PCP is undecidable when |Σ| = 7. This was proved by Matiyasevich and Sénizergues in [7]. They actually proved that there exists a 3-rule semi-Thue system with undecidable individual word problem. A semi–Thue system T = (Σ, R) consists of an alphabet Σ = {a1 , a2 , . . . , an } and a relation R ⊆ Σ∗ × Σ∗ , the elements of which are called the rules of T . For two words u, v ∈ Σ∗ , we write u → − T v, if there are words u1 and u2 such that u = u1 xu2

and

v = u1 yu2

where (x, y) ∈ R.

Let → − ∗T be the reflexive and transitive closure of the relation →. Therefore, we have u → − ∗T v if and only if either u = v or there exists a finite sequence of words u = v1 , v2 , . . . , vn = v such that vi → − T vi+1 for each i = 1, 2, . . . , n − 1. In the individual word problem we are given a semi-Thue system T and a fixed word w0 and we ask, for input words w, whether or not w → − ∗T w0 holds. Now it is known that there exist a 3-rule semi-Thue system and a fixed w0 such that the individual word problem is undecidable, see [7]. The reduction from semi-Thue systems to the PCP is due to Claus [1]. Theorem 5. If there is a semi-Thue system with n rules having an undecidable word problem, then the PCP is undecidable for instances of size n + 4. We shortly recall Claus’s construction. First of all, we may suppose that Σ = {a, b}. This is due to the fact that there exists an encoding of arbitrary alphabet to a binary alphabet, for instance, ϕ : Σ → {a, b}, defined by ϕ(ai ) = abi a is one such encoding. So let T = ({a, b} , R) be a semi–Thue system, where the set of rules is R = {t1 , t2 , . . . , tn } with ti = (ui , vi ). We may suppose without restriction that the rules ti ∈ R are encoded by ϕ, i.e., ui , vi ∈ (abb∗ a)∗ . In the following we shall consider R also as an alphabet. Let f = aa be a special word used as a marker. Note that aa is not an image of ϕ. Let w, w0 ∈ {a, b}∗ be two given words, w being the input and w0 fixed. For a new letter d, let `d and rd be the desynchronizing morphisms: for a word u = a1 a2 · · · an with ai ∈ {a, b}, define `d (u) = da1 da2 d · · · dan

and

rd (u) = a1 da2 d · · · dan d .

In other words, `d is a morphism that adds d in front of every letter and rd is a morphism that adds d after every letter of a word. We define morphisms h, g : ({a, b, d, e} ∪ R)∗ → {a, b, d, e}∗ 5

by h(x) = `d (x), h(ti ) = `d (vi ), h(d) = `d (wf ), h(e) = de,

g(x) = rd (x), for x ∈ {a, b} , g(ti ) = rd (ui ), for ti ∈ R, g(d) = d, g(e) = r(f w0 )e,

It can be proved that the minimal solutions (those that are not catenations of shorter solutions) of the instance (h, g) are of the form dwe, where w ∈ ({a, b} ∪ R)∗ . Note that, as above, we always can find an equivalent instance of the PCP such that the morphisms are from ({a, b, d, e} ∪ R)∗ into Γ∗ = {a1 , a2 }∗ , using a variant of the encoding ϕ. Now, since only the image h(d) changes for different inputs w, and all minimal solutions are of the form dwe, and |R| = 3, we obtain Corollary 1. Let X1 , . . . , X5 and Y be fixed 3 × 3 non-singular uppertriangular integer matrices and let X be an input 3 × 3 non-singular uppertriangular integer matrix. It is undecidable whether or not there exists a matrix Z ∈ hX1 , . . . , X5 i such that XZY ∈ hA1 , A2 i. Proof. The claim follows from Theorem 4 when considering an instance of Claus’s construction. Here X = γ(h(d), g(d)) and Y = γ(h(e), g(e)). Note that a similar corollary can be obtained in the case of 4 × 4 matrices in Theorem 3. Note also that by the undecidability of the PCP for the instances of size 7 implies that in Krom’s proof both of the semigroups are generated by 7 matrices. We have another corollary for 3 × 3 matrices over rational numbers. Since all the matrices Z ∈ {X, Y, A1 , A2 , Xi | 1 ≤ i ≤ 5} in the above are non1 singular, we may multiply them by √ to get matrices with determinant 3 det Z 1. Now the matrices are not necessarily rational, but this can be fixed using the following idea of Dr. Matti Soittola. We replace the number 2 by 8 in the definition of mapping σ and in the definition of matrices γ(u, v), and in the diagonal of matrices A1 and A2 . In other words, we change the base of the number representation of the words from the alphabet {1, 2} from 2 to 8. It is now straight forward to see that the determinants of the matrices are powers of 8 implying that the cubic roots of the determinants are powers of 2, i.e., the matrices having determinant 1 in the above construction become rational. Corollary 2. Let X1 , . . . , X5 and Y be fixed 3 × 3 non-singular uppertriangular rational matrices with det(Xi ) = 1 = det(Y ) and X be an input 6

3 × 3 non-singular upper-triangular rational matrix such that det X = 1. It is undecidable whether or not there exists a matrix Z ∈ hX1 , X2 , . . . , X5 i such that XZY ∈ hA1 , A2 i. Acknowledgements. We are grateful to Dr. Matti Soittola for good comments that helped us to fix Corollary 2.

References [1] V. Claus. Some remarks on PCP(k) and related problems. Bull. EATCS, 12, (1980), 54–61. [2] V. Halava and T. Harju, Mortality in matrix semigroups, Amer. Math. Monthly 108 (2001), no. 7, 649–653. [3] T. Harju and J. Karhumäki. Morphisms. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 1. pp. 439–510, Springer-Verlag, 1997. [4] M. Krom. An unsolvable problem with product of matrices, Math. System Theory 14 (1981), 335-337. [5] Z. Manna, Mathematical theory of computations, McGraw-Hill, 1974. [6] A. Markov, On certain insoluble problems concerning matrices, Doklady Akad. Nauk SSSR (N. S.), 57, (1947), 539–542. [7] Y. Matiyasevich and G. Sénizergues, Decision problems for semi-Thue systems with a few rules, Theoret. Comput. Sci. 330 (2005), no. 1, 145– 169. [8] J. Nielsen, Die Gruppe der dreidimensionalen Gittertransformationen (Danish), Danske Vid. Selsk. Math.-Fys. Medd. 5, no. 12 (1924), 3–29. [9] M. S. Paterson, Unsolvability in 3 × 3 matrices, Studies in Applied Mathematics 49 (1970), 105–107. [10] E. Post, A variant of a recursively unsolvable problem, Bull. of Amer. Math. Soc. 52 (1946), 264–268.

7

Lemminkäisenkatu 14 A, 20520 Turku, Finland | www.tucs.fi

University of Turku • Department of Information Technology • Department of Mathematics

˚ Abo Akademi University • Department of Computer Science • Institute for Advanced Management Systems Research

Turku School of Economics and Business Administration • Institute of Information Systems Sciences

ISBN 952-12-1702-2 ISSN 1239-1891