Quantified Equality Constraints - Semantic Scholar

Quantified Equality Constraints∗ Manuel Bodirsky Laboratoire d’informatique (LIX, CNRS UMR 6171), ´ Ecole Polytechnique, France [email protected] Hubie Chen Departament de Tecnologies de la Informació i les Comunicacions Universitat Pompeu Fabra, Barcelona, Spain [email protected] May 30, 2008

Abstract An equality template is a relational structure with infinite universe whose relations can be defined by boolean combinations of equalities. We prove a complete complexity classification for quantified constraint satisfaction problems (QCSPs) over equality templates: these problems are in L (decidable in logarithmic space), NP-complete, or PSPACE-complete. To establish our classification theorem we combine methods from universal algebra with concepts from model theory.

1

Introduction

The constraint satisfaction problem (CSP) is the problem of deciding the truth of a primitive positive (pp) sentence ∃v1 . . . ∃vn (R(vi1 , . . . , vik ) ∧ . . .) over a finite relational signature, relative to a relational structure over the same signature. For a relational structure Γ, define CSP(Γ) to be the CSP where the relational structure is fixed to be Γ. Note that the instances of CSP(Γ) are syntactic objects, namely, the pp-sentences over the signature of Γ. When discussing problems of the form CSP(Γ), we call Γ the template. In a now oft-cited result, Schaefer [24] studied the family of problems CSP(Γ) over templates with a two-element universe. He proved a dichotomy theorem: every such problem CSP(Γ) is either in P, or is NP-complete. In recent years, there has been considerable research effort directed towards classifying the complexity of CSP(Γ) over all finite templates Γ [6–8, 11, 12, 14, 16]. One of Schaefer’s motivations was to provide a tool for developing NP-hardness proofs; in particular, one can show that a problem is hard by showing that it can simulate conjunctions of atomic formulas over any template Γ such that CSP(Γ) is known to be NP-hard. Another feature of his and subsequent work is that it places tractable cases of CSP(Γ) into a unified framework; the tractable cases of CSP(Γ) encompassed by Schaefer’s theorem include 2-SAT and Horn SAT, two classic tractable fragments of propositional logic. In recent work, Bodirsky and Kara [3] gave a systematic CSP(Γ) classification result for templates Γ over an infinite universe. They studied the natural and basic class of equality templates, defined to be templates where every relation is definable from equalities (x = y) and the usual ∗ An

extended abstract of this paper appeared in the Proceedings of LICS 2007

boolean connectives. They proved a complete complexity classification on equality templates, showing that each problem CSP(Γ) is either in P or NP-complete. In this paper, we consider the quantified constraint satisfaction problem (QCSP), the natural generalization of the CSP where universal quantification is permitted in addition to existential quantification. We establish a complete complexity classification for the problems QCSP(Γ) over equality templates, showing that each such problem is either in L (decidable in logarithmic space), NP-complete, or PSPACE-complete. We believe that this classification result can, in analogy to Schaefer’s theorem, serve as a tool for performing complexity analysis on logical formulas involving both quantifiers as well as other PSPACE problems. The formulas that we study can be viewed as syntactically restricted fragments of the first-order theory of equality. In fact, for the templates Γ that we show to be NP-complete, containment in NP follows from a result of Kozen [19] showing that positive first-order sentences have an NP-complete validity problem. To the best of our knowledge, our algorithmic result showing certain formulas to be decidable in logarithmic space, is new. The classification. We now describe the classes of formulas corresponding to the three complexity regimes of our classification. First, we say that a relation is negative if it can be defined by a quantifier-free equality formula that is the conjunction of equalities and disjunctions of disequalities x1 6= y1 ∨ · · · ∨ xk 6= yk . We extend this to templates by saying that a template is negative if all of its relations are negative. We show that, for all negative templates Γ, the problem QCSP(Γ) is in L. Let D be (here and throughout) a set of arbitrary infinite cardinality. Example 1.1 Let Γ1 be the relational structure (D; T, =), where T is the ternary relation T = {(x, y, z) ∈ D3 | (x 6= y ∨ y 6= z}. The relation T is negative, as it may be viewed as the conjunction of a single disjunction of disequalities. An example of an instance of QCSP(Γ1 ) is ∀y1 ∃x2 ∀y3 ∃x4 ∀y5 ∃x6 (T (x2 , y5 , x4 ) ∧ x4 = y1 ∧ T (y1 , x6 , y3 )) . It can be verified that the formula above is true in Γ1 . Since Γ1 is negative, it follows from our classification theorem that QCSP(Γ1 ) is in L. The second type of relations we identify are positive relations. We say that a relation is positive if it can be defined by a quantifier-free equality formula that uses only the positive connectives AND (∧) and OR (∨). As before, we say that a template is positive if all of its relations are positive. We show that, for all templates Γ that are positive but not negative, the problem QCSP(Γ) is NP-complete. Example 1.2 Let Γ2 be the relational structure (D; P ), where P is the ternary relation P = {(x, y, z) ∈ D3 | (x = y) ∨ (y = z)}. The relation P , and hence the constraint language Γ2 , is positive. It can be verified that Γ2 is not negative, and so our classification theorem implies that QCSP(Γ2 ) is NP-complete. Finally, we show that the remaining templates Γ–those not falling into either of the two previously identified classes–give rise to a problem QCSP(Γ) that is PSPACE-complete. Example 1.3 Let Γ3 be the relational structure (D; S), where S is the ternary relation S = {(x, y, z) ∈ D3 | (x = y ∧ y = z) ∨ (x 6= y ∧ y 6= z ∧ x 6= z)}. It can be verified that S is not negative nor positive, and hence, by our classification theorem, the problem QCSP(Γ3 ) is PSPACE-complete. 2

Techniques used. As we mentioned, the complexity of the CSP over equality templates has been classified recently [3]. This result was obtained by a universal-algebraic approach which was originally developed for studying the CSP over finite domains [6–8, 11, 14, 16]. The approach rests on studying certain closure properties, so-called polymorphisms, that a template has. The set of all polymorphisms of a template forms an algebraic object called a clone. For templates over infinite domains, this clone is locally closed. The result in [3] exhibits two polymorphisms for equality templates that imply polynomial-time decidability of the corresponding constraint satisfaction problems, and proves that the constraint satisfaction problem is NP-complete in all other cases. It is worth noting that the polymorphisms that guarantee polynomial-time tractability for the CSP do not guarantee tractability in the QCSP. To derive our complexity classification for the QCSP, we have to obtain deeper knowledge of the associated clones. In doing so, we develop a number of new techniques; several of them could be of independent interest in universal algebra. Indeed, our results reveal new information on the lattice of clones containing all permutations, which includes all clones corresponding to equality templates. Interestingly, some of our results utilize a mix of “relational” arguments, arguments directly on the relations of a template, and “operational” arguments, arguments on the polymorphisms of a template; see Section 8 as an example. In this paper, we also present a surjective preservation theorem which concerns the expressibility of a template from formulas that may use conjunction and arbitrary (both universal and existential) quantification. In particular, this theorem shows that a relation can be defined by a first-order formula of the described form over an ω-categorical template Γ if and only if it is preserved by all surjective polymorphisms of Γ. This theorem was previously shown for finite structures [5]. The proof that we give here relies on known preservation theorems in model theory [18].

2

Preliminaries

In this section we recall fundamental notions from logic and universal algebra; the terminology and notation we use is standard and can be found for instance in [15, 21] and [26]. Relational structures. A relational signature τ is a (in this paper always finite) set of relation symbols Ri , each of which has an associated finite arity ki . A relational structure Γ over the signature τ (also called τ -structure) is a set DΓ (the domain) together with a relation Ri ⊆ DΓki for each relation symbol Ri from τ . For simplicity, we use the same symbol for a relation symbol and the corresponding relation. If necessary, we write RΓ to indicate that we are talking about the relation R belonging to the structure Γ. First-order logic. Let τ be a signature. A τ -formula is a first-order formula whose atomic formulas are either of the form x = y or R(x1 , . . . , xk ) for R ∈ τ . A τ -sentence is a τ -formula without free variables. A first-order theory (over the signature τ ) is a set of first-order τ -sentences. If Γ is a τ -structure, then Th(Γ) denotes the first-order theory of Γ, that is, the set of all first-order τ -sentences that are true in Γ. Categoricity. A first-order theory theory T is called κ-categorical if all models of T (i.e., all structures that satisfy all sentences in T ) of cardinality κ are isomorphic, and T is called totally categorical if T is κ-categorical for all infinite cardinals κ. If Γ is a structure of cardinality κ, and if the first-order theory Th(Γ) is κ-categorical, we also say that Γ is κ-categorical. Automorphisms and Homomorphisms. Let Γ and Γ0 be τ -structures. Isomorphisms from Γ to Γ are called automorphisms. The set of all automorphisms of a structure Γ forms a group with respect to composition of automorphisms. An orbit of a k-tuple t in Γ is the set of all tuples of the form (π(t1 ), . . . , π(tk )), where π is an automorphism of Γ. 3

A homomorphism from Γ to Γ0 is a function f from DΓ to DΓ0 such that for each n-ary relation 0 symbol R in τ and each n-tuple (a1 , . . . , an ), if (a1 , . . . , an ) ∈ RΓ , then (f (a1 ), . . . , f (an )) ∈ RΓ . In this case we say that the map f preserves the relation R. Polymorphisms. Let D be a set, and let O be the set of finitary operations on D, i.e., functions from Dk to D for finite k. We say that a k-ary operation f ∈ O preserves an m-ary relation R ⊆ Dm if whenever R(xi1 , . . . , xim ) holds for all 1 ≤ i ≤ k in Γ, then R f (x11 , . . . , xk1 ), . . . , f (x1m , . . . , xkm ) holds in Γ. If f preserves all relations of a relational τ -structure Γ, we say that f is a polymorphism of Γ. Equivalently, f is a polymorphism of Γ if it is a homomorphism from Γk = Γ × · · · × Γ to Γ, where Γ1 × Γ2 is the (direct) product of the two relational τ -structures Γ1 and Γ2 [15]; this product is also called the categorical product or the cross product [14]. Hence, the unary bijective polymorphisms are the automorphisms of Γ. We use Pol(Γ) to denote the polymorphisms of Γ, and sPol(Γ) to denote the surjective polymorphisms of Γ. Clones. An operation π is a projection if for all n-tuples, π(x1 , . . . , xn ) = xi for some fixed i ∈ {1, . . . , n}. The composition of a k-ary operation f and k operations g1 , . . . , gk of arity n is an n-ary operation defined by f (g1 , . . . , gk )(x1 , . . . , xn ) = f g1 (x1 , . . . , xn ), . . . , gk (x1 , . . . , xn ) . A clone F is a set of operations defined on a set D (the domain of F ) that is closed under compositions and that contains all projections. If f is in the smallest clone that contains a set of operations F , we say that f is generated by F . It is easy to verify that the set Pol(Γ) of all polymorphisms of Γ is a clone with domain DΓ . Moreover, Pol(Γ) is also locally closed, a property that is defined as follows. We say that an operation f is interpolated by a set F if for every finite subset B of D there is some operation g ∈ F such that f (t) = g(t) for every t ∈ B k . The set of operations that are interpolated by F is called the local closure of F ; if F equals its local closure, we say that F is locally closed. In this paper we focus on clones on an infinite domain D that contain all permutations of D. When understandable from context, we slightly abuse terminology and say that an operation g on a countable infinite domain D is generated by a set of operations F (by an operation f ) on D if g is contained in the smallest locally closed clone containing F (containing f ) and containing all permutations of D. An operation f is essentially unary if there is a unary operation f0 such that f (x1 , . . . , xk ) = f0 (xi ) for some fixed i ∈ {1, . . . , k}. We say that a k-ary operation f depends on argument i if there is no k−1-ary operation f 0 such that f (x1 , . . . , xk ) = f 0 (x1 , . . . , xi−1 , xi+1 , . . . , xk ). We can equivalently characterize k-ary operations that depend on the i-th argument by requiring that there are elements x1 , . . . , xk and x0i such that f (x1 , . . . , xk ) 6= f (x1 , . . . , xi−1 , x0i , xi+1 , . . . , xk ). Hence, an essentially unary operation is an operation that depends on one argument only. More generally, an operation that depends on at least k arguments is called essentially k-ary. We will call an essentially k-ary operation having k ≥ 2 an essential operation, as is common in universal algebra. To conclude this section, we identify a simple but useful lemma. Lemma 2.1 A binary operation is essential if and only if there are values a, a0 , b, b0 such that f (a, b) 6= f (a0 , b) and f (a, b) 6= f (a, b0 ). Proof . Because f depends on the first argument, there are elements u, v, and u0 such that f (u, v) 6= f (u0 , v). Similarly, f depends on the second argument, and hence there are values x, y, and y 0 such that f (x, y) 6= f (x, y 0 ). Observe that f (x, v) can not be equal to both f (x, y) and f (x, y 0 ), and similarly, not be equal to both f (u, v) and f (u0 , v). Therefore, taking a = x and b = v, there exist values a0 , b0 such that f (a, b) 6= f (a0 , b) and f (a, b) 6= f (a, b0 ). Conversely, if there are such points a, a0 , b, b0 , then f clearly depends on both arguments.

4

3

Quantified Constraint Satisfaction

A template or constraint language is simply a relational structure; we typically refer to a relational structure Γ as a template or constraint language when we are interested in the computational problem QCSP(Γ), which is defined below. (We use the terms “template” and “constraint language” synonymously.) Recall that signatures τ in this paper are assumed to contain finitely many symbols. A first-order τ -formula is a quantified constraint formula if it has the form Q1 v1 . . . Qn vn (ψ1 ∧ . . . ∧ ψm ), where each Qi is a quantifier from {∀, ∃}, and each ψi is an atomic τ -formula (a formula x = y or R(x1 , . . . , xk ) for R ∈ τ ) that can contain both free variables and quantified variables from {v1 , . . . , vn }. The quantified constraint satisfaction problem for Γ, denoted by QCSP(Γ), is the problem of deciding, given a quantified constraint τ -formula without free variables, whether or not the formula is true in Γ. Note that both the universal and existential quantification is understood to take place over the entire universe of Γ. Let R be a k-ary relation and let Γ be a τ -structure. We say that R has a ∀∃∧-definition (or is ∀∃∧-definable) in Γ if there exists a quantified constraint formula φ with free variables x1 , . . . , xk such that R(x1 , . . . , xk ) = φ(x1 , . . . , xk ). We say that R has a pp-definition (or is pp-definable) in Γ if there exists a quantified constraint formula φ that uses only existential quantification with free variables x1 , . . . , xk such that R(x1 , . . . , xk ) = φ(x1 , . . . , xk ). Here, “pp” stands for “primitive positive”. We use [Γ] to denote the set of all relations that are ∀∃∧-definable from Γ, and we use hΓi to denote the set of all relations that are pp-definable from Γ. In the following, we frequently make use of the following result, which was shown in [5] for constraint languages over a finite domain, but which is purely syntactic and holds for infinite domain constraint languages as well. Lemma 3.1 Let Γ1 , Γ2 be constraint languages. If every relation in Γ1 has a ∀∃∧-definition in Γ2 (that is, Γ1 ⊆ [Γ2 ]), then QCSP(Γ1 ) is logarithmic space reducible to QCSP(Γ2 ). In this paper we focus on the QCSP of a fundamental class of highly symmetric ω-categorical constraint languages, called equality constraint languages. An equality-definable relation is a relation (on an infinite domain) that has a first-order definition by an equality-formula, i.e., a first-order formula where all atomic subformulas are of the form x = y. An equality constraint language is a relational structure on a countably infinite universe such that all of its relations are equalitydefinable relations. In general, we use the symbol D to denote an infinite domain. For every constraint language Γ over an infinite domain there exists a constraint language Γ0 over a countably infinite domain such that Γ and Γ0 have the same QCSP; this follows from downward Löwenheim-Skolem (see for example [15]). For equality constraint languages we even have that the complexity classification is the same for all infinite domains D. This can be seen as follows. Let D and D0 be two infinite sets (not necessarily of the same cardinality). Suppose that φ1 , . . . , φs are equality formulas, and let R1 , . . . , Rs be the relations defined by φ1 , . . . , φs over D, and let R10 , . . . , Rs0 be the relations defined by φ1 , . . . , φs over D0 . Then QCSP((D0 ; R1 , . . . , Rn )) and QCSP((D; R10 , . . . , Rn0 )) are the same computational problem. So, for the main result it would have been sufficient to prove the classification only for countable equality constraint languages, and to deduce the general case from the countable case; however, all lemmata and all proofs work independently from the domain size and so we make all statements for arbitrary infinite domains. All equality constraint languages admit quantifier elimination [15]: this is, all equality-formulas are logically equivalent to a quantifier-free equality-formula. Therefore, equality constraint languages can be introduced equivalently as those relational structures where all relations can be defined by a boolean combination of atoms of the form x = y. It is also straightforward to verify that the first-order theory of equality constraint languages is totally categorical. 5

When Γ is an equality constraint language with domain D, any permutation of D is an automorphism of Γ, that is, the automorphism group of Γ is the full symmetric group on D. It is easy to see that all injective unary operations on D are polymorphisms of an equality constraint language Γ on D. In fact, a relational structure is an equality constraint language if and only if it is preserved by all injective unary operations on D. The quantified constraint satisfaction problem for equality constraint languages over an infinite domain D is in PSPACE. This is closely related to the observation by Stockmeyer and Meyer that the first-order logic of equality has a validity problem that is contained in PSPACE [25]. For completeness, we sketch a proof that the problems under study are in PSPACE in Appendix B. Proposition 3.2 For every equality constraint language Γ, the problem QCSP(Γ) is in PSPACE.

4

The Surjective Preservation Theorem

It was proved in [5] that a relation R has a ∀∃∧-definition in a finite structure Γ if and only if R is preserved by all surjective polymorphisms of Γ. It turns out that the same characterization holds for ω-categorical structures Γ (Theorem 4.2). Our proof relies on the following model-theoretic preservation theorem, which follows from [18] (the result is not covered by standard text books as [10] or [15, 21]). We thank J. Keisler for helpful explanations. To deduce our result from general results in model theory, we need to some (standard) terminology; for completeness, it will be introduced in Appendix A. The proofs for the statements in this Section can be found in Appendix B. Theorem 4.1 (essentially from [18]) A formula φ is equivalent to a ∀∃∧-formula modulo a firstorder theory T if φ is preserved by surjective homomorphisms from finite direct products of models of T to models of T . A standard model-theoretic argument, which can be found in the appendix, allows one to deduce the desired preservation theorem for surjective polymoprhisms. Theorem 4.2 Let Γ be an ω-categorical structure. Then a relation R has a ∀∃∧-definition in Γ if and only if R is preserved by all surjective polymorphisms of Γ. The proof of Theorem 4.2 that is presented in the appendix also works for κ-categorical structures Γ for arbitrary cardinals κ (and without appeal to the theorem of Engeler, Ryll-Nardzewski, and Svenonius, which is used there), with the only difference that the statement of Theorem 4.2 holds only for relations R with a first-order definition in Γ. In fact, since we want to state all arguments directly for domains D of arbitrary cardinality, we will use the following theorem in the following (which, together with the theorem of Ryll-Nardzewski, directly implies Theorem 4.2). Theorem 4.3 Let Γ be an κ-categorical structure. Then a first-order definable relation R has a ∀∃∧-definition in Γ if and only if R is preserved by all surjective polymorphisms of Γ. Corollary 4.4 Let Γ1 , Γ2 be equality constraint languages over the same infinite domain D. If sPol(Γ2 ) ⊆ sPol(Γ1 ), then QCSP(Γ1 ) is logarithmic space reducible to QCSP(Γ2 ). Proof . Directly from Lemma 3.1 and Theorem 4.3.

5

Classification Theorem

This section presents the statement of our classification theorem for equality constraint languages. We first identify two properties of equality constraint languages that will be needed to state the theorem. The first property is that of being negative. 6

Definition 5.1 An equality-definable relation R is called negative if it is definable by a quantifierfree conjunction of equalities and disjunctions of disequalities. An equality constraint language Γ is negative if every one of its relations is negative. Example 5.2 Let S ⊆ D7 be the relation defined by S(x1 , x2 , x3 , x4 , x5 , x6 , x7 ) ≡ (x1 6= x2 ∨ x2 6= x3 ) ∧ (x5 6= x4 ∨ x4 6= x3 ) ∧ (x1 = x6 ) ∧ (x3 = x7 ). From the form of the definition, we can see that S is negative. The second property is that of being positive. Definition 5.3 An equality-definable relation R is positive if it is definable by a formula consisting of equalities and the connectives ∧ and ∨. An equality constraint language Γ is positive if every one of its relations is positive. Example 5.4 Let S ⊆ D4 be the relation defined by S(x1 , x2 , x3 , x4 ) ≡ (x1 = x2 )∨(x2 = x3 ∧x3 = x4 ). The relation S is positive. The following is the statement of our classification theorem. Theorem 5.5 (Classification theorem) Let Γ be an equality constraint language. • If Γ is negative, then the problem QCSP(Γ) is decidable in logarithmic space. • Else, if Γ is positive, then the problem QCSP(Γ) is NP-complete. • Else, the problem QCSP(Γ) is PSPACE-complete. In the second and third cases, the problem QCSP(Γ) is complete for the respective complexity class under logarithmic space reducibility. We now give a proof of the classification theorem which contains forward references to the needed results presented in the subsequent sections. This proof may well serve as a road map for the next four sections of the paper, which together establish the classification theorem. For ease of readability, in each of these sections, the main result of the section is stated as the first theorem. Our proof of the classification theorem only refers to these “first theorems”. Proof . If the constraint language Γ is negative, then Theorem 6.1 of Section 6 shows that QCSP(Γ) is decidable in logarithmic space. If the constraint language Γ is not negative but is positive, then Theorem 7.1 shows that QCSP(Γ) is NP-complete. Otherwise (that is, if the constraint language is not negative nor positive), Theorem 8.1 shows that [Γ] contains the relation I ⊆ D3 (defined in Section 8). By Theorem 9.1, we have that QCSP((D; I)) is PSPACE-hard; QCSP((D; I)) reduces to QCSP(Γ) by Lemma 3.1.

6

Negative Languages

In this section, we present an algorithmic result for negative constraint languages. Theorem 6.1 If Γ is a negative constraint language, then QCSP(Γ) is decidable in logarithmic space. Let Γ be negative, and let Φ be an instance of QCSP(Γ) with variables X = {x1 , . . . , xn }. If R(v1 , . . . , vk ) is a constraint in Φ, we use φR(v1 ,...,vk ) to denote the formula with variables v1 , . . . , vk that defines R as a conjunction of equalities and disjunctions of disequalities. We say that a disjunction of disequalities ψ appears in Φ if it is a member of one of the conjunctions φR(v) with R(v) a constraint of Φ. We will make use of the following graphs. • The undirected graph G= Φ is defined to have vertex set X, and for each pair {x, y} ⊆ X, the edge {x, y} is present if and only if there exists a constraint R(v) in Φ such that φR(v) contains x = y. 7

• Let ψ be a disjunction of disequalities appearing in Φ. The undirected graph Gψ Φ is defined to have vertex set X, and contains all edges of G= as edges; in addition, the edge {x, y} is Φ present if the disequality x 6= y is part of the disjunction ψ. Our logarithmic space decision procedure is based on the following characterization of true instances. This characterization is given by “forbidden paths”: an instance is true if the graphs ψ G= Φ and GΦ omit paths of certain types; otherwise, it is false. For the correctness proof of the following algorithm, but also for some arguments in Section 9, it will be convenient to view an instance Q1 v1 . . . Qn vn Ψ of the QCSP as a two-player game between a universal and an existential player. The game proceeds in rounds; in round i, the players assign a value from D to the variable vi . If Qi is existential, the existential player chooses the value for vi , and otherwise the universal player chooses the value. The existential player wins the game if after round n the constructed assignment satisfies all the constraints in Ψ, and otherwise the universal player wins. It is straightforward to verify that an instance of the QCSP is true if and only if the existential player has a winning strategy in the corresponding game. A variable vi is called earlier (later) than variable vj in an instance Q1 v1 . . . Qn vn Ψ of the QCSP if i < j (i > j). Theorem 6.2 Let Φ be an instance of QCSP(Γ) where Γ is negative. The formula Φ is false if and only if one of the following two conditions hold: 1. There exists a universally quantified variable that is connected in G= Φ to an earlier variable. 2. There exists a disjunction of disequalities ψ(z1 , . . . , zk ) appearing in Φ such that for all existentially quantified variables x ∈ {z1 , . . . , zk }, either (a) x is connected in G= Φ to an earlier variable from {z1 , . . . , zk }, or (b) x is the earliest variable among the variables in {z1 , . . . , zk } that are connected to x in Gψ Φ. Proof . We first show that if one of these two conditions hold, the the formula is false, by proving that the universal player has a winning strategy. If Condition 1 applies, she can play a fresh value at the universal variable that is connected in G= Φ to a previous variable, and clearly wins. Now, suppose that Condition 2 applies to a disjunction of disequalities ψ appearing in Φ. For z = from z1 , . . . , zk , let C ψ (z) be the connected component of z in Gψ Φ , and let C (z) be the connected = component of z in GΦ . Then we have: for all existential variables x ∈ {z1 , . . . , zk }, either the earliest variable in C = (x) equals the earliest variable in C ψ (x), or the earliest variable in C = (x) is a universally quantified variable. If both these conditions are violated, then the earliest variable in C = (x) is existential and not the earliest variable in C ψ (x). Thus, this earliest variable violates both (a) and (b) in Condition 2, a contradiction. Now, if the universal player has to play a value for a universally quantified variable y from z1 , . . . , zk , she plays the value of the earliest variable in C ψ (y) if the earliest variable has already been set; otherwise, y is the earliest variable in C ψ (y) and can be set arbitrarily. For all other variables she can play arbitrarily. Suppose that there is a strategy for the existential player that wins against the above strategy for the universal player. Consider a connected component C in Gψ Φ . The universal variables in C will all be set to the value a of the earliest variable in C. Let x be an existential variable in C. If existential player wins, all variables in C = (x) must have the same value. Consider the earliest variable z in C = (x). By the observation of the last paragraph, z equals the earliest variable in C or is universal. In the first case, we know that a is assigned to z. If z is universal, then the strategy for the universal player also assigned a to z. Hence, all values in C are mapped to a. Since this holds for all connected components C in Gψ Φ , the disjunction of disequalities is falsified, contradicting the assumption that existential player wins. If Condition 1 and 2 do not apply, then the existential player has the following winning strategy. If he has to play a value for x, and x is connected in G= Φ to an earlier variable, he plays the value of this earlier variable. Otherwise, he picks a fresh value. We have to show that this strategy wins against any strategy for the universal player. We first show that the constraints of the form 8

x = y in Φ are satisfied in that way. Since condition 1 does not apply, for each component of G= Φ, all variables–except possibly the earliest–are existentially quantified. Hence, the given existential strategy satisfies all equality constraints x = y. Next, consider a disjunction of disequalities ψ appearing in Φ. Since condition (2) does not hold, there exists an existentially quantified variable x of ψ such that neither (2a) nor (2b) holds. Since (2a) fails, x is (by the described strategy) set to a value different from all variables earlier than x; but, since (2b) fails, x is connected to an earlier variable z in Gψ Φ . Since x and z are set to different values, there exists an edge on the path from x to z that must have vertices set to different values. We claim that this edge does not correspond to an equality edge in Gψ Φ . Due to condition (1), the later variable (of the edge) can not be universal; but if the later variable is existential and the edge is in Gψ Φ , the existential player would have set its value to the value of the earlier variable. Therefore, the edge corresponds to an disequality in ψ, and this disequality is satisfied. Proof . (Theorem 6.1) We use the recent result that L=SL [23]. From this result, a problem is in L if it can be Turing-reduced to undirected graph reachability, USTCON. The undirected connectivity queries that we are going to use will be exclusively with respect to graphs of the form ψ G= Φ or GΦ where ψ is a disjunction of disequalities appearing in Φ. Condition 1 and 2 can be verified in logarithmic space, once we have an oracle that decides ψ whether two specified vertices are connected in G= Φ or in GΦ , for a disjunction of disequalities ψ. These graphs can be constructed with logarithmic work-space. The first condition needs to be tested for all pairs of variables, and the second condition for all disjunctions of disequalities, and all pairs of variables. Hence, the only non-constant additional space is needed for storing the next variables and constraints that will be tested, which is possible in logarithmic space. We want to remark that every constraint language that contains a relation symbol for = has a quantified constraint satisfaction problem that is hard for L, since the problem USTCON can be simulated by such a QCSP (using two universal quantifiers).

7

Positive Languages

This section is devoted to the proof of the following. Theorem 7.1 Let Γ be a positive constraint language that is not negative. Then QCSP(Γ) is NP-complete. We begin by presenting a proposition that gives a number of useful characterizations of positivity of constraint languages. Definition 7.2 We say that a formula φ in conjunctive normal form is reduced, if removing a literal or a clause from φ results in a formula that is not equivalent to φ. Clearly, every formula is equivalent to a formula in reduced form. Proposition 7.3 Let Γ be an equality constraint language over domain D. The following are equivalent: 1. Γ is positive 2. Pol(Γ) contains all unary operations on D 3. Pol(Γ) contains a non-injective unary operation having infinite image 4. sPol(Γ) contains a non-injective unary operation 5. If φ is a reduced formula that defines a relation from Γ, then φ does not contain a disequality. 6. [Γ] does not contain the relation 6= 9

7. sPol(Γ) contains an operation that generates a constant operation. The proof of this proposition anticipates concepts and arguments from a forthcoming paper on the lattice of locally closed clones that contain all permutations [2]. The proof for the implication from (7) to (4) is due to Michael Pinsker. The following known result gives the complexity upper bound on positive constraint languages. Theorem 7.4 (follows from [19]) For any positive constraint language Γ, the problem QCSP(Γ) is in NP. We mention that recently, the present authors gave an alternative, polymorphism-based proof of Theorem 7.4 [1]. We now show for one particular positive constraint language Γ that CSP(Γ) is NP-complete. Let T be the ternary relation defined by T (x, y, z) ≡ (x = y) ∨ (y = z). Clearly, T is positive. Proposition 7.5 The problem QCSP((D; T )) is NP-hard. Proof . Let L be the relation T applied to the two-element domain {0, 1}, that is, define L = {(a, b, c) ∈ {0, 1}3 : (a = b) ∨ (b = c)}. We show that the CSP over constraint language Γ = (D, L, {0}, {1}) reduces to QCSP((D, T )). Take an instance P of this CSP with variables {x1 , . . . , xn }. We create an instance P 0 of the QCSP with quantifier prefix ∀y0 ∀y1 ∃x01 . . . ∃x0n with the following constraints: • for each constraint L(xi , xj , xk ) in P , we create a constraint T (x0i , x0j , x0k ), • for each constraint {0}(xi ) in P , we create a constraint y0 = x0i , • for each constraint {1}(xi ) in P , we create a constraint y1 = x0i , • for each variable xi ∈ {x1 , . . . , xn } in P , we create a constraint T (y0 , x0i , y1 ). Notice that the instance P 0 may have constraints of the form ya = x0i , and so might not be an instance of QCSP((D, T )). However, we can easily compute an equivalent instance P 00 of QCSP((D, T )) from P 0 , where constraints ya = x0i are removed simply by 1) replacing all instances of x0i in constraints with ya , and 2) removing x0i from the quantifier prefix. We now argue that P 0 is true if and only if P is satisfiable. Suppose that P 0 is true. Let a0 , a1 be two values of the domain that are not equal, and set y0 = a0 and y1 = a1 . Since P 0 is true, there exists an assignment f 0 to the x0i that make all constraints of P 0 true. By the last group of constraints in P 0 , every single variable x0i is equal to either a0 or a1 . It is straightforward to verify that the mapping f : {x1 , . . . , xn } → {0, 1} where f (xi ) = 0 if f 0 (x0i ) = a0 , and f (xi ) = 1 if f 0 (x0i ) = a1 , is a satisfying assignment for P . Conversely, suppose that P is satisfiable by the assignment f : {x1 , . . . , xn } → {0, 1}. Then, for any assignment to the universal variables y0 , y1 , the assignment to the x0i that sets x0i to yf (xi ) satisfies the constraints of P 0 . Finally, we wish to show that the CSP over Γ is NP-hard. By [17], it suffices to show that Γ does not have as polymorphism one of the operations {0, 1, ∧, ∨, m, a} where 0 and 1 are the constant operations, ∧ and ∨ are the boolean AND and OR operations, respectively, m : {0, 1}3 → {0, 1} is the majority operation m(x, y, z) = (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z), and a : {0, 1}3 → {0, 1} is the affine operation a(x, y, z) = x ⊕ y ⊕ z. Clearly, Γ does not have either of the constants 0, 1 as polymorphism: 1 is not a polymorphism of {0}, and 0 is not a polymorphism of {1}. The operation ∧ is not a polymorphism of T because (1, 1, 0) ∧ (0, 1, 1) = (0, 1, 0). The operation ∨ is not a polymorphism of T because (0, 0, 1) ∨ (1, 0, 0) = (1, 0, 1). The operation m is not a polymorphism of T because m((1, 1, 0), (0, 1, 1), (0, 0, 0)) = (0, 1, 0). And, the operation a is not a polymorphism of T because a((1, 1, 0), (0, 1, 1), (0, 0, 0)) = (1, 0, 1). We now present a lemma that deals with positive languages preserved by an essential and surjective operation. The proof is based on known combinatorial results from universal algebra, and due to Michael Pinsker (and considerably simplifies the original proof of the authors). It can be found in the appendix. 10

Lemma 7.6 Let Γ be a positive constraint language that is preserved by an essential operation on D with infinite image. Then Γ is preserved by all operations. In particular, such a constraint language is preserved by all binary injective operations. It was shown in [3] that in this case all relations of Γ can be defined by (quantifier-free) Horn formulas over (D, =), i.e., by quantifier-free formulas in conjunctive normal form where each clause contains at most one positive literal. It will later be convenient to have the following slight strengthening of this result. Proposition 7.7 If φ is a reduced definition of a relation preserved by an injective binary operation, then φ is a Horn formula. Proof . Let φ be a reduced formula that defines R (Definition 7.2). We show that no clause in φ contains two equalities. Assume otherwise that there exists a clause ψ of φ which contains two equalities xs = xt and xi = xj . Construct φ0 from φ by removing the equation xs = xt , and φ00 by removing xi = xj . There exist a, b ∈ Dn such that φ(a) but not φ0 (a), and φ(b) but not φ00 (b). Clearly, as = at , ai 6= aj , bs 6= bt , and bi = bj . Because R is preserved by g, it contains a tuple c = g(a, b) where cs 6= ct and ci 6= cj . But then φ(c) does not hold, a contradiction. We can now prove Theorem 7.1. Proof . (Theorem 7.1) The containment of QCSP(Γ) in NP is given by Theorem 7.4. We prove NP-hardness of QCSP(Γ) as follows. If all operations in sPol(Γ) are essentially unary, then they all preserve T . Hence, by Corollary 4.4, the problem QCSP((D, T )), which is NP-hard by Proposition 7.5, reduces to QCSP(Γ). Otherwise, sPol(Γ) contains an essential operation, and by Lemma 7.6 Pol(Γ) contains all operations on D. It then follows from Proposition 7.7 that Γ is negative.

8

Analysis of Non-Positive Languages

We have already proved the classification theorem for positive languages. We now show that if a language is not positive, then it is either negative or it can express the relation I defined by I(x, y, z) ≡ ((x = y) → (y = z)) . In Section 9, we finally show the QCSP for the relation I is PSPACE-hard. Theorem 8.1 Let Γ be a constraint language that is not positive. Then either Γ is negative or [Γ] contains I. It will be helpful to consider the relation S = {(x, y, z) | (x = y ∧ y = z) ∨ (x 6= y ∧ y 6= z ∧ x 6= z} containing all tuples whose values are either all equal, or all different, and the relation T = {(a, b, c, d) | a = b 6= c = d ∨ the values a, b, c, d are pairwise distinct} . Note that both relations consist of exactly two orbits of tuples. This will be relevant due to the following general lemma (which holds for arbitrary structures). Lemma 8.2 Let R be a k-ary relation that consists of m orbits of k-tuples. Then every operation f that violates R generates an m-ary operation that violates R. Proof . Let f 0 be an operation of smallest arity l that is generated by f and that violates R. Then there are k-tuples t1 , . . . , tl ∈ R such that f 0 (t1 , . . . , tl ) ∈ / R. For l > m there are two tuples ti and tj that lie in the same orbit of k-tuples, and therefore there is a permutation h such that h(tj ) = ti . By permuting the arguments of f 0 , we can assume that i = 1 and j = 2. 11

Then the (l − 1)-ary operation g defined as g(x2 , . . . , xl ) := f 0 (h(x2 ), x2 , . . . , xl ) also violates R, a contradiction. Hence, l ≤ m. In case that l = m, we are done. In case that l < m, the result also follows, because we can then always obtain an m-ary operation that violates R by composing f 0 with projections. Lemma 8.3 The relation S can pp-define the relation I. Proof . I(u, v, w) ≡ ∃x∃x0 (S(u, v, x) ∧ S(u, v, x0 ) ∧ S(x, x0 , w)) is a pp-definition of I. We verify this as follows. Consider the pp-definition. If u and v are equal, then by the predicates S(u, v, x) and S(u, v, x0 ), we must have that u = v = x = x0 , and then by the predicate S(x, x0 , w), it must hold that u = v = w. Moreover, if u, v, w are all equal, then setting x and x0 to the same value as the values of u, v, w, we have that all three predicates are satisfied. If u and v are not equal, then for any assignment to w, if x and x0 are set to values that are pairwise distinct from each other and from each of u, v, w, then all three of the predicates are satisfied. Lemma 8.4 The relation T can pp-define the relation I. Proof . It suffices to show that T expresses the relation S of the previous lemma (Lemma 8.3), by that lemma. We claim that S has the following pp-definition. S(u, v, x) ≡ ∃y∃z(T (u, v, y, z) ∧ T (v, x, y, z)) . We verify this as follows. If u = v = x all have the same value then setting y = z to be a distinct value results in both of the predicates being satisfied. If u, v, and x have pairwise distinct values then setting y and z to have values pairwise distinct from each other and from the values of u, v, x, also results in both of the predicates being satisfied. We now prove that all other tuples do not satisfy the given pp-formula. Suppose that u = v. Then it must hold, by the first predicate T (u, v, y, z), that y = z. This in turn implies, by the second predicate T (v, x, y, z), that v = x. Thus u = v = x. Next, suppose that u 6= v. By the first predicate, it must hold that u, v, y, z are all pairwise distinct, and then the second predicate implies that v, x, y, z are all pairwise distinct, so we have that x 6= u and x 6= v, that is, u, v, x are pairwise distinct. We will use a result of [3], which was stated there for operations over a countable domain only, but whose proof applies to operations over any infinite domain as well. Theorem 8.5 (of [3]) Let g be an essential operation that preserves 6=. Then g generates all injective operations. Lemma 8.6 If R is not negative and preserved by a binary injective operation, then the relation I has a pp-definition in (D, R, 6=). Proof . Let φ be a reduced definition of R. Proposition 7.7 shows that φ is a Horn formula. Among all reduced definitions, we choose one that minimizes the vector telling us how many clauses there are of each length, in a lexicographical sense: we prefer any number of length 3 clauses to a length 4 clause, for instance. Since R is Horn but not negative, φ contains a clause C of the form xi1 = xj1 ∨ xi2 6= xj2 ∨ · · · ∨ xik 6= xjk for k ≥ 2. Because φ is reduced, we know we have some tuple t ∈ R where xi1 = xj1 is true and all other literals in C are false, and we also have some tuple s ∈ R where xi2 6= xj2 is true and all other literals in C are false. Let R0 (x1 , . . . , xn ) be the relation defined by the formula φ(x1 , . . . , xn ) ∧

k ^ l=3

12

xil = xjl ,

and let R00 be the projection of R0 to the i1 th, j1 th, i2 th, and j2 th argument. Note that |{i1 , j1 , i2 , j2 }| ≥ 3 because φ is reduced, and therefore the arity of R00 is either 3 or 4. Because t ∈ R satisfies xi1 = xj1 and xi2 = xj2 , and also satisfies all the additional equations xil = xjl for l ≥ 3, the relation R00 contains a tuple t0 that satisfies xi1 = xj1 and xi2 = xj2 . The analogous argument for s shows that R00 contains a tuple s0 where xi1 6= xj1 and xj1 6= xj2 . First, consider the case that |{i1 , j1 , i2 , j2 }| = 3. By renaming variables, we can assume that j1 = i2 . Suppose that R00 contains a triple with three distinct entries. We know that R00 (xi1 , xj1 , xj2 ) does not contain a tuple of the form (x, x, y) for x 6= y, because such a tuple Vk violates C ∧ l=3 xil = xjl . But then the relation defined by R00 (x, y, z) ∧ R00 (y, z, x) ∧ R00 (z, x, y) only contains the tuple with three distinct entries and the tuple with three equal entries, and hence we have found a pp-definition of the relation S in (D, R, 6=). By Lemma 8.3 we are done. Suppose now that R00 (xi1 , xi2 , xj2 ) does not contain a tuple with three distinct entries. Then it is impossible that there is a tuple r ∈ R00 where xi1 6= xj2 , because the tuple obtained from applying the binary injective operation to r and s0 has three distinct entries. Thus the clause xi3 6= xj3 ∨ · · · ∨ xik 6= xjk ∨ xi1 = xj2 is entailed by φ. Hence, we could replace the clause C in φ by this shorter clause and obtain a formula that is equivalent to φ, because xi1 = xj2 also implies xi1 = xj1 ∨ xj1 6= xj2 . This contradicts the choice of φ. Now take the case where |{i1 , j1 , i2 , j2 }| = 4. Observe that R00 does not contain tuples with Vk xi1 6= xj1 and xi2 = xj2 , because such tuples violate C ∧ l=3 xil = xjl . If t0 ∈ R00 satisfies xi1 = xj1 6= xi2 = xj2 , then the tuple obtained from applying the binary injective operation to t0 and s0 has four distinct entries. It is easy to verify that the formula R00 (a, b, c, d) ∧ R00 (c, d, a, b) ∧ a 6= c ∧ a 6= c ∧ b 6= c ∧ b 6= d

(1)

is a pp-definition of the relation T (a, b, c, d), and we are done by Lemma 8.4. Now suppose that t0 ∈ R00 satisfies xj1 = xi2 and hence t0 is the tuple with four equal entries. If the tuple in R00 that has the most values has exactly two values, by the observation above each of these values must occur twice. So we have two forced equalities y = z and x = w, and we can observe as before that xi3 6= xj3 ∨ · · · ∨ xik 6= xjk ∨ y = z and xi3 6= xj3 ∨ · · · ∨ xik 6= xjk ∨ x = w are both entailed by the original formula, and we could replace the original clause with these two clauses. So suppose that the tuple in R00 that has the most values only has one value that occurs twice. We can suppose this tuple has the form (a, b, b, c), up to rearranging coordinates. We claim that the expression R00 (a, b, b, c) ∧ R00 (c, a, a, b) ∧ R00 (b, c, c, a) defines the relation S(a, b, c). Clearly, the tuple that contains three distinct values, and the tuple that contains three equal values satisfy the expression. For any other tuple there is a conjunct in the expression that has the form R00 (a, b, b, b) for a 6= b, and by the observation above the tuple is excluded. Finally, suppose that the tuple in R00 that has the most values contains four distinct values. We know that some tuple where one value occurs once and another value occurs three times is excluded (we have just shown this). If some tuple where two values occur twice is excluded from R00 , then the relation ^ ∃a R00 (π(a), π(b), π(c), π(d)) π

is a pp-definition of the relation S(b, c, d), where the conjunction is over all permutations π of {a, b, c, d}. Otherwise R00 contains all tuples where two values occur excactly twice. Similarly as before it is easy to verify that Expression (1) is a pp-definition of T (a, b, c, d), and we are again done by Lemma 8.4. Proof . (Theorem 8.1) Let Γ be neither positive nor negative, and assume for contradiction that I∈ / [Γ]. Because Γ is not positive, Proposition 7.3 implies that 6=∈ [Γ]. Because I ∈ / [Γ] Proposition 8.3 implies that S ∈ / [Γ]. Because the relation S consists of two orbits of triples, Lemma 8.2 implies that the polymorphism that violates S generates a binary 13

operation f that violates S. If f violates S, there must be two triples u, v in S such that in the triple w = f (u, v) exactly two entries are the same and one is distinct. If f was essentially unary, suppose without loss of generality that f depends on the first argument. Then u can not have three equal entries, and therefore all entries in u have distinct values. But then, to violate S the operation f must violate 6=. Therefore, every Γ has an essential polymorphism that preserves 6=. Because Γ is not negative, we can then apply Lemma 8.6 and obtain that I ∈ [Γ], a contradiction.

9

PSPACE-hardness

In this section, we prove that the QCSP over the relation I is PSPACE-hard. Theorem 9.1 QCSP((D; I)) is PSPACE-hard. Before giving the proof, we will need to introduce some notions and intermediate results. Definition 9.2 Let us say that a relation R ⊆ {0, 1}n is force-definable if there exists a prenex formula ΦR,v0 ,v1 (b0 , b1 , x1 , . . . , xn ) = Qφ over I such that: • Q is the quantifier prefix and φ is the quantifier-free part, • the free variables of ΦR are {b0 , b1 , x1 , . . . , xn }, • the quantifier prefix Q contains v0 and v1 as its innermost two variables, and they are both existentially quantified, • for any instantiation of the free variables {b0 , b1 , x1 , . . . , xn }, the formula Q(φ ∧ (v0 = b0 ) ∧ (v1 = b1 )) is true, • when t ∈ {0, 1}n , if b0 and b1 are set arbitrarily and xi is set to bti (for all i ∈ [n]), then: the formula Q(φ ∧ (v0 6= b0 ∨ v1 6= b1 )) is false if and only if t ∈ R • (monotonicity) for any setting to the free variables {b0 , b1 , x1 , . . . , xn }, if the formula Q(φ ∧ (v0 6= b0 ∨ v1 6= b1 )) is true, then changing the value of any variable xi to a value not equal to b0 nor b1 preserves the truth of the given formula. We call the formula ΦR,v0 ,v1 (b0 , b1 , x1 , . . . , xn ) the force-definition of R. The intuition behind this definition is the following. Thinking of the prenex formula as a two-player game, when t ∈ {0, 1}n is a tuple and the variables xi are set according to t (that is, by xi = bti ), if t ∈ R, then there is a way for the universal player to play so that v0 is forced to be equal to b0 , and v1 is forced to be equal to b1 ; and, if t ∈ / R, then it is not possible for the universal player to force both of these equalities simultaneously. Lemma 9.3 There exists a polynomial-time algorithm that, given a boolean circuit C as input, produces a force-definition of the relation RC containing, as tuples, the satisfying assignments of the circuit. Proof . It is known (see e.g. [11]) that from a boolean circuit C, it is possible to compute in polynomial time a primitive positive definition of RC over the relations N and P defined as N = {0, 1}3 \ {(1, 1, 1)} P = {0, 1}3 \ {(0, 0, 0)}.

14

Note that N contains all length 3 tuples with at least one coordinate having value 0 (the “negative” value), and P contains all length 3 tuples with at least one coordinate having value 1 (the “positive” value). Now, let ∃s1 . . . ∃sk C be a primitive positive definition of RC (x1 , . . . , xn ), where C is the conjunction of atomic formulas N (u1 , u2 , u3 ) and P(u1 , u2 , u3 ) with u1 , u2 , u3 ∈ {x1 , . . . , xn , s1 , . . . , sk }. Let us denote the atomic formulas N (u1 , u2 , u3 ) in C by N1 , . . . , Nl and the atomic formulas P(u1 , u2 , u3 ) in C by P1 , . . . , Pl . We now give a force-definition ΦRC ,v0 ,v1 (b0 , b1 , x01 , . . . , x0n ) of RC . Our force-definition has quantifier prefix: ∀s01 . . . ∀s0k ∃n1 . . . ∃nl ∃tn1 . . . ∃tnl ∃p1 . . . ∃pm ∃tp1 . . . ∃tpm ∃v0 ∃v1 The quantifier-free part of our force-definition has the following constraints: • For each constraint Ni = N (u1 , u2 , u3 ) in C, there is a constraint uj = b0 → ni = b0 for j = 1, 2, 3. Similarly, for each constraint Pi = P(u1 , u2 , u3 ) in C, there is a constraint uj = b1 → pi = b1 for j = 1, 2, 3. These constraints state that, for each constraint Ni (respectively, Pi ), if it is true, then the value ni must be set equal to b0 (respectively, pi must be set equal to b1 ). • There is a constraint n1 = b0 → tn1 = b0 and, for each i = 2, . . . , l, a constraint ni = tni−1 → tni = tni−1 . Similarly, there is a constraint p1 = b1 → tp1 = b1 and, for each i = 2, . . . , m, a constraint pi = tpi−1 → tpi = tpi−1 . The first sequence of constraints ensure that if all of the values ni are set equal to b0 , then the variable tnl must also be set equal to b0 ; also, if not all of the values ni are set equal to b0 , then the variable tnl may be set to any value. Similarly, the second sequence of constraints ensure that if all of the values pi are set equal to b1 , then the variable tpm must also be set equal to b1 ; and, if not all of the values pi are set equal to b1 , then the variable tpm may be set to any value. • There are constraints tnl = b0 → v0 = b0 and tpm = b1 → v1 = b1 . We verify that this is indeed a force-definition, as follows. Consider an assignment to the variables b0 , b1 , x01 , . . . , x0n where the x0i represent a tuple (x1 , . . . , xn ) ∈ {0, 1}n , that is, x0i = bxi for all i ∈ [n]. First suppose that (x1 , . . . , xn ) ∈ RC . Then, in the primitive positive definition of RC (x1 , . . . , xn ), there exists a boolean assignment to the variables s1 , . . . , sk under which all constraints in C are true. Consider the assignment to the variables s0i where s0i = bsi for all i ∈ [k]. We claim that any assignment to the remaining variables of the force-definition (all of which are existentially quantified) that satisfies the quantifier-free part must set v0 = b0 and v1 = b1 . Since every constraint Ni and Pi is satisfied by the assignment to {x1 , . . . , xn , s1 , . . . , sk }, by the first group of constraints in our force-definition, we have that all values ni must be set equal to b0 , and all values pi must be set equal to b1 . Then, by the second group of constraints, all values tni must be set equal to b0 , and all values tpi must be set equal to b1 . By the last two constraints, we must have v0 = b0 and v1 = b1 . Next suppose that (x1 , . . . , xn ) ∈ / RC . Then, for any setting to the variables s0i , there exists some variable ni that is not forced to b0 , or some variable pi that is not forced to b1 . Let us suppose that nj is not forced to b0 (the other case is similar). In this case, it is possible to satisfy the second group of constraints by setting tn1 = · · · = tnj−1 = b0 , and the variables tnj , . . . , tnl to a fixed value outside of {b0 , b1 }. Then, v0 can be set to any value. Note that all variables {p1 , . . . , pm , tp1 , . . . , tpm , v1 } can be set equal to b1 .

15

We now turn to the PSPACE-hardness proof. This proof takes inspiration from a PSPACEhardness proof for the quantified constraint satisfaction problem for finite templates having a semilattice operation without unit [4]. Proof . (Theorem 9.1) We reduce from succinct graph unreachability. In this problem, the input is a boolean circuit with 2c inputs, which represents a graph G whose vertices are the tuples in {0, 1}c ; as well as two distinguished vertices s, t ∈ {0, 1}c . There is an directed edge (x, y) in the graph if and only if the circuit returns true given input (x, y). The question is to decide whether or not there is a directed path in the graph from s to t. This problem is known to be PSPACE-complete [22]. Define Ri ⊆ {0, 1}2c to be the relation containing all tuples (x, y) such that there exists a directed path in G from x to y of length less than or equal to 2i . We claim that a forcerepresentation of Rc can be computed from the input circuit in polynomial time. We give a forcerepresentation for each Ri from i = 0, . . . , c, using a recursive construction; from the construction, it will be clear that Rc can be constructed in polynomial time. Representations of the Ri . To obtain a force-representation for R0 , we simply take the input circuit, augment it so that it always returns true on inputs (x, x), and appeal to Lemma 9.3. Now, we assume that we have a force-representation for Ri , and give a force-representation for Ri+1 . Let ΦRi ,z0 ,z1 (b0 , b1 , (u, v)) = Qi φi be a force-representation for Ri with the indicated variables. Let E denote the relation {(x, y) ∈ {0, 1}4c : ∀i, xi = yi }. By Lemma 9.3, the boolean relation E has force-representations ΦE,v00 ,v10 (z0 , z1 , ((u, v), (x, q))) = Q0 φ0 and ΦE,v000 ,v100 (z0 , z1 , ((u, v), (q, y))) = Q00 φ00 with the indicated variables. The following is a force-representation for Ri+1 ; note that x, y, q, u, v all denote vectors of length c. ΦRi+1 ,w0 ,w1 (b0 , b1 , (x, y)) = ∀qQ0 Q00 ∀u∀vQi ∃w0 ∃w1 φ0 ∧ φ00 ∧ φi ∧ (v00 = v000 → w0 = v000 ) ∧ (v10 = v100 → w1 = v100 )) First, suppose that there is a path from x to y of length less than or equal to 2i , that is, (x, y) ∈ Ri . (Here, we associate the boolean values 0, 1 with the values b0 , b1 .) Then, there exists a tuple q such that (x, q), (q, y) ∈ Ri−1 . We want to show that the universal player can play so that the existential player is forced to set w0 = b0 and w1 = b1 . Let the universal player play inside Q0 as if (u, v) = (x, q), so as to force v00 = z0 and v10 = z1 . Observe that if the existential player does not set v00 = z0 and v10 = z1 , then the universal player can falsify a constraint by subsequently setting (u, v) = (x, q). Similarly, the universal player can play inside Q00 in a way that (s)he forces v000 = z0 and v100 = z1 . We claim that, in fact, at this point in the game (after the variables in Q0 and Q00 have been set), it must be the case that v00 and v000 must both be set to b0 , and v10 and v100 must both be set to b1 . If we do not have (for example) v00 = b0 and v10 = b1 , then the universal player can set (u, v) = (x, q), and then, to satisfy the constraints of φi , the existential player must set z0 = b0 and z1 = b1 ; since we already established that the existential player must obey v00 = z0 and v10 = z1 , it follows that some constraint is falsified. Finally, observe that to satisfy the constraints (v00 = v000 → w0 = v000 ) ∧ (v10 = v100 → w1 = v100 ), the existential player must set w0 = b0 and w1 = b1 . The reduction.

Having computed a force-representation ΦRc ,w0 ,w1 (b0 , b1 , (x, y)) = Qc φc

16

of Rc , we now show how to use it to give an instance of QCSP((D; I)) that is true if and only if there is no path from s to t in the succinctly represented graph. The instance created is ∃b0 ∃b1 ∃x∃yQc (φc ∧ (b0 6= b1 ) ∧ (x = s) ∧ (y = t) ∧ (v0 6= b0 ∨ v1 6= b1 )) . Note that a disjunction a 6= b ∨ c 6= d, which is used in this expression, has the primitive positive definition ∃b0 ∃d0 ∃t (I(a, b, b0 )∧I(b, b0 , t)∧I(c, d, d0 )∧I(d, d0 , t)∧(b0 6= d0 )), and x 6= y has the ∀∃∧definition ∀z. I(x, y, z); therefore, these expressions can be used in our reduction (Lemma 3.1).

10

Concluding Remarks

We would like to remark that the classification of the computational complexity of QCSP(Γ) for equality constraint languages Γ presented in this paper is effective in the sense that there is an algorithm that decides in a finite amount of time whether a given constraint language (whose relations are represented by equality-formulas) has a QCSP in L or has a QCSP that is PSPACEcomplete or NP-complete. This is obvious from the syntactic descriptions of the corresponding equality constraint languages presented in the classification theorem. Acknowledgements We thank Jerome Keisler for helpful explanations concerning [18], and Michael Pinsker for friendly permission to state in the appendix some of the proofs from forthcoming joint work [2].

References [1] M. Bodirsky and H. Chen. Collapsibility in infinite-domain quantified constraint satisfaction. In Computer Science Logic (CSL), Szeged, Hungary, 2006. [2] M. Bodirsky, H. Chen, and M. Pinsker. The locally closed clones above the permutations. In preparation. [3] M. Bodirsky and J. K´ ara. The complexity of equality constraint languages. Theory of Computing Systems, to appear; an extended abstract appeared in the Proceedings of the International Computer Science Symposium in Russia (CSR’06), 2007. [4] F. Boerner, A. Bulatov, H. Chen, P. Jeavons, and A. Krokhin. The complexity of constraint satisfaction games and qcsp. Preprint NI06023, Isaac Newton Institute, Cambridge. Submitted for journal publication, 2006. [5] F. Boerner, A. Bulatov, A. Krokhin, and P. Jeavons. Quantified constraints: Algorithms and complexity. In Proceedings of CSL’03, LNCS 2803, pages 58–70, 2003. [6] A. Bulatov. Tractable conservative constraint satisfaction problems. In Proceedings of LICS’03, pages 321–330, 2003. [7] A. Bulatov and V. Dalmau. A simple algorithm for Mal’tsev constraints. SIAM J. Comp., 36(1):16– 27, 2006. [8] A. Bulatov, A. Krokhin, and P. G. Jeavons. Classifying the complexity of constraints using finite algebras. SIAM Journal on Computing, 34:720–742, 2005. [9] P. J. Cameron. Oligomorphic Permutation Groups. Cambridge University Press, 1990. [10] Chang and Keisler. Model theory. Princeton University Press, 1977. [11] N. Creignou, S. Khanna, and M. Sudan. Complexity Classification of Boolean Constraint Satisfaction Problems. SIAM Monographs on Discrete Mathematics and Applications. Society for Industrial and Applied Mathematics, 2001. [12] M. V. Emil W. Kiss. On tractability and congruence distributivity. In Proceedings of LICS’06, pages 221–230, 2006.

17

[13] L. Haddad and I. G. Rosenberg. 46(5):951–970, 1994.

Finite clones containing all permutations.

Canad. J. Math.,

[14] P. Hell and J. Neˇsetˇril. Graphs and Homomorphisms. Oxford University Press, 2004. [15] W. Hodges. A shorter model theory. Cambridge University Press, 1997. [16] P. Jeavons, D. Cohen, and M. Gyssens. Closure properties of constraints. Journal of the ACM, 44(4):527–548, 1997. [17] P. G. Jeavons. On the algebraic structure of combinatorial problems. Theoretical Computer Science, 200:185–204, 1998. [18] J. Keisler. Reduced products and Horn classes. Trans. Amer. Math. Soc., 117:307–328, 1965. [19] D. Kozen. Positive first-order logic is NP-complete. IBM Journal of Research and Development, 25(4):327–332, 1981. [20] A. I. Mal’tsev. A strengthening of the theorems of Slupecki and Iablonskii (Russian, english summary). Algebra i Logika (3), 6:61–75, 1967. [21] D. Marker. Model Theory: An Introduction. Springer, 2002. [22] C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994. [23] O. Reingold. Undirected st-connectivity in log-space. Electronic Colloquium on Computational Complexity (ECCC), 094, 2004. [24] T. J. Schaeffer. The complexity of satisfiability problems. In Proceedings of STOC’78, pages 216–226, 1978. [25] L. J. Stockmeyer and A. R. Meyer. Word problems requiring exponential time: Preliminary report. In STOC, pages 1–9, 1973. [26] A. Szendrei. Clones in universal Algebra. Seminaire de mathematiques superieures. Les Presses de L’Universite de Montreal, 1986.

18

A

Model-theoretic Terminology

Let T be a first-order theory. Then two formulas φ and ψ are equivalent modulo T iff φ(a) ↔ ψ(a) holds for all a ∈ Γ in all models Γ of T . Let Γ be a τ -structure with domain D. If A ⊆ D, then τA denotes the language obtained by adding to τ relations symbols Ra for each a ∈ A. We can naturally view Γ as a τA -structure by interpreting Ra by {a}. A set p of τA -formulas with free variables v1 , . . . , vn is called an n-type over A if p ∪ Th(Γ) is satisfiable. We say that p is a complete n-type if φ ∈ p or ¬φ ∈ p for all τA -formulas φ with free variables from v1 , . . . , vn . If a is from Dn , let p be the complete n-type that consists of all τA -formulas that hold for a. In this case we say that p is realized, and that a realizes p. For ω-categorical structures, we can identify the complete k-types in Γ with orbits of k-tuples in Γ. Moreover, can identify the types of Γ with relations that are first-order definable in Γ. These are essentially consequences of the Theorem of Ryll-Nardzewski [9,15]. In this sense, every k-type in Γ is a disjoint union of a set of complete k-types in Γ. A structure Γ with domain D is called saturated if for all subsets A ⊆ D of strictly smaller cardinality than D every n-type over A is realized in Γ.

B B.1

Omitted Proofs Proposition 3.2

Proof . (idea) To solve the problem QCSP(Γ) in polynomial space, we can use a standard backtracking algorithm which always branches on the outermost quantifier. The key point is that, when the algorithm is about to branch on a variable v, the previous variables form an equivalence relation (where two variables are equal simply if they were assigned the same value), and as far as satisfying equality constraints is concerned, all that matters is whether or not v is set equal to the value of one of the equivalence classes, or set equal to a new value. Clearly, at each branching point there are at most linearly many choices, and so the backtrack search may be performed in polynomial space.

B.2

Theorem 4.1

Proof . Let φ be a formula with free variables x1 , . . . , xk that is preserved by surjective homomorphisms from finite direct products of models of a first-order theory T to models of T . Let τ be the signature of T . Let a1 , . . . , ak be constant symbols that do not appear in τ , and let τ 0 be the expanded signature τ ∪ {a1 , . . . , ak }. It is clear that the sentence φ(a1 , . . . , ak ) is preserved by surjective homomorphisms from finite direct products of τ 0 -models of T to τ 0 -models of T . Assume the continuum hypothesis, so that for every τ 0 -structure Γ there is a saturated τ 0 structure ∆ such that Th(Γ) = Th(∆) where the domain of ∆ is finite or of cardinality continuum [10]. Let U be the set of all ∀∃∧-sentences over the signature τ 0 that are implied by T ∪ S. By Corollary 3.2 of [18] (with K the class of all models of T ∪ S), for every model M of U there is a homomorphic image N of a direct product of τ 0 -models of T ∪ S such that Th(N ) = Th(M ). Therefore every model of T ∪ U is a model of S. By compactness, some finite subset U 0 of U is equivalent to S modulo T . Substituting the occurences of a1 , . . . , ak in the sentences in U 0 by the variables x1 , . . . , xk and taking conjunctions, this shows that the formula φ is equivalent to a ∀∃∧-formula. The continuum hypothesis can be eliminated by the method in Section 6.3 of [10]. A simpler alternative is to use the method of Theorem 5.1 in [18] to show that if one can prove in set theory that S is preserved by homomorphic images of finite direct products of models of T , then one can prove that S is equivalent to a ∀∃∧-sentence modulo T .

19

B.3

Theorem 4.2

Proof . First, observe that every relation R with a ∀∃∧-definition in Γ is preserved by all surjective polymorphisms. This can be shown in a straightforward way by induction on the structure of the ∀∃∧-definition of R. To prove the converse, let R be a k-ary relation that is preserved by all surjective polymorphisms of Γ. In particular, R is preserved by all automorphisms of Γ, and the Theorem of Engeler, RyllNardzewski, and Svenonius (see e.g. [9, 15] implies that R has a first-order definition φ in Γ. We claim that if φ is not preserved by a surjective homomorphism f from a finite direct product of models Γ1 , . . . , Γl of T := Th(Γ) to a model Γ0 of T , then it is also not preserved by some l-ary surjective polymorphism of Γ. This follows by a standard application of the theorem of LöwenheimSkolem, which is given below for completeness. Let ∆ be the disjoint union of Γ0 , Γ1 , . . . , Γl . This is, ∆ is a structure whose domain is the disjoint union of the domains of Γ0 , Γ1 , . . . , Γl defined as follows. The signature of ∆ contains the signature of Γ, where the interpretation of the relation symbols is simply the union of the interpretations of the relation symbols in Γ0 , Γ1 , . . . , Γl . Additionally, ∆ has a relation symbol R that denotes the union of the relations defined by φ in Γ0 , Γ1 , . . . , Γl , and has unary relation symbols P0 , P1 , . . . , Pl , where Pi denotes the elements from Γi , for 1 ≤ i ≤ l. Finally, ∆ has a l-ary function symbol g, which is interpreted as follows. If v1 , . . . , vl are elements from ∆, where vi is from Γi for 1 ≤ i ≤ l, then g(v1 , . . . , vl ) denotes the value of the surjective homomorphism f from Γ1 × · · · × Γl to Γ0 . Note that there are l-tuples t1 , . . . , tm from R such that (f (t1 ), . . . , f (tm )) is not in R. For all other l-tuples from Γ, the interpretation of g is set arbitrarily to some constant from ∆. By downward Löwenheim-Skolem, T h(∆) has a countable model ∆0 . Note that in ∆0 , each Pi denotes a countable, but still necessarily an infinite set. It is also easy to see that the structures Γ01 , . . . , Γ0k induced by these sets P1 , . . . , Pk in ∆0 have the same first-order theory as Γ, and by ω-categoricity of Γ are isomorphic to Γ. The function symbol g denotes in ∆0 still an operation that maps tuples v1 , . . . , vk where vi ∈ Γ0i for 1 ≤ i ≤ k surjectively to Γ00 , and still the restriction of g to those tuples violates the relation R. Hence, the restriction of g to these tuples gives rise to a surjective polymorphism of Γ that violates R. We have just seen that if R is preserved by all surjective polymorphisms of Γ, then R is preserved by surjective homomorphisms from a finite direct product of models Γ1 , . . . , Γl of T to a model Γ0 of T . Theorem 4.1 implies that φ is equivalent to a ∀∃∧-formula modulo T . Hence, R has in Γ a ∀∃∧-definition.

B.4

Theorem 7.3

Proof . (1) ⇒ (2): Suppose that Γ is positive. Let f : D → D be any unary operation. We want to show that any relation R of Γ is preserved by f . Let t ∈ R be any tuple. Observe that, for any two coordinates i, j, if ti = tj then f (ti ) = f (tj ). Thus, any equalities that hold in t also hold in f (t), and since R has a definition that is positive in equalities, we have f (t) ∈ R. (2) ⇒ (3): Trivial. (3) ⇒ (4): Let f ∈ Pol(Γ) be a non-injective unary operation having infinite image. Let g be a bijection from the image of f to D. The restriction of g to any finite set can be extended to a permutation, so gf is contained in the smallest locally closed clone containing f and all permutations. The operation gf is clearly non-injective and surjective. (4) ⇒ (5): Let f be a non-injective surjective unary polymorphism of Γ, and let R be a relation of Γ. Let φ be any reduced formula that defines R; we claim that φ is positive. Assume for contradiction that φ contains a clause ψ with a negative literal xi 6= xj . Since φ is reduced, there exists a k-tuple t such that φ(t) holds and such that xi 6= xj is the only literal that is satisfied by t in the clause ψ. Let f be the unary operation that maps ti and tj to the same value and all other elements to distinct values. Then f (t) does not satisfy the literal xi 6= xj , and clearly also does not satisfy any other literal in ψ. Hence, φ is not preserved under f , a contradiction. (5) ⇒ (1): Trivial. (4) ⇒ (6): Every unary non-injective operation violates 6=. By the simple direction of Theorem 4.3, the relation 6= does not have a ∀∃∧-definition in Γ.

20

(6) ⇒ (7): Suppose that 6= does not have a ∀∃∧-definition in Γ. The surjective preservation theorem (Theorem 4.3) asserts that Γ has a surjective operation that violates 6=. It was shown in [3] that every operation that violates 6= generates a constant operation. (7) ⇒ (4): Let f be an operation with infinite image, and assume it generates a constant operation. We will show that f generates a unary non-injective operation that has infinite range. We assume that f is essential, otherwise there is nothing to show. We also assume that f depends on all of its variables. Case 1: There exist injective unary operations g1 , . . . , gn such that f (g1 , . . . , gn ) is injective. Choose S ⊆ D infinite such that D \ S and D \ (g1 [S] ∪ . . . ∪ gn [S]) are infinite. Since f does not preserve 6=, there exist c, d ∈ Dn such that ci 6= di for all 1 ≤ i ≤ n and such that f (c) = f (d). We may assume that those values are not in S ∪ g1 [S] ∪ . . . ∪ gn [S]. For all 1 ≤ i ≤ n, set gi0 = gi on S and gi0 (c1 ) = ci and gi0 (d1 ) = di . Write T = S ∪ {c1 , d1 }. On T , F = f (g10 , . . . , gn0 ) is a operation with infinite range which is not injective. Furthermore, the gi0 can be extended to permutations on D since they are injective and have co-infinite domains and ranges. This completes the first case. Case 2: f (g1 , . . . , gn ) is not injective for all injective unary operations g1 , . . . , gn . Since f is not constant, there exist unary injections u1 , . . . , un such that f (u1 , . . . , un ) is not constant. By assumption for this case we have that f (u1 , . . . , un ) is not injective. Thus, f (u1 , . . . , un ) generates a unary operation r which is constant except for one value, at which it takes another value. To see this, just observe that every non-constant unary operation generates this operation. Consider arbitrary unary g1 , . . . , gn such that f (g1 , . . . , gn ) is injective. Such operations exist since f has infinite range. Pick an infinite S ⊆ D on which each gi is either injective or constant. Observe that it is impossible that all those restrictions of gi are constant. Say the restriction of gi to S is constant with value ci for 1 ≤ i ≤ k, where 1 ≤ k < n. Since f depends on its first variable, there exist ck+1 , . . . , cn ∈ D and a1 , . . . , ak ∈ D such that f (a1 , . . . , ak , ck+1 , . . . , cn ) 6= f (c). For otherwise the value of f is determined by the values of the arguments xk+1 , . . . , xn , contradicting that f depends on x1 . Choose dk+1 , . . . , dn ∈ D such that f (c1 , . . . , ck , dk+1 , . . . , dn ) is distinct from both f (a1 , . . . , ak , ck+1 , . . . , cn ) and f (c); this is possible as F (xk+1 , . . . , xn ) = f (c1 , . . . , ck , xk+1 , . . . , xn ) takes infinitely many values. Set ui (0) = ai and ui (x) = ci if x 6= 0, for 1 ≤ i ≤ k. Let moreover vi (1) = di , and vi (x) = ci if x 6= 1, for all k + 1 ≤ i ≤ n. All ui and vi are generated by the two-valued operation r and hence by f . Set F (x) = f (u1 , . . . , uk , vk+1 , . . . , vn )(x). We have that F (0) = f (a1 , . . . , ak , ck+1 , . . . , cn ) and F (1) = f (c1 , . . . , ck , dk+1 , . . . , dn ) and F (2) = f (c) are pairwise distinct; also, F has finite range. Hence, f generates a unary operation F that takes exactly three values. Pick a, b ∈ Dn such that ai 6= bi for all 1 ≤ i ≤ n and such that f (a) = f (b). For all 1 ≤ i ≤ k, let hi be a unary operation satisfying hi (0) = ai , hi (1) = bi , and hi (x) = ci for all x ∈ / {0, 1}. Clearly, hi is generated by the unary operation F with three values, and hence also by f . For all k + 1 ≤ i ≤ n, let αi be any permutation that maps 0 to ai and 1 to bi . Now we set G(x) = f (h1 (x), . . . , hk (x), αk+1 (x), . . . , αn (x)) and have that G(0) = f (a) = f (b) = G(1) and G takes infinitely many values.

B.5

Lemma 7.6

We use a theorem of Haddad and Rosenberg [13] about clones on finite domains containing all permutations; they credit the proof ideas to Malt’sev [20]. Theorem B.1 (of [13]) Let S be a finite set, |S| > 2, and let f be an essential operation on S with an image of size k, for k ≥ 3. Then f together with the permutations of 1, . . . , N generates all operations on S having an image of size k. From this theorem concerning clones on finite domains we obtain our result about clones on infinite domains by local closure. Proof . By Proposition 7.3, the positive constraint language Γ is preserved by all unary operations on D. It is clear that for each k we can find a unary operation a such that a(f ) is still essential but takes exactly k values. 21

Now, let g be an l-ary operation on D. By local closure, it suffices to show that for every finite subset A the operation f generates an l-ary operation h such that g(x) = h(x) for all x ∈ Al . Let S be a finite subset of D such that A ⊆ S, |S| > 2, and the restriction f 0 of f to S is essential and takes exactly k values (it is clear that such a subset always exists). By composing f 0 with permutations we can assume that f 0 is an operation defined on S. By Theorem B.1, f 0 generates all operations on S, and in particular it generates an operation h0 such that g(x) = h0 (x) for all x ∈ A. The operation h0 can be written as a composition of permutations of S and the function f 0 . In this term of compositions, we replace all occurrences of f 0 by f and all permutations α of S by the extension of α to all of D by mapping x ∈ D \ S to x. Clearly, the restriction of the operation h that is defined by the resulting new term equals h0 and therefore g on elements from A. Hence, by local closure, g is generated by f , and therefore preserves Γ as well.

22