Parsing beyond context-free grammar: Range Concatenation ...

7 downloads 193279 Views 99KB Size Report
The idea behind range concatenation grammar (RCG) is comparable to the ..... language input, while the parser determines the destination language via string ...
Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

Range Concatenation Grammar The idea behind range concatenation grammar (RCG) is comparable to the idea behind MCFG.

Parsing beyond context-free grammar: Range Concatenation Grammar Parsing

• While in MCFG, a string is generated, in RCG, a string is reduced to ǫ.

ESSLLI Course 2008

1

Kallmeyer/Maier

• One predicate can be true or false for a certain string. • Some string w is in the language of an RCG if the start predicate is true for w.

Laura Kallmeyer, Wolfgang Maier University of T¨ ubingen

Parsing beyond CFG

• Predicate-rewriting clauses describe ranges which are not necessarily adjacent.

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

3

RCG Parsing

ESSLLI 2008

Expressivity of RCG • RCG exactly covers the class of PTIME recognizable languages (Bertsch&Nederhof, 2001). • Simple RCG (basically non-deleting non-copying RCG) is equivalent to MCFG

Overview 1. Range Concatenation Grammars (RCG)

• RCG can represent languages beyond mild context-sensitivity

2. Parsing RCG (a) Directional top-down parsing (b) Earley-style parsing 3. Uses of RCG

Parsing beyond CFG

2

RCG Parsing

Parsing beyond CFG

4

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

Definition of RCGs: Derivation Relation, Language

Definition of RCGs: Grammar Definition

• The derivation relation is defined as follows:

A RCG is a tuple G = hN, T, V, P, Si such that • N is a finite set of predicates, each with a fixed arity, • T and V are disjoint finite sets of terminals and variables, • S ∈ N is the start predicate of arity 1, and • P is a finite set of clauses of the form

For a predicate A of arity k, a clause A(. . .) → . . ., and ranges hi1 , j1 i, . . . , hik , jk i with respect to a given w: if there is an instantiation of this clause with LHS A(hi1 , j1 i, . . . , hik , jk i), then A(hi1 , j1 i, . . . , hii , jk i) can be replaced with the RHS of this instantiation. • The language of an RCG G is the set of strings that can be reduced to the empty word:

A0 (x01 , . . . , x0a0 ) → ǫ



L(G) = {w | S(h0, |w|i) ⇒ ǫ with respect to w}.

or A0 (x01 , . . . , x0a0 ) → A1 (x11 , . . . , x1a1 ) . . . An (xn1 , . . . , xnan ) with n ≥ 1 and Ai ∈ N, xij ∈ (T ∪ V )∗ and ai being the arity of Ai . A predicate An (xn1 , . . . , xnan ) can be written as An (~xn ) Parsing beyond CFG

5

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

7

RCG Parsing

ESSLLI 2008

Definition of RCGs: Instantiation

A sample RCG (1)

A given clause C is instantiated with respect to a string w if variables and arguments are consistently replaced by ranges of w.

Sample RCG G for the string language {an bk an | k, n ∈ IN }: An RCG with N = {S, A, B}, T = {a, b}, V = {X, Y, Z}, start predicate S and clauses

Example:

• S(X Y Z) → A(X, Z) B(Y ),

• A(hi . . . ji) → B(hi + 1 . . . ji)

• A(a X, a Y ) → A(X, Y ),

is an instantiation of the clause

• B(b X) → B(X),

• A(aX1 ) → B(X1 )

• A(ǫ, ǫ) → ǫ,

if wi+1 = a.

• B(ǫ) → ǫ

Parsing beyond CFG

6

RCG Parsing

Parsing beyond CFG

8

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

A sample RCG (2) A(X,

Z)

B(Y )

w3,5

w0,2

w3,5

w2,3

aa

aa

aa

b

Y

Z)

w0,2

w2,3

aa

b

ESSLLI 2008

A sample RCG (4)

As an example consider the reduction of w = aabaa: S(X

Kallmeyer/Maier



A(a

X,

a

Y)

w0,1

w1,2

w3,4

a

a

a



A(X,

Y)

w4,5

w1,2

w4,5

a

a

a

leads to A(w0,2 , w3,5 ) ⇒ A(w1,2 , w4,5 ). Then

With this instantiation, S(w0,5 ) ⇒ A(w0,2 , w3,5 )B(w2,3 ). Then

Parsing beyond CFG

9

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

A sample RCG (3)

A sample RCG (5) A(a

X,

a

Y)

w3,3

w3,3

w1,2

w2,2

w4,5

ǫ

ǫ

a

ǫ

a

X)

w2,3 b

RCG Parsing

ESSLLI 2008

B(X)

B(b



11



A(X,

Y)

w5,5

w2,2

w5,5

ǫ

ǫ

ǫ

and B(ǫ) → ǫ

and A(ǫ, ǫ) → ǫ

lead to A(w0,2 , w3,5 )B(w2,3 ) ⇒ A(w0,2 , w3,5 )B(w3,3 ) ⇒ A(w0,2 , w3,5 ).

lead to A(w1,2 , w4,5 ) ⇒ A(w2,2 , w5,5 ) ⇒ ǫ

Parsing beyond CFG

10

RCG Parsing

Parsing beyond CFG

12

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

RCG parsing: Treatment of terminals

Definition of RCGs: Other properties (1) • An RCG with maximal predicate arity k is called an RCG of arity k (also called a k-RCG).

Without loss of generality, we presuppose that all non-ǫ clauses contain no terminals in their arguments.

• An RCG is called non-combinatorial if each of the arguments in the right-hand sides of the productions are single variables.

For each t ∈ T , we introduce a new clause Tt (t) → ǫ and for each clause C ∈ P ,

• An RCG is called linear if no variable appears more than once in the left-hand sides of the productions and no variable appears more than once in the right-hand side of the productions.

• we replace each occurrence t′ of t in all arguments of all predicates with a variable Vt′ , • for each Vt′ , we add the predicate Tt (Vt′ ) to the RHS of C. Furthermore, for all clauses we assume that its variables are continuously numbered from 1 to some j.

Parsing beyond CFG

13

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

15

RCG Parsing

ESSLLI 2008

RCG parsing: Range vectors

Definition of RCGs: Other properties (2) • An RCG is called non-erasing if for each production, each variable occurring in the left-hand side occurs also in the right-hand side and vice versa. • An RCG is called simple if it is non-combinatorial, linear and non-erasing. • A simple RCG is called ordered simple if the range variables are ordered the same way in the RHS and the LHS predicates. Ordered simple RCG is equivalent to simple RCG.

We will use range vectors similar to those used for MCFG parsing. Range vectors are used to describe variable bindings. • φ = (hx1 , y1 i, . . . , hxk , yk i) is a range vector in w if all hxi , yi i are ranges in w for 1 ≤ i ≤ k. • φ = (hx1 , y1 i, . . . , hxk , yk i) is a range constraint vector if it contains pairs hx, yi where x, y ∈ P os(w) ∪ Vr (Vr is a set {r1 , r2 , . . .} of range boundary variables) such that if hx, yi ∈ P os(w)2 then it is a range. • k is called the dimension of φ • φ(i).l denotes then the first component and φ(i).r the second component of the ith element of φ.

Parsing beyond CFG

14

RCG Parsing

Parsing beyond CFG

16

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

RCG parsing: Variable constraint vectors

Directional top-down parsing

The variable constraint vector φ of a non-ǫ clause A(~x) → Φ is a range constraint vector of dimension j, j being the highest variable index in the clause. It contains only x ∈ Vr × Vr and must be consistent with variable adjacencies in the clause.

Corresponds to the algorithm presented in Boullier (2000). Item form: ~ → Φ • Ψ, φ] • Active items: [A(X) • Passive items: [A, ψ, f lag]

Formally, the elements of φ are pairs from Vr × Vr such that φ(h).r = φ(i).l iff Xh Xi occurs as a substring in one of the arguments of the clause.

where • φ is a range vector of dimension j, j being the highest variable index in the clause, • ψ is a range vector of dimension k, k being the arity of A, • flag= {p, c} indicates if a passive item is predicted or completed.

Parsing beyond CFG

17

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

Update of range vectors

19

RCG Parsing

ESSLLI 2008

Directional top-down parsing (axiom and goal)

We define an update φ′ of a range constraint vector φ with respect to an identity x = y, x, y ∈ P os(w) ∪ Vr as follows:

• Axiom:

[S, (h0, ni), p]

• if x = y, then φ′ = φ; • else if x ∈ Vr and the result ψ of replacing all occurrences of x in φ with y is a range constraint vector, then φ′ = ψ;

• The goal item is [S, (h0, ni), c].

• else if y ∈ Vr and the result ψ of replacing all occurrences of y in φ with x is a range constraint vector, then φ′ = ψ; • otherwise, φ′ is undefined.

Parsing beyond CFG

18

RCG Parsing

Parsing beyond CFG

20

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

Directional top-down parsing (predict-rule)

Directional top-down parsing (scan)

We have two predict operations.

Scan:

Predict-rule predicts an active item for a previously introduced passive item.

[A, (hl, ri), p] [A, (hl, ri), c]

A(x) → ǫ, hl, ri(w) = x

[A, ψ, p] [A(~x) → •Ψ, φ] thereby, the variable bindings in φ applied to ~x yield ψ. Furthermore, φ respects the adjacency constraints imposed by the variable constraint vector of the clause.

Parsing beyond CFG

21

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

23

RCG Parsing

ESSLLI 2008

Directional top-down parsing (predict-pred)

Directional top-down parsing (complete)

Predict-pred predicts a passive item.

Complete moves the dot over a predicate in the RHS of an active item if the corresponding passive item has been completed.

[A(. . .) → Φ • B(~x)Ψ, φ] [B, φB , c],

[B, ψ, p]

[A(. . .) → Φ • B(~x)Ψ, φ] [A(. . .) → ΦB(~x) • Ψ, φ]

thereby, ψ results from applying φ to ~x.

where φB must be the result of applying φ to ~x.

Parsing beyond CFG

22

RCG Parsing

Parsing beyond CFG

24

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Directional top-down parsing (convert)

Kallmeyer/Maier

ESSLLI 2008

Earley-style parsing (initialization and goal)

Once the dot has arrived at the right end of the RHS of a clause, we can convert the active item to a passive item.

• Initialize:

[S, (h0, ni), p]

Convert: [A(~x) → Φ•, φ]

• The goal item is [S, (h0, ni), c].

[A, φ, c]

Parsing beyond CFG

25

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

27

RCG Parsing

ESSLLI 2008

Earley-style parsing

Earley-style parsing (predict)

Presented in Kallmeyer&Maier (2009) (in preparation).

We have two predict operations.

Item form:

As for the top-down case, predict-rule predicts active items with the dot on the left of the RHS, for a given previously introduced passive item.

• Active items: [A(~x) → Φ • Ψ, φ] • Passive items: [A, ψ, f lag]

[A, ψ, p]

where

[A(x1 . . . y1 , . . . , xk . . . yk ) → •Ψ, φ′ ]

• φ is a range constraint vector of dimension j, j being the highest variable index in the clause, • ψ is a range constraint vector of dimension k, k being the arity of A, • flag= {p, c} indicates if a passive item is predicted or completed.

Parsing beyond CFG

26

RCG Parsing

where, starting from the variable constraint vector φ of the clause, we obtain φ′ by updating with the following identities: φ(xi ).l = ψ(i).l, φ(yi).r = ψ(i).r for all 1 ≤ i ≤ k. Note the difference to the top-down case: We are now dealing with range constraint vectors, i.e., some variable boundaries remain unspecified.

Parsing beyond CFG

28

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

Earley-style parsing (predict-pred)

Earley-style parsing (complete)

Also as for the top-down case, predict-pred predicts a passive item for the predicate following the dot in an active item.

Complete moves the dot over a predicate in the RHS of an active item if the corresponding passive item has been completed.

[A(. . .) → Φ • B(x1 . . . y1 , . . . , xk . . . yk )Ψ, φ]

[B, φB , c],

[B, ψ, p]

[A(~x) → Φ • B(x1 . . . y1 , . . . , xk . . . yk )Ψ, φ] [A(~x) → ΦB(x1 . . . y1 , . . . , xk . . . yk ) • Ψ, φ′ ]

where ψ(i).l = φ(xi ).l, ψ(i).r = φ(yi ).r for all 1 ≤ i ≤ k. where φ′ is φ updated with all new constraint information coming from φB , i.e., φ′ is an update of φ wrt. the identities φ(xj ).l = φB (j).l and φ(yj ).r = φB (j).r for all 1 ≤ j ≤ k.

Parsing beyond CFG

29

Kallmeyer/Maier

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

31

ESSLLI 2008

Earley-style parsing (scan)

Earley-style parsing (convert)

Scan:

Convert turns an active item with the dot at the end of the righthand side into a completed passive item

[A, (hl, ri), p] ′



[A, (hl , r i), c]

A(x) → ǫ, hl′ , r ′ i(w) = x,

RCG Parsing

hl, ri compatible with hl′ , r ′ i [A(x1 . . . y1 , . . . , xk . . . yk ) → Ψ•, φ] [A, ψ, c]

• Reduce a single terminal to ǫ, recall definition • Here, “compatible with” means that there is a function f : {l, r} → {l′ , r ′ } such that f (l) = l′ , f (r) = r ′ and f (x) = x if x ∈ P os(w).

Parsing beyond CFG

30

RCG Parsing

where ψ(i).l = φ(xi ).l and ψ(i).r = φ(yi ).r for all 1 ≤ i ≤ k.

Parsing beyond CFG

32

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

RCG as a tool

Kallmeyer/Maier

ESSLLI 2008

TAG → simple RCG: Example

One can use RCG as an intermediary device, resp. a pivot formalism. We will see two applications:

S ǫ

• RCG for TAG parsing • RCG for Syntax-Directed Machine Translation

SNA

SNA

a

S

b

S

S∗NA

a

S∗NA

b

Start predicate: S(X) → α(X)

Parsing beyond CFG

33

Kallmeyer/Maier



ǫ

α(B1 B2 )



β1 (B1, B2)|β2(B1, B2)

β1 (aB1 , aB2 )



β1 (B1 , B2 )|β2 (B1 , B2 )

β2 (bB1 , bB2 )



β1 (B1 , B2 )|β2 (B1 , B2 )

β1 (a, a)



ǫ

β2 (b, b)



ǫ

RCG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

TAG → simple RCG

35

RCG Parsing

ESSLLI 2008

RCG for MT

A TAG can straightforwardly be converted into an RCG.

• Binary 2-RCG can be used for efficient syntax-based machine translation.

• Introduce a predicate for each elementary tree • A predicate corresponding to – an aux tree β has the form β(L, R), where L and R covers the yield of β to the left and the right of the footnode, including all material added to it – an initial tree α have the form α(X), with X covering the yield of α and all trees added to it • A predicate α/β reduces the input by determining which parts of the string come from the α/β respectively and which parts come from substituted/adjoined trees

Parsing beyond CFG

α(ǫ)

34

RCG Parsing

• Intuitively, the first argument of a clause specifies the source language input, while the parser determines the destination language via string variables, i.e., variables in the parser input that are instantiated by lexical items in parsing. • Main advantage over previous systems based on synchronous versions of CFG/TAG/etc.: Higher expressivity through availability of copying/deleting while still in the same complexity class (O(n6 )). • Refer to Søgaard (2008) (COLING 2008) for complete presentation.

Parsing beyond CFG

36

RCG Parsing

Kallmeyer/Maier

ESSLLI 2008

Example grammar: →

• Range concatenation languages coincide with the class of PTIME recognizable languages.

NP (X1 , Y1 )VP (X2 , Y2 )

VP (X1 X2 , Y1 Y2 Y3 )



V (X1 , Y1 )ObjP (X2 , Y2 )Part(X1 , Y3 )

VP (X1 , Y1 Y2 )



V (X1 , Y1 )Part(X1 , Y2 )

ObjP (X1 , Y1 Y2 )



NP (X1 , Y2 )Prep(X1 , Y1 )

NP (X1 X2 , Y1 Y2 )



Art(X1 , Y1 )N (X2 , Y2 )

NP (he, er )



ǫ

V (entered , trat)



ǫ

Part(entered , ein)



ǫ

Prep(the room, in)



ǫ

Art(the, das)



ǫ

N (room, Zimmer )



ǫ

Parsing beyond CFG

37

• Other parsing strategies are possible (cf. Kallmeyer&Maier (2009)).

RCG Parsing

ESSLLI 2008

RCG for MT: Example (2) • Call the parser with the input string w =He entered S, where S is a string variable, and the start predicate S(X1 X2 , Y1 Y2 ). • The algorithm should infer that S = Y1 Y2 = trat ein in order to reduce X1 X2 to ǫ. Example derivation: entered

the

room (the room)

er

trat

Parsing beyond CFG

in

das

• We have seen a top-down algorithm and an Earley-style algorithm.

• Range concanenation grammar are used as intermediary formalism in different applications.

Kallmeyer/Maier

he

ESSLLI 2008

Conclusions

RCG for MT: Example (1)

S(X1 X2 , Y1 Y2 )

Kallmeyer/Maier

Zimmer

38

ein

RCG Parsing

Parsing beyond CFG

39

RCG Parsing