Parsing beyond context-free grammar: Tree Adjoining Grammar ...

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008

Tree Adjoining Grammars (1)

Parsing beyond context-free grammar: Tree Adjoining Grammar Parsing Laura Kallmeyer, Wolfgang Maier University of T¨ ubingen

1

Kallmeyer/Maier

• adjunction: replacing an internal node with a new tree. The new tree is an auxiliary tree and has a special leaf, the foot node. • substitution: replacing a leaf with a new tree. The new tree is an initial tree

ESSLLI Course 2008

Parsing beyond CFG

A Tree Adjoining Grammars (TAG) (Joshi & Schabes 1997) is a tree-rewriting system, i.e., a set of elementary trees with two operations:

Notation: γ[p, γ ′] is the tree one obtains from replacing the node at position p in γ with the tree γ ′ (by substitution or adjunction).

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

3

TAG Parsing

ESSLLI 2008

Tree Adjoining Grammars (2) (1) John sometimes laughs

Overview

S

1. Tree Adjoining Grammars

NP

VP VP

2. An Earley parser for TAG

NP

ADV

John

sometimes

(a) Introduction (b) Items

V

VP∗

laughs

(c) Inference Rules S

3. LR Parsing

NP

(a) Introduction

derived tree

(b) Construction of the automaton

John

VP ADV

VP

sometimes

V

laugh[1, john][2, sometimes]:

(c) The recognizer

laughs Parsing beyond CFG

2

TAG Parsing

Parsing beyond CFG

4

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008



A Tree Adjoining Grammar (TAG) is a quadruple G = hN, T, I, Ai such that

Languages TAG can generate:

• T and N are disjoint alphabets of terminals and nonterminals, • I is a finite set of initial trees, and

• {ww | w ∈ {a, b}∗ } • L4 := {an bn cn dn | n ≥ 0} Languages TAG cannot generate:

• A is a finite set of auxiliary trees.

• {wn | w ∈ {a, b}∗ } for any n > 2.

The trees in I ∪ A are called elementary trees. G is lexicalized iff each elementary tree has at least one leaf with a terminal label.

⇒ TAG generate only a limited amount of cross-serial dependencies • Lk := {an1 an2 an3 . . . ank | n ≥ 0} for any k > 4. ⇒ TAG can “count up to 4, not further”.

TAG allows to specify for each node

n

• L := {a2 | n ≥ 0}.

1. whether adjunction is mandatory and

⇒ TAG cannot generate languages whose word lengths grow exponentially.

2. which trees can be adjoined. Parsing beyond CFG

5

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

7

ESSLLI 2008



A derivation starts with an initial tree. In a final derived tree, all leaves must have terminal labels:

TAGs are mildly context-sensitive:

Let G = hI, A, N, T i be a TAG. Let γ and γ ′ be finite trees. • γ ⇒ γ ′ in G iff there is a node position p and an instance γ0′ of a tree (possibly derived from some) γ0 ∈ I ∪ A such that γ ′ = γ[p, γ0 ].

TAG Parsing

• TAGs are slightly more powerful than CFG, they can describe a limited amount of cross-serial dependencies. • TAGs are polynomially parsable (complexity O(n6 )). • TALs are of constant growth.

∗

⇒ is the reflexive transitive closure of ⇒. • The tree language of G is LT (G) := {γ | there is an α ∈ I such ∗ that α ⇒ γ, all leaves in γ have terminal labels and there are no OA nodes in γ}.

Parsing beyond CFG

6

TAG Parsing

Parsing beyond CFG

8

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008

Earley Parsing: Introduction (1)

• Behaviour is due to pure bottom-up approach, no predictive information whatsoever is used • Goal: Earley-style parser! First in Schabes & Joshi (1988). Here, we present the algorithm from Joshi & Schabes (1997). We assume a TAG without substitution nodes.

9

Kallmeyer/Maier

ESSLLI 2008


• Left-to-right CKY parser (Vijay-Shanker & Joshi, 1985) very slow: O(n6 ) worst case and best case (just as in CFG version of CKY, to many partial trees not pertinent to the final tree are produced)

Parsing beyond CFG

Kallmeyer/Maier

General idea: Whenever we are • left above a node, we can predict an adjunction and start the traversal of the adjoined tree; • left of a foot node, we can move back to the adjunction site and traverse the tree below it; • right of an adjunction site, we continue the traversal of the adjoined tree at the right of its foot node; • right above the root of an auxiliary tree, we can move back to the right of the adjunction site.

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier


11

TAG Parsing

ESSLLI 2008

Earley Parsing: Items (1)

• Earley Parsing: Left-to-right scanning of the string (using predictions to restrict hypothesis space)

What kind of information do we need in an item characterizing a partial parsing result?

• Traversal of elementary trees, current position marked with a dot. The dot can have exactly four positions with respect to the node: left above (la), left below (lb), right above (ra), right below (rb).

[α, dot, pos, i, j, k, l, sat?] where • α ∈ I ∪ A is a (dotted) tree, dot and pos the address and location of the dot • i, j, k, l are indices on the input string, where i, l ∈ {0, . . . , n}, j, k ∈ {0, . . . , n} ∪ {−}, n = |w|, − means unbound value • sat? is a flag. It controls (prevents) multiple adjunctions at a single node (sat? = 1 means that something has already been adjoined to the dotted node)

Parsing beyond CFG

10

TAG Parsing

Parsing beyond CFG

12

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008


Kallmeyer/Maier

Earley Parsing: Inference Rules (1)

What do the items mean? • [α, dot, la, i, j, k, l, nil]: In α part left of the dot ranges from i to l. If α is an auxiliary tree, part below foot node ranges from j to k.

ScanTerm

• [α, dot, lb, i, −, −, i, nil]: In α part below dotted node starts at position i. • [α, dot, rb, i, j, k, l, sat?]: In α part below dotted node ranges from i to l. If α is an auxiliary tree, part below foot node ranges from j to k. If sat? = nil, nothing was adjoined to dotted node, sat? = 1 means that adjunction took place.

wi+1

...

Scan-ǫ

• [α, dot, ra, i, j, k, l, nil]: In α part left and below dotted node ranges from i to l. If α is an auxiliary tree, part below foot node ranges from j to k. Parsing beyond CFG

ESSLLI 2008

13

Kallmeyer/Maier

wl

[α, dot, la, i, j, k, l, nil] [α, dot, ra, i, j, k, l + 1, nil]

• wl+1

[α, dot, la, i, j, k, l, nil] [α, dot, ra, i, j, k, l, nil]

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier


α(dot) labelled ǫ

15

TAG Parsing

ESSLLI 2008


Some notational conventions: • We use Gorn addresses for the nodes: 0 is the address of the root, i (1 ≤ i) is the address of the ith daughter of the root, and for p 6= 0, p · i is the address of the ith daughter of the node at address p.

PredictAdjoinable

•

[α, dot, la, i, j, k, l, nil] [β, 0, la, l, −, −, l, nil]

A

⇒

PredictNoAdj

14

A∗

wi+1 . . . wl

• For a node n, Adj(n) is the set of trees adjoinable at n. nil ∈ Adj(n) signifies that adjunction is not obligatory. Adj(n) = ∅ if n has a terminal or ǫ as label.

TAG Parsing

β ∈ Adj(α(dot)) • A

• For a tree α and a Gorn address dot, α(dot) denotes the node at address dot in α (if defined).

Parsing beyond CFG

α(dot) labelled wl+1

Parsing beyond CFG

[α, dot, la, i, j, k, l, nil] [α, dot, lb, l, −, −, l, nil]

16

nil ∈ Adj(α(dot))

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008



PredictAdjoined

Complete II

[β, dot, lb, l, −, −, l, nil] [α, dot′ , lb, l, −, −, l, nil]

[α, dot, rb, i, j, k, l, sat?], [α, dot, la, h, −, −, i, nil]

dot = f oot(β), β ∈ Adj(α(dot′ ))

[α, dot, ra, h, j, k, l, nil]

β(dot) ∈ N

or A

[α, dot, rb, i, −, −, l, sat?], [α, dot, la, h, j, k, i, nil] ⇒

•

•A

[α, dot, ra, h, j, k, l, nil]

• A

A∗ • A

A•

⇒ wh+1

wi+1 . . . wl

Parsing beyond CFG

17

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

19

Earley Parsing: Inference Rules (6) Adjoin

TAG Parsing

dot′ =f oot(β),

[β, 0, ra, i, j, k, l, nil], [α, dot, rb, j, p, q, k, nil]

β∈Adj(α(dot))

[α, dot, rb, i, p, q, l, 1]

⇒ •

A

A•adj

∗

A∗

wi+1 . . . wl

β ∈ Adj(α(dot))

• A

A A•

wl

ESSLLI 2008

Complete I

[β, dot′ , rb, i, i, l, l, nil] A

...

wh+1 . . . wi


[α, dot, rb, i, j, k, l, 1], [β, dot′, lb, i, −, −, i, nil]

β(dot) ∈ N

wi+1 . . . wj

•

A∗

wk+1 . . . wl

A•

⇒ wi+1 . . . wl

wj+1 . . . wk

sat? = 1 prevents the new item from being reused in another Adjoin application. Parsing beyond CFG

18

TAG Parsing

Parsing beyond CFG

20

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008

Move the dot to daughter/sister/mother: [α, p, lb, i, j, k, l, nil] [α, p · 1, la, i, j, k, l, nil]

MoveUp:

• The parser has Complete, Scan and Predict operations plus an Adjunction operation. α(p + 1) is defined

[α, p + 1, la, i, j, k, l, nil] [α, p · m, ra, i, j, k, l, nil] [α, p, rb, i, j, k, l, nil]

Parsing beyond CFG

• We have seen an Earley-type recognition algorithm for TAG. We can turn our recognizer into a parser by storing each item with a set of pairs of other items from which it can be inferred.

α(p · 1) is defined

[α, p, ra, i, j, k, l, nil]

MoveRight:

• The algorithm has an upper time bound of O(n6 ) • The parser does not have the Valid Prefix Property. Ensuring this property for TAG parsing is costly.

α(p · m + 1) is not defined

21

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

Earley Parsing: Inference Rules (8) Initialize:

[α, 0, la, 0, −, −, 0, nil]

ESSLLI 2008

Earley Parsing: Summary


MoveDown:

Kallmeyer/Maier

23

TAG Parsing

ESSLLI 2008

LR parsing: Introduction (1) • LR parsing: Left-to-right scanning and Right-to-left reduction

α∈I

• We compile a finite-state automaton from the grammar (offline) and use it to guide actions during parsing (online)

Goal item: [α, 0, ra, 0, −, −, n, nil], α ∈ I

• What does the automaton represent? – States: Correspond to sets of items closed under prediction – Edges: Correspond to scanning a terminal symbol or consuming an already recognized nonterminal Roughly, LR parsing is Earley parsing with precompiled predictions.

Parsing beyond CFG

22

TAG Parsing

Parsing beyond CFG

24

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008

LR parsing: Introduction (2)

Kallmeyer/Maier

ESSLLI 2008


An LR automaton is typically represented by two tables.

• Nederhof (1998) extends traditional LR parsing to TAG

• The Action table lists what action must be performed (shift or reduce). This action depends on – the current state in the automaton

• His algorithm is based on – a LR parse automaton (automaton) – a function to scan the next symbol of the input: shif t(∆, aw)

– the next preterminal to be read • The Goto table lists the states where the automaton has to go after reducing a production

– two functions to reduce partial results on the stack: reduceSubtree(∆, w) and reduceAuxtree(∆, w) where w is the input and ∆ is the LR stack. • Nederhof (1998) mentions an implementation of the parser generator • LR automaton generation for the XTAG grammar seemed to be feasible

Parsing beyond CFG

25

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

27

ESSLLI 2008



In CFG LR parsing, we dispose of two operations on a stack:

Notations: • N (t) is the set of nodes of a tree t.

1. shift(k): Scans a terminal, pushes the corresponding pre-terminal on the stack and switches to state k 2. reduce(A): The RHS of some production A → A1 . . . An has been recognized, i.e. is on the stack. reduce(A) pops A1 , . . . , An from the stack and pushes the LHS A on the stack, then switches to the next state (provided by the goto table)

Parsing beyond CFG

26

TAG Parsing

TAG Parsing

• children(N ) is the list of the children of a node N , given in linear precedence order.

Parsing beyond CFG

28

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008


Kallmeyer/Maier

ESSLLI 2008

LR parsing: Construction of the automaton (2)

• The elementary trees are extended with artifical new nodes: – For each t ∈ I ∪ A, we add a unique node ⊤ immediately dominating Rt (the root of t). – For each t ∈ A, we add a unique node ⊥ immediately dominated by Ft (the foot of t). • For a t ∈ I ∪ A, (t, N ) denotes the subtree of t rooted in N . T = I ∪ A ∪ {(t, N )|t ∈ I ∪ A, N ∈ N (t)} is the set of all subtrees of elementary trees, including the elementary trees themselves.

Items have the form [τ, N → α • β], where • τ ∈ T, • N ∈ N (τ ), and • αβ are the daughters of N . An item is called completed if is has the form • either [t, ⊤ → Rt •] with t ∈ I ∪ A, • or [(t, N ), N → α•].

Assume that our TAG has no substitution nodes and does not contain empty words.

Parsing beyond CFG

29

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier


31

TAG Parsing

ESSLLI 2008


• The states of the LR automaton are sets of items

• The construction of the set of states of the automaton starts with an initial LR state qin = {[t, ⊤ → •Rt ]|t ∈ I}

• Transitions are labeled with terminals and nonterminals An item represents a subtree of height 1 (mother node N and its daughters) in one of the τ ∈ T together with a dot • that specifies up to which daughter the subtree has been recognized. This subtree is notated as a dotted production N → α • β.

• From each state, new states can be computed using functions goto and goto⊥ . • To compute these functions for a given state q, one needs the closure closure(q) of this state.

Intuition: the closure contains all items that can be obtained from an item [τ, . . .] in q by moving down or up in τ or predicting an adjunction or predicting the part below a foot node.

Parsing beyond CFG

30

TAG Parsing

Parsing beyond CFG

32

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008

Kallmeyer/Maier

ESSLLI 2008



Definition of the closure of a state q: Let q be a set of items. closure(q) is then defined by the following inference rules:

Now we can define the set Q of LR states of our automaton as follows:

•

•

•

•

•

x

x∈q

[τ, N → α • M β] [τ, M → •γ] [τ, N → α • M β] [t, ⊤ → •Rt ] [τ, Ft → •⊥] [(t′ , N ), N → •γ]

qin q

q ′ = goto(q, M ) 6= ∅ for some node M

nil ∈ Adj(M ), children(M ) = γ

•

t ∈ Adj(M )

•

t ∈ Adj(N ).N ∈ N (t′ ), children(N ) = γ

A state is final (in Qf in ) if its closure contains a completed item for some initial tree:

q′ q q′

q ′ = goto⊥ (q, M ) 6= ∅ for some node M

Qf in = {q ∈ Q|closure(q) ∩ {[t, ⊤ → Rt •]|t ∈ I} 6= ∅} Parsing beyond CFG

33

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

35

TAG Parsing

ESSLLI 2008



Intuition behind the goto-functions: goto shifts the dot over a node, goto⊥ shifts the dot over a ⊥ (i.e., a foot node daughter).

For the definition of the recognizer, we also need the notion of reductions(q) for a given state q.

Definition of goto and goto⊥ : Let q be a set of items, M a terminal leaf or a node with Adj(M ) ∩ A 6= ∅ (no NA constraint).

Intuition: If the closure of q contains a completed item, then the LHS node of the dotted production or, if this is a ⊤ in an auxiliary tree, the whole tree are part of the reductions.

• goto(q, M ) = {[τ, N → αM • β]|[τ, N → α • M β] ∈ closure(q)} • goto⊥ (q, M ) = {[τ, Ft → ⊥•]|[τ, Ft → •⊥] ∈ closure(q) ∧ t ∈ Adj(M )}

Definition of reductions(q) for a given state q:

reductions(q) =

{t ∈ A|[t, ⊤ → Rt •] ∈ closure(q)} ∪ {N ∈ N |[(t, N ), N → α•] ∈ closure(q)}

Parsing beyond CFG

34

TAG Parsing

Parsing beyond CFG

36

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008


Kallmeyer/Maier

ESSLLI 2008

LR parsing: The recognizer (1)

We also need the definition of cross-sections through a tree rooted at some node N .

• The stack ∆ contains states and symbols. The latter are either terminal nodes or nonterminal nodes equipped with a stack.

Intuition: the sequences on the stack that can be reduced, i.e., that correspond roughly to the RHS of some completed dotted production are cross-sections.

• A configuration (∆, w) consists of a stack and a word (the remaining part of the input string).

A cross-section of a node N is either the node N or a sequence of cross-sections of the daughters of N in linear precedence order.

• There are three operations that allow the automaton to make a transition (i.e., to change configuration): shif t, reduce subtree and reduce aux tree.

Furthermore, nodes dominating foot nodes are paired with a stack of nodes (indicating where subsequent adjunctions took place).

Parsing beyond CFG

37

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

39

TAG Parsing

ESSLLI 2008



Definition of cross-sections CS(N ) of a node N : Define M := N ∪ (N × N ∗ )

shif t pushes the next input symbol followed by a new state on the stack:

Then for a given node N : • N ∈ CS(N ) if N does not dominate a foot node,

;

• (N, L) ∈ CS(N ) for each L ∈ N ∗ if N dominates a foot node, • x1 . . . xm ∈ CS(N ) if children(N ) = M1 . . . Mm and xi ∈ CS(Mi ) for 1 ≤ i ≤ m. Furthermore, CS + (N ) := CS(N ) \ ({N } ∪ {(N, L) | L ∈ N ∗ }) (the cross-sections without the node itself).

Parsing beyond CFG

38

TAG Parsing

•a

a•

(∆q, aw) ⊢ (∆qaq ′ , w) if q ′ = goto(q, a) 6= ∅.

Parsing beyond CFG

40

TAG Parsing

Kallmeyer/Maier

ESSLLI 2008


...

• The stack is initialized with the initial state qin . • The stack always contains an alternation of states q ∈ Q and nodes or nodes with stacks X ∈ M. • A parse is successful if, in a sequence of transitions (i.e., applications of shif t, reduce subtree and reduce aux tree), the input is completely consumed and the automaton reaches a final state:

; ⊥[N . . .]•

X1

ESSLLI 2008


Reduce subtree is applied when having completed a subtree rooted in N such that an adjunction occurs at N . In other words, it recognizes the part below a foot node.

N [. . .]•

Kallmeyer/Maier

Xm

Some input v is recognized if (qin , v) ⊢∗ (qin ∆q, ǫ) such that q ∈ Qf in .

′

(∆q0 X1 q1 . . . Xm qm , w) ⊢ (∆q0 (⊥, [N L])q , w) if • N ∈ reductions(qm ), X1 . . . Xm ∈ CS + (N ), q ′ = goto⊥ (q0 , N ) 6= ∅, and • L is defined as follows: if some Xj is of the form (M, L), then this provides L, otherwise L = [ ]. Parsing beyond CFG

41

Kallmeyer/Maier

TAG Parsing

Parsing beyond CFG

ESSLLI 2008

Kallmeyer/Maier

TAG Parsing

ESSLLI 2008

LR parsing: Summary

LR parsing: The recognizer (4) Reduce aux tree is applied once an auxiliary tree has been recognized. We then go back to the node where the adjunction occurred. Rt • ;

43

N [. . .]•

X1 . . . Xj [N . . .] . . . Xm

• LR parsing techniques can be applied to TAG. • Shift-reduce parser guided by a precompiled automaton. • General idea: precompile predictions and moves into states and precompile shifts and reductions into transitions of an automaton. • Problem: LR automata get very big.

(∆q0 X1 q1 . . . Xm qm , w) ⊢ (∆q0 Xq ′ , w) if • there is a t ∈ reductions(qm ) with X1 . . . Xm ∈ CS + (Rt ), • q ′ = goto(q0 , N ) 6= ∅ where N is obtained from the unique Xj of the form M [N L], and • if L = [ ], then X = N , otherwise X = N [L].

Parsing beyond CFG

42

TAG Parsing

Parsing beyond CFG

44

TAG Parsing