a grammar used for parsing and generation

1 downloads 0 Views 263KB Size Report
using the same grammar for parsing and generating ... grammar of a language, which minimizes the linguistic ... example this allows us to express agreement.
A GRAMMAR

Jean-Marie

CAP

LANCEL

SOGETI

w, U n i v e r s i t y i.

(~),

USED

FOR

PARSING

Frangois

INNOVATION,

If,

INTRODUCTION

T h i s t e x t p r e s e n t s the o u t l i n e of a s y s t e m u s i n g the s a m e g r a m m a r for p a r s i n g a n d g e n e r a t i n g s e n t e n c e s in a g i v e n l a n g u a g e . T h i s s y s t e m has b e e n d e v i s e d for a "multilingual document generation" project. M a r t i n K A Y h a s s h o w n t h a t p a r s i n g and g e n e r a t i o n c o u l d be d o n e u s i n g F u n c t i o n a l G r a m m a r s . A P P E L T l A P P 8 5 ] and M c K E O W N [McK82], among others, have used Functional G r a m m a r s for g e n e r a t i o n . U s u a l l y the g r a m m a r f o r m a l i s m is m o r e s u i t e d to p a r s i n g t h a n to g e n e r a t i o n . T h e w a y a g r a m m a r is p r o c e s s e d for p a r s i n g is r a t h e r clear, w h i l e the generating process needs strong assumptions on the i n t e r p r e t e r to be e a s i l y r e a d a b l e . The F u n c t i o n a l G r a m m a r n o t a t i o n d e s c r i b e d h e r e a l l o w s a full s y m m e t r y b e t w e e n p a r s i n g and g e n e r a t i n g . S u c h a g r a m m a r m a y be r e a d e a s i l y f r o m the p o i n t of v i e w of the p a r s i n g and f r o m the p o i n t of v i e w of the g e n e r a t i o n . T h i s a l l o w s to w r i t e o n l y one g r a m m a r of a l a n g u a g e , w h i c h m i n i m i z e s the l i n g u i s t i c c o s t s in a m u l t i l i n g u a l scheme. D e s c r i p t i o n of the F u n c t i o n a l G r a m m a r n o t a t i o n , in c h a p t e r 2, w i l l t h o r o u g h l y r e f e r to F u n c t i o n a l D e s c r i p t i o n s and Functional Unification. For a detailed presentation, the r e a d e r m a y r e f e r to [KAY79] [ROU84] [SIM85].

2. T H E

GRAMMAR

FORMALISM

T h e f o r m a l i s m we h a v e d e f i n e d a l l o w s us to w r i t e a s i n g l e g r a m m a r of a l a n g u a g e w h i c h is u s e d b o t h for a n a l y s i s and g e n e r a t i o n b y m e a n s of t w o s p e c i a l i z e d i n t e r p r e t e r s . S e n t e n c e a n a l y s i s is v i e w e d as the t r a n s i t i o n f r o m a s u r f a c e s t r u c t u r e to a s e m a n t i c r e p r e s e n t a t i o n w h i c h is a Functional Description. Sentence generation is t h e t r a n s f o r m a t i o n of a s e m a n t i c representation i n t o a s y n t a c t i c form. T h i s s y m m e t r y b e t w e e n the two p r o c e s s e s h a s to be c l e a r l y e x p r e s s e d if we w a n t a c l e a r n o t a t i o n , e a s y to r e a d a n d to u n d e r s t a n d f r o m the p o i n t of v i e w of p a r s i n g a n d of generating. A g r a m m a r r u l e is i t s e l f r e p r e s e n t e d as a F u n c t i o n a l D e s c r i p t i o n . T h i s FD h a s t h r e e main "identifiers" : PATTERN, MEANING and CONSTRAINTS.

536

ROUSSELOT

]29,

of S T R A S B O U R G

AND

rue 22,

de

GENERATION

(**),

Nathalie

l'Universit6,

rue

Descartes,

SIMONIN

75007 67084

(~)

- PARIS - STRASBOURG

E x a m p l e of a s i m p l e g r a m m a r r u l e : simple_gn = [ p a t t e r n = (det s u b s t adj) m e a n i n g = [ obj = < s u b s t m e a n i n g > d e f i n i t u d e = q u a l i f = number = ] c o n s t r a i n t s = ( [ e q u a l = ( ) e q u a l = ( ) ] ) det = [ c a t = det] subst = [cat = s u b s t ] adj = [ c a t = adj] ] The ~ p a r t d e s c r i b e s the s y n t a c t i c a l s t r u c t u r e . E a c h i t e m of the l i s t a s s o c i a t e d to p a t t e r n r e f e r s to a r u l e or to a t e r m i n a l . In the a b o v e e x a m p l e the t h r e e t e r m s r e f e r to t e r m i n a l s . O m i s s i o n s a n d r e p e t i t i o n s are a l l o w e d . T h e m e a n i n q p a r t d e s c r i b e s the s e m a n t i c representation a s s o c i a t e d to the s y n t a c t i c a l structure. Bracketed lists represent "paths" r e f e r i n g to F u n c t i o n a l D e s c r i p t i o n s inside the r u l e or in a n o t h e r rule. D u r i n g p a r s i n g , t h e s e p a t h s are u s e d to b u i l d the s e m a n t i c r e p r e s e n t a t i o n w h i l e in g e n e r a t i o n t h e y are u s e d for s p l i t t i n g a s e m a n t i c s t r u c t u r e i n t o different sub-structures. The t w o p r o c e s s e s , p a r s i n g and g e n e r a t i o n , are d e t a i l e d in c h a p t e r s 3 a n d 4. T h e c o n s t r a i n t s p a r t is a l i s t of "set of c o n s t r a i n t s " e x p r e s s e d by F u n c t i o n a l D e s c r i p t i o n s . A t ] e a s t o n e "set of c o n s t r a i n t s " m u s t be f u l f i l l e d . In the a b o v e e x a m p l e this a l l o w s us to e x p r e s s a g r e e m e n t r u l e s u s e d for b o t h p a r s i n g and g e n e r a t i n g . As in M a r t i n K a y d e f i n i t i o n s a r u l e m a y h a v e d i f f e r e n t d e r i v a t i o n s . T h e s e are r e p r e s e n t e d by enclosed braces. Example : simple_phrase = { pattern = (gnl v t r a n s gn2) meaning = [action = s u b j s e m = objsem = ] c o n s t r a i n t s = ( [ e q u a l = ( , ]) pattern meaning

= =

constraints

=

(gnl v i n t r a n s ) [action = s u b j s e m = ] ( [ e q u a l = ( , ~)

}

3.

THE

3.1.

PARSING

PROCESS

Us___e o f th e ~ r a m m a r

4. T H E

4.1o

for

In o r d e r to a n a i y z e a s e n t e n c e , the w o r d s a n d c o m p o u n d s w o r d s a r e c o n v e r t e d in Functional Descriptions, using a morphological analyzer and a dictionary. The r e s u l t is a l i s t o f F D ' s w h i c h w i l l be p r o c e s s e d b y -the p a r s e r . Example (semantic va]ues are expressed by English terms but they are usually e x p r e s s e d as FD) : "]es c h a u s s u r e s Input

list

vertes"

of p a r s e r

is

("the green

here

:

[cat = subst gender = fem number:plural lex=ehaussure meaning=shoe]

Use

of the ~rammar

:for ~ e n e r a t i o n

The generation-takes as i n p u t a s e m a n t i c structure and produces a sentence. A s an e x a m p l e t h e r u l e s i m p l e _ g n (cf c h a p t e r 2), is a c t i v a t e d w i t h t h e s e m a n t i c structure [ obj = box definitude = undefined qualif = white number = plural ]

[ c a t = adj gender = fem number=plural lex = vert meaning:green]

A c o p y of t h e r u l e is b u i l t . T h e p a t h s in ā€¢the F u n c t i o n a l D e s c r i p t i o n a s s o c i a t e d to t h e i d e n t i f i e r " m e a n i n g " a r e u s e d to c o n v e y the s e m a n t i c i n f o r m a t i o n to "the i t e m s r e f e r r e d to b y t h e i d e n t i f i e r " p a t t e r n " ( T h e s e i t e m s are n a m e d " c o n s t i t u e n t s " ) ]~n this

example

identifier This sentence matches with the rule s i m p l e _ g n d e s c r i b e d in c h a p t e r 2, as the f i r s t F D of the l i s t is f u n c h i o n n a l y u n i f i a b ] e w i t h [ c a t = de t], the s e c o n d F D w i t h [cat = s u b s t ] a n d t h e t h i r d F D w i t h

[cat

PROCESS

shoes")

( [cat = det type =defined number:plural l e x : ]e]

GENERATING

Jt g i v e s

:

pointed

path

obj definitude qualif number

value

box undefined white p]ura]



= adj].

The parsing process builds a structure which is a c o p y of t h e r u l e s i m p l e _ g n a n d e n l a r g e s it w i t h t h e actual, w o r d a n a l y z e d . The path descriptions are replaced by their actual values.

The interpretation " b u i l d s " the p a t h , needed identifiers of -the r u l e .

p r o c e s s o f the g r a m m a r w h i c h m e a n s that t h e a r e i n c l u d e d in t h e c o p y

FD

3.2.

Structure

sJmple_gn [ pattern meaning

= = =

det

=

subst

=

adj

=

built

(det s u b s t adj) [obj = s h o e definitude = defined qualif = green number = plural ] [cat = det type = defined number =plurai lex ~ le ] [cat = subst gender = fem number = plural lex = c h a u s s u r e meaning = shoe ] [ c a t = adj gender = fem number = plural lex = v e r t meaning = green ]

T h i s s t r u c t u r e is b u i l t if t h e c o n s t r a i n t s a r e m e t : f o r t h i s r u l e it i m p l i e s a g r e e m e n t of g e n d e r a n d n u m b e r , w h i c h is t h e c a s e f o r "les c h a u s s u r e s v e r t e s " .

for D E T b e c o m e s : det = [ cat = deh type = undefined ~ere "type" has been added.

]

FD

:for S U B S T b e c o m e s : subst = [ cat = subst meaning ~ box number = plural ] where "meaning" and "number" have been added.

FD

for ADJ becomes : adj = [ c a t : adj meaning = white ] where "meaning" has been added.

T h e n t h e c o n s t r a i n t s a r e a p p l i e d . In t h e p a r s i n g p r o c e s s t h e y a r e u s e d to e l i m i n a t e w r o n g c o n s t r u c t i o n s , w h i l e in t h e g e n e r a t i n g p r o c e s s t h e y a r e u s e d to t r a n s m i t information.

Use

of

the

constraints

equal = (



)

537

At t h i s step, t h i s r u l e d o e s n ' t t r a n s m i t any i n f o r m a t i o n b e c a u s e i d e n t i f i e r " g e n d e r " is n o t p r e s e n t in at l e a s t one F u n c t i o n a l Description equal = (



)

T h i s r u l e t r a n s m i t s n u m b e r of s u b s t a n t i v e ( n u m b e r = p l u r a l ) , in the t w o o t h e r F u n c t i o n a l d e s c r i p t i o n s of the o u t p u t l i s t After constraints l i s t is : ([cat = type = number =

are

applied,

the

output

det undefined plural ]

F o r t h i s e x a m p l e the list c o n s t r u c t e d b y morphological g e n e r a t i o n is : ( "des", "boites", "blanches" ) which gives : "des b o i t e s b l a n c h e s "

is

the

[cat = subst meaning = box number = plural

]

T h i s e x a m p l e is a s i m p l e c a s e w h e r e items of a " p a t t e r n " do not r e f e r to o t h e r r u l e s . P r e s e n c e of a r u l e n a m e in a p a t t e r n l e a d s to a c t i v a t i o n of t h i s r u l e w i t h a s u b s e t of the i n i t i a l m e a n i n g ( t r a n s m i t t e d b y a path, as for a t e r m i n a l ) .

[cat = adj meaning = white number = plural

])

4.2.

T h e n e x t s t e p is w o r d s e l e c t i o n : for e a c h t e r m i n a l , the s e m a n t i c s t r u c t u r e a s s o c i a t e d w i t h it is u s e d to c h o o s e a l e x i c a l item. T h i s is d o n e b y u s i n g F u n c t i o n a l U n i f i c a t i o n . F o r e a c h w o r d or c o m p o u n d w o r d s e l e c t e d , " c o n s t r a i n t s " are p r o c e s s e d a g a i n , in o r d e r to t r a n s m i t i n f o r m a t i o n s to F u n c t i o n a l D e s c r i p t i o n s of the list. F o r a g i v e n s t r u c t u r e t h e r e m a y be m o r e o n e a d e q u a t e w o r d . In t h a t c a s e the a p p r o p r i a t e w o r d is c h o s e n by the u s e r interactively.

the

first

item

:

( [cat=det type=undefined number=plural lex = un ]

Generation

models

T h e g e n e r a t i o n of the s e n t e n c e a s s o c i a t e d to a s e m a n t i c s t r u c t u r e m a y lead to v a r i o u s s y n t a c t i c a l c o n s t r u c t s . In o r d e r to r e d u c e the n u m b e r of c o n s t r u c t s , and to a l l o w c o n t r o l of t e x t style, a s p e c i f i c f e a t u r e has b e e n i n t r o d u c e d , n a m e d " g e n e r a t i o n model". A generation model associates a s e m a n t i c p a t t e r n to a p r e c i s e g r a m m a r rule.

than

T h e l i s t of t e r m i n a l s is e n l a r g e d b y the s e l e c t e d l e x i c a l items, as s h o w n in t h e following example : For

At this s t e p e a c h w o r d of the o u t p u t l i s t completely defined. The morphological generation processes each Functional D e s c r i p t i o n u s i n g f i e l d s LEX, G E N D E R , NUMBER, MODE, TENSE and PERSON. The a p p r o p r i a t e f o r m of the l e x i e a l i t e m is constructed using Functional Unification.

[cat = s u b s t [cat = adj meaning=box meaning=white number=plural] number=plural]

Example

:

S e m a n t i c s t r u c t u r e a s s o c i a t e d to the a d v i c e "Do not e x p o s e to rain" in a u s e r ' s m a n u a l : [advice advice-type = directive advice-giver = constructor content = [link = negation argl = [ a c t i o n action-type = expose subjsem = user objsem = machine obj2 = rain ] ] ]

) For

the

second

item

:

( [cat=det type=undefined number=plural lex=un gender=fem ]

[cat=subst [cat=adj meaning=box meaning=white number=plural number=plural lex=boite gender=fem ] gender=fem ]

) For

the

third

item

A m o n g the " g e n e r a t i o n m o d e l s " the f o l l o w i n g is F u n c t i o n n a l y the a b o v e s t r u c t u r e :

[advice advice-type = directive g e n - m o d e l = [ [cat = p r o p - i n f i n i t i v e pattern = (gvinf *comp) meaning = ]

:

( [cat=det type=undefined number=plural lex=un gender=fem ]

)

538

of the system, U n i f i a b l e to

[cat=adj [cat=subst meaning=white meaning=box number=plural number=plural lex=boite gender=fem gender=fem ] lex=blanc ]

[cat = p r o p - m u s t pattern = (gvdir *comp) meaning = ]

] Remark m a y be

: the s y m b o l repeated.

] * means

that

the

rule

This g e n e r a t i o n model is s e l e c t e d by a r e s t r i c t e d v e r s i o n of F u n c t i o n a l U n i f i c a t i o n : i d e n t i f i e r s "advice" and " a d v i c e - t y p e " must be p r e s e n t in the semantic structure.

KAY,M.

In this e x a m p l e two g r a m m a r rules are c a n d i d a t e once the g e n e r a t i o n model is selected. A simple i m p l e m e n t a t i o n is to choose a rule at random, a n o t h e r is to h a v e an e v a l u a t i o n m o d u l e w h i c h choose the most a p p r o p r i a t e rule a c c o r d i n g to stylistic k n o w l e d g e (technical style, t e l e g r a p h i c style, etc).

KAY,N.

"Functional Grammar." Annual Meeting of the Society, 1979.

"Unification 1981.

Proceedings of Firth Berkeley I inguistics

Grammars."

Xerox

publication.

NeKEOWN,K. "GeneraLing Natural Language TexL in Response to Questions about Database Structure." Ph.D. dissertation. University of Pennsylvania. 1982.

5. D E V E L O P M E N T S

P r e v i o u s v e r s i o n of the m u l t i l i n g u a l g e n e r a t i o n system uses a g r a m m a r for parsing, and p r o d u c t i o n rules for generation. P r e s e n t work i s the a d a p t a t i o n of the p a r s e r to the n e w formalism, and the i m p l e m e n t a t i o n of the g e n e r a t i o n interpreter. It includes the a d a p t a t i o n of the m u l t i l i n g u a l d i c t i o n a r y retrieval process.

RITCIIIE,G. "Simulating n -ruring functional unification Piss. 1984.

machine using g r a m m a r . " ECAI

84.

ROUSSELOr,F. "R4alisaLJon d'un programme eomprenant des textes~ en u L J l i s a n L un f o r m a l i s m e unique pour reprdsenLeĀ£ routes les connaissances ndcessaiees." Thbse d'EtaL. University of Paris VI. ]986.

6. R E F E R E N C E S APPELT,D.E. "Planning E n g l i s h U n i v e r s i t y Press.

Sentences." 1985.

Cambridge

K A P L A N , R . M . and BRESNAN,J. " L e x i c a l - F u n c t i o n a l G r a m m a r : A Formal S y s t e m for G r a m m a t i c a l R e p r e s e n t a t i o n . " In : Bresnan,J. (ed) The M e n t a l R e p r e s e n t a t i o n of G r a m m a t i c a l Relations. MIT Press. 1982.

ROUSSEI_OT,I r . and GROSCOT,H. "Un l a n g s g e d 6 c ] e r a t i f uniforme analyseur syntaxico-s~mant:ique." 85. Paris. ]985.

SIMflNZN,N. "Ukilisation des t e x t e s Univet'sity

eL un COGN[TIVA

d'une Expertise pout" s t P u e t t J F ~ s en f r a n g a i s . of Paris VI. 1985.

engendrer " Th~se.

539