Tree Unification Grammar - CiteSeerX

9 downloads 0 Views 212KB Size Report
system, others, like HPSG unify the different structure s (e.g. SYNSEM and ... of Tree Unification Grammars is an attempto lexicaliz ...... h TAGs can han- dle) but ...
Electronic NotesinTheoretical Computer Science 53 (2001) URL: http://www.elsevier.nl/locate/entcs/volume53.h tml

TreeUnification Grammar ProblemsandProposalsfor Topology,TAG,andGerman

Kim Gerdes

1

Lattice,UFRL Université Paris7

Abstract Thisworkpresentsalexicalizedgrammarformalismwh multi-componenttreeadjoininggrammar(TAG).Thisfo scribingthesyntaxofGermanbecauseirtelatesasynt hierarchyoftopologicaldomains.Thetopologicalphr mentofverbalandnominalelementsinthe(ordered) caltopologicalanalysisotfheGermansentence.Thi stepbetweenthesemanticandtheprosodicrepresentati derivingcasesoscrambling f thatareproblematicfor structure based formalisms.

ichcanbeseenasavariantof rmalismiswell-suitedfordeacticdependencygraphwitha asestructureencodestheplacefieldstructure,justasintheclassim s oduleconstitutesanintermediate onsofthesentenceandallows classicalTAGsandforsomephrase

DiedeutscheWortfolgeistnicht„frei“,sonderndenkbedingt.

1 Introduction Thispaperproposesyetanotherlexicalizedtreegramma TAGfamilywiththepurposeofcapturingGermanwordor andyetanotherlinearizationsystemfordependencygramma topologicalmodelo ft heGermansentence.Whatsetst his previousworkisthatt heproposedformalism,TreeUnif accomplishesboththesetasksat hesametime.Moreover, poseofalexicalizedtreegrammart oserveasasynta moduleinsidet heMeaning-Text-Theory(MTT,Mel’

1 Email : [email protected] Thispaperh asbenefitedgreatlyfromdiscussionswith YonYoo,andPatriceLopez.Iassumethecustomaryrespon shortcomings. 2 TheGermanwordorderins ot‘free’,butthought-co

rformalisminthe derphenomena, rsbasedonthe workapartfrom icationGrammars, weseethepurcticcorrespondence čuk1987).

SylvainKahane,IgorMel’ čuk,Hisibilityforcontentand nditioned.Drach1937,page26

©2001PublishedbE y lsevierScienceBV. .

2

GERDES

InMTT,languageisdescribedas (ar eversible)processofgeneratpossibly conceptual graph ingtext(theactualspokenorwritten language) frommeaning, which is semantic graph thoughtofasaconceptualstructureo f (encodest hetarolesandcommunicative hierarchyotheme/rheme) f whattosay.Onitswaytobecome text,specificcorrespondencemodules possibly otherintermediate levels transformthemeaningintodifferent like deepsyntax intermediate representations (See surface syntaxtree Figure1).Int hepresentwork,weuse (encodes subcategorization andpasses aslightlyrevisedversionofMTTfolthroughwordgroups) lowingGerdesandKahane2001a: proposedmodule Betweenthe phonologicalandthe topologicalphrase structure tree syntacticrepresentationwestipulate (encodes h a ierarchy o word f g roups) theexistenceofahierarchyofword surface string domains, called topological phrase structure,t obeintroducedbelow,rephonologicalrepresentation placingthe‘morphologicallevel’in usualMTT. Thelineofargumentotfhispaper Figure1Str : ucturesinMTT goesasfollows:ThenexttwosectionsintendtojustifywhyanadequatelexicalizedtreegrammarforGermanhastogofa ro fft hetrackof usualTreeAdjoiningGrammars:I nsection2,Irecallbr ieflywhyaphrase structureviewonGermanwillnotgivesatisfyinginsights inthefunctioning oft helanguage.Insection3,sI ummarizet heeffortst hathavebeendonet o adoptTAGtoGermanandviceversa.Section4givesashor introduction t of analternativeviewonphrasestructurebasedontheclassic alt opological modelotfheGermansentencestructures.Thisphrase structuredoesnotitselfcarryanysyntacticorsemanticinformation,butit caneasilybelinked toaphonologicalanalysisononesideandtoan(unordered )syntacticdependencystructureontheotherside.I nsection5,then I p resentsomephenomenaoG f ermanconstituentformation,whicharedifficult tohandleat he syntacticlevel.Theseconstituentswillbeshowntobeofd ifferentnature thansyntacticconstituents,andshouldbecontrolledonase manticlevel,i.e. bythemoduleintheMTTframeworkthatlinkst hesemantic withthesyntacticrepresentation.Weendupwitha‘lightened’task ofthesyntactic module.Thist askwillfinallybeshowntobeaccomplished byTreeUnificationGrammars,tobde efinedandillustratedinsections 6and7. TheappearanceofatopologicalstructureinanMTTgra mmar,where phonologyandsyntaxhavetobejoined,seemssurprisinglynat ural:Erich Drach,t hefatherofGermansentencetopologywhocoinedto day’susual fieldnames,alsowrotet hefollowinglinesin 1937: UmAufschlußzugeben,

2

GERDES

„wieeinegesprocheneoderniedergeschriebeneÄußer ungalszutreffender AusdruckdesgemeintenBedeutungserlebnisseszustan dekommt,“muß„von derSprechdenk-Funktion,demSchöpfungsaktdesSatz esinderSeeledes Sprechenden, ausgegangenwerden:vonderBeobachtungdesSpreche nsals 3 PersönlichkeitsleistungundalssozialenHandelns. “.Aclearforerunnero f theapplicationotafopologicalstructureinanMTT-fr amework.

2 Phrases tructurea nditss hortcomings Classicalphrasestructuretriestocollapsesyntacti candorderinginformation.Thisconceptionoft hesyntaxoflanguageiserroneous becauseitassumest hatwordorderisalwaysanimmediatereflecti onoft hesyntactichierarchyandthatanydeviationfromthisconstitutesaprobl em,denotedby 4 termslike scrambling . Modernlinguisticframeworksproposeadoublestructur econsistingat leastotfhebasicsyntacticstructure(valency,functorargumentstructure,fstructure,deepstructure –wewilluset het ermsyntactic dependency)anda linearstructure(surfacestructure,c-structure,phr asestructure,precedence rules,etc.);someframeworkslikeLFGputt hisdual ityatt hebasisotfheir system,o thers,likeHPSGunifythedifferentstructure s(e.g.SYNSEMand DTRS)ino nesign. Whilethesyntacticdependencygivesrisetolittlecontro versy,t hedifferentphrasestructuresproposedforlinearizationcons titutet heboneosfyntacticcontention.Thedifferentapproachesappeart obec aughtinthet ransformationalt hinkingwhichassumestwocloselyrelatedst ructures:Analready-ordereddeepstructureholdsfunctionalinformati onandtransforms, viamovements,intoasurfacestructurewhichthenstillca rriessomefunctionalinformationhiddenint henodest hathavenotgottenhold ofanyothe f itinerantelements.Accordingly,agoodsurfacephrases tructureiso nethat carriesasmuchfunctionalinformationaspossible,fori thadbeeneasyto obtainthesurfacestructurebyt ransformationfromthede epsyntaxtree.For English,t hisattemptcangoquitefar;forlanguageswi thcasehowever,t he wordorderservesmainlytorepresento thercommunicative goalst hanfunctionalrecovery. TheattemptsforatransformationalgrammarofGermanr esultedina greatvarietyofsurfacephrasestructures:Evers1975 ’sanalysisputsall NP’so nt hehighestpossiblenodeaftert het ransformationf romadeepstruc3 “For a nexplanationofhowaspokenorwrittenutterance them eantsignificantexperience,weh avet ohypothesi tion,onthea ctofcreationoft hesentencei nthesoul thea ctofspeakingaapersonal s achievementandaas 4 First usedbyRoss(1967).

becomesacorrectexpressionof zeonthespeechandthoughtfuncoft hespeaker:ontheobservationof socialaction.”(Drach1937,p.7)

3

GERDES

ture.Müller1999,onthecontrary,advocatesbinaryandr phrasestructuretreesforhisHPSGgrammar.Theyall phrasestructureswhosesubtreesdonotcorrespondtoli prosodic,semantic…)objectsoft heiro wn,t heirjustifi transformationalproximityt ot hedeepstructures.

ight-combed endupwithsurface nguistic(functional, cationreliesonthe

3 Germansentencet opology TheclassicalanalysisofGermansentencestructure(Dra ch1937,Bech 1955)dividest hesentenceintoafixedsequenceoffields i,nwhichthesyntacticelementsareplaced.Wedenoteby domainasequenceoffields.The maindomain ofadeclarativesentenceconsistsoV f orfeld(VF),le ftbracket (‘[’),Mittelfeld(MF),rightbracket(‘]’)andNachfel d(NF).Wecallt he fieldsVF,MF,andNF majorf ields . Theideaisthatwordsdonotpositionthemselvesinrelat iontoeach other,butt hatt heyappearint hefields,whicharepresent ateveryutterance. Thefieldstructurecontrolsthepossibleordersbyconst rainingthenumber ofelementst hattheymusthold. Kathol1995proposesaformalizationofthetopologica lstructurein HPSG,refiningworkofReape1994.Heshowsthatt hisst ructureisindependento fphrasestructureandessentialforlinearizing German.However, basedontheHPSGframework,hestillneedstokeepphras estructurefor combiningsigns.I nasense,hekeepsthreelevelsofdescri ption:Thedomainstructure(DOM)givingthelinearization,thephrases tructuretree (DTRS),representinghowthestructurehasbeenbuild,andt hedependency graph(encodedunderSYNSEM),correspondingt ot hesubcate gorization. Recentworksindependencygrammarhavetriedtolinkdire ctlythede5 pendencystructuretotheplacemento ft hewordsindiffe rentfields, skippingtheconstituentstructureunderlyingformalismslike HPSG.SeeBröker 1998foralexicalizeddescriptionofverysimplephenom enabasedonmodal logic,DuchierandDebusman2001foradescriptionincons traintprogramming,andGerdesandKahane2001aforadescriptionofa topologicalhierarchyseenasasyntacticmoduleoM f eaningTextTheory.T hedevelopment ofTreeUnificationGrammarsisanattemptt olexicaliz eandfine-tunethis latterapproach. Letusnowturntotheideasunderlyingthet opologicalp hrasestructure: ThealgorithmofGerdesandKahane2001at akesasinputa nunorderedsur5

Thewordorderproblemofm anydependencyanalysesiss problemsopf hrasestructuregrammarsstatedabove:Iis t deringoft hewordsi ntot hedependencygraph,since thedependentsofalexeme)a reevidentlynotsufficie forashortsummaryothe f differentattemptst odefine

4

omehoworthogonalt othe difficulttoencodet hel inearorpurelyl ocalrules(ontheorderingof nt.SeeLombardoandLesmo2000 the‘right’degreeoprojectivity. f

GERDES

hat

D

has

pp

subject

[

vf

versprochen

niemand

mf

]

nf

promised

nobody

indirobj dem Lehrer

zulesen

to the teacher

to read

obj denRoman

V

D

zu-inf

N

N

hatdenRomanniemand ]

mf V

N

hasthenovelnobody

nf V

demLehrerzulesenversprochen totheteachertoreadpromised

the novel

Figure2Correspondence : syntax

topology -

6 facesyntacticdependencytree with amarkupofgroupingsotfhewordso f thetree:Thelexicalelementsareallparto fasecond hierarchy,indicating whichelementswillhavetoformatopologicalconstituent T . hemeaningof theseconstituentsandtheunderlyingrestrictionso ntheirf ormationwillbe discussedinsection5.Outotfhismarked-uptree,we constructanordered hierarchyoft opologicaldomains,t hetopologicalphrase structure,t huslinearizingt hewordsothe f dependencyt ree. Asanillustration,considerFigure2: Thelinearizationwillbe doneby placingtheelementso ft hesyntacticdependencytreeinto themaindomain oft hedeclarativesentence.Westartfromtheroototf het reeandplacet he finiteverbintheleftbracket.Itssubjectcouldgoinone oft hemajorfields, andigoes, t forinstance,intot heMittelfeld. Essentialt othisanalysisist hatverbalcomplementcanbe placedintwo waysintothetopologicalstructure:averbalcomplement alwaysgointothe rightbracketoafdomain,butt hisdomaincaneitherbet he existing domain ofitsverbalgovernoro r,ift heverbheadsagroupingonit os wn,itcanbea newembeddeddomainitcreatesinamajorfieldofits governororina higherdomaincontainingitsgovernor.Anembeddeddomainofno n-finite verbsconsistso nlyofa Mittelfeld,arightbracket,andaNachfeld.Ifict reatessuchanewdomain,t hedomainasawholebehaveslikea non-verbal complementoftheverb.I ntheexample, versprochen’promised’headsa groupingandopenst hereforeanewembeddeddomain.Thisnew domain,as itisheadedbyapastparticiple,canonlygointotheV orfeld –azuinfinitivecouldjoinanymajorfield.

6

Thisa pproachusesavery‘surfacy’versionofdependen cy.Sincesubjectplacementi n Germanisidenticalforauxiliaries,raisingandcont rolverbs,weonlyencodeactual syntacticsub-categorization:T hecontrolledverb zulesen doesn otcontrolitsdeepsubject niemand/nobody,a ndthesubjectbelongst otheauxiliaryort hepastp articiple.SeeFigure2.

5

GERDES

Thenon-verbaldependentofthepastparticiple, demLehrer ‘tothe teacher’,isapartotfhegroupingofitsgovernor,and consequentlyiht ast o stayinitsgovernor’sdomain.Betweenthet womajorfiel dsint heembedded 7 domain,ict hoosest heMittelfeld. Thenextverbaldependent,t heinfinitive, couldagaincreateanewdomaininoneotfhemajorfield soiftsgovernor’s domainoro fadomaincontainingitsgovernor.Theotherchoi ceist ojoin therightbracketoiftsgovernor’sdomain.I ntheexample, thecreationofa newdomainisnotpossiblebecauseiitspartoiftsgove rnor’sgrouping.The infinitivehast ostayinitsgovernor’sdomain,andint hisc ase,itmustgodirectlyt ot heleftofitsgovernorint herightbracket. Nowitremainsonlytoplacethelastcomplement,“theno vel”.I tcan againgoinoneotfhemajorfieldsoiftsgovernor’sdom ain,oor afdomain containingitsgovernor.Sincethegroupingcutsiot uto f itsgovernor’sdomain,itfindsitselfnaturallyinthehigherdomain,nextt o niemandinthe Mittelfeld.Allelementso ft hedependencytreehavebeenp ositionedinthe topologicalstructureandt hederivationicsompleted. Sowehavelinearizedthe(unordered)nodesotfhedependenc ytreeinto 8 sentence(1), va ariationoRambow f 1994’smainexample . (1)Dem Lehrerzulesen versprochen hatden Roman niema The teacherto readpromisedhasthe novelnobody. Nobodyhaspromisedtotheteachertoreadthenovel

nd. .

Theotherpossiblesurfaceorderso G f ermancanbeobtained withother 9 groupings . ConsideralsotheotherexamplesinFigure3:I ncaseA, thesentenceis notfurthersubdividedintogroups,andtheconstructedtop ologyconsequentlyhasnoembeddeddomains.I ncaseB,t heinfinitiveand itscomplementformagroup,andthecorrespondingdomaincouldocc upyanymajor field.Inthisexamplet hedomaingoesintotheVorfeld,bu st tartingfromthe samestructure,wecouldalsogeneratethesurfacest ringasincaseC.This stringisidenticaltothestringincaseAi.e. , asente ncecanbet opologically ambiguous.WehaveshowninGerdesandKahane2001b,t hatt his distinctionisnotaspuriousambiguity,butcorrespondst odif ferentprosodicpatternsotfhesamestring,andthust odifferentlinguistic structures.Thepure 10 dependencyt reewithoutthegroupmark-updoesnotcaptu ret hisdifference

7

NP’si ntheNachfeldhaveaheavinessconstraint,n ot discussedhere.Seeforexample Müller1999section13.1.1.3 8 Readingthenovel isabetterexamplet han repairingthef ridge because , wea voidconfusionwitht hebenefactivedative( ihmdasFahrradreparieren vs. *ihmdenRomanl esen) 9 The constraintsonwhatcanformagroupa rediscussed insection 5. 10 TopologicalstructureArequiresaveryspecificdiscou rsecontexta ndist hereforem ore difficulttoobtaint hant het hreeotherexamples.See Gerdesa ndK ahane2001bfordetails.

6

.

GERDES hat has

pp

subject niemand

versprochen

nobody

promised

indirobj dem Lehrer

D

A

mf

vf

[

N

V

]

nf

zu-inf zulesen

to the teacher

to read

N

N

V

V

demLehrer niemand hat denRomanzulese

nversprochen

totheteacher thenovel nobody to has readpromi

obj

sed

denRoman the novel

hat has

D

pp

subject

B

niemand

versprochen

nobody

promised

indirobj dem Lehrer

[

vf

zu-inf

V

D

]

N

N

nf

V

hatdemLehrerniemandversprochen

zulesen

to the teacher

mf

to read

mf

]

N

V

hastotheteachernobodypromised

nf

obj denRoman

denRomanzulesen

the novel

thenoveltoread

D

C vf

[

N

V

mf

]

N

D

niemand dem hat Lehrer

nf

V

versprochen

nobody has totheteacher

mf

promised ] nf

N

V

denRomanzulesen thenoveltoread

D

hat has

pp

subject niemand

versprochen

nobody

promised

indirobj dem Lehrer to the teacher

vf

D

D

[

mf

]

V

N

V

niemand hat zu-inf

zulesen to read

mf

]

N

V

nobodyhas nf D

demLehrerversprochen obj

totheteacherpromised

denRoman

mf

]

N

V

nf

denRomanzulesen

the novel

thenoveltoread

Figure3 Syntacticdependencytreewithgrouphierarchyandtheirc

orrespondingtopologicalphrasestructuretrees 7

nf

GERDES

Therulespresentedaboveconstitutet hebackbone pendencylinearizationinatopologicalmodel.Fordetai ruleswerefert oGerdesandKahane2001a.Iwouldjus reader’sattentiont othefactt hatdo I nott reatt hest in thiswork.I nthesyntacticdependencytrees,NPsarerep innerstructure,evokingTesnière’snuclei(Tesnière1959) clustersom f eaning.Ocf ourse,inthelightoaftopolo man,iw t ouldbereasonabletoexplorethepossibilitieso asaspecifickindofdomain,fromwhichextractionispo conditionsjustasfromverbaldomains.

ofadescriptionodf elsandfiner-grained tliketodrawthe ructureonf ounphrases resentedwithout i,.e.unstructured gicalanalysisoG f eraf nalysingtheNP ssibleundercertain

4 TAGsa ndtheirs hortcomings AlexicalizedTAGisasimplemathematicallanguagemo delwithnicecomputationalproperties:Alexicalentryconsistsoef lem entarytreest hatcombinewithothert reesbyverysimplerulest oformthefina pl hrasestructure oft heanalyzedsentence.Notingdownthestepstakenyield saderivation tree,interpretableasasemanticdependencystructurec onsistingoft helexicalunits.Acompleteanalysisconsistsoft hestring,t heattachedderived tree,andthederivationt ree.Beckerealii t 91called obtainingt hecorrectobjects weak, strong,and derivational generativepower 11, respectively. Differentapproacheshavetriedtoconstructasemantics tructureduring theTAGderivation:SynchronousTAGsconstructasemantict reeinparallel totheusualderivationtree(whichraisesconsiderablythe computational complexityoftheformalism)(Shieber&Schabes1990).Jo shiandKallmeyer1999givet oTAGarestrictedmulti-componentma keup,designedfor scopeinterpretabilityothe f derivationt ree. Thislatterapproachpresumesthegoalo fTAGtobead irectlinkbetweenthesurfacestringandasemanticstructure.Equal ly,inthet woimportantexistingTAGgrammars,XTAG(XTAGgroup,1995)and FTAG (Abeillé1991),t hederivationtreesaresupposedtoenc odedependencieso f amoreprofoundlevelt hansimplesyntactic sub-categorization;e.g.raising verbs,notcarryingtheirsubjectintheirelementarytree, areadjoinedinto theinfinitives,resultinginaderivedstructurewheret he finiteraisingverbis notlinkedtothesubjectiat greeswith,i.e.t heraising verbisgiventherole ofapuremodifiero ft he“main”verb.However,t his“sem antic-ambition”12 oft hederivedtreefallsfarshortopf erfectionwhen,f orexample,adjectives 11

Thisn otationdisregardst hefactt hatt hederivation tachest oasentence,anditshouldsimplybeconsidere power. 12 Iti satl easta“deep-syntax-ambition”,dependingon questions.

ispartotfhe dasapartoft he

analysist hatTAGatstronggenerative

whereweplaceverbalvalency

8

GERDES

adjointooneanotherand semantic tree nott othenoun(Schabesand (obtainedwithsynchronous Shieber1994),o rwhenthe TAGs) derivedtreeofcontrolverb derivationtree constructionsdoesnoten(semantic anddeepsyntax codethe“controlled”link interpretation) betweensubjectandinfinideepsyntaxtree tive (Candito and Kahane (implicitly definedwiththe 1998a).Ist eemsmuchmore derivedtree) reasonabletolimito urselves derivedtree rightfromthestartt oasur(semantic,syntactic,or facesyntaxdependencyenprosodic interpret ation) coding exclusively surface surface string syntactic relations. HowFigure4Structures : inthestandardTAGanalysis ever, the writers of the LTAG-grammars did not haveachoice:Forexample,t heonlywaytocoverlong-dist ancerelationshipsinthe(single-component)TAGformalismist headjunc tionoft hematrixverbs,resultinginderivedstructureswithmixed(sem anticandsyntactic)informationcontent. Inthiscorrespondencefromthesurfacestringtothe(sem antic)derivationtree,t heroleotfhederivedtreeremainst heoretic allyandcomputationallyunclear:I tattemptstoresembleGB’ssurfacesy ntactictreeassome emptynodesaremarkedwithepsilons,butt hedeepsyntac tictreewhose elementhavebeenmovedisnevercalculated.Thesetreesinher itt hehandicapofGB’ssurfacesyntactictrees:Somenodesares implycomputational necessities,o therstendtorepresentsemantic,syntactic ,o rprosodicunits. Forexample,GB’ssurfacesyntactict reesneedintermed iatelandingsitesfor themovingobjects.TAG’sderivedtreehasevenmorenodesw hosesubtrees donotcorrespondto syntacticentities:Eachadjunctionaddsanadditional nodelevelintothederivedstructureandevenelementarytr eescannotbe flat,becausesisteradjunctionisnotavailable.Thenodes allowcontrolling 13 adjunctionbetweenelements,andarevitalinTAGs mainlyforexpressing thelinearizationrules;t heydonotstemfromlinguistic observation 14.The resultingmostlyright-branchingVPorNPstructuresare oftenjustifiedwith scopepropertiesofadverbialsoradjectives(seefor exampleSchabesand Shieber1994).I nasense,t he raisond’être oft hederivedtreeist hatiat l-

13

Thesameh oldsforDTGs(Rambowetalii1995)andGAGs Ofcoursesomel inguistmightreallywanttoh avet hes structure.Allwecanreallysayisthatalinguist nodesins otpossible.

(Candito,Kahane1998b). ei ntermediaten odesi nthephrase icdescriptionthatdoesnoth avethese

14

9

GERDES

lowedustoobtainasemanticallyinterpretablederivatio ntree;itsstatusis nottherepresentationolafinguisticentity. However,eventhiscompulsoryopen-mindednessoft heLTAGw riters concerningthederivedtreedoesnotsuffice:Nomatterw hichderivedtree 15 wetake,aslongasTAG’sstrongcooccurrenceconstraint issupposedto hold,wecannoto btainthepredicate-argumentstructureof adoublematrix constructionwithfrontedinnerargumentinEnglish(Rambow,V ijayShanker,Weir,1995),andforGerman,Beckereat lii,1 991,1992showthat TAGcannotdescribet he‘scrambling’phenomenainsatisfying a manner. Inspiteotfhesedrawbacksweshouldnotgiveuprightaw ayt heideao f alexicalizedtreegrammar,morepreciselyalexical grammarwhoselexical entriescanbecombinedintwomannersinparallel:toform anordered phrasestructureandanunordereddependencytree.Myobje ctiveist ousea lexicalizedtreegrammar asamoduleinanMTTapproach,i.e.ast hecorrespondencemodulebetweenthet opologicalandthe(surface) syntacticstructureoof urlinguisticrepresentation.Theexistenceotf hedifferentlevelscan bejustifiedcomputationallybythesimplicityofthetw ocorrespondence modules,oneforo btainingastructuret heo therfort rans latingiinto t thefollowingstructure.However,t helevelscanalsobevalidate dintuitivelyand psycho-linguisticallybytheexpressivenessoft hestructur esandtheir(possible)well-formednessrules.SoIsupposeaspecific phrasestructure(a topologicalhierarchy,whichItriedtojustifyinsecti on3),andaspecific dependency(onlysimplesub-categorizationstructuresfo ragreement),and I’mlookingforanalgorithmthatlinksthetwostructure scompositionally (withcorrespondingsubstructures).Wecouldcallt hisc apacityofalexicalizedgrammarthe descriptivestronggenerativepower . SinceTAGcannotanalyzeGermanwithanyphrasestructurea ndwitha derivedtreethatencodessyntacticorsemanticdependenci es,itisclear a fortiorit hatTAGs(anditscloserelatives,whichdonotallows isteradjunction)lackthedescriptivestronggenerativepowerfort he topologicalmodel, i.e.t hepowert oengendert hedesiredtopologicalstru ctureandthesurface syntaxderivationtreeinparallel.Mygoalist hust ode finealexicalizedtree grammarwithenoughdescriptivestronggenerativepowerfo rt herelation betweensurface a syntaxdependencyandt opology. Wewillseethatt hisgrammarshouldalsoremedyanother flawofTAG: Sinceelementarytreesosf tandardTAGhaveo rderedbranche s,weo btaina combinatorialexplosionoft reesundistinguishablefromsynt acticambiguity andthusahighinformationredundancy,inparticularforfr eerwordorder 16 languageslikeGerman. 15

Apredicatecontainsi ni tselementaryt reealeas t one t n odeforeachoits f a rguments. Oneproposedsolution,them etagrammar(Candito1999), solvest hepracticalaspectsof grammargeneration,butm oveslinguisticdescriptionout oft hetreesetsi ntothemeta-

16

10

GERDES

5 Communicativeg roupsa ndtopologicalwellformedness –dividingthet asksa mongthem odules. Itiswellknownthatthe“freedom”ofGerman(oranyothe rcaselanguage’s)wordorderiso nlyrelativet oagivensub-cate gorization,whendisregardingthecontextinwhichthesentenceisuttered.Soit ismostlyagreed ont hatthespeakerchooseso neo rderoelements f o veranot hero rder,forexampleintheGermanMittelfeld,t odistinguisholdinform ationfromnew one,t odistinguishwhatshet alksaboutfromwhatshesays aboutit,t odistinguishwhatshefindsimportantfromlessimportantinfor mation.Wecall thesedistinctions communicativestructure ;amorecommonlyusedtermis ‘informationstructure’ 17.

5.1. DataonVPFronting Itislessclearwhatkindofrulesgoverntheformation ofsocalled(partial)VPs. Theproblemstandso utclearlywhent heVPtakest heVorfe ld,because“ dasinsVorfeldverlegteSatzglied –gleichviel,wieesgrammatisch verwendetsei –kannbeliebiguntergliedertwerden.Immerjedoch bleibtes 18 einGanzes ” (Drach1937,page21). Theexamplesin(2),(3),and(4)have anNPjoiningthepast(orpassive)participletoformo neconstituentinthe Vorfeld.Thefirstquestionis:Whatkindofentityits heVorfeldinsentences like(2)and(3)? (2)a.

Den Roman gelesen hatPeterbishernicht. The novel(acc)readhasPeterso-farnot. Sofar,Peterhasnotreadthenovel.

b.

Ein berühmterGeigergeworden

wäreegerne. r

19

Afamousviolinist(nom)become,wouldhw e ith-pleasure. Hewouldhavelikedtobecomeafamousviolinist

grammar,reducingt heelementaryt reest osomea lgorit hmicsideproduct.SeeGerdes2002 fordetails. 17 pI refert het erm‘communicativestructure’,because dI on otknowwhatinformationis. Moreover,t hecomplexNP‘informationstructure’isambigu ousbetweentheintended reading‘structureotfhei nformation’=‘structuredi nformation’a ndt her eading‘structure thatcontains/givesinformation’=‘informativestructure ’.Interestingdiscussionsonthe termsi nquestioncanbefoundi nChoi1999(section3.2 .2),Vallduví1992a ndLambrecht 1994. 18 “Thephraset hatism ovedt ot heVorfeld –h oweverm aybei tsgrammaticaluse –canbe subdivideda rbitrarily.Italwaysr emainsa nentity.” 19 Thecontrastbetween(2b)a nd(4a),bothfrontedconstitu entswithnominativearguments,goest oshowthatthedistinctionbetweenthet er m subjectand nominativeargument isworthwhile.

11

GERDES

(3)a.

20

Ein Linguistangekommen ist(*sind)bishernicht. Alinguist(nom)arrivedhasso-farnot. Sofar,nolinguisthasarrived

b.

Solcheschönen Geschenkegemacht

wurden (*wurde)mirnoch nie.

Suchnice presents(nom)offeredwere (*was)to-me so-f have I nevergottenanygiftsthatbeautifulbefore.

c.

VonGrammatikernangeführt tiverVerben. 21

arnever.

werdenauchFällemitdemPartizipintransi-

By grammarianscitedare also caseswiththe participle Caseswiththeparticipleointransitive f verbsarealsocitedbgyr

(4)a?* .

of intransitive verbs. ammarians.

Ein Linguistgeschlafen hatbishernicht.

Alinguist(nom)slepthasso-farnot.

b.?*

DieserFrauunterlaufen

istein Fehlernoch nie.

To-thiswoman(dat)slipped-inim as istake so-farneve

r.

5.2. ProsodicandCommunicativeInterpretationoth f

D e ata

Onefirstanswert othequestiono nt hequalityothese f g roupsitshatthey certainlyareprosodicconstituents:Thewordsint heVor feldformagroupo f wordst hatdoesnotsupportapauseinitsmidstandt ha obtains t aawhole s a typicalmelodiccurve,dependingonthecontextinwhichthes entenceisuttered:Asananswert o(5a),(2a)iso nlypossiblewith afallingcontouro n theVorfeld.Thiscontextmakest heVorfeldtherhemeoft hesentenceand thefallingcontourisidentifiedastypicalrhematicac cent. 22Equivalently, whenweputt hesentenceinacontextwhere denRomangelesen isotfhematiccharacter(5b),t heVorfeldcaneitherhaveaflatp rosodiccurve,usuallyassociatedwithnon-prominentt heme,o rict anhavear aisingpitchaccento nthelastlexicallystressedsyllable,usedinmany languagesforcontrastandperseveranceofathematicelement.Wefindide nticaldatafor questions(6)with(2b)asananswer. (5)aWas . hatPeternoch nichtgetan? Whathasn’tPeterdone yet?

b.HatPeterden Roman gelesen? HasPeterreadthe novelyet?

(6)aWas . wäreegerne? r Whatwouldhleike tboe?

b.Wollteeein r berühmterGeigerwerden? Didhw e antto become famous a violinist?

Thesedataindicateclearlythatt hegroupingoft heelem feldnoto nlyhasaspecificprosodicappearance,butal

20

TheexampleifsromHaider1985. Theexamplesa refromMüller1999. 22 SeeGibbon1998. 21

12

entsintheVorsothatt hisgrouping

GERDES

asawholeplaysaspecificcommunicativerole.Thisis ourquestion. Tosumupweuseananalysisbasedontwobasicbinaryfea to Choi 1999’s point of view: Theme/rheme prominence.Thecommunicativeroleoft hefrontedVPcanbe rhematic;iftheconstituentisthematic,itcanbepromine prominent 24i,ifitsrhematic,iht ast obeprominentinordert obe theVorfeld. 25

5.3. Pushingthecommunicativeresponsibilityonthe

anotheranswert o

23

tures,similar and prominence/nonthematicor ntornonplacedin

semanticlevel

Thenextquestionis:Whatreallyiascommunicativestruc ture?Iitcs lear thatt heprosodicgroup,thecorrespondingstring,thecor respondingdomain, andthecorrespondingparto ft hesyntactictree,canall besaidtopossess thiscommunicativefeature,butwheredoesirt eallycom efrom,wheredoes itmaterialize?Aw t hichlevel,inanMTTviewoflanguage the , communicativestructureiscreated?will I notbecapableogf ive nacompleteanswert o thequestion,andrIefert heinterestedreadert otheboo koncommunicative structureMel’cuk2001.Ijustwantt ogivesomeindic ation,importantfor thejustificationofwhatfollows:Manytreatiseso npro sodyanalyzet heprosodicpatternso naword-stringbase.Theywouldsay:“ DenRomangelesen carriestheprosodicthememarking”,andthisisin asensecorrect,asthe stringios nerealizationothe f underlyingspeechact. Thepresentanalysiso ft heGermansentence,however,relies heavilyon theexistenceoft hesegroupsinthesyntacticdependencytr ee.I nanMTT analysisolfanguage,wehavet owonderaw t hichlevelthes tructuresareinstantiated.Fort hiswehavet odistinguishthefrontedcons tituentsin(2)and (3)fromtheungrammaticalstructureso(f4),andweha vet oask:Aw t hich levelshouldwebestinstantiatet hecommunicativegrouping inordert ocaptureeasilyt heexistingrestrictionso nt hesegroupings? Generally,allconstituentscanenterthefrontedconstitue ntexceptfor subjectsaisn(4a),whichleadst otheideat hatGermanno n-finiteverbsform VPs.Thereareneverthelessexceptionstothisrule:Some NPswithother casemarkingthannominativeareequallydifficultt ogro upwiththeverb,as 23

Choiusest het erms topica nd focus,endingupwith non-prominentfocus which , sounds tomelike defocalizedfocus .Iprefer themeand rheme,whileusingherbinaryfeature prominence. 24 Thisisimilart oVallduví1992whodistinguishes topica nd tail. 25 Aroughdraftoft hepossibleprosodicandcommunicativeva luesoft heVorfeldconstituentcanalreadybefoundinDrach1937:Hen otes thattheVorfeldcaneitherbeoccupiedbyt he expressiveposition ( Ausdruckstelle)for“semanticallyn on-emptywordswitha valueofemotionorwill”,orby“minori nformationoraconnec torwithgiveninformation”.Hedoesn otyetexplicitlystatet hepossibility offillingtheexpressivepositionwith giveni nformation,i.e.the prominenttheme case.

13

GERDES

demonstratedin(4b),wheret hedativeNPanditsverbalhead cannotforma communicativeentity.Thisarisesincaseswheretheargum entplaysavery agentiverole.Ergativeverbs(3a)andverbsintheirpassive voice(3b)seem totoleratebeinghookedtothesubject(anddeepobject). Thedifficultyfor allphrasestructurebasedapproaches,likeforexample HPSG,isthatlinearization,agreement,andtheconstructionort hepredica teargumentstructureisbasedonthephrase.For(3a,b)iht ast obeexplai nedhowandwhere thesubjectverbagreementisdone:DoesthewholefrontedV Pcarrythe 26 agreementvalueordo‘spirits’ carrytheinformationintothefrontedVP? Inversely,theo ptionalPPin(2c)hast obeassignedtheag ent’s θ-roleotfhe verb(seeMüller2000). Inthewhole,t hesemanticrelationbetweenpredicateandnou nappears toplayamoreimportantroleintherestrictionsonVP frontingthanthe nouns’actualcasemarking,asWebelhuth1985alreadyo bse rved.Unsurprisingly,itseemst hatwhenthespeakerdecideso nthecommu nicativegroupingofherspeech-act(theme/rheme,prominent/non-prominent), restrictions applyt hatrelyo nsemanticinformation.(2c)showst hati does t notsufficet o simplyblockallcommunicativegroupingsofagentandpre dicate,butwe onlyneedonespecificruleinthesyntax-semanticinterface forcapturingthe phenomenon: (A)Theabsenceofanagentiveargumentaswellasthecomm groupingoftheagentiveargumentwithitspredicatebotht passiveconstruction. 27 Allt hist oconcludet hato urlanguagemodelshouldplac oft hecommunicativegroupingatt hesemanticlevelo frep evenhigher),becauseatt hislevelt herestrictionsareeas semanticmoduleprovidesthecorrespondencebetweenthisse tureandasurfacesyntacticdependencyt ree.Wearenot thedetaileddescriptionoft hismodule,however,whendec strictionsonVPfrontingareasemanticproblem,weare thattheburdenwpe utont hemoduleins ottooheavy.

unicative riggerthe et heemergence resentation(or ytocapture.The manticstrucconcernedherewith laringthatt hereobligedtoshow

5.4. What’sleftforthesyntacticmodule? Inthe(stillunordered)surfacedependencyt ree,agreem outindependentlyotfhesubsequentactualsurfaceo rder. module,t helinkingofsurfacesyntacticdependencyandto chy,doesnothavetoworryaboutt herestrictionso nthefo beddedVPs.I nthedirectionofsynthesis,t hemodulegener 26

Meurers1999: RaisingSpirits(andassigningt hemcase) Therea re,ofcourseothert riggersotfhepassiveconst nuity,t hata reonf oconcerninthispaper.aI munsu mantic,ordiscursivefeaturedistinguishest het wosu

27

14

entcanbecarried Thus,oursyntactic pologicalhierarrmationofematest hepossible

. ruction,likee.g.discoursecontiret houghwhichcommunicative,serfacer ealizationoFigure f 1.

GERDES

wordorders,whichtransformthegivengroupingsintoembedd eddomain structures,whiletheanalysisreportst heencounteredgr oupingintothesurfacesyntacticstructure.Itremainst hedutyotfhese manticmodulet orefuse theungrammaticalstructureso ft hesentencesin(4).This correspondswell toourintuitionthatt heungrammaticalityoft hesesentenc esisoafdifferent natureasforexampletheagreementclashintheungrammati calvariantso f (3a)and(3b);ist eemslesscleart hatt hesentencesin( 4)arereallyungrammatical,itratherseemsdifficulttoguesswhatthes peakerwantst osay. semantic graph time past

LESEN read

undergoer

agent PETER nbsg

def Peter det comm T, Tp, Rp

ROMAN nbsg

def NOVEL det

surface syntaxtree

surface syntaxtree

wurdenbsg

hat nbsg

3 was pers

3 has pers

pp

subject Peter

der Roman the novel

gelesen

forme pp

read

Peter nb sg

pp

subject nb sg pers 3

gelesen read

forme pp

obj

by

nb sg den Roman pers 3 the novel

comm T,

iobj Peter

nb sg

comm T,

Peter

Tp, Rp

Tp, Rp

proposedmodule

proposedmodule

D

D

vf

[

D

V

mf

]

nf

vf

[

mf

V

P

Tp, Rp

Tp, Rp

D

N

Peter hat

V N denRomangelesen thenread ovel

nf

wurdevon Peter byPwas eter

Peter has ]

]

comm T,

comm T,

mf

mod von

mf

nf

topologicaltree

]

V N derRomangelesen thenread ovel

Figure5Schematic : representationscorrespondingtgoramm Left:activediathesis.Right:passivediathesis.

15

nf

topologicaltree

aticalsentences.

GERDES

Astheanalysisofthesentencesreachesthesyntacticmodu le,where agreementwassuccessfullychecked,wewillcallsente nceslikethosein(4) syntacticallywell-formed a, nd semanticallydefective E . qually,wewereable toassignatopologicalstructuretotheungrammatical versionso f(3a)and (3b),t heclashariseswhenagreementischeckedonthesynta cticlevel.We willcallt hesesentences topologicallywell-formed and syntacticallydefective.

5.5. Examplesocorrespondences f Accordingly,IadvocateanalyzingagrammaticalGermans compositionalcorrespondencebetweenatleastthreerepre specificwell-formedconditions.Figure6showst hesim twogrammaticalsentencesgivenaswrittentext.Thesesent inouranalysistothesamesemanticrepresentation.Ther semanticrepresentationismuchsimplifiedandnotfinegr capturet hechoiceotfhediathesis.Equally,oursyntac notgetholdofallwordordervariationinsideo nefield derscancorrespondt ot hesamesyntacticrepresentation. Aswrittentextcontainsnoindicationontheprosodicstruc Vorfeldconstituent,itscommunicativefeaturesremainu possiblevaluesaVorfeldconstituentcanget(non-prominent prominentt heme(Tp),prominentrheme(Rp)).Aninputfrom module(aspeechanalyzer)wouldspecifythemelodicp 28 ofthecommunicativefeaturecouldbienstantiated.

28

bI elievet hatthefieldsa nddomainsi nthet opological arer eflectionsopf rosodicgroupingsi nvolvedinthel inea analysisiisnasenseacompromiset ocapturewordorde MittelfeldandtheNachfeldarecertainlynotprosodic topologicalm odela ndstipulateforexamplethereplacem grainedfieldsofprecisecommunicativeandprosodicval happenasm elodicschemei ntheMittelfeld:T hefiel forRp,t hefieldfort hetheme,a ndfort herheme.How buildananalyzert hatanalyzesstringsa ndwordswit tionoft heMittelfeldwouldmainlyl eadtoagreata mou callevel.Soweh avet osticktowhatisobservablei nwrit ah eapomany f differentstructuresi nbetween.

16

entenceasa sentationswith plifiedstructuresfor encescorrespond easonist hato ur ainedenoughto ticrepresentationdoes and , twosurfaceo rtureoft he nderspecifiedinthe theme(T), theprosodic attern,andthevalue

hierarchyothe f Germansentence rizationprocess.Thet opological w r ithaprosodict ool.Infact,the units.Wecouldrevolutionizethe entoft heMittelfeldbyfiner ue,correspondingtowhatcan dfort heprominentt heme,t hefield ever,fort hemomentwewantt o houtprosodici nformation.Thepartintoaf mbiguitiesonthet opologitentext:theverbalbrackets,and

GERDES

semantic graph

LESEN

time past

read

agent

undergoer ROMAN nbsg

PETER nbsg

def NOVEL det

Peter

comm T, Tp, Rp

surface syntaxtree

surface syntaxtree

wurdenbsg

hat nbsg

3 was pers

3 has pers

pp

subject

derRoman

gelesen

forme pp

Peter

nb sg pers 3

read

mod

nb sg pers 3

obj

comm T, Tp, Rp

gelesen

forme pp

the novel

read

Peter nb sg

pp

subject

von by

iobj Peter

denRoman the novel

nb sg

comm T,

Peter

Tp, Rp

proposedmodule

proposedmodule

D

D

vf

[

mf

D

V

N

]

[

vf

nf

comm T,

Tp, Rp

Tp, Rp

hat denRoman ]

N

V

]

nf

V N wurdederR oman thenwas ovel

D

the has novel mf

mf

comm T,

nf

topologicaltree

mf

]

P

V

nf

topologicaltree

von Petergelesen

*Peter gelesen

byPread eter

read Peter

Figure6 Left:Schematicrepresentationoasfyntacticallycorrect andsemanticallydefectivesentence Right:Schematicrepresentation ofagrammaticalsentencewithpassivediathesis.

Thelefthand analysisofFigure6showsasemanticallydefectivesentence.Wecanconstructatopologicalphrasestructure andwecantransform thisstructureintoasyntacticstructure.Thesemantic module,however,fails, asiht ast otransferagroupingofverband subjectintoagroupingofpredicateandagent,whatisforbiddenwithrule(A).Thisgrou pingofpredicate andagentispossiblewiththepassiveconstruction,whatis showninthe derivationo nt herighthandside.

17

GERDES

no semantic graph isbuild

no semantic graph isbuild

surface syntaxtree

partsof saurface syntaxtree

nb pl 3 havepers

haben

3 havepers

pp

subject Peter

habennbpl

to the man

gelesen

forme pp

gelesen

forme pp

nb sg pers 3

read

Peter nb sg

pp

des Mannes

read

obj

obj

nb sg den Roman pers 3 the novel

comm T,

nb sg pers 3

comm T,

Tp, Rp

den Roman the novel

Tp, Rp

proposedmodule

proposedmodule D [

vf

D mf

]

nf

vf

comm T, Tp, Rp

V

D

Tp, Rp

N

D

habenPeter

mf

V

N

]

nf

des hat Mannes

Peter have mf

[

comm T,

]

V N *denRomangelesen thenread ovel

ofthe has man

nf

mf

topologicaltree

]

nf

V N *denRomangelesen read thenovel

Figure7 Schematicrepresentationscorresponding to asyntacticallydefectivesentence: Agreementproblem

topologicaltree

Figure8 Schematicrepresentationscorresponding to asyntacticallydefectivesentence: Valencyproblem

Figures7and8showsyntacticallydefectivesentences.N aretopologicallywellformed,fort heagreementprobl notpreventt hesyntacticmoduletoproduceatopologica andeventoconstructasyntacticdependencytreeouto fi formednessconditionoft hesyntacticlevelt hatwillcheck agreement. ThecaseoF f igure8isdifferent:Sincet hegenitiveNP argumento ft heauxiliary,wecannotestablishasyntactic them.Eitherwesaythatinthiscaseasyntacticstructu

18

onetheless,t hey emofFigure7does pl hrasestructure t.I t’sthewellthesubjectverb isnotasyntactic relationbetween recannotbecreated,

GERDES

no syntactic tree is

orweconcludethattheunconnectedpartsareasyntacticstructurethatdoesnotfulfillthewellformednessconditionofconnectedness. 29 Figure9showsatopologically defective sentence. Den Romanand Petercannotcreatea newdomainthatcouldoffera landingsiteforbothoft hemin the Vorfeld, and a connected topologicaltreecannotbeconstructed. Att hispoint,wehaveseena sufficientnumberofillustrations oft hesyntacticmoduleatwork, thenextstepbeingtheformalizationofthecorrespondingalgorithm.

build

proposedmodule

D

vf

[

mf

]

V

N

V

nf

hatgelesen vf

vf

N

N

have read

partsof taopologicaltree

denRomanPeter thePeter novel

Figure9 Schematicrepresentationscorrespondingtao topologicallydefectivesentence

6 GivingGermanaTUG Inwhatfollows,iIntroduceanewlexicalizedtreegram ilybasedonsuperposingandunifyingtreestructures.We TreeUnificationGrammar(TUG). IntheprecedingsectionsIdefinedwhatt healgorithmoft moduleissupposedtoperform:Takingastringofwor logicalphrasestructureonit,andbuildingcompositiona facesyntacticdependencytree.I naddition,t hegrouping intoonetopologicaldomain(withitseventualcommunicati shouldbepassedon,andmarkedonthedependencytree.We performthetaskofbuildingthetopologicalphrasestru combinationprocedureoflexicalizedtreechunks,takingT 29

Fort hesentenceofFigure8,t hesituationwouldbedi Thesyntacticstructurewouldbeconnectedandwell-for canclash.T hesemanticm oduleh oweverwillr emarkthe theLESEN‘read’semanteme.Thisviewallowsaw s ell tionswith‘subjectless’verbsl ikein(i):Fort hethir subjectr emainsoptional.T hesemanticmodulepassesa agentpositiononlyiitfagreesi n umbera ndperson. (i) Mirh atgegraut. Tom(edat.)hasdreaded dreaded. I

19

marint heTAGfamcallt heformalism hesyntactic ds,buildingatopollyinparallelasuroflexicalheads vefeaturevalue) wouldliket o cturewithasimple reeAdjoining

fferentwithoutt hegenitiveNP. medsincenoagreementfeature lackof anagentivea rgumentof anelegantdescriptionofconstrucdpersonsingularform hat‘has’t he nominativeargumenti ntothe

GERDES

Grammarsasamodel.Moreover,theconstructionhast obe compositionalin thesensethatt hesuccessfulcombinationoft wolexicalent rieso n thet opologicallevelshouldfinditsimmediatereflectioninthe surfacesyntacticdependencyt ree. TUGisprincipallyalexicalizationoft healgorithmfor thetopological phrasestructureanalysisoG f ermanpresentedinGerdes andKahane2001a. ItborrowednotionsandideasfromTAGsanditsrelatives D : TGs(Rambow etalii1995)addressthesameproblemoft heso-calle dlongdistancedependenciest hatarenotperipheral(likewh-extraction,whic hTAGscanhandle)butinthemiddleofthephrasestructure(scrambling ).TheAESof AlexisNasr1996placefort hefirstt imet helexicalt reegrammarinaMeaning-Textframework,andbubblegrammarsandGAGs(Kahane,19 97,CanditoandKahane1998b),justasTUGs,addresst heproblem oft helinkbetweenadependencygrammarandano rderedphrasestructur e.Anattemptt o describeacompleteMeaning-Textgrammarlexicallycanbe foundinKahane2001.Thet askofTUGsim s erelyt oserveaascor respondencemodule betweensyntacticdependencyandt opologicalphrasestruct ure:

6.1. Thedefinition Let Vbeanalphabet,letD ∈Vbeadistinguishedletter,let Wbetheseto f words. Wecallt reenodes atomsitfheyaredistinguishedbyalabelLoutoV f andbyabinarycolorfeature.Thisfeaturecantaket heva luefull(i.e.