the lexicon-grammar of portuguese predicate nouns

13 downloads 0 Views 2MB Size Report
Jun 22, 2018 - Verb Fazer. 2017. LG of. Human. Intransitive. Adjectives. 2015. LG of. Support. Verb ser de. 2018. eSPERTo Smart Paraphrasing System.
technology
 NooJ International Conference - Palermo, Italy,

20—22

from seed June 2018

Cristina Mota1 Jorge Baptisa1,2 Anabela Barreiro1

THE LEXICON-GRAMMAR OF PORTUGUESE PREDICATE NOUNS WITH SER DE IN PORT4NOOJ 1 INESC-ID, Lisbon 2 Universidade do Algarve

Port4NooJ 3.1 eSPERTo Smart Paraphrasing System Port4NooJ Genesis 2009: OpenLogos ü  Bilingual PT-EN ü  Morpho-syntactic Relations ü  Semantico-syntactic Properties (SAL)

LG of Human Intransitive Adjectives 2015

ü  ü  ü  ü  ü 

LG of Support Verb Fazer 2017

Derivational Relations Support Verb Constructions Semantic Relations SentiLex 2016 Stencil NER 2016

LG of Support Verb ser de 2018

2

eSPERTo Paraphrasing Applications QA and Summarization Applications

Edgar, Virtual QA Agent (Fialho et al. 2013)

eSPERTo Paraphrasing • 

.

SSNT Summarization (Ribeiro 2011)

3

eSPERTo Interface https://esperto.l2f.inesc-id.pt/esperto/esperto/demo.pl

Interactive language learning application – helps learners in producing and revising their texts 4

Predicate nouns with Vsup ser de • 

(Baptista, 2005)

Lexicon-grammar of 2,085 predicate nouns which co-occur in constructions with the support verb ser de (‘be of ’) –  Many correspond to adjective predicative constructions, so in those cases they are linked to a corresponding adjective in the LG table –  Classified into 9 classes according to: •  number of arguments (1 or 2) selected by the predicate noun •  the syntactic (sentential/nominal) constraints •  distributional (semantic) selection constraints on the nominal argument slots (human/ non-human). •  Two special classes were established for: –  nouns selecting a body-part noun as their subject –  symmetrical constructions

• 

The process of integrating the LG of predicates with Vsup ser de into Port4NooJ was very similar to integrating the LG of predicates with Vsup fazer

5

Predicate nouns with Vsup ser de Classification Criteria Ser de

N0 ser de N

9 classes

N0 ser de N Prep N1

N0 =: Nhum

N0 =: Nhum

N0 =: Nnhum

N0 =: Nnhum

N0 =: Que F0

N0 =: Que F0

N0 =: Npc de Nhum

N1 =: Que F

N1 ser de N Prep N0

Predicate nouns with Vsup ser de Classification Criteria Ser de

N0 ser de N

N0 ser de N Prep N1

SdH1

SdH2

328 19%

54 3%

SdNH1

SdNH2

363 17%

30 1%

SdQ0

SdQ1

820 39%

308 15%

SdNPC

SdQ2

30 1%

37 2%

SdSIM 55 3%

Transformations based on noun predicates with Vsup ser de Negation [in-N]

Paraphrases O Pedro é de uma certa intolerância à lactose Pedro is of a certain intolerance to lactose

[falta de N] ([lack of N])

O Pedro é de uma certa falta de tolerância à lactose Pedro is of a certain lack of tolerance to lactose

Vsup Substitution [Vsup=ser de]

Paraphrases A Ana é de uma alegria contagiante Ana is of a contagious happiness

[Vsup=ter]

A Ana tem uma alegria contagiante Ana has a contagious happiness

[Vsup=haver]

Há na Ana uma alegria contagiante There is a contagious happiness in Ana

[Vsup=ser de]

A Ana é de uma alergia ao pó impressionante Ana is of an impressive allergy to dust

[Vsup=faz]

A Ana faz uma alergia ao pó impressionante Ana makes an impressive allergy to dust 8

Integration of LG of Vsup ser de –  Major challenges ²  50% of the predicate nouns already exist in Port4NooJ Ø  Old news: somewhat being addressed since we started integrating the LG with Vsup fazer -  consolidate information from old entry and LG table -  solution is far from perfect -  needs thorough revision

²  55% of the cases where the predicate nouns have an equivalent adjectival construction, the adjective is homograph of a human intransitive adjective (HIA) already formalized in the LG of HIA Ø  New problem! Adjectives equivalent to predicate nouns are being treated by derivation. Not sure how to harmonize those derived entries with the HIA entries yet…

9

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Mostly done automatically with various scripts ≠ scripts used to integrate the tables of Human Intransitive Adjectives

Port4NooJ •  • 

Current version (CV) Version before removing Npred that derive from verbs (OV)

LG tables

ü  Check if noun exists in Port4NooJ: ü  If noun not in CV nor in OV §  Create new entry ü  If noun in CV and (not in OV or CV=OV entry) §  Merge1 the LG properties with current entry ü  If noun in OV only or (CV≠OV entry) then §  Merge2 the LG properties with old entry §  Remove nominalization from CV ü  Create FLX and DRV codes and corresponding rules as needed ü  Check for missing FLX and DRV codes

npred_vsupserde 10

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Representation of LG table properties

+Det…

+Vsup…



+Npred+Vsup =ser +Table=SdH1

+N0Nhum +PfxNeg=in +Vsupter=ter

Grammars take care of de +VsupteroNdeVinf0w=ter 11

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Representation of LG table properties +Prep…

+N1…

+DRV=N2A5:ALTO

•  DRV code is determined and formalized automatically by finding the radical between the noun and the verb or adjective that are listed in a separate file activ(idade) => N2A5= o/A

• 

FLX code of derived word is determined by consulting Port4NooJ activo,A+FLX=ALTO+AV+state+EN=brisk+DRV=AVDRV01:RAPIDAMENTE

If the derived form does not exist, then its code is assigned automatically 12

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ①  Noun not in Port4NooJ (old or current): ü 

Create new entry: ü  ü  ü 

ü 

FLX code is assigned automatically given the ending of the word Entries are checked for missing FLX codes and reviewed by a linguist All other properties come from LG table

Add entry to new standalone dictionary npred_vsupserde.dic

airosidade,N+FLX=CASA+Npred+Vsup=ser+Table=SdH1+N0Nhum+N0Npabst+N0Ncl asspessoa+DetEModif+DetUMModif+Vsupter=ter+VsupteroNdeVinf0w=ter+ DRV=N2A5:ALTO

13

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ②  Noun exists both in current and old Port4NooJ A. 

If entries are the same do Merge 1: ü  ü 

Blindly add additional properties as specified by the LG tables to current entries Add merged entries to npred_vsupserde.dic

aprumo,N+FLX=ANO+AB+prop+EN=aplomb





+Npred +Vsup=ser+Table=SdH1 +Negfaltade +N0Nhum +N0Npc +N0Npabst+N0Nclasspesso a+DetEModif +DetUMModif +Vsupter=ter+VsupteroNd eVinf0w=ter +Vsuphaver=haver+DRV=N2 A16:ALTO

14

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ②  Noun exists both in current and old Port4NooJ A. 

If entries are the same do Merge 1: ü  ü 

Blindly add additional properties as specified by the LG tables to current entries Add merged entries to npred_vsupfazer.dic

aprumo,N+FLX=ANO+AB+prop+EN=aplomb+Npred+Vsup=ser+Table=SdH1+Negfal tade+N0Nhum+N0Npc+N0Npabst+N0Nclasspess+DetEModif+DetUMModif+Vs upter=ter+VsupteroNdeVinf0w=ter+Vsuphaver=have+DRV=N2A16:ALTO



15

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ②  Noun exists both in current and old Port4NooJ B. 

If entries are not the same do Merge 2 with old entries as shown in case 3: ü  ü  ü  ü 

Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV

Entries in CV: avidez,N+FLX=LUZ+AB+qual+EN=avidity avidez,N+FLX=LUZ+AB+qual+EN=greed Entries in OV: avidez,N+FLX=LUZ+AB+strvb+Npred+Nom+EN=acquisitiveness+VRB=ansiar



16

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ②  Noun exists both in current and old Port4NooJ B. 

If entries are not the same do Merge 2 with old entries as shown in case 3: ü  ü  ü  ü 

Remove previous Npred related properties +Npred Blindly add additional properties as specified by the LG tables to old entries +Vsup=ser+Table=SdQ Add merged entries to npred_vsupserde.dic 0 Remove nominalization from CV



Entries in CV: avidez,N+FLX=LUZ+AB+qual+EN=avidity avidez,N+FLX=LUZ+AB+qual+EN=greed

Entries in OV: avidez,N+FLX=LUZ+AB+strvb+Npred+Nom+EN=acquisitiveness +VRB=ansiar

+N0Nhum+N0NpreddeN+ N0NopQueF+N0RestrNo pQueF+N0QueFconj+N0 OfactodeVinf0w +N0N0Vinf0w+N0Restr Vinf0w+N0Nclass+N0N classpessoa+DetEMod if+DetUMModif+Vsupt er=ter+Vsuphaver=ha ver+DRV=N2A18:ALTO

17

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ②  Noun exists both in current and old Port4NooJ B. 

If entries are not the same do Merge 2 with old entries as shown in case 3: ü  ü  ü  ü 

Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV

Entries in CV: avidez,N+FLX=LUZ+AB+qual+EN=avidity avidez,N+FLX=LUZ+AB+qual+EN=greed Entries in OV: avidez,N+FLX=LUZ+AB+strvb+Npred+Nom+EN=acquisitiveness +VRB=ansiar+Npred+Vsup=ser+Table=SdQ0+N0Nhum+N0NpreddeN+N0NopQu eF+N0RestrNopQueF+N0QueFconj+N0OfactodeVinf0w+N0N0Vinf0w+N0Rest rVinf0w+N0Nclass+N0Nclasspessoa+DetEModif+DetUMModif+Vsupter=te r+Vsuphaver=haver+DRV=N2A18:ALTO

18

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ③  Noun exists only in old Port4NooJ ü 

Do Merge 2 with old entries as shown in Case 2-B: ü  ü  ü  ü 

Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV

capricho,N+FLX=ANO+AB+strvb+Npred+Nom+EN=caprice+VRB=caprichar

19

Integration of LG of PT Vsup ser de –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ③  Noun exists only in old Port4NooJ ü 

Do Merge 2 with old entries as shown in Case 2-B: ü  ü  ü  ü 

Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV



capricho,N+FLX=ANO+AB+strvb+EN=caprice

+Npred+Vsup=ser +Table=SdH1+N0Nhum +N0Npabst +N0Nclasspessoa +DetE +Vsupter=ter +VsupserumNdpdNhum=ser +Vsuphaver=haver +DRV=N2A25:ALTO

20

Integration of LG of PT Vsup fazer –  From LG tables to NooJ dictionaries •  Integration with eSPERTo dictionary entries ③  Noun exists only in old Port4NooJ ü 

Do Merge 2 with old entries as shown in Case 2-B: ü  ü  ü  ü 

Remove previous Npred related properties Blindly add additional properties as specified by the LG tables to old entries Add merged entries to npred_vsupserde.dic Remove nominalization from CV

capricho,N+FLX=ANO+AB+strvb+EN=caprice+Npred+Vsup=ser+Table=Sd H1+N0Nhum+N0Npabst+N0Nclasspessoa+DetE+Vsupter=ter+Vsupser umNdpdNhum=ser+Vsuphaver=haver+DRV=N2A25:ALTO

21

Integration of LG of PT Vsup ser de –  From LG to NooJ grammars • 

New grammars to paraphrase constructions based on specific LG properties, such as paraphrase of negative constructions Ø  Use of more than one LG property: PfxNeg & Negfaltade

O Pedro é de uma grande falta de sinceridade Pedro is of a great lack of sincerity

O Pedro é de uma grande falta de sinceridade Pedro is not of a great sincerity Pedro is of a great insincerity

22

Integration of LG of PT Vsup ser de –  From LG to NooJ grammars • 

New grammars to paraphrase constructions based on specific LG properties, such as when restructuring with a possessive pronoun Ø  Unidirectional paraphrase: needs larger context and more complex analysis to be able to rephrase the possessive with the appropriate noun phrase

(Fazer isso) é do interesse do Pedro (To do this) is of interest to Pedro

(Fazer isso) é do interesse do Pedro (To do this) is of his interest

23

Integration of LG of PT Vsup ser de –  From LG to NooJ grammars • 

New grammars to paraphrase constructions based on specific LG properties that also exist on other LG grammars, such as the substitution of the support verb by another verb Ø  Likely to be common to the three lexicon-grammar or at least have shared sub-graphs

O Pedro é de um certo altruismo Pedro is of a certain altruism

O Pedro é de um certo altruismo



24

Preliminary Results • 

2132 predicate nouns with Vsup ser de (1376 different noun lemmas) –  Additional 797 entries await revision of inflectional codes of derived adjectives or have format problems to be added to the final dictionary

• 

450 new derivational paradigms, but there might be overlap with paradigms created when integrating LG of vsup fazer

• 

Example grammars for the syntactic parser

• 

Half of the nouns already existed in Port4NooJ (50%) è

6% increase in nominal entries and 20% increase in predicate nouns Table SdH1 SdH2 SdNH1 SdNH2 SdNPC SdQ0 SdQ1 SdQ2 SdSIM Total

Example In Port4NooJ O Zé é de uma alegria contagiante 183 O Zé é da confiança da Ana 41 Este molho é de uma acidez exagerada 153 Esta substância é de uma total indissolubilidade em água 16 O rosto da Ana era de uma palidez doentia 7 Essa medida é de grande abrangência 309 O Zé foi de uma agressividade desproporcionada para com a Ana 162 O Zé é de uma grande habilidade para tratar das roseiras 22 O Zé e a Ana são de um companheirismo exemplar 34 927

New % In 208 47% 14 75% 211 42% 15 52% 24 23% 512 38% 147 52% 16 58% 22 61% 1169 50% 25

Next Steps

Consolidate Dictionaries

Review Review Review

Build LG Paraphrasing Grammars

Integrate new LG

26

Thank you! Grazie!

[email protected] [email protected] [email protected] 27