Ellipses resolution in WH-constructions in Pashto

0 downloads 0 Views 237KB Size Report
The کﻮڅ (Who) in the above question asks for .... sentence is in non-past tense (whether transitive or ... auxiliary is chosen for the verb of the answer from.
Ellipses resolution in WH-constructions in Pashto Mushtaq Ali,

Mohammad Abid Khan, Rahman Ali

Department of Computer Science, University of Peshawar. [email protected], [email protected], [email protected] Abstract—The constructions in Pashto, having a question involving a WH-word equivalent and an answer or response to that question, are called WHconstructions. In most of the cases, in both written and spoken WH-constructions, there occur ellipses among them. This paper presents the first ever rule-based algorithm and the factors involved in ellipses resolution in WH-constructions in Pashto text. The algorithm resolves ellipses occurring in the answers to WHquestions by using the morphological and syntactic information of both the question and its corresponding answer. The algorithm accurately resolves the ellipses occurring in the answers to the questions having WHwords i.e.‫( څﻮک‬Who-Dir),‫( ﭼﺎ‬Who-Obl),‫( څﻪ‬What), ‫ﮐﻮم‬ (Which),‫( ﭼﺮﺗﻪ‬Where) and ‫( څﻮﻣﺮﻩ‬How much) among them. The algorithm has been tested on real world text of Pashto taken from various genres such as books, novels, newspapers and online magazines. The algorithm achieved an accuracy of 76%. Elliptical construction; Keywords-Ellipses; construction; Elliptical clause; Antecedent clause.

WH-

I. INTRODUCTION Ellipses resolution is an important area in the research community. All natural languages have the occurrences of ellipses in their text as well as speech, although different from each other. Ellipsis is the omission of a portion of a phrase or a sentence [3]. Ellipsis is “The omission of a word or phrase necessary for a complete syntactical construction but not necessary for understanding” [7]. “The antecedent or source clause is complete, whereas the target clause is missing (or contains only vestiges of) material found overtly in the source” [8]. Ellipses are the non-expression of one or more sentence elements whose meaning can be reconstructed either from the context or from a person’s knowledge of the world. For their resolution, different approaches are followed by linguists as described by Shalom Lappin [2]. Some of the several types of Ellipses are: 1 2 3 4 5 6 7

Noun phrase ellipses Verb phrase ellipses Gapping Stripping Sluicing Ellipses in WH-constructions Ellipses in Q-constructions

Ellipses may occur in a single sentence (where the antecedent clause or a-clause and elliptical clause or e-clause occur in the same sentence) called Intrasentential ellipses while ellipses when occur in multiple sentences (where the a-clause is in one sentence and e-clause in another) are called Intersentential ellipses. “Questions are basic to the give and take of social interaction. In fact, on the average one asks, answers, or at least hears a question once in every 40 words of conversation (In conversation about half of all questions are elliptical). WH-questions are asked when special information is unknown or unclear, such as the location, the participants, the object, the reason, the manner, or the time. In these cases, an appropriate question word (where, who, what, why, how, or when respectively) is placed at the beginning of the sentence. As most of these question-words begin with WH, they are named WH-questions” [5]. A. WH-Constructions The construction having a question involving a WH-word, and an answer or response to that question is called WH-construction. In most of the cases, the answers to such questions involve Ellipses among them. So if the answering sentence contains an elliptical-clause, its corresponding antecedentclause would have to be found from the constituents of the questioning sentence. B. WH-Constructions in Pashto The pronouns ‫“( څﻮک‬Who” in English) is used as animate,‫څﻪ‬/‫“( څﺸﻰ‬What” in English) is used as inanimate and ‫“( ﮐﻮم‬Which” in English) is used as animate and inanimate both. Noun is that meaning of a language which chooses the answer for “Which” (‫)ﮐﻮم‬, “What” (‫څﻪ‬/‫ )څﺸﻰ‬and “Who” (‫ )څﻮک‬from its set as a result of a question [13]. For example: 1)

Q:

‫دا ﻣﻮرڅﻮک وې؟‬ [dā mor cok wi] [Who this mother is?] A: .‫دا ﺧﻮهﻢ دﭼﺎ ﻟﻮروې‬ [dā xū hum də čā lūr wi] [This mother also is the daughter of someone] The ‫( څﻮک‬Who) in the above question asks for some noun to be provided in the answer. So, in the answer there must be a noun present to provide the information which is asked in the question. The WH-

words in Pashto can have several different forms by adding a prefix or postfix with the original word. When the postfix ‫ ځﺎﺋﮯ‬is added to the WH-word ‫ﮐﻮم‬ then the response it asks about is Locative. For example: 2) Q:

‫ﺗﺎﺳﻮدﮐﻮم ځﺎﺋﮯ وﺳﯧﺪوﻧﮑﻰ ﯦﺊ؟‬ [tāsū dá kūm ĵāү wusidunkσү үaү] [From where are you?] A: .‫د ﺧﻮﺵﺤﺎﻟﭙﻮر‬ [dá xušālpūr] [From Khushalpur] Similarly, with ‫( څﻪ‬What), the postfix ‫ ﺷﻰ‬is added to form ‫( څﻪ ﺵﻰ‬Which thing). When the postfix ‫ ﻩ‬is added to the WH-word ‫( ﮐﻮم‬Which), the WH-word becomes ‫ ﮐﻮﻣﻪ‬for feminine and requires the feminine focus-word in the answer.

In the above example, the clitic ‫( دي‬2nd Sng, you) of the question is converted into ‫( ﻣﻲ‬1st Sng, I) in the answer. So the full answer after ellipses resolution becomes: .‫د زیړى ګﻞ ﮐﺎﮐﺎ ﻟﻮر ﻧﻴﮑﺒﺨﺘﻪ ﻣﻲ ورﻟﻪ ﺧﻮښﻪ ﮐړي دﻩ‬ [də zeŗi gul kaka lūr nekbáxtá maү warla xwaşá kəŗi dá] [I have chosen the lucky daughter of Uncle Zeeri Gul for him] Table-1. Pronouns-Transformation Person-Number-Gender Pronoun Transform ation P1-Sng-Masc/Fem ‫زﻩ‬/ ‫ﻣﺎ‬/‫زﻣﺎ‬ ‫ﺗﻪ‬/ ‫ﺗﺎ‬/‫ﺳﺘﺎ‬ P1-Plu-Masc/Fem P2-Sng-Masc/Fem P2-Plu-Masc/Fem

Several researchers (Fiengo and May, 1990; Ha'/k, 1987; Hellan, 1988; Lappin and McCord, 1990; Lappin, 1992) followed the syntactic account for ellipses resolution while the others (including Dalrymple et al. [1991]; Gawron and Peters, 1990; Klein, 1987; Sag, 1976; Williams, 1977) followed the semantic account for ellipses resolution [9]. There has been almost no computational work on ellipses resolution in Pashto before this, although very little work has been done on anaphora resolution. Here a morphological and syntactic approach of ellipses resolution in Pashto text is presented. Section-II describes various factors involved in ellipses resolution. Section-III provides an algorithm for a computational resolution of ellipses in Pashto WH-constructions and in sectionIV an evaluation of the described approach is provided.

P3-Sng-Masc/Fem P3-Plu-Masc/Fem Demon (Sng/Plu, Masc/Fem)

PersonNumber P1-Sng P1-Plu P2-Sng P2-Plu P3-Sng P3-Plu Directional Directional Directional

ƒ

II. ELLIPSES RESOLUTION IN WHCONSTRUCTIONS IN PASHTO ƒ The pronoun in a question, if exists, is converted into its appropriate form according to Table-I. It means that if a question of WHconstruction has a personal pronoun in 1st or 2nd person convert it into the one according to Table-I. It also shows that if the pronoun of the question is in 3rd person there will be no change in it in the answer sentence. ƒ The Clitic when exists in the question may or may not be provided in the answer. When it is not provided then an appropriate transformation for the clitic in the question is made based on Table-II, at the position of the clitic in question. Computational treatment of clitics in Pashto is discussed by Din et al [12]. Consider the following example: 3) Q: ‫څﻮک ﻧﻴﮑﺒﺨﺘﻪ دي ورﻟﻪ ﺧﻮښﻪ ﮐړي دﻩ؟‬ [cok nekbáxtá daү warla xwaşá kəŗi dá] [Who is the lucky girl you have chosen for him?] A: .‫د زیړى ګﻞ ﮐﺎﮐﺎ ﻟﻮر‬ [də zeŗi gul kaka lūr] [The daughter of Uncle Zeeri Gul]

‫ﻣﻮﻧږ‬/‫ﻣﻮﻧږﻩ‬/ ‫زﻣﻮﻧږ‬/‫زﻣﻮﻧږﻩ‬ ‫ﺗﻪ‬/ ‫ﺗﺎ‬/‫ﺳﺘﺎ‬ ‫ﺗﺎﺳﻮ‬/‫ﺗﺎﺳﻮ‬/ ‫ﺳﺘﺎﺳﻮ‬/‫ﺳﺘﺎﺳﻮ‬ ‫هﻐﻪ‬/ً‫هﻐﻪ‬/‫هﻐﯥ‬ ‫هﻐﻮې‬ ‫دے‬/‫دا‬/‫دوې‬

‫ﺗﺎﺳﻮ‬/‫ﺗﺎﺳﻮ‬/ ‫ ﺳﺘﺎﺳﻮ‬/‫ﺳﺘﺎﺳﻮ‬ ‫زﻩ‬/ ‫ﻣﺎ‬/‫زﻣﺎ‬ ‫ﻣﻮﻧږ‬/‫ﻣﻮﻧږﻩ‬/ ‫زﻣﻮﻧږ‬/‫زﻣﻮﻧږﻩ‬ ‫هﻐﻪ‬/ً‫هﻐﻪ‬/‫هﻐﯥ‬ ‫هﻐﻮې‬ ‫دے‬/‫دا‬/‫دوې‬

Table-II. Clitics-Transformation Clitic Transformation occurring ‫ﻣﻲ‬ ‫دي‬ ‫ﻣﻮ‬/‫ام‬ ‫ﻣﻮ‬/‫ام‬ ‫دي‬ ‫ﻣﻲ‬ ‫ﻣﻮ‬/‫ام‬ ‫ﻣﻮ‬/‫ام‬ ‫ي‬ ‫ي‬ ‫ي‬ ‫ي‬ ‫را‬ ‫در‬ ‫در‬ ‫را‬ ‫ور‬ ‫ور‬

In Pashto language, the verb in non-past tense agrees with its subject (noun or pronoun) when changing the number and gender. When the sentence is transitive and is in the past tense, the verb agrees with the object instead of the subject [6]. e.g. .‫اﺡﻤﺪ ﺧﻂ ﻟﻴﮑﻰ‬ .‫اﺡﻤﺪ ﺧﻂ وﻟﻴﮑ ًﻪ‬ [ahmad xat leki] [ahmad xat wūlekū] [Ahmad writes a letter] [Ahmad wrote a letter] .‫اﺡﻤﺪ ﭼﻴټﺊ وﻟﻴﮑﻠﻪ‬ [ahmad č‫ל‬ṭσү wulekəlá] [Ahmad wrote a letter] .‫ز ًﻩ ﺧﻂ ﻟﻴﮑﻢ‬ .‫ﺗ ًﻪ ﺧﻂ ﻟﻴﮑﮯ‬ [zá xat lekəm] [tá xat lekү] [I write a letter] [You write a letter] ‫ﻣﺎ ﺧﻂ وﻟﻴﮑ ًﻪ‬ .‫ﺗﺎ ﺧﻂ وﻟﻴﮑ ًﻪ‬ [mā xat wulekə] [tā xat wulekə] [I wrote a letter] [You wrote a letter] An appropriate verb-ending is placed with each verb accordingly based on its agreement with the subject or object. During ellipses resolution, in WHconstruction, the appropriate verb-ending transformation is performed in the answering sentence according to Table-III. These verb-endings are discussed by Khattak [11]. In past tense

transitive sentences the verb-ending is changed according to the number and gender of the object of the sentence. Table-III. Transformation of verb-Endings (Present Tense) PersonVerb-Ending Verb-Ending Number Occurring Transformation P1-sng ‫م‬ ‫ى‬ P1-plu ‫و‬ ‫ئ‬ P2-sng ‫ى‬ ‫م‬ P2-plu ‫ئ‬ ‫و‬ P3-sng ‫ى‬ ‫ى‬ P3-plu ‫ى‬ ‫ى‬

.‫هﻐﻪ ﻟﻴﮑﻰ‬ .‫هﻐﻮئ ﻟﻴﮑﻰ‬ .‫ﺗ ًﻪ ﻟﻴﮑﮯ‬ [ĥağá leki] [ĥağwi leki] [tá lekσү] [He/She writes] [They write] [You write] .‫ﺗﺎﺳﻮ ﻟﻴﮑﺊ‬ .‫ز ًﻩ ﻟﻴﮑﻢ‬ .‫ﻣﻮﻧږﻟﻴﮑﻮ‬ [tāsū lekσү] [zə lekəm] [mūnř lekū] [You (plural) write] [I write] [We write]

Tense

Gender

Person-Number

Aux

Muscular

P1-sngu P1-Plu P2-sngu P2-plu P3-sng 3rd-plu

‫ﻳﻢ‬ ‫ﻳﻮ‬ ‫ي‬ ‫ﻳﺊ‬ ‫دے‬ ‫دي‬

‫دي‬

P1-sng P1-plu P2-sing P2-plu P3-sng P3-plu P1-sngu P1-plu P 2-sngu

‫ﻳﻢ‬ ‫ﻳﻮ‬ ‫ي‬ ‫ﻳﺊ‬ ‫دﻩ‬ ‫دې‬ ‫وم‬ ‫وو‬ ‫وى‬ ‫وے‬ ‫وو‬ ‫وو‬ ‫وم‬ ‫وو‬ ‫وى‬ ‫وے‬ ‫وﻩ‬ ‫وى‬

‫ي‬ ‫ﻳﺊ‬ ‫ﻳﻢ‬ ‫ﻳﻮ‬ ‫دﻩ‬ ‫دي‬ ‫وى‬ ‫وے‬ ‫وم‬ ‫وو‬ ‫وو‬ ‫وو‬ ‫وى‬ ‫وے‬ ‫وم‬ ‫وو‬ ‫وﻩ‬ ‫وى‬

Non-Past

Feminine

e.g.

In the answer to a WH-question, the verb will be transformed according to the subject when the sentence is in non-past tense (whether transitive or intransitive) and is in past tense (intransitive). The verb will be transformed according to the object when the sentence is a past tense transitive [6], [11]. Consider the following example: 4) Q:

‫د ﺧﻮن دﻋﻮﻩ ﭼﺎﮐړي وﻩ؟‬ [dá xūn daҸwa čā káŗi wá] [Who had sued for the murder?] A: .‫دي ښځﯥ‬ [de şəĵe] [This woman] ƒ The copula and auxiliary verbs are inflected for person, number and gender [1]. If the corresponding auxiliary for a verb is already provided in the answer to a WH-question, it is used in the resolved sentence otherwise an appropriate auxiliary is chosen for the verb of the answer from Table-IV. The auxiliaries for verb in the answer, may change with the subject (in non-past tense transitive/intransitive and past-tense intransitive) or change with the object (in past-tense transitive), based on the person, gender and number of the noun/pronoun. Consider example (2) above. Here the auxiliary ‫( ﯦﺊ‬P2-Plu) of the question sentence, which agrees to the pronoun‫( ﺗﺎﺳﻮ‬you, P2-Plu), will be transformed to ‫( یﻮ‬P1-Plu) according to Table-IVA. The pronoun ‫ ﺗﺎﺳﻮ‬of the question is converted into ‫( ﻣﻮﻧږ‬P1-Pl) in the answer and the auxiliary ‫ ﯦﺊ‬will be transformed to ‫ یﻮ‬in accordance to agreement with this new pronoun. The ellipsis-resolved sentence becomes: . .‫ﻣﻮﻧږد ﺧﻮﺵﺤﺎﻟﭙﻮر وﺳﯧﺪوﻧﮑﻰ یﻮ‬ [mūnř dá xūšālpūr wusidunkσү үo] [We belong to Khushalpur] The transformation rules for the auxiliaries are given in Table-IV. Table-IV.A Transformation of auxiliary Verb “to be”

Past Muscular

Feminine

P2-plu P3-sngu P3-plu P1-sngu P1-plu P2-sngu P2-plu P3-sngu P3-plu

Transfo rmation ‫ي‬ ‫ﻳﺊ‬ ‫ﻳﻢ‬ ‫ﻳﻮ‬ ‫دے‬

Table-IV.B. Transformation of auxiliary verb ‫“ ﮐﻴﺪل‬to become”

Tense

Gender

Muscular

Non-Past Feminine

Feminine

PersonNumber

Aux

Transfor mation

P1-sngu P1-Plu P2-sngu P2-plu P3-sng

‫ﺷﻢ‬ ‫ﺷﻮ‬ ‫ﺷﻰ‬ ‫ﺷﺊ‬ ‫ﺷﻰ‬

‫ﺷﻰ‬ ‫ﺷﺊ‬ ‫ﺷﻢ‬ ‫ﺷﻮ‬ ‫ﺷﻰ‬

P3-plu

‫ﺷﻰ‬

‫ﺷﻰ‬

P1-sng P1-plu P2-sng P2-plu P3-sng P3-plu P1-plu P2-sngu P2-plu P3-sngu P3-plu P1-sngu P1-plu P2-sngu P2-plu

‫ﺷﻢ‬ ‫ﺷﻮ‬ ‫ﺷﻰ‬ ‫ﺷﺊ‬ ‫ﺷﻰ‬ ‫ﺷﻰ‬ ‫ﺷﻮ‬ ‫ﺷﻮى‬ ‫ﺷﻮئ‬ ‫ﺷﻮ‬ ‫ﺷﻮ‬ ‫ﺷﻮم‬ ‫ﺷﻮ‬ ‫ﺷﻮى‬ ‫ﺷﻮئ‬

‫ﺷﻰ‬ ‫ﺷﺊ‬ ‫ﺷﻢ‬ ‫ﺷﻮ‬ ‫ﺷﻰ‬ ‫ﺷﻰ‬ ‫ﺷﻮئ‬ ‫ﺷﻮم‬ ‫ﺷﻮ‬ ‫ﺷﻮ‬ ‫ﺷﻮ‬ ‫ﺷﻮى‬ ‫ﺷﻮئ‬ ‫ﺷﻮم‬ ‫ﺷﻮ‬

P3-sngu P3-plu

‫ﺷﻮﻩ‬ ‫ﺷﻮي‬

‫ﺷﻮﻩ‬ ‫ﺷﻮي‬

5) Q: ‫ﭼﺎپ ﺵﻮي ﮐﺘﺎﺏﻮﻧﻪ ﻣﻮڅﻮﻣﺮﻩ دي؟‬ [čāp šəwí k‫ל‬tābūná mū comrá di] [How many of your books are published?] A: .‫ دي‬١٨ [18 di] [These are 18] In the above example, there is a copula verb ‫دي‬ (P3-Plu) agreeing to the noun ‫( ﮐﺘﺎﺏﻮﻧﻪ‬P3-Plu). In the

answer, a copula ‫( دي‬P3-Plu) is already provided with a number (i.e. 18) in response to the WH-word ‫( څﻮﻣﺮﻩ‬How much), so this copula verb can be used in the resolved sentence and there is no need for transformation from the Rules-Table. As example (6) shows, the copula verbs of 3rd person-masculinesingular/plural forms of the questions do not modify in the answers to WH-questions. ƒ

The tense of the answer sentence remains the same as in the question. 6) Q: ‫دﻣﺎﺵﻮم څﻪ ﻧﻮم دى؟‬ [dá māšūm cá nūm de] [What is the name of the child?] A: .‫ﺡﻨﻈﻠﻪ‬ [Hanžalá] [Hanzala] While resolving ellipses in the above elliptical construction, the tense of the answer sentence will remain the same as that of the question (which is present-tense), so the resolved answer sentence will become: .‫دﻣﺎﺵﻮم ﺡﻨﻈﻠﻪ ﻧﻮم دى‬ [dá māšūm Hanžalá nūm de] [The name of the child is Hanzala] ƒ The adjectives agree with their respective nouns when modifying gender and number [6]. e.g. .‫درون ﺳړے‬ .‫دراﻧ ًﻪ ﺳړى‬ [dron saŗσү] [drāná saŗi] [Respected man] [Respected men] .‫درﻧﻪ ښځﻪ‬ .‫درﻧﮯ ښځﮯ‬ [draná şəĵá] [draní şəĵəү] [Respected woman] [Respected women] Like adjectives, the numerals also agree with their nouns [6]. e.g. .‫دریﻢ ﺳړے‬ .‫دریﻤﻪ ښځﻪ‬ [drσүm saŗσү] [drσүmá şəĵá] [Third man] [Third woman] ƒ The demonstratives ‫ اﻏﻪ‬،‫ دﻏﻪ‬،‫ دې‬،‫( دا‬like “This” and “That” in English) are extensively used in Pashto sentences. They will remain unchanged in resolving ellipses in the answers to WHquestions. Consider the demonstrative “‫ ”دا‬in the following example: 7) Q: ‫دا ﺧﻄﺮﻧﺎک دﺵﻤﻦ څﻮک دے؟‬ [dā xaţarnāk dūšmən cok dσү] [Who is this dangerous enemy?] A: .‫دا ﯦﻮﻩ ښځﻪ دﻩ‬ [dā үawá xəĵá dá] [This is a woman] The possessive phrases (marked by the preposition (‫د‬/ِ‫ )د‬and other prepositions also precede the noun they modify [10]. These phrases when occurring in the question sentence remain unchanged in the ellipses-resolved answer. As an example consider (6) above. The answer after ellipsis resolution becomes:

.‫دﻣﺎﺵﻮم ﺡﻨﻈﻠﻪ ﻧﻮم دى‬ [dá māšūm hanžalá nūm de] ƒ The case of the WH-word in the question is the case of the focus in the answer. The focus may be Locative, Nominative and Accusative [4]. Consider the following examples from Pashto language: 8) Q: ‫ﺗﺎ ﺳﺮﻩ ﺵﮑﺮاﻧﻪ څﻮﻣﺮﻩ دﻩ؟‬ [tā sará šukrāná comrá dá] [How much is the endowment with you?] A: .‫پﻨځﻪ روپۍ‬ [p‫ל‬nĵá ropσү] [Rs. 5] Here in example (8), ‫( څﻮﻣﺮﻩ‬How-much) in question requires information in the answer to be provided in some quantitative form which is ‫پﻨځﻪ‬ ‫( روپۍ‬Rs. 5). This quantitative word or phrase in the answer is the focus of the WH-word for its corresponding question sentence. To resolve this elliptical construction, the WH-word in the question will be replaced by its corresponding focus-word from the answer. Consider example (7) in which the WH-word in the question is ‫( څﻮک‬Who), for which some Nominative focus-word is required which is ‫( ښځﻪ‬woman) in this case. For ellipses resolution, this focus-word from the answering sentence will replace the WH-word of the question. 9) Q: ‫او ﺗ ًﻪ ﺏﻪ څ ًﻪ ﮐﻮې؟‬ [aw tá ba sá kawσi] [And what will you do?] A: .‫ﺻﺪارت‬ [şadārat] [President-ship] In example (9), for the WH-word ‫( څ ًﻪ‬What) of the question focus-word, ‫( ﺻﺪارت‬president-ship) is provided in the answer, so the resolution of ellipsis can have the replacement of the WH-word by this focus-word. Similarly in example (6), the WH-word ‫( څﻪ‬What) requires a Nominative focus-word ‫ﺡﻨﻈﻠﻪ‬ (Hanzala) in the answer. The focus of the WH-word may be locative as is in the following example: 10) Q: ‫ﭼﯧﺮﺗﻪ ﻣﻮ ﺳﺮﻩ وﻟﻴﺪل؟‬ [čertá mū sará wūledal] [Where did you see each other?] A: .‫پﻪ پﯧښﻮرښﺎر آﯥ‬ [pá peşawər şār kşe] [In Peshawar city] The question in the above construction has a WH-word ‫( ﭼﯧﺮﺗﻪ‬Where) (also ‫( ﭼﺮﺗﻪ‬used frequently)). The focus-word for this may be some Locative-noun (word/phrase) which is ‫پﯧښﻮرښﺎر‬ (Peshawar city) in the answer. To resolve this elliptical construction, the WH-word is replaced by the focus-word/phrase. Similarly in (2), the ‫ﮐﻮم ځﺎﺋﮯ‬ (Which place) requires a Locative focus-word ‫( ﺧﻮﺵﺤﺎﻟﭙﻮر‬Khushalpur, a place) in the answer. 11) Q: ‫ﺟﺎپﺎن ﮐښﻰ ﺳﺘﺎﺳﻮﻓﻦ ﭼﺎ وﺳﺘﺎﺋﻴﻠﻮ؟‬ [japān kşe stāsū fan čā wustāσүlū] [Who did appreciate your art in Japan?] A: .‫ډﯦﺮوﺧﻠﻘﻮوﺳﺘﺎﺋﻴﻠﻮ‬

[ḍerū xalqū wustāəүlū] [Many people appreciated it] In the above example, the WH-word ‫“( ﭼﺎ‬Who”) is in oblique form; for which the focus-word in the answer is ‫( ډﯦﺮوﺧﻠﻘﻮ‬many people), which will replace the WH-word of the question for ellipses resolution. III. ALGORITHM Tag the question and answer and identify NPs, VPs and clauses from the parser. 2. Identify the question-word in the question. 3. Identify the focus-word in the answer. 4. Transform the pronoun (if any) of the question according to Table-I. 5. Check for clitic in the question and answer: If there exists a clitic in the question If its corresponding clitic exists in the answer Replace that clitic of the question with the clitic of the answer. Transform the clitic of the question as per Table-II. 6. Replace the WH-word of the question by the focus-word of the answer. 7. If the verb (with appropriate ending) exists in the answer Replace the verb of the question with the verb of the answer. Transform the verb of the question as per TableIII: (i) If the question sentence is in non-past tense (transitive/intransitive) or past-tense (intransitive), the verb agrees to the subject of the answer. (ii) If the question is the past tense (transitive) sentence the verb agrees to the object of the answer. 8. If the auxiliary and/or copula verbs corresponding to the question exist in the answer, Replace those of the question by those of the answer Convert auxiliary/copula verbs of the question sentence according to Table-IV. 9. Insert the remaining other words/phrases of the question and answer in their original order into the resolved sentence. 10. Remove the sign-of-interrogation at the end of the question sentence. 1.

The first step is to identify the structure of the question and answer sentences from the tagger/parser. The WH-word in the question and focus-word in the answer sentences are identified in the 2nd and 3rd step respectively. Step 4 converts the pronoun of the question according to the rules specified in Table-I. The presence of clitic in both the question and answer is checked in step 5; if there

is a clitic in the question and its corresponding clitic in the answer, replace the clitic of the question with that of the answer. If for clitic in the question, there is no corresponding clitic in the answer, then transform the clitic of the question as per Table-II. In step 6 the WH-word of the question is replaced by the focus-word of the answer. In step-7 the existence of corresponding verb of the question with its appropriate ending is checked in the answer; if an appropriate verb-ending for a verb exists in the answer, replace the verb-ending of the question with that of the answer. However, if the verb with its appropriate ending does not exist in the answer, transform the verb of the question based on the tense and transitivity/intransitivity of the sentence. Step-8 checks for auxiliary and copula verbs in the question and the answer. If in the answer they exist in correspondence to those of the question then there is no need of transformation and the auxiliary and copula of the answer in ellipses resolution are used. If the corresponding auxiliary and/or copula verbs do not exist in the answer then transform the auxiliary/copula of the question according to TableIV. Step-9 adds the remaining words or phrases of question and answer in their original order into the resolved sentence. Finally, the sign-of-interrogation from the question sentence is removed in the resolved sentence in Step-10. Abbreviation P1 P2 P3 Masc

List of Abbreviations Actual Abbreviation Word 1st-Person Fem 2nd-Person Sng rd 3 -Person Plu Masculine Demon

Actual Word Feminine Singular Plural Demonstr ative

IV. EVALUATION The algorithm was tested on Pashto text examples, manually tagged as currently no Part-ofSpeech Tagger is available for Pashto. It showed success rate of 76%. The constructions, having questions involving more than one WH-words, make the ellipses resolution more complex and require further analyses and understanding on the basis of the above work. V. CONCLUSION This paper presents a rule based algorithm for Ellipses resolution in WH-Constructions in Pashto language. The algorithm uses the available information of both the question and answer of the WH-Construction and achieves an accuracy of 76%. Several factors important in Ellipses resolution of these WH-Constructions are identified and discussed in detail. REFERENCES [1] Babrakzai, F., “Topics in Pashto Syntax PhD Thesis”, Linguistics Department, University of Hawai’I, 1999. [2] Lappin, S., “A Sequenced Model of Anaphora and Ellipsis Resolution”, 2003.

[3]

Rav, L., F. “The Understanding and Generation of Ellipsis in a Natural Language System”, Berkeley AI Research Project Comp: Sc Division University of California Berkeley California. [4] Sobha, L., B. N. Patnaik., “VASISTH – An Ellipsis Resolution Algorithm for Indian Languages”, In an International Conference MT2000: machine translation and multilingual applications in the new millennium, University of Exeter, British Computer Society. [London: BCS], 19-22 November 2000. [5] http://www.clas.ufl.edu/users/rthompso/ Interaction6Questions.pdf (retrieved 21st April, 2008) [6] Khattak, P., Niaz, J., Tair, M, Nawaz., Amin, Rohul., Ahmad, M, Bashir., “Bunyadee Pukhto aw Da kaar Kasab Takee”, Pashto Academy Peshawar University [7] http://www.ask.com/web?q=Definition+ of+Ellipsis&qsrc=2419&o=0&l=dir (retrieved 21st April, 2008) [8] Dalrymple, M., Shieber, S, M., Pereira, F, C, N., “Ellipsis and Higher-Order Unification”, Journal: Linguistics and Philosophy, Volume 14, Number 4 / August, 1991, Publisher: Springer Netherlands. [9] Kehler, A., “A Discourse Copying Algorithm for Ellipsis and Anaphora Resolution”, Harvard University Aiken Computation Laboratory, 33 Oxford Street Cambridge, MA. 02138. [10] Roberts, T., “Clitics and Agreement”, PhD Thesis submitted to the Massachusetts Institute of Technology, June 2000. [11] Khattak, K, K., “A Case Grammar study of the Pashto Verb”, PhD thesis submitted to Department of Phonetics and Linguistics school of Oriental and African studies Faculty of Arts, University of London, England, 1988. [12]Din, A., Khan, M., A., “Syntax Based De-cliticization of Pashto Text for Better Machine Translation”, CLT07, Department of Computer Science University of Peshawar, NWFP, Pakistan [13] Shinwari, P, K, P., “Pukhto Grammer Landez”