rule based machine translation system for english to malayalam ...

117 downloads 4477 Views 2MB Size Report
The reordered output after morphological generation of Malayalam words is displayed as the final output of the machine translation system. The system is freely.
RULE BASED MACHINE TRANSLATION SYSTEM FOR ENGLISH TO MALAYALAM LANGUAGE

op

Submitted for the degree of

y

A Thesis

Master of Science (by research)

C

in the School of Engineering

By

D

o

N

ot

R. HARSHAWARDHAN

Centre for Excellence in Computational Engineering and Networking Amrita School of Engineering Amrita Vishwa Vidyapeetham Coimbatore - 641112

December, 2011

AMRITA SCHOOL OF ENGINEERING

y

AMRITA VISHWA VIDYAPEETHAM, COIMBATORE - 641112

op

BONAFIDE CERTIFICATE

This is to certify that the thesis entitled “RULE BASED MACHINE TRANSLATION SYSTEM FOR ENGLISH TO MALAYALAM

C

LANGUAGE”, submitted by “R. HARSHAWARDHAN” (Reg. No.: CB.EN.M*CEN09002) for the award of the degree of Master of Science

ot

(by Research) in the School of Engineering, is a bonafide record of the research work carried out by him under my guidance. He has satisfied all the

N

requirements put forth for the project and has completed all the formalities

o

regarding the same to the fullest of my satisfaction.

D

Ettimadai, Coimbatore. Date: 

Dr.K.P.Soman, Research Guide and Head, CEN.

AMRITA SCHOOL OF ENGINEERING AMRITA VISHWA VIDYAPEETHAM, COIMBATORE - 641112

Centre for Excellence in Computational Engineering and

op

DECLARATION

y

Networking

I, R.Harshawardhan (Reg. No.: CB.EN.M*CEN09002), hereby declare that this thesis entitled RULE BASED MACHINE TRANSLATION

C

SYSTEM FOR ENGLISH TO MALAYALAM LANGUAGE is the record of the original work done by me under the guidance of

ot

Dr.K.P.Soman, Head, Centre for Excellence in Computational Engineering and Networking, Amrita School of Engineering, Coimbatore and to the best

N

of my knowledge this work has not formed the basis for the award of any degree/ diploma/ associateship/ fellowship or a similar award, to any candidate in any University.

o

Place: Ettimadai

D

Date:

Countersigned by

Dr.K.P.Soman, Professor and Head, CEN, Amrita Vishwa Vidyapeetham, Coimbatore.

Signature of the Student

 

ABSTRACT A rule-based machine translation system for English to Malayalam language pair has been developed (Model). The machine translation system takes in the English sentence as input and parse with the help of Stanford Parser. The Stanford Parser is made use for four main purposes on the source (English) side processing, in the machine translation system: Parsing, POS tagging, Stemming and Morphological analysis. The English to Malayalam bilingual dictionary is created. Several font converters of Malayalam are built to convert

y

the collected data into Unicode (UTF-8) format. Malayalam words of the dictionary are

op

converted into lexicons with the help of linguists. The system takes in the parsed output and separates the source text word by word with POS category and searches for their corresponding target words in the bilingual dictionary. In this stage, the set of Malayalam

C

words with their POS category are the output. For named entities, SVM based English to Malayalam transliterator, which is developed by CEN, Amrita Vishwa Vidyapeetham, is used. The words that are not available in the dictionary also get transliterated to fulfill the

ot

need. The Malayalam words from the dictionary are romanized with the help of mapping. The mapping file has been created for English to Malayalam and vice versa. By making use of the morphological information of English words from the parser, the target word

N

form is synthesized. The system processes through the FST model which has been developed for incorporating Malayalam morphology. The orthographic rules are written for Malayalam inflections. The nominal and verbal forms of Malayalam are synthesized

o

by morphological synthesizer. The output at this stage would be the morphologically

D

inflected Malayalam words. The reordering rules of Malayalam Language are written. The transfer rules for reordering from English parse tree with respect to Malayalam help us to get the output in the syntactic pattern of target language. After applying the reordering rules, English sentence would be syntactically reordered to suit Malayalam language. The reordered output after morphological generation of Malayalam words is displayed as the final output of the machine translation system. The system is freely available online at http://nlp.amrita.edu:8080/Eng2Mal/ .

i   

 

CONTENTS

Abstract

........................................................................................................................... i

List of Tables ................................................................................................................... vii List of Figures................................................................................................................. viii

y

Acronyms and Abbreviations ......................................................................................... ix

op

CHAPTER 1 ...................................................................................................................... 1 INTRODUCTION .............................................................................................................. 1

C

1.1 Need for Translation............................................................................................. 1 1.2 Translation Research ............................................................................................ 1 

ot

CHAPTER 2 ...................................................................................................................... 2 LITERATURE SURVEY ................................................................................................... 2 2.1 Machine Translation ............................................................................................. 2

N

2.1.1 Related Works ............................................................................................... 3 2.2 Statistical Machine Translation ............................................................................ 4

o

2.2.1 Related Works ............................................................................................... 4

D

2.3 Example Based Machine Translation ................................................................... 5 2.3.1 Related works................................................................................................ 5

2.4 Rule Based Machine Translation ......................................................................... 5 2.4.1 Related Works ............................................................................................... 6 2.5 Morphological Synthesizer and Analyzer ............................................................ 6 2.5.1 Related Works ............................................................................................... 8 2.6 Reordering ............................................................................................................ 8

ii   

 

2.6.1 Related Works ............................................................................................... 8 CHAPTER 3 ...................................................................................................................... 9 STUDY ON ORTHOGRAPHIC RULES OF MALAYALAM NOUNS AND VERBS .. 9 3.1 Malayalam Morphology ....................................................................................... 9 3.2 Orthographic (Sandhi) Rules ................................................................................ 9 3.3 Malayalam Nouns .............................................................................................. 10

y

3.4 Noun Inflections ................................................................................................. 11

op

3.5 Inflections for plural numbers ............................................................................ 11 3.6 Exceptions in plural formation ........................................................................... 12 3.7 Plural forms of pronouns .................................................................................... 13

C

3.8 Malayalam Case Markers ................................................................................... 14 3.9 Nominative Case Marker ................................................................................... 14

ot

3.10 Accusative Case Marker..................................................................................... 15 3.11 Dative Case Marker............................................................................................ 16

N

3.12 Sociative Case Marker ....................................................................................... 17 3.13 Locative Case Marker ........................................................................................ 18 3.14 Instrumental Case Marker .................................................................................. 19

o

3.15 Genitive Case Marker......................................................................................... 20

D

3.16 Benefactive Case Marker ................................................................................... 21 3.17 Ablative Case Marker......................................................................................... 21 3.18 Adjectivization ................................................................................................... 21 3.19 Adverbalization .................................................................................................. 22 3.20 Clitics ................................................................................................................. 22 3.21 Emphatic particles .............................................................................................. 22 3.22 Interrogative particles ......................................................................................... 23 iii   

 

3.23 ‘And’ Coordination ............................................................................................ 24 3.24 ‘Or’ Coordination ............................................................................................... 24 3.25 Malayalam Verbs - Morphology ........................................................................ 25 3.26 Malayalam Verb Base Forms ............................................................................. 25 3.27 Intransitive (akarmaka) ...................................................................................... 25 3.28 Transitive (sakarMaka) ...................................................................................... 25

y

3.29 Causative (prayOjaka) ........................................................................................ 26

op

3.30 Tense Forms ....................................................................................................... 27 3.31 Past Tense (bhUtaM) .......................................................................................... 27 3.32 Present Tense (vartamAnam) ............................................................................. 29

C

3.33 Future Tense (bhAvi) ......................................................................................... 29 3.34 Continuous Tense ............................................................................................... 29

ot

3.35 Perfect Tense ...................................................................................................... 30 3.36 Perfect Continuous Tense................................................................................... 30

N

3.37 Voice (prayOga) ................................................................................................. 30 3.38 Auxiliary Verbs .................................................................................................. 31 3.39 Negation ............................................................................................................. 31

o

3.40 Question Verbs ................................................................................................... 31

D

3.41 Infinite Verbs...................................................................................................... 31  CHAPTER 4 .................................................................................................................... 32 IMPLEMENTATION OF RULE BASED MACHINE TRANSLATION SYSTEM ...... 32 4.1 English Parser..................................................................................................... 33 4.1.1 Introduction ................................................................................................. 33 4.1.2 Usage........................................................................................................... 33 4.2 English to Malayalam Transliteration ................................................................ 35 iv   

 

4.2.1 Introduction ................................................................................................. 35 4.2.2 Preparation .................................................................................................. 35 4.2.3 Usage........................................................................................................... 35 4.3 English-Malayalam Bilingual Dictionary .......................................................... 37 4.3.1 Introduction ................................................................................................. 37 4.3.2 Preprocessing .............................................................................................. 37

y

4.3.3 Implementation ........................................................................................... 38

op

4.4 Malayalam Morphological Generator ................................................................ 39 4.4.1 Introduction ................................................................................................. 39 4.4.2 Preparation .................................................................................................. 39

C

4.4.3 Building FST Model ................................................................................... 39 4.4.4 Writing orthographic rules .......................................................................... 42

ot

4.4.5 Working of FST .......................................................................................... 44 4.5 Malayalam Morphological Analyzer ................................................................. 46

N

4.5.1 Introduction ................................................................................................. 46 4.5.2 Working ...................................................................................................... 46 4.6 Reordering by Transfer Rules ............................................................................ 48

o

4.6.1 Introduction ................................................................................................. 48

D

4.6.2 Preparation .................................................................................................. 48 4.6.3 Implementation ........................................................................................... 48 

CHAPTER 5 .................................................................................................................... 57 RESULTS ........................................................................................................................ 57 5.1 Results of Malayalam morphological generator and analyzer ........................... 57 5.2 Discussion about results of Malayalam morphological generator and analyzer 60 5.3 Results of Rule Based Machine Translation System ......................................... 62 v   

 

5.4 Discussion about the results of Rule Based Machine Translation System ........ 63 5.5 Screenshots ......................................................................................................... 65  CHAPTER 6 .................................................................................................................... 68 CONCLUSION ................................................................................................................. 68 6.1 Limitations ......................................................................................................... 68  6.2 Applications ....................................................................................................... 68

y

6.3 Future Work ....................................................................................................... 69 

op

REFERENCES ................................................................................................................ 70 APPENDIX - A................................................................................................................ 73 A.1. Penn Treebank Tag set for POS category ........................................................... 73

C

APPENDIX - B ................................................................................................................ 75 B.1. Hand coded Reordering Rules for RBMT........................................................... 75

ot

APPENDIX - C................................................................................................................ 78 C.1. Inflection Markers used in morphology of Malayalam Nouns ........................... 78

N

C.2. Inflection Markers used in morphology of Malayalam Verbs ............................ 79 APPENDIX - D................................................................................................................ 80 D.1. FST State Transition Table modeled for Noun Morphotactics ........................... 80

o

D.2. FST State Transition Table modeled for Verb Morphotactics ............................ 81

D

D.3. Orthographic Rules for Malayalam Nouns ......................................................... 83 D.4. Orthographic Rules for Malayalam Verbs .......................................................... 89 

APPENDIX - E ................................................................................................................ 94 E.1. Tested sentences of Machine Translation System with Rankings....................... 94  PUBLICATIONS .......................................................................................................... 115

vi   

 

LIST OF TABLES Table.2.1 Some Machine Translation projects in India ...................................................... 3 Table.3.1 Classification of nouns based on stem ends...................................................... 11 Table.3.2 Examples for Plural forms of English and Malayalam nouns .......................... 11 Table.3.3 Exceptions to ‘kaL’ ........................................................................................... 12

y

Table.3.4 Exceptions to adding ‘mAr’ as plural marker ................................................... 13

op

Table.3.5 Other exceptions of Plural forms ...................................................................... 13 Table.3.6 Plural forms of pronouns .................................................................................. 14 Table.3.7 Nominative case markers for various stem ends .............................................. 14

C

Table.3.8 Special cases of Past Tense forms for Malayalam Verbs ................................. 28 Table.4.1 Sandhi rules for various stem endings of Malayalam nouns ............................ 42

ot

Table.4.2 Sandhi rules for various stem endings of Malayalam verbs in past tense ........ 44 Table.5.1 Statistics of morphology for Malayalam nouns ................................................ 57

N

Table.5.2 Statistics of morphology for Malayalam verbs ................................................. 58 Table.5.3 Testing results of morph generator for Malayalam nouns ................................ 59 Table.5.4 Testing results of morph generator for Malayalam verbs ................................. 59

o

Table.5.5 Testing results of morph analyzer for Malayalam nouns ................................. 59

D

Table.5.6 Testing results of morph analyzer for Malayalam verbs .................................. 60 Table.5.7 Coverage of Malayalam nouns and verbs ......................................................... 60 Table.5.8 Statistics of Rule Based Machine Translation System ..................................... 62 Table.5.9 Testing results for English-Malayalam machine translation system ................ 62

 

vii   

 

LIST OF FIGURES Fig.4.1 Block Diagram of English-Malayalam Rule Based Machine Translation System .. ........................................................................................................................ 32 Fig.4.2 Sample Parse Tree for the sentence “I am writing a book” .................................. 33 Fig.4.3 First Transition of FST ......................................................................................... 40 Fig.4.4 Second Transition of FST ..................................................................................... 41

y

Fig.4.5 Third Transition of FST........................................................................................ 41

op

Fig.4.6 Final Transition of FST ........................................................................................ 41 Fig.4.7 Parse Tree of English Sentence ‘I am eating an apple’ ........................................ 49

C

Fig.4.8.a Reordered Parse Tree of English Sentence by executing Rule – 1 .................... 50 Fig.4.8.b Reordered Parse Tree of English Sentence by executing Rule – 2 ................... 50 Fig.4.9 Parse Tree of English Sentence ‘I work hard to finish the work and achieve the ........................................................................................................................ 52

ot

goal’

Fig.4.10 Reordered Parse Tree of English Sentence - 2 by executing Rule – 1 ............... 53

N

Fig.4.11 Reordered Parse Tree of English Sentence - 2 by executing Rule – 2 ............... 54 Fig.4.12 Reordered Parse Tree of English Sentence - 2 by executing Rule – 3 ............... 55

o

Fig.4.13 Reordered Parse Tree of English Sentence - 2 by executing Rule – 4 ............... 56 Fig.5.1 Screenshot of Machine Translation system .......................................................... 65

D

Fig.5.2 Screenshot of online translation system with proper font rendering of Malayalam text

........................................................................................................................ 65

Fig.5.3 Screenshot of Morph Generator for Malayalam nouns ........................................ 66 Fig.5.4 Screenshot of Morph Analyzer for Malayalam nouns.......................................... 66 Fig.5.5 Screenshot of Morph Generator for Malayalam verbs ......................................... 67 Fig.5.6 Screenshot of Morph Analyzer for Malayalam verbs .......................................... 67

viii   

 

-

Natural Language Processing

MT

-

Machine Translation

SMT

-

Statistical Machine Translation

RBMT

-

Rule Based Machine Translation

EBMT

-

Example Based Machine Translation

POS

-

Parts of Speech

RE

-

Regular Expression

FSA

-

Finite State Automata

FSM

-

Finite State Machine

FST

-

Finite State Transducer

CFG

-

Context Free Grammar

SVM

-

Support Vector Machines

D

o

N

ot

C

op

NLP

ix   

y

ACRONYMS AND ABBREVIATIONS

CHAPTER

1

INTRODUCTION 1.1 Need for Translation Have you ever imagined a centralized education system for the whole nation? The centralized education system requires all study materials in local languages. There is nothing wrong in creating such educational database from dawn to dusk in all languages

y

of India, but we all know that “Time is Money”. So, the problem can be tackled easily by a process called translation. The translations are language specific and the manual

op

translations are again time-consuming. The machine translation is the only hope for all these situations. If we could possibly develop a perfect machine translation system, then all information from Kindergarten to PhD will be centralized and can be learnt in the

C

local languages. Not only the education system be necessarily been made available in their languages but also the government policies and official documents can be easily made local by having such translation engines. Developing such a system is not an easy

ot

task and it involves various processes.

1.2 Translation Research

N

Translation research is carried out all over the world to build up an efficient system for translation; however the basic idea remains the same for all languages. In India, the

o

machine translation project is carried out at many places for years but still we are in need of a good translation system. Any basic translation requires two main view points: First is

D

the linguistic point of view and second is the mathematical point of view. The development of a machine translation system should go hand in hand by both the Language experts and the Engineers. The whole world is designing a translation system from English because of its international use. The data available in English is immense. For machine translation, we need data of both source and target languages for training the system. More the data available, more the accuracy level can be reached. It is worthy to be mentioned here that one of the world’s leading company Google doesn’t have a machine translation system for English to Malayalam language till now. 

1   

CHAPTER

2

LITERATURE SURVEY 2.1 Machine Translation Machine Translation is the process of translating the sentences from source language into target language, by the use of computers, with or without the influence of human assistance. The idea of building a machine translation system came into existence

y

during the days of World War for encoding and decoding [1]. The human translation of any language is highly time consuming and expensive as well. With the advent of

op

machine translation systems, the task of human translators had been greatly reduced. The machine translation systems are built world-wide and now the field has become the active research area of computer science. There are three approaches to machine

C

translation: Statistical, Example based and Rule based machine translation systems. The three techniques of machine translation systems are as follows: Direct translation,

i.

ot

Interlingua based translation and Transfer based translation. Direct Translation – This is a direct word by word (substitution) translation. No

N

detailed linguistic aspects are followed. This primitive approach is not used in recent times. ii.

Interlingua-based Translation – This technique linguistically analyzes the source

o

text and converts to the intermediate semantic representation called Interlingua. The advantage of this approach is that Interlingua can be used for translation to

D

many target texts.

iii.

Transfer-based Translation – This technique grammatically analyzes the source text and transfers to the target in grammatical representation by hand written rules. The advantage of this approach is that it is well-suited for domain wise translations.

iv.

Knowledge-based Translation – This technique uses the knowledge base (KB) as a source of information for translation. The knowledge base has to be created based on ontology and semantic web.

2   

v.

Corpus-based Translation – This technique uses the parallel corpus of source and target sentences. A huge amount of corpus is required to get better results in this technique.

vi.

Hybrid Translation – This technique involves the combination of two or more techniques that are discussed earlier.

2.1.1 Related Works Some of the MT projects for Indian languages are shown in table from [2].

Languages

Domain/ Main Application

Approach /Formalism

Strategy

Anglabharati (IIT-K, ER&DCI-N)

Eng-IL (Hindi)

General (Health)

Transfer/Rules

Post-edit

Anusaaraka (IIT-K, UoH)

IL-IL (5IL>Hindi)

General (Children)

MaTra (NCST)

Eng-IL (Hindi)

Mantra (CDAC)

C

op

Project

Post-edit

General (News)

Transfer/Frames

Pre-edit

Eng-IL (Hindi)

Govt. notifications

Transfer/XTAG

Post-edit

Eng-IL (Kannada)

Govt. circulars

Transfer/UCSG

Post-edit

General

Interlingual/UNL

Post-edit

General (Children)

LWG mapping/PG

Post-edit

ot

LWG mapping/ PG

N

UCSG MAT (UoH)

Tamil Anusaaraka (AU-KBC)

Eng/IL (Hindi, Marathi) IL-IL (TamilHindi)

MAT (JadavpurU)

Eng-IL (Hindi)

News Sentences

Transfer/Rules

Post-edit

Anuvadak (Super Infosoft)

Eng-IL (Hindi)

General

N/A

Post-edit

StatMT (IBM)

Eng-IL

General

Statistical

Post-edit

D

o

UNL MT (IIT-B)

3   

y

Table.2.1 Some Machine Translation projects in India

Apart from them, a consortium of ten Indian Universities along with CDAC has been working on machine translation for Indian languages under DIT, India. MHRD, India is also taking several measures to make machine translation serve for people.

2.2 Statistical Machine Translation According to [3], the statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The SMT is a

y

corpus based approach, where a massive parallel corpus is required for training the SMT

op

systems.

The SMT systems are built based on two probabilistic models: language model and translation model. The advantage of SMT system is that linguistic knowledge is not

C

required for building them. The difficulty in SMT system is creating massive parallel corpus. SMT systems work well for machine translation of English to European languages because the word order is almost preserved in such translations. For machine

ot

translation of English to Indian languages, the parallel corpora have to be preprocessed (changing word-order) and trained in SMT.

N

2.2.1 Related Works

Many SMT systems are being built nowadays for Indian Languages. For

o

Malayalam, in [4], it is said that the SMT system from Amrita – CEN has been built, with 10,000 sentences trained by considering rule based reordering and morphological

D

processing of Malayalam, the BLEU score for baseline - 13.15, baseline and syntax – 15.6, baseline, syntax and morphology – 16.1 is achieved. In [5], it is said that another SMT system with reordering approach from Amrita – CEN has been built, when trained with 2000 sentences, a BLEU score of baseline -10.76 and baseline and reordering – 15.6

is achieved. In [6], it is said that the SMT system by incorporating morphology and syntax from Amrita – CEN has been developed and trained with 1000 sentences, the BLEU score of 15.9 for baseline, 19.3 for baseline and syntax, 24.9 for baseline, syntax and morphology is achieved. In [7] and [8], it is said that the SMT systems for

4   

Malayalam trained with 250 sentences, the BLEU score of baseline -0.48, suffix separated – 0.69 with one set and baseline – 0.22, suffix separated – 0.38 for another set.

2.3 Example Based Machine Translation The example based machine translation (EBMT) is the corpus based approach without any statistical models. The example based systems are trained with the parallel corpus of example sentences, similar to SMT systems. The example based systems generally don’t learn from the corpus. They store the parallel corpus and uses matching

y

algorithms to search and retrieve the sentences.

op

The translation memory is one of the example based machine translation systems. The translation memories (TM) are built to aid the human translators by serving as an assisting tool for translation. The advantage of translation memories are easy to

C

implement and linguistic knowledge is not required. The translation memories are not used for translation purposes but they are also useful for dictionary search of words,

2.3.1 Related works

ot

idioms and proverbs, etc.

EBMT systems are getting implemented for Indian languages. In [9], an EBMT

N

system is built for four Indian languages (Tamil, Malayalam, Kannada and Telugu). With 18,000 sentence rules, the BLEU score of 0.7164 is achieved. Translation memories for

o

Tamil and Malayalam have been built. TM For Tamil in [10] is stored with a parallel corpus of 44833 sentences, 761 idioms and phrases, 1776 proverbs and loaded with a

D

terminology dictionary of 2,13,202 words. TM for Malayalam in [11] is stored with a parallel corpus of 2499 sentences and a dictionary of 11,569 words. The Malayalam grammars are also stored: Paryayam – 176, Arththa Vyathsyam – 316, Samasam – 47, Ashayavipulanam – 8, Santhi – 113, Lingam – 107, Nanartham – 133, Ottapatham – 90, Vipareetham – 279.

2.4 Rule Based Machine Translation The rule based machine translation system translates the source text into target text by a set of linguistic rules. Three techniques of machine translation – Direct, 5   

Interlingua and Transfer based are applicable to rule based machine translation system. The rule based machine translation system is developed by hand coded rules for translation. The system requires good linguistic knowledge to write the rules and a bilingual dictionary is also needed. Other MT systems like SMT and EBMT requires huge parallel corpus for training, which is not readily available for Indian languages. The source of parallel corpus is internet and texts. The parallel texts are not widely available in internet and in

y

multi-lingual text books, the alignment of sentences vastly vary. On the other hand, the rule based systems are highly suited for translation of English to Indian Languages

op

because the bilingual dictionary could be collected easily compared to parallel corpus and the rules could also be written well with the help of linguists. The rule based system which has been developed follows the transfer based approach of reordering rules. The

C

drawback of rule based system is that the system is confined with the rules and the rules will evolve with the language over time.

ot

2.4.1 Related Works Rule based systems are also getting developed for Indian languages. In [12], rule

N

based machine translation for noun phrases for Punjabi language, trained with 2000 phrases; accuracy of 75%-85% is reached. AnglaMalayalam from CDAC is developed for Malayalam in health and tourism domains with 75-80% accuracy [13]. A rule based

o

system from Amrita – CEN [14] has also been developed by utilizing dependencies from

D

parser, POS tagger and transfer link rules for reordering and rules for morphology.

2.5 Morphological Synthesizer and Analyzer Malayalam is the morphologically rich and highly agglutinative language. The

morphological synthesizer and analyzer are required for machine translation of English to Malayalam language. There are many methods to develop morphological synthesizer and analyzer. Some of them are discussed below:

6   

i.

Paradigm based method This approach is suitable for inflectionally rich languages. The words are

classified into different paradigms based on their morpho-phonemes, and the paradigm table is created. The word to be morph analyzed or generated is classified into the particular paradigm and the corresponding inflections of that paradigm are applicable. ii.

Suffix stripping method

y

This approach uses stem dictionary that stores the root words and suffix dictionary that store all possible suffixes of nouns and verbs. The morphotactics and

iii.

op

orthography can also be stored and used for suffix stripping. Directed Acyclic Word Graph (DAWG) method

C

DAWG data structure is used for both morphological analysis and generation. This approach is language independent it does not require any morphological rules or any

ot

other special linguistic information. iv.

Corpus based method

N

The corpus based approach needs large number of morphologically variant data in order to train the system. The SVM based morphological analyzer and generator belongs to this category. The data collection in the format required to train the system is difficult

o

in this approach.

Finite State Transducer (FST) based method

D

v.

The FST based morphological analyzer and generators are widely implemented

for many languages. The FST systems are mainly used in speech recognition and speech processing while building the language models. The morph analyzer and generator can be built in a bidirectional manner using FST.

7   

2.5.1 Related Works The morphological analyzers and generators are being developed for many Indian languages with all the above mentioned approaches. The unsupervised learning of morphological analysis for inflectionally rich Indian languages (Hindi) has been proposed by [15], with primitive morph coverage of 32-63% and advanced morph coverage of 9697%. From Amrita – CEN, morph analyzer and generator with a rule based FST approach for Tamil has been built [16]. Three approaches: paradigm, suffix stripping and hybrid,

y

for Malayalam morph analyzer have been compared in [17]. The morph analyzer and

op

generator for Malayalam to Tamil machine translation have also been developed by [18].

2.6 Reordering

Malayalam is the free word-order language. Reordering is the ultimate need for

C

English to any Indian language machine translations. The reordering in rule based systems is different from the statistical systems. Many levels of reordering are possible such as word level reordering and phrase level reordering. In word level reordering, the

ot

source words get reordered to target format or target words get reordered after translation. In phrase level reordering, the phrases are reordered syntactically. The reordering of source sentence in the order of target language is performed in rule based reordering.

N

Reordering is carried out with the help of parse trees. The parser could be dependency parser or Context Free Grammar (CFG) based parser. The reordering rules

o

are written based on the parser used. The pattern based reordering for Tamil [19] has been developed from Amrita – CEN, which is useful for building the rules for

D

Malayalam.

2.6.1 Related Works The related works for reordering aspects of English to Malayalam machine translation are discussed in [4], [5], [6] and [14].

8   

CHAPTER

3

STUDY ON ORTHOGRAPHIC RULES OF MALAYALAM NOUNS AND VERBS 3.1 Malayalam Morphology In linguistics, Morphology is the study of word forms. The morphology varies with language and the word structures. The smallest meaningful word unit of morphology

y

is morpheme. The morphology plays an important role in inflections of words in a language. Therefore morphology is the important aspect of machine translation. In this

op

chapter, we will discuss only the particular aspects of Malayalam morphology of nouns and verbs that are considered for writing orthographic rules to build morphological generator for Malayalam. We have considered only the handful of morphological aspects

C

of Malayalam in our system. Each inflection aspect we have considered for writing a rule is detailed under each inflection types. The chapter is much dedicated about the transformation of words by writing rules for each type. The morphology of Malayalam

ot

language is well explained in [20] and we followed that book in building the Morphological synthesizer.

N

3.2 Orthographic (Sandhi) Rules

The orthographic or sandhi rules form the backbone of the language morphology.

o

The Sanskrit word ‘sandhi’ means 'to join together'. Sandhi denotes the phonological changes that occur at the morpheme boundaries depending on the grammatical function

D

of neighboring morphemes. For adjacent words, the last letter of previous word and the first letter of the next word when come in contact may lead to the unification of two words to form a larger word based on the sandhi rules. There are two types of sandhi: Internal Sandhi and External Sandhi. The internal sandhi involves changes within a word whereas external sandhi involves changes at word boundaries. Sandhi is further classified into four categories namely: loopa, aagama, dvitva, aadeesa.

9   

i.

Elision (Loopa) It is the reduction in the duration of phoneme by omitting the last vowel of the

first word at the boundary. Example: padi + ikk ii.

padi+ikk

padikk

Augmentation (Aagama) It is the insertion of a new consonant in between the two vowels at the boundaries

of the adjacent words. Example: amma+OT Reduplication (Dvitva)

 ammayOT

y

iii.

amma+y+OT

Example: pU+kaL

pUkkaL

Substitution (Aadeesa)

C

iv.

pU+k+kaL

op

It is the duplication of the consonant that follows the vowel at the word boundary.

It is the substitution of particular phoneme with the other new phoneme at the maraM+ngng+kaL

marangngaL

ot

word boundary. Example: maarM+kaL

3.3 Malayalam Nouns

In any language, the nouns are words that indicate people, beings, things, places,

N

phenomena, qualities or ideas. Nouns that indicate individual entities, such as names of persons, places or organizations are called proper nouns. Examples for some Malayalam

o

and English nouns are: Arm-കയ്, Language-ഭാഷ, Cheeks-കവിള്, and Mother-am. For generating the morphology for Malayalam nouns, first the whole list of available nouns is

D

categorized into four main categories based on their stem ending (phonology) and morphological changes. The different categories of nouns are listed below in table.3.1. Apart from these four categories, we consider human nouns like ‘amma’ and non-

human nouns as two separate categories to take care of two different plural markers (‘mAr’ and ‘kaL’)

10   

Table.3.1 Classification of nouns based on stem ends Stem Endings

Example amma mAla Guru pU maraM varaM vIT vAtil

Vowels(a,A,e,E,i,I,o,O) Vowels(u,U) Vowel(aM)

y

Consonant

op

3.4 Noun Inflections

In the present work, we have considered three main categories of inflections are

C

considered for the nouns. They are: Plural Markers, Case Markers and Clitics.

3.5 Inflections for plural numbers

ot

Malayalam Plurals are grammatical numbers, typically referring to more than one of the referent in the real world. In English language, singular and plural are the only grammatical numbers.

Plural and Singular play an important role in Malayalam

table.3.2:

N

morphology; therefore they need very special attention. Some examples given in

o

Table.3.2 Examples for Plural forms of English and Malayalam nouns Malayalam Plural

book

പുസ്തകം

books

പുസ്തകങ്ങള്

daughter

പുത്രി

daughters

പുത്രിമാര്

D

English Plural

There are two separate plural markers – ‘mAr’ and ‘kaL’ based on whether the singular noun is human or non-human. When the singular nouns are changed to their

11   

corresponding plural form by adding ‘kaL’, the sandhi operates in different ways based on the end phonemes of the singular nouns. For example: i.

tira+N+PL

tira+kaL

tirakaL

guru+k+kaL

gurukkaL

(sandhi

without change) ii.

guru+N+PL

(sandhi with doubling

of ‘k’)

y

In the third category, the words which end in ‘anuswAram’, when added with ‘kaL’,

iii.

op

resulting in the change of ‘M’ and ‘k’ into ‘ngng’. maraM+N+PL

maraM+ngng+kaL

substitution)

marangngaL (sandhi

with

C

In the fourth category, the words which end in a consonant, when added with ‘kaL’, a new phoneme is inserted between them. kUT+N+PL

kUT+u+kaL

kUTukaL

ot

iv.

(Augmentation of ‘u’)

N

3.6 Exceptions in plural formation

There are exceptions to the categorization of nouns into human and non-human on the basis of plural marker ‘mAr’ and ‘kaL’. The following tables 3.3, 3.4, 3.5 will reveal

D

o

this:

Table.3.3 Exceptions to ‘kaL’ Singular

Plural

kurangngan

kurangnganmAr

kurukkan

kurukkanmAr

12   

Table.3.4 Exceptions to adding ‘mAr’ as plural marker Plural

makan/makaL

makkaL

peN

peNungngaL

dATA

dATAkkaL

suhartt

suharttukaL

shatru

shatrukkaL

vakkIl

vakkIlanmAr

y

Human Singular

op

Table.3.5 Other exceptions of Plural forms Singular

Plural

paNikkAr

C

paNikkAran/paNikkAri

kalAkkAr

kaTankAran/kaTankAri

kaTankAr

vITTukAran/vITTukAri

vITTukAr

toTTakAran/ toTTakAri

toTTakAr

addhyApakan/addhyApika

addhyApakar

sahOdaran/sahOdari

sahOdarar

snEhitan/snEhiti

snEhitar

dEvan/dEvi

dEvar

o

N

ot

kalAkAran/kalAkAri

D

3.7 Plural forms of pronouns The Plural forms of pronouns need different treatment. It is better to consider

them as two pair of forms, rather than deriving one form from the other. Some of the words under this category and their inflections are shown in the table.3.6 below:

13   

Table.3.6 Plural forms of pronouns Inflected Plural form

atu

ava

njAn

njangngal

nI

ningngal

avan

avaR

avaL

avaR

y

Root Word

op

3.8 Malayalam Case Markers

The bound case affixes denote the syntactic and semantic functions of nouns and are added to the oblique bases of nouns. The addition of case markers is the inflectional

C

processes of nouns. The different case markers considered in Malayalam language are: nominative, accusative, dative, sociative, locative, instrumental and genitive. The changes which occur while adding case suffixes to the oblique noun bases can be

ot

captured by sandhi rules. The case markers are added to the singular stems or plural stems. (N±PL+Case Suffix)

N

3.9 Nominative Case Marker

In Malayalam, the nominative case form is unmarked. The base form itself

o

functions as the nominative case form. The table.3.7 given below shows that the addition of null marker for the nominative case, which does not make any change in the nominal

D

forms.

Table.3.7 Nominative case markers for various stem ends

Noun Category

Examples

Vowels(a,A,e,E,i,I,o,O)

amma

Vowels(u,U)

guru

Vowel(aM)

maraM

Consonant

kUT 14 

 

Nominative

Inflections amma

Ф

guru maraM kUT

3.10 Accusative Case Marker The accusative case marker of Malayalam is ‘e’. The addition of case suffixes to nominal forms of nouns changes them into oblique forms and thereby making changes due to sandhi. The changes that occur are given below: i.

In the first category of nouns ending in vowels ‘a,A,e,E,i,I,o or O’, the on glide ‘y’ appears between the base form and the suffix ‘e’. ammaye

In the second category of nouns ending in the vowels ‘u or U’, the accusative

op

ii.

amma+y+e

y

Eg: amma+ACC

case suffix ‘e’, is added to the nominal base augmented by the inflectional

Eg: guru+ACC

guru+vin+e

guruvine

In the third category of nouns ending in ‘M’ (AnusvAram), ‘M’ is deleted and

ot

iii.

C

increment ‘vin’.

‘tt’ is augmented to the base, when the suffix ‘e’ is added to them.

iv.

maraM+tt+e

maratte

N

Eg: maraM+ACC

In the fourth category of nouns ending in consonants, ‘in’ is augmented to the

o

base, when the case suffix ‘e’, is added to them. kUT+i+ne

kUTine

D

Eg: kUT+ACC

v.

The accusative case is added to the plural stems without any change due to sandhi. Eg: pUkkaL+ACC

pUkkaL+e

pUkkaLe

The accusative form normally indicates the object of the verb. Eg:

English: The child asked about his mother (child mother_ACC about ask_PAST) Malayalam: kuTTi ammaye paRRi cOdiccu 15 

 

3.11 Dative Case Marker The Dative case marker for Malayalam is ‘kk’, which alternates with ‘in’ i.

The dative case suffix ‘kk’ is added directly without change with the first category of nouns ending in vowels ‘a,A,i,I,e,E,o or O’. ‘(u)’ occurs with ‘an’ ending nouns, elsewhere ‘kk’ occurs as dative suffix.

ii.

amma+kk

ammakk

y

Eg: amma+DAT

In the second category of nouns ending in vowels ‘u or U’, the dative suffix ‘in’ is

Eg: guru+DAT

guruvin

In the third catefory of nouns ending in AnusvAram ‘M’, the dative case suffix

C

iii.

guru+v+in

op

added to the nominal base augmented by ‘v’ inflectional increment.

‘in’ is added to the nominal base augmented by ‘tt’ inflectional increment with the

ot

deletion of ‘M’. Eg: maraM+DAT

marattin

In the fourth category of nouns ending in consonants, the dative case suffix ‘in’ is

N

iv.

maraM+tt+in

added to the nominal bases.

kUT+in

kUTin

The dative case marker ‘kk’ is added to the plural stems without any sandhi

D

v.

o

Eg: kUT+DAT

changes.

Eg: pUkkaL+DAT

pUkkaL+kk

pUkkaLkk

The dative form usually indicates the indirect object of the verb. Eg:

English Sentence:

I gave her the book (I she_DAT book give_PAST)

Malayalam Sentence: njAn avaLkk pustakaM koTuttu 16   

3.12 Sociative Case Marker The sociative case marker is ‘OT’. The addition of ‘OT’ makes changes to the base nouns depending on the four categories to which they belong. i.

In the first category of nouns, the on glide ‘y’ is added when the sociative case suffix is suffixed to the nominal base.

ii.

amma+y+OT

In the second category of nouns ending in ‘u or U’, ‘vin’ is augmented to the

Eg: guru+SOC

op

base when ‘OT’ is added.

guru+vin+OT

guruvinOT

In the third category of nouns ending in ‘M’, ‘ttin’ is augmented to the

C

iii.

ammayOT

y

Eg: amma+SOC

nominal bases with the deletion of ‘M’.

iv.

marattinOT

maraM+ttin+OT

ot

Eg: maraM+SOC

In the fourth category of nouns ending in consonants, ‘in’ is augmented to

N

them, when the sociative case suffix ‘OT’ is added. kUT+in+OT

kUTinOT

o

Eg: kUT+SOC v.

The sociative case suffix ‘OT’ is added to the plural bases without any

D

sandhi change. Eg: pUkkaL+SOC

pUkkaL+OT

pUkkaLOT

The sociative case usually denotes the accompanying person. Eg:

English Sentence:

She came with a friend. (she friend_SOC along with come_PAST)

Malayalam Sentence: avaL kUTTukAriyOT kuuTi vannu

17   

3.13 Locative Case Marker The Locative case marker of Malayalam is ‘il’. The addition of the locative case suffix to the nominal forms makes changes to the base depending on the four categories to which they belong. i.

In the first category, the on glide ‘y’ is added when ‘il’ is suffixed to the noun.

In the second category of nouns, the on glide ‘v’ is added when ‘il’ is suffixed to the base.

guru+v+il

guruvil

C

Eg: guru+LOC iii.

ammayil

op

ii.

amma+y+il

y

Eg: amma+LOC

In the third category of nouns ending in ‘M’, the nominal bases are

ot

augmented by ‘tt’, when suffixed with ‘il’. Eg: maraM+LOC

marattil

In the fourth category of nouns, ending in consonants, the suffix ‘il’ is

N

iv.

maraM+tt+il

added without any sandhi change (optionally the final consonant may

o

germinate).

kUT+il

kUTTil

D

Eg: kUT+LOC

v.

The locative case suffix ‘il’ is added to the plural nominal forms without any sandhi change. Eg: pUkkaL+LOC

pUkkaL+il

pUkkaLil

The locative forms usually denote the location concerned with the verb. Eg: English: No one came from home (house_LOC from anyone come_PAST_NEG) Malayalam: vITTil ninnu AruM Vannilla 18   

3.14 Instrumental Case Marker The Instrumental case marker is ‘Al’. The sandhi changes and augmentation occur depending on the four categories of the nouns. i.

In the first category, the on glide ‘y’ is added when ‘Al’ is suffixed to the noun.

ii.

amma+y+Al

In the second category of nouns, the on glide ‘v’ is added when ‘Al’ is

Eg: guru+INS

op

suffixed to the base.

guru+vin+Al

guruvinAl

In the third category of nouns ending in ‘M’, the nominal bases are augmented

C

iii.

ammayAl

y

Eg: amma+INS

by ‘ttin’, when suffixed with ‘Al’.

iv.

marattinAl

maraM+ttin+Al

ot

Eg: maraM+INS

In the fourth category of nouns, ending in consonants, the suffix ‘Al’ is added

N

without any sandhi change.

kUT+in+Al

kUTinAl

o

Eg: kUT+INS v.

The instrumental case suffix ‘il’ is added to the plural nominal forms without

D

any sandhi change. Eg: pUkkaL+INS

pUkkaL+Al

pUkkaLAl

The instrumental forms usually inflect the instrumental role of the noun concerned with the verb. Eg:

English Sentence:

He stabbed a tiger with a knife (He one tiger_ACC knife_INS stab_PAST)

Malayalam Sentence: avan oru puliye kattiyAl kutti 19   

3.15 Genitive Case Marker The genitive case marker is ‘uTe’. ‘Re’ occurs after nominal bases or oblique bases ending in ‘n’, where ‘uTe’ occurs elsewhere. i.

In the first category, the on glide ‘y’ is added when ‘uTe’ is suffixed to the noun.

ii.

amma+y+uTe

In the second category of nouns, the on glide ‘v’ is added when ‘Re’ is

Eg: guru+GEN

op

suffixed to the base.

guru+vin+Re

guruvinRe

In the third category of nouns ending in ‘M’, the nominal bases are augmented

C

iii.

ammayuTe

y

Eg: amma+GEN

by ‘ttin’, when suffixed with ‘Re’.

iv.

marattinRe

maraM+ttin+Re

ot

Eg: maraM+GEN

In the fourth category of nouns, ending in consonants, the suffix ‘Re’ is added

N

without any sandhi change.

kUT+in+Re

kUTinRe

o

Eg: kUT+GEN v.

The genitive case suffix ‘uTe’ is added to the plural nominal forms without

D

any sandhi change. Eg: pUkkaL+GEN

pUkkaL+uTe

pUkkaLuTe

The genitive case suffix links a noun with another noun by possession. Eg:

English Sentence:

The book is on the table (Book table+GEN on be-PRES)

Malayalam Sentence: pustakaM mESayuTe mEle unTe 20   

3.16 Benefactive Case Marker The benefactive and ablative cases suffixes are secondary case suffixes added to the primary dative and locative case markers respectively. Benefactive is expressed by the adding postpositions ‘aayi’ or ‘vENTi’ to a noun suffixed with dative case marker. There is no sandhi change when ‘vENTi’ is added to the dative forms. Eg:

English Sentence:

She lives for her children

Malayalam Sentence: avaL makkLkk vEnTi jIvikkunnu

op

3.17 Ablative Case Marker

y

she children_DAT for_BEN live_PRES

Ablative is expressed by adding the postposition ‘ninnu’ to the nouns suffixed

C

with locative case marker ‘il’. There is no sandhi change when ninnu is added to the locative form.. English Sentence:

We took this ball from the basket

ot

Eg:

we basket_LOC from_BEN this ball take_PAST

Malayalam Sentence: njangngal sanciyilninnu E pant eduttu

N

3.18 Adjectivization

o

Adjectives are formed by adding the adjectivizer ‘Aya’ to the noun. i.

The addition of ‘Aya’ to the first category of nouns inserts an on glide ‘y’ in

D

between the base and the suffix. Eg: amma+ADJZ

ii.

amma+y+Aya

ammayAya

In the second category of nouns, the on glide ‘v’ is inserted between the base and the suffix. Eg: guru+ADJZ

guru+v+Aya

21   

guruvAya

iii.

In the third category of nouns ending in ‘M’, the addition of ‘Aya’ changes ‘M’ to ‘m’. Eg: maraM+ADJZ

iv.

maraM+m+Aya

maramAya

In the fourth category of nouns, ending in consonants, the addition of ‘Aya’ doesn’t make any sandhi changess. kUT+Aya

kUTAya

y

Eg: kUT+ADJZ

op

3.19 Adverbalization

Adverbs are derived by adding ‘Ayi’ to the nouns. The changes that take place due to sandhi are exactly similar to that of adjectivization.

C

3.20 Clitics

Clitic is a morpheme that is grammatically independent, but phonologically

ot

dependent on another word or phrase. It functions like an affix, but works at the phrase level. Clitics elements are of two types: i) Free Clitics; ii) Bound Clitics. Free clitics can occur freely (without being attached to verb, noun or another clitics). The bound clitics

N

occur after verbs, nouns and other clitics. The Free clitics include interjections and ideophones and sometimes, manner adverbs (verb attributes). The bound clitics includes Proclitics which are mainly prefixes of Sanskrit origin, along with infinite attributive

o

quantifiers and enclitics, which comprise a somewhat miscellaneous group of suffixed

D

elements. There are different categories of clitic forms present in Malayalam language. Presently, the following types of clitics are considered.

3.21 Emphatic particles The emphatic clitics, ‘tanne’ or ‘E’ can be seen as sentence particles when attached to a verb in sentence-final position. Eg:

amma tanne, guru tanne, maraM tanne, kUT tanne; ammayE, guruvE,

maramE, kUTE. 22   

3.22 Interrogative particles ‘ANO’ is the interrogative clitics and it is added after nominal bases, case markers, plural markers, etc. i.

With the first category of nouns, the on glide ‘y’ appears when ‘ANO’ is suffixed to the nominal base.

ii.

amma+y+ANO

With the second category of nouns, the on glide ‘v’ appears when ‘ANO’ is

Eg: guru+ CLI_ANO

guru+v+ ANO

guruvANO

maraM+m+ANO

ot

Eg: maraM+ CLI_ANO

C

With the third category of nouns, ending in ‘M’, the suffixation ‘ANO’ changes ‘M’ to ‘m’.

iv.

op

suffixed to the nominal base.

iii.

ammayANO

y

Eg: amma+CLI_ANO

maramANO

With the fourth category of nouns ending in consonants, the suffixation of

N

‘ANO’ doesn’t make any sandhi change.

kUT +ANO

o

Eg: kUT+ CLI_ANO

kUTANO

'A' which is another interrogative clitic can replace ‘ANO’, subject to the above

D

mentioned sandhi changes. Example:

ammayA, guruvA, maramA, kUTA

23   

3.23 ‘And’ Coordination ‘um’ is the clitics for denoting coordination between nouns. The sandhi changes are as described in the case of interrogative clitics. amma+CLI_um

amma+y+um

ammayum

guru+ CLI_um

guru+v+ um

guruvum

maraM+ CLI_um

maraM+m+um

English Sentence

:

Malayalam Sentence :

y kUTum

Unni too went to Kottayam

unniyum kOTTayatteekk pOyi

3.24 ‘Or’ Coordination

C

Eg:

kUT +um

op

kUT+ CLI_um

maramum

ot

For ‘Or’ coordination of Malayalam nouns, 'O' is added after nominal bases similar to the clitics discussed before.

ammayO, guruvO, maramO, kUTO

D

o

N

Example:

24   

3.25 Malayalam Verbs - Morphology In a language, there are words which are categorized as verbs. The verbs usually carry tense and functions mostly as predicates. In Malayalam, the verb inflect or get modified for tense, mood, negation, aspect and voice. The base forms of the Malayalam verbs are arrived at by removing the infinitive

Example: cirikkuka

paRayuku+V

paRayuka

op

cirikk

cirikkuka+V

y

suffix ‘uka’ from the infinitive form of the verbs suffixed with ‘uka’.

paRay

C

According to the strategy adopted here, ‘cirikk’ and ‘paRay’ are the base forms. The inflections are accounted by making use of the base forms.

ot

3.26 Malayalam Verb Base Forms

The three base forms of Malayalam verbs are considered. They are Intransitive

N

verb, Transitive verb, and Causative verb.

3.27 Intransitive (akarmaka)

o

The verb without the object is called as Intransitive verb. The intransitive verb denotes the state or process or action of the subject. The intransitive forms are usually the

D

root forms to which the transitive and causative suffixes can be added to arrive at the respective base or stem forms. A simple rule for no change in Sandhi for Intransitive base is written below:

tinn+INT

tinn+ε

tinn

3.28 Transitive (sakarMaka) The verbs which can take an object noun are called transitive verbs. Not all transitive verbs are derived from their intransitive counterpart. Some verbs are inherently 25   

transitive (eg. maRakkuka). Only those verbs which are derived from their intransitive counterparts are considered for derivation. The intransitive verbs are converted into transitive verbs at least by the following fourteen kinds of processes which are listed

muRukk

ii.

AT+TRA

ATT

iii.

kayaR+TRA

kayaRR

iv.

uRangng+TRA

uRakk

v.

kUmpu+TRA

kUpp

vi.

nIL+TRA

nITT

vii.

cuzal+TRA

cuzaRR

viii.

tIr+TRA

tIrkk

ix.

kariy+TRA

karikk

x.

paRakk+TRA

paRatt

xi.

viz+TRA

vIztt

xii.

irikk+TRA

irutt

xiii.

nilkk+TRA

xiv.

poTT+TRA

op

muR+TRA

ot

C

i.

y

below.

nirtt

N

poTTikk

3.29 Causative (prayOjaka)

o

If the cause of the action denoted by the verb is the subject of the sentence and is reflected in the verb form by the causative marker, then the verb concerned is the

D

causative verb. The transitive verbs are converted into causative by at least three kinds of processes, which are listed below: i.

ceyy+CAU

ceyyikk

ii.

uNN+CAU

UTT

iii.

kAN+CAU

kATT

iv.

kELkk+CAU

kELppikk

26   

In general, the transitive forms are accompanied by the tense markers of Malayalam verbs. Consider the example sentence where the transitive form comes.

3.30 Tense Forms The time at which the action of the verb takes place is denoted by tense. There are

Past (bhUtaM)



Present (vartamAam)



Future (bhAvi)

op



y

three different tenses:

The morphotactics of Malayalam verbal forms are as follows:

3.31 Past Tense (bhUtaM)

C

Verb+TRA+CAU+TENSE

ot

The past tense of the verb denotes the action that already took place. There are three sets of past tense suffixes: ‘i, tu, ntu’. Morphophonemic change occurs when these suffixes are added to the verbal bases. Accordingly we get the following alternates of past

N

tense suffix: ‘i, t, T, R, njnj, NT, N, nn, tt, and cc’. kayaR+PAST

kayaR+i

kayaRi

ii.

ceyy+PAST

ceyy+tu

ceytu

iii.

kAN+PAST

kAN+tu

kaNTu

D

o

i.

iv.

viT+PAST

viT+tu

viTTu

v.

peR+PAST

peR+tu

peRRu

vi.

paRay+PAST

paRay +ntu

paRanjnju

vii.

curuL+PAST

curuL +ntu

curuNTu

viii.

vIZ+PAST

vIz+ntu

vINu

ix.

cEr+PAST

cEr+ntu

cErnnu

x.

tar+PAST

tar+ntu

tannu

xi.

koTukk+PAST

koTukk+ttu

koTuttu

27   

xii.

vilkk+PAST

vilkk+tu

viRRu

xiii.

kELkk+PAST

kELkk+tu

kETTu

xiv.

kaTikk+PAST

kaTikk+tu

kaTiccu

There are verbs whose past tense formations are not predictable. They are handled as special cases before going to any generalized orthographic rule. Some of the special cases of past tense form of Malayalam verbs are given in the following table.3.8.

C

op

nonTu venTu taLLi colli konnu pOyi vannu tannu pOyi kantu vannu tannu tinnu koTuttu veccu veccu cattu cattu ninnu unTu kaTTu nakki cikki uzhutu vINu tAzhNu koNTu

D

o

N

ot

nOv vEv taLL coll koll pO vA tA pOk kAN var tar tinn koT vey veykk cAv cAk nilkk uNN kakk nakk cikk uzh vIzh tAzh koLL

28   

y

Table.3.8 Special cases of Past Tense forms for Malayalam Verbs

3.32 Present Tense (vartamAnam) The present tense denotes the current state of action. The present tense is marked by adding the suffix ‘unnu’ to the verbal bases. ceyy+INT+PRES

ceyy+unnu

ceyyunnu

Transitive:

ceyy+TRA+PRES

ceyy+ikk+unnu

ceyyikkunnu

Causative:

ceyy+CAU+PRES

ceyy+ippikk+unnu

ceyyippikkunnu

y

Intransitive:

var+unnu

3.33 Future Tense (bhAvi)

vannu

C

var+INT+PRES

op

The unpredictable present tense formations of verbs are handled separately.

The future tense denotes the action that is going to take place. The future tense is

ot

marked by adding the suffix ‘um’ to the verbal bases. ceyy+INT+FUT

ceyy+um

ceyyum

Transitive:

ceyy+TRA+FUT

ceyy+ikk+um

ceyyikkum

Causative:

ceyy+CAU+FUT

N

Intransitive:

ceyyippikkum

o

ceyy+ippikk+um

D

3.34 Continuous Tense The continuous tense denotes the continuity of action denoted by the verb. The

continuous tense is marked by adding the compound auxiliary verb ‘konTirikk’ to the past participle form of the main verb. ceyy+CONT+PAST

ceytu+konTirukk+nnu

ceyy+CONT+PRES

ceytu+konTirukk+unnu

ceytukonTirikkunnu

ceyy+CONT+FUT

ceytu+konTirukk+um

ceytukonTirukkum

29   

ceytukondirunnu

3.35 Perfect Tense Perfect tense is realized by adding to the past participle form of the verbs, three types of compound auxiliaries ‘iTTuNT, iTTuNTAyirunnu and iTTuNTAvum’ forming past perfect, present perfect and future perfect respectively. ceyy+PERF+PAST

cey+tu+iTTuNTAyirunnu

ceyy+PERF+PRES

cey+tu+iTTuNT

ceyy+PERF+FUT

cey+tu+iTTuNTAvum

ceytiTTuNTAyirunnu

y

ceytiTTuNT

op

ceytiTTuNTAvum

3.36 Perfect Continuous Tense

It is realized by a complex auxiliary string ‘koNTirukkukayAyiru’ to the past

C

participle form of the verb.

paTi+PERFCONT+PAST paTi+cci+konTirukkukayAyiru+unnu

ot

paTiccikonTirukkukayAyirunnu paTi+PERFCONT+PRES

paTi+cci+konTirukkukayAyirikku+unnu

N

paTiccikonTirukkukayAyirikkunnu paTi+PERFCONT+FUT

paTi+cci+konTirukkukayAyirikku+um

o

paTiccikonTirukkukayAyirikkum

D

3.37 Voice (prayOga) The voice is divided into two categories: Active and Passive. Passive voice is

marked and active voice is unmarked. The passive voice is realized by adding the auxiliary verb ‘peT’ to the infinitive form of a transitive or causative verb. ceyy+INF+PASS+PAST

ceyy+a+peT+Tu

ceyyappeTTu

ceyy+INF+PASS+PRES

ceyy+a+peT+unnu

ceyyappeTunnu

ceyy+INF+PASS+FUT

ceyy+a+peT+um

ceyyappeTum

30   

3.38 Auxiliary Verbs The auxiliary verbs such as can, may, should and could are handled by adding the

ceyy+AUX_CAN

ceyy+AM

ceyyAM

ceyy+AUX_MAY

ceyy+tu+EkkAM

ceytEkkAM

ceyy+AUX_SHOULD

ceyy+aNaM

ceyyaNaM

ceyy+AUX_COULD

ceyy+Anpatti

y

suffixes ‘AM, EkkAM, aNaM and Anpatti’ respectively.

op

ceyyAnpatti

3.39 Negation

The negation is marked by adding the suffix ‘illa’ to the tense forms of the verbs. ceyy+tu+illa

C

ceyy+PAST+NEG_NOT

ot

3.40 Question Verbs

ceytilla

The ‘yes’ or ‘no’ questions are marked by adding the suffix ‘O’ to the tense forms of the verbs.

N

ceyy+PAST+QUES

ceyy+tu+O

ceytO

3.41 Infinite Verbs

D

o

The infinite forms of verbs are marked by adding the suffix ‘An’ to the base forms of the verbs without any tense. ceyy+INF

ceyy+An

ceyyAn

31   

CHAPTER

4

IMPLEMENTATION OF RULE BASED MACHINE TRANSLATION SYSTEM A rule-based machine translation system for English to Malayalam language pair has been developed (Model). Each module of the system will be discussed in detail in this

y

chapter. The block diagram of the system is shown in fig.4.1 below:

op

ENGLISH SENTENCE                       (INPUT SOURCE TEXT) 

C

STANFORD ENGLISH PARSER 

D

o

N

ot

ENGLISH – MALAYALAM BILINGUAL  DICTIONARY 

ENGLISH – MALAYALAM  TRANSLITERATOR 

MALAYALAM MORPHOLOGICAL  SYNTHESIZER

REORDERING BY TRANSFER RULES

MALAYALAM SENTENCE              (OUTPUT TARGET TEXT)  Fig.4.1 Block Diagram of English-Malayalam Rule Based Machine Translation System 32   

4.1 English Parser 4.1.1 Introduction Parser is an algorithm which produces a syntactic structure for a given input. The parser is the first component of the rule based machine translation system and it is used on the source (English) side. The statistical Stanford parser based on the probabilistic context free grammar (PCFG) is used in the system. English sentence is directly given to the parser without preprocessing. The Penn tree bank tagset of POS tag used in Stanford

y

Parser is given in Appendix – A.1.

op

4.1.2 Usage

The Stanford Parser is used for four main purposes in the machine translation system. i.

The parser is used for syntactic analysis of the English sentence in order to give

C

the parse tree structure of the English sentence by context free grammar. The example of parsing is shown in fig.4.2 below:

ot

ROOT

N

S

VP

o

NP 

VBP



am

D

PRP 

VP VBG

writing

NP DT

NN 

a

book 

Fig.4.2 Sample Parse Tree for the sentence “I am writing a book”

33   

This tree structure is required for re-ordering the source (English) sentence with respect to the target (Malayalam) sentence by transfer rules. ii.

The parser is used for Parts of Speech (POS) tagging of the English sentence to give English words and their corresponding POS tags based on the Penn Tree bank tag set. The example of POS tagging is as follows: I will bring the pen  

  

I(PRP) will(MD) bring(VB) the(DT) pen(NN) 

y

These POS tagged words are used to search the target equivalent of English word in bilingual dictionary, to synthesize morphology of Malayalam words and also to

iii.

op

reorder the English text with respect to the Malayalam text.

The parser is used for stemming the words of English sentence, to get their

children are playing     

C

corresponding root words. The example of stemming is as follows:   

(children‐child) (are‐be) (playing‐play) 

ot

The root words of English obtained after stemming are used to find the equivalents of Malayalam words from bilingual dictionary. The parser is used for the morphological analysis of words in the English

N

iv.

sentence, to get the morphology of English words.   

(I) (go+V+PRES+CONT) (Chennai+N+DAT) 

o

I am going to Chennai 

D

The morphology information of English is used in the morphological synthesizing for equivalent Malayalam words.

34   

4.2 English to Malayalam Transliteration 4.2.1 Introduction The transliteration is the process of labeling the text in one language with other. In English to Malayalam transliteration, the English text is replaced with the Malayalam text by preserving the spell. The SVM based Multilingual Amrita English-Malayalam Transliteration tool [21] is developed by Amrita – CEN and we use the same in the

y

machine translation system.

op

4.2.2 Preparation

The Amrita English-Malayalam transliteration system is implemented using SVMTool. First the corpus of English words are collected and preprocessed. The

C

preprocessing involves two level Romanization, segmentation and alignment. The English words are romanized into Malayalam words, by English-Malayalam mapping. The romanized Malayalam words are again romanized back to English, by Malayalam-

ot

English mapping. The regular English words and romanized English words are segmented by phonetics of digrams, trigrams, etc. The segmented parallel corpus is aligned as such for one to one alignments but for other alignments they are aligned with

N

help of empty variables (^). The system is trained with 20000 names using SVMlight and the model is tested with 1000 names using SVMTeval, and gives an accuracy of 90%

o

[21].

D

4.2.3 Usage

In machine translation, the proper nouns like person names and place names, named

entities, may not have the equivalent Malayalam words in the bilingual dictionary. In

such cases, the translation system will not produce good output. Such words should not be translated but these words had to be transliterated. The Amrita English-Malayalam Transliteration tool is used for transliteration in the rule based machine translation system. The transliteration is invoked after parsing the English sentence with Stanford Parser. Because only after parsing, the proper nouns could be identified (easy way is to identify the word with Capital case letters), by POS tagging them with NNP (proper noun 35   

singular) or NNPS (proper noun plural). Any word with ‘NNP’ or ‘NNPS’ POS category will be directly transliterated without entering into other translation modules. The transliteration is used in the machine translation system to serve for three main purposes: i.

The transliteration is used for transliterating the proper nouns such as place names and person names. Since we are using the machine translation system for Indian Languages, the transliteration system is trained with local places and persons in India and Kerala. The sample output for transliteration of proper nouns is as

ii.

   

ഹ ഷവ ധന് 

op

Harshawardhan  Pondichery   

y

follows:

െപാnിെഛര ് 

The transliteration is used for transliterating the named entities such as the

C

Organization names and University names. The machine translation system should not translate such names because their words could be mapped to their equivalent meaning in Malayalam and the prepositions in between might also

ot

morphologically processed with the word. The sample output for transliteration of named entities is as follows:

iii.

N

CEN    Renault 

 

െസന് 

 

െരെനൗ

് 

 

The transliteration is used for transliterating the words that don’t have the

o

equivalent Malayalam words. Some English words don’t require equivalents in Malayalam because they might be contributing to morphological processing of

D

Malayalam. The machine translation system will not give any equivalent for the word that is not available in the dictionary. In such cases, those words are transliterated to notify the user that the word has to be added to the dictionary. This transliteration is very much helpful in improving the system accuracy by aiding in manual testing. The sample output for transliteration of non-available dictionary words is as follows: Cytoplasm    Chromosome   

   

െ ാp മ്  െ ഛാെമാെസാെമ 

36   

4.3 English-Malayalam Bilingual Dictionary 4.3.1 Introduction The dictionary contains words and their corresponding meanings. The bilingual dictionary has the words in one language and their meanings in the other. EnglishMalayalam bilingual dictionary is used in the machine translation system for translating the English words to equivalent Malayalam words.

y

4.3.2 Preprocessing

op

Around 21,000 English-Malayalam bilingual data and more than 40,000 Malayalam- English bilingual data have been collected. The bilingual data is manually typed and preprocessed. The preprocessing of dictionary undergoes various stages

i.

C

depending on the data.

Font-converting – The Malayalam data has to be in Unicode font for the system to process. The font converters of are built for converting many

ii.

ot

Malayalam fonts to into Unicode (UTF-8) format by distinct mappings. Aligning – The English words have to be aligned with the equivalent

N

Malayalam word with respect to their meanings. The inflections of the data are also removed and only the roots words are taken. POS tagging – The POS category of each English-Malayalam bilingual pair

o

iii.

has to be tagged. Many POS categories of the same word may exist and all the

D

categories of that word have to be POS tagged.

iv.

Lexicalizing – This is the most important stage of preprocessing. If we have a

detailed description of Malayalam meaning for equivalent English word, then there will be ambiguity in morphology generation of the word and also the reordered word may not give a readable output. The challenge in creating bilingual dictionary for machine translation is to find the one word Malayalam equivalent to the English word. So, most of the Malayalam words of dictionary are manually lexicalized. 37   

v.

Adding synonyms – One English word may have one or more equivalent Malayalam words (depending on the sense) in dictionary. The primary lexicon (first sense) of Malayalam word is stored as an equivalent target word for English and all other secondary lexicons (other senses) are stored as synonyms in dictionary along with the primary lexicon. The synonyms of the Malayalam word are stored along with the secondary lexicons in dictionary.

vi.

Removing duplicates – The duplicates in English words with same

y

Malayalam equivalents have to be removed so that the one word of English

op

could be matched and its equivalent can be retrieved.

After preprocessing, manual checking has to be performed to the entire dictionary.

C

4.3.3 Implementation

The preprocessed bilingual dictionary is loaded into the database and MySQL

ot

server is used. Based on the POS categories, the dictionary is separated into seven different databases: Noun, Verb, Adjective, Adverb, Pronoun, Preposition and General (excluding reordering rules, auxiliary tense and Stanford dependencies). Their names

N

suggest the content of the database, while the general category stores all other POS categories such as conjunctions, interjections, determiner, particle, cardinal, etc. Each

o

database has five fields: source, target, category, Feature and Synonym. The field ‘source’ stores the English words.

ii.

The field ‘target’ stores the Malayalam words.

iii.

The field ‘category’ stores the POS category of the source and target words.

iv.

The field ‘feature’ stores the person-number-gender (PNG) marker, which is

D

i.

not required for Malayalam, so the feature column is left empty. v.

The field ‘synonym’ stores the synonym of Malayalam words for that particular English word. All the databases are set the type as ‘varchar’ and the source field is set as the

primary key, because the English words have to be unique for better search and retrieval. 38   

4.4 Malayalam Morphological Generator 4.4.1 Introduction The morphological synthesizer adds morphology to the words. A bi-directional Morphological Generator cum Morphological Analyzer has been developed for Malayalam, for synthesizing morph to the Malayalam words. Finite State Transducer is used to model the morphology and orthographic rules of Malayalam are written. The FST

y

based Malayalam morphological synthesizer is used in the machine translation system. 4.4.2 Preparation

op

Around 1000 Malayalam Nouns and 1500 Malayalam Verbs belonging to various stem ends have been collected. The rule based approach is followed here. The words are distributed based on their stem endings. The words collected are manually inflected for

C

different types of inflections. There are approximately 7500 inflected forms for Malayalam Nouns and 12500 inflected forms for Malayalam Verbs. The inflections considered for noun are: Plural markers; Case Markers: Accusative, Dative, Locative, Instrumental,

Genitive,

Ablative,

ot

Sociative,

Benefactive,

Adjectivization;

Adverbalization, Cliticization. The inflections which have been considered for verb are:

N

Transitive marker, Causative marker; Tense Markers: Simple Present, Simple Past, Simple Future, Continuous Present, Continuous Past, Continuous Future, Present Perfect, Past Perfect, Future Perfect, Present Perfect Continuous, Past Perfect Continuous, Future

o

Perfect Continuous, Passive, negative, question types and infinitive verbs . The inflected word forms are manually analyzed in order to model the Morphological Generator for

D

Malayalam and implemented in open source software open FST-1.2.7. The linguistic aspects of Malayalam morphology and the orthographic rules are discussed in the previous chapter.

4.4.3 Building FST Model The morph synthesizer is used in the rule based machine translation system for English to Malayalam translation. The morphological information about the English words will be transferred to Malayalam words. The Stanford Parser is used to stem the English words from the input sentence and also to get the morphologically analyzed 39   

information. The equivalent Malayalam words are extracted from the English to Malayalam Bilingual Dictionary. The target words are romanized by Unicode to Roman character mapping and given as input to the FST. Therefore we have the romanized target word along with the morphological information from the source side. The dependency information from the parser is transferred to the required format for morphological synthesis. So these dependency transfer information are separately stored for nouns and verbs in the database. The FST model is studied from [22] where FST based morphology

y

is used for Speech Recognition. The input to the FST is considered as the Regular Expression. For example, the

FST input: maraM+NOUN+PLURAL

op

plural marking for noun maraM is written as:

maraM+N+PL

C

Here ‘+’ is used for our convenience and stored separately in a variable to distinguish from the regular expression operator ‘+’. All the romanized alphabets of Malayalam are categorised as variables for the usage in FST. Next step is to build the

ot

FST. Let us consider the Plural marker of a noun as an example to build FST. The working of the FST will also be explained later, with an example of a verb in Present

N

Perfect Continuous form.

The romanized Malayalam word along with the morphological information from the source parser is taken as input. The romanized characters are stored in a variable and

D

o

the FST transition from the state ‘a’ to state ‘b’ takes place, which is given in fig.4.3. maraM a

b

Fig.4.3 First Transition of FST In the next transition, the Parts of Speech category (N for noun) of the word is taken. No inflection occurs for Malayalam nouns in this transition from state ‘b’ to ‘c’ which is given in fig.4.4, so it is marked with ‘epsilon’:

40   

maraM

+ N

a

b

c ε

Fig.4.4 Second Transition of FST For the next transition, the morphological information from the source language is considered, in our case, the plural marking. The inflection for plural has to be added to the Malayalam noun: for ‘human’ ‘mAr’ has to be appended and for the rest ‘kaL’,

y

during the transition from state ‘c’ to‘d’, which is given in fig.4.5. Since we use more

op

words other than ‘human’ nouns, we define the plural marking as ‘kaL’. Also the ‘+’ sign will be replaced with ‘~’ sign for identification of markings:



+N b

+PL

c

C

maraM

ε



~kaL

ot

Fig.4.5 Third Transition of FST The final transition is to define the final state‘d’, which is given in fig.4.6:

N

maraM

o



+ N

b

+PL c

ε

d  ~kaL

Fig.4.6 Final Transition of FST

D

It is observed that it is always better to have separate final states for FST for

future use, if we are going to extend the model. Output of FST: maraM+N+PL

maraM~kaL

Thus the FST model for Plural marking of Malayalam Nouns has been built. The important aspect in modeling FST is to follow the morphotactics. The morphotactics of Malayalam are discussed in the previous chapter.

41   

4.4.4 Writing orthographic rules As Malayalam is an inflectionally rich language, Malayalam words have to be classified into different categories by defining them with different sets of orthographic rules. The linguistic aspects of orthographic rules are seen in the previous chapter. The computational aspects of the orthographic rules are discussed in this chapter. The rule notation of Chomsky and Halle is followed for Malayalam orthographic rules. b/c_d’ states that ‘a’ is replaced with ‘b’ in the position between ‘c’

The rule ‘a

y

and ‘d’, when the word ends with ‘c’ joins with the word starting with ‘d’.

op

For plural marking, the nouns are categorized based on the following phonological endings of stems and semantic forms: (consonants), (u|U), (M), (general), (human); whose orthographic rules are given in table.4.1 as follows:

amma[b2]kaL M[b2]k  [b2]  [b2]  [b2] 

‐>  ‐>  ‐>  ‐>  ‐> 

ot

Stem end [human]:   Stem end M:   Stem end [u or U]:   Stem end [consonants]:   Stem end [general]:  

ammamAr  angng  k  u  [] 

/  __   /  __ aL  /  [u|U]  __ kaL  /  [CON]  __ kaL  /  __ kaL 

N

i.  ii.  iii.  iv.  v. 

C

Table.4.1 Sandhi rules for various stem endings of Malayalam nouns

Here the variable [b2] denotes the ‘~’ sign, variable [CON] denotes all consonants of Malayalam, [epsilon] is an empty variable. The suffixes are phonologically,

o

morphologically as well as semantically conditioned.

D

The rule (i) states that the human nouns are marked with ‘mAr’ and the ‘kaL’ that is

added during pluralization has to be replaced with ‘mAr’. Example: accan~kaL

accanmAr

The rule (ii) states that the nouns with ‘M’ stem ends are marked with ‘kaL’, where ‘M~k’ is replaced with ‘angngaL’. Example: maraM~kaL

marangngaL

The rule (iii) states that the nouns with ‘u’ or ‘U’ stem ends are marked with ‘kaL’, where ‘~’ is replaced with ‘k’. Example: pasu~kaL

42   

pasukkaL ; pU~kaL

pUkkaL

The rule (iv) states that the nouns with ‘consonant’ stem ends are marked with ‘kaL’, where ‘~’ is replaced with ‘u’. Example: kall~kaL

kallukaL

The rule (v) states that the nouns with all other stem ends are marked with ‘kaL’, where ‘~’ is replaced with nothing in general. The four categories (consonants), (u|U), (M), (general) are common for all inflections of Malayalam nouns. The two other categories to be considered are (a|A,i|I,e|E,o|O) and (aL) that results from the plural marking by morphotactics of

y

Malayalam. Therefore six categories are sufficient for marking each inflection of

op

Malayalam nouns.

The order of orthographic rules is important. The special rules have to be considered before the general rules written. The exceptions are taken as special rules as

C

they don’t follow the general rules. For example, in the case of ‘accan’ the special rule overrides the general rule since it ends with consonant sound.

ot

Same strategies of FST that are applied to Malayalam nouns are also applicable to Malayalam verbs. FST model is built for Malayalam Verbs too. The categories considered for Malayalam verbs are different from the groups of Malayalam nouns. In

N

Malayalam nouns, the total of 35 stem ends considered is grouped into 7 categories in order to optimize the rules of all inflections, whereas in Malayalam verbs, the total of 71 stem ends is considered and they are grouped into 10 categories, where majority of them

o

cause inflections among the past tense markings.

D

The ‘uka’ forms of Malayalam Verbs are taken as base forms as they help in

exploring the verbal inflections, in clear terms, the ‘uka’ forms comes handy in explaining the morphophonemic changes, for example, ‘ceyyuka’. Before the morph getting generated, these ‘uka’ suffixes has to be removed. Consider the Malayalam word ‘paRayuka’, its root is ‘paRa’, the corresponding Future tense form is ‘paRayum’. The form ‘paRayum’ can be easily derived from ‘paRayuka’ as it has ‘y’ in it. As already mentioned, the past tense marker in Malayalam verbs, brings most of the inflections. Some of the rules for the past tense marker are given in table.4.2:

43   

Table.4.2 Sandhi rules for various stem endings of Malayalam verbs in past tense /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  /  / 

iTTuNT   __ u  koNTirikk   __ u   __ u   __    __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __ u   __   

y

 Ayirunn   unn   koNTAyirunn   pOyi   kaNT   vann   tann   tinn   rnn   N   nn   yt   nj   iTT   icc   rtt   utt   ann   nn   i 

op

‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐> 

C

[b2]  [b2]  koNTAyirikk[b2]  pOk[b2]u  kAN[b2]  var[b2]  tar[b2]  tinn[b2]  r[b2]  zh[b2]  ll[b2]  yy[b2]  y[b2]  iT[b2]  ikk[b2]  rkk[b2]  ukk[b2]  akk[b2]  lkk[b2]  [b2]u 

ot

i.  ii.  iii.  iv.  v.  vi.  vii.  viii.  ix.  x.  xi.  xii.  xiii.  xiv.  xv.  xvi.  xvii.  xviii.  xix.  xx. 

The first eight rules are written as special rules that deviate from the general rules,

N

whereas the next twelve are all general rules common to most of the verbs that are considered. The first three rules are written to aid the continuous tense marker, perfect

o

tense marker and perfect continuous tense marker, because these three markers will first convert the verb into past tense and then create inflections depending on continuous or

D

perfect or both and again moves to the tense marker. All other Malayalam verb tense markers and inflections have only three to four common rules. 4.4.5 Working of FST The working of FST by following the Sandhi rules is discussed below with the example of a Malayalam verb ‘ceyyuka’ in present perfect continuous form. Input: ceyyuka + Present Perfect Continuous Tense Input to FST: ceyyuka+V+PAST+PERFCONT+PRES 44 

 

i.

FST reads the input: Output: ceyyuka

ii.

FST goes to next state (POS category-verb ‘V’ state) and checks for the rule, then truncates ‘uka’ by the rule: Rule 1: uka[b2]uka 

‐> 

 [] 



 __  

Output: ceyyuka => ceyy

Rule 2: yy[b2]   

‐> 

 yt 



Output: ceyy => ceytu

 __ u 

FST transition to ‘PERFCONT’ tense state and append ‘koNTAyirikk’ by the rule: Rule 3: [b2] 

 

‐> 

C

iv.

y

FST moves to ‘PAST’ tense state and replaces ‘yy’ by ‘yt’ by the rule:

op

iii.

 [] 



 __ koNTAyirikk 

FST comes back to tense states, now it is ‘PRESENT’ state and append ‘unnu’ by the rule:

N

v.

ot

Output: ceytu => ceytukoNTAyirikk

Rule 4: [b2] 

 

‐> 

 [] 



 __ unnu 

o

Output: ceytukoNTAyirikk => ceytukoNTAyirikkunnu

D

FST final Output of Malayalam Morph Generator: ceytukoNTAyirikkunnu By employing four rules in morphotactics order, we got the correct output for

given input. The built FST model is directly used in the rule based machine translation system. The parser dependencies are converted into the required format of inflection marker inputs of FST.

45   

4.5 Malayalam Morphological Analyzer 4.5.1 Introduction The morphological analyzer analyzes the morphology of words. With similar FST model, the morph analyzer is built for Malayalam nouns and verbs based on the morphotactics. This is the added advantage of using FST in building Malayalam morphological generator. The Malayalam morph analyzer gives multiple outputs in the order of expected results because of the multiple connected paths in the FST model.

y

Although we have built the Malayalam morph analyzer, it is not used in the rule based machine translation system. But the morph analyzer could be useful for future purpose

op

like Malayalam to English machine translations. 4.5.2 Working

C

With similar FST model, the morph analyzer is built for Malayalam nouns and verbs based on the morphotactics. This is the added advantage of using FST in building Malayalam morphological generator. The Malayalam morph analyzer gives multiple

ot

outputs in the order of expected results because of the multiple connected paths in the FST model. In general, the first output of the morph analyzer is taken as the exact output

N

amongst all.

The morph analyzer is discussed with the example of Malayalam noun and verb as

o

follows:

D

FST Morph Analyzer Noun Input: pUkkaL Expected Output: pU+N+PL FST Outputs: i.

pU+N+PL

ii.

pUkkaL+V

iii.

pUkkaL+V+INT

iv.

pUkkaLuka+V

v.

pUkkaLuka+V+INT 46 

 

The first output is taken as the exact output. The second and third outputs are logically incorrect. The last two outputs (iv & v) from the above results are entirely wrong. This problem of wrong outputs could be solved if we build a morph analyzer separately for nouns and verbs.

FST Morph Analyzer Verb Input: ceytu Expected Output: ceyy+V+PAST

ii.

ceyy+V+INT+PAST

iii.

ceyyuka+V+PAST

iv.

ceyyuka+V+INT+PAST

v.

ceytu+V

vi.

ceytu+V+INT

vii.

ceytuuka+V

viii.

ceytuuka+V+INT

C

ceyy+V+PAST

N

ot

i.

op

y

FST Outputs:

The first four outputs are correct outputs and the first output is taken as the expected

o

output. The last two outputs (vii & viii) are wrong outputs. This problem of outputs in morph analyzing the Malayalam verbs could be solved by developing morph generator

D

and analyzer separately.

47   

4.6 Reordering by Transfer Rules 4.6.1 Introduction In machine translation, the reordering denotes the change in syntactic structure of source text with respect to the target text. The reordering can be machine learned or executed by rules or both. The reordering by rules is followed in the machine translation system to reorder the English sentence in the order of Malayalam sentence.

y

4.6.2 Preparation

op

English is the Subject-Verb-Object (SVO) language, whereas Malayalam is Subject-Object-Verb (SOV) language. Therefore the reordering is necessary for English to Malayalam translation. The pattern based reordering approach proposed for Tamil in

C

[19] is followed for Malayalam. The pattern based reordering is based on the pattern of CFG rules from the parser (Stanford Parser). The transfer rules are written as transfer links, which carries the order of the child nodes from the parse tree. First the parallel

ot

corpus of English and Malayalam has been collected from the open source Kerala Government school texts. The corpus is manually typed, converted to Unicode format and aligned manually and has around 2500 parallel sentences. These sentences are

N

analyzed for all possible combinations of sentence structures such as simple, compound and complex. Then the reordering rules are written based on the syntactic variation of structure of English sentence depending on Malayalam sentence. At present, we have

o

written around 100 reordering rules for Malayalam. Since Malayalam is a free word-

D

ordered language, the reordering rules are flexible. The rules are loaded into a separate database with two fields: source pattern and target pattern. Source pattern stores English CFG rules and target pattern stores corresponding transfer CFG rules of Malayalam. 4.6.3 Implementation The reordering forms the last component of the machine translation system. The syntactic information of English sentence from Stanford Parser is checked for the match in the database of reordering rules. If the syntactic pattern of English sentence matches with the source rule, then the corresponding Malayalam rule is taken and the source tree 48   

structure of parser is modified with respect to the target rule. If the pattern matches, the transfer rule is applied to the child nodes of all branches in parse tree. Now the system output would be syntactically reordered to suit Malayalam language. So we are not transforming the parse tree as a whole which might require one rule for each type of sentence, and the count is infinite. But if we are transforming the tree branch wise with respect to child nodes, then many transfer rules can be applied to reorder the sentence and also the same rules can be applied to other sentences also. The reordered output after morphological generation of Malayalam words is displayed as the final output of the

y

machine translation system. Let us see some of the English-Malayalam reordering rules

op

for two different types of sentences and see how the rules are applied in step by step procedure in the following examples.

“I am eating an apple”, this sentence is reordered with respect to Malayalam by executing two re-ordering rules: Rule – 1:  Rule – 2: 

C

i.

 VP (VBP: VP)    VP (VP: VBP)   VP (VBG: NP)    VP (NP: VBG) 

ot

ROOT

N

S

o

NP 

D

PRP  I 

VP

VBP am

VP VBG

eating

NP DT

NN 

an

apple 

Fig.4.7 Parse Tree of English Sentence ‘I am eating an apple’ 49   

By applying rule – 1, the reordered parse tree is shown in Fig.4.8.a, and by applying rule – 2, the reordered parse tree is shown in Fig.4.8.b.

ROOT 



S  VP

VBG 

NP

I

am

DT 

NN

N

eating 

PRP

VBP

ot



VP 

op

PRP 

NP

an 

VP 

C

NP 

y

ROOT 

apple

DT an

VBP

VBG  am NN  eating  apple 

(Fig. 4.8.b)

o

(Fig. 4.8.a)

NP

VP 

Fig.4.8.a Reordered Parse Tree of English Sentence by executing Rule – 1

D

Fig.4.8.b Reordered Parse Tree of English Sentence by executing Rule – 2

After applying the above two rules, the English sentence is in the order of required Malayalam sentence as shown in Fig.4.8.b.

50   

ii.

“I work hard to finish my work and achieve my goal”, the parse tree of this sentence is shown in fig.4.9 and it is reordered with respect to Malayalam by executing following four re-ordering rules and the reordered parse trees are shown in fig.4.10, 4.11, 4.12 & 4.13 be executing rule-1, rule-2, rule-3 and rule-4

VP (VB: ADJP)   VP (ADJP: VB) 

Rule – 2:  

ADJP (JJ: S)   ADJP (S: JJ) 

Rule – 3:  

VP (TO: VP)   VP (VP: TO) 

Rule – 4:  

VP (VB: NP)   VP (NP: VB) 

op

Rule – 1:  

y

respectively.

C

The complete set of reordering rules that are written is given in Appendix – B.1. The rules have certain limitations while applying to the source sentences. Some sentence may have same structure but they need to be reordered in different ways. In such

ot

cases, the only option is to apply the transfer rule that is written. Such kinds of sentences create ambiguity in machine translation outputs. This problem is not handled in the

N

system and it is yet to be handled by separating such ambiguous sentences from other sentences that follow the general rule written. Such ambiguous sentences have to be handled by identifying them with their lexicons and structure and pre-process or post-

D

o

process the same.

51   

ROOT 

S  NP 

VP  VBP 



work 

ADJP JJ 

op

y

PRP 

S

hard 

C

VP

VP

ot

TO to

N

CC

VP 

D

o

VP

VB 

finish 

and NP

VB

DT

NN

the

work

NP 

achieve DT 

NN

the 

goal

Fig.4.9 Parse Tree of English Sentence ‘I work hard to finish the work and achieve the goal’

52   

ROOT 



PRP 

ADJP  JJ 



hard 

work

VP 

CC

N

to 

VP

ot

TO 

C



VBP

y

VP 

op

NP 

VP

VP 

D

o

VB 

finish 

and

NP

VB

DT 

NN

the 

work

NP 

achieve DT

NN 

the

goal 

Fig.4.10 Reordered Parse Tree of English Sentence - 2 by executing Rule – 1 (VP (VB: ADJP)   VP (ADJP: VB))

53   

ROOT 

NP 

VP 

PRP 

ADJP 





JJ

VP 

hard

op

VBP

C

work

VP

ot

TO  to 

N

CC

VP

o

VP 

D

VB 

finish 

y



and NP 

VB

DT 

NN

the 

work

NP

achieve DT

NN 

the

goal 

Fig.4.11 Reordered Parse Tree of English Sentence - 2 by executing Rule – 2 (ADJP (JJ: S)   ADJP (S: JJ)) 54   

ROOT

S  NP 

VP

PRP 

ADJP



S

JJ

VP

hard

op

y

VBP

C

work

TO

ot

VP 

to

N

CC 

VP

VP 

and 

NP 

VB

D

o

VB 

finish 

DT 

NN

the 

work

NP

achieve DT

NN

the

goal

Fig.4.12 Reordered Parse Tree of English Sentence - 2 by executing Rule – 3 (VP (TO: VP)   VP (VP: TO)) 55   

ROOT

S VP

PRP 

ADJP



S

VBP 

work 

hard

TO

ot

VP

JJ

C

VP

op

y

NP 

N

CC

to VP

VP 

and

D

o

NP 

DT 

NN 

the 

work 

NP

VB finish

VB

DT

NN

the

goal

achieve 

 

Fig.4.13 Reordered Parse Tree of English Sentence - 2 by executing Rule – 4 (VP (VB: NP)   VP (NP: VB))

56   

CHAPTER

5

RESULTS 5.1 Results of Malayalam morphological generator and analyzer The details and statistics of Morphological generator systems are given in the following tables. The table.5.1 shows the statistics of nouns and table.5.2 shows the statistics of verbs.

NOUN STEMS

STEMS

INFLECTIONS

y

Table.5.1 Statistics of morphology for Malayalam nouns

(END) COVERED

GROUPED

COVERED

MARKER

Noun

-

Plural

kaL/mAr

Case Markers

-

Nominative case

-

Accusative case

e

kha

amma

A

kurunn

aval

al

M

makaL

aL

makaL

M

amma

ngng

u,U

App

nj

a,A,i,I,e,E,o,O

Dative case

in

ar

Ol

aL

Locative case

il

at

OL

consonants

Sociative case

OT

atu

u

njAn

Instrumental case

Al

av

U

ni

Genetive case

uTe

Ay

ub

Ablative case

ilninnu

Azh

Ud

Benefactive case

inuvENTi

chcha

ud

o

Adjectivization

Aya

chchat

UN

Adverbalization

Ayi

D

N

C

a

ot

op

INFLECTION

ER

Im

Clitics

-

i

k

Clitics_um

um

I

r

Clitics_E

E

Ih

ni

Clitics_A

A

Clitics_O

O

Clitics_tanne

tanne

Clitics_ANO

ANO

njAn

TOTAL = 37

TOTAL = 10

TOTAL = 21

Total Number of Sandhi Rules Written = 152

57   

Table.5.2 Statistics of morphology for Malayalam verbs VERB STEMS (END)

VERBS

INFLECTIONS

INFLECTION

COVERED

GROUPED

COVERED

MARKERS

Azh

ll

ikk

Verb

uka

ak

enc

opp

ar

Intransitive

-

Akk

eNN

Or

rkk

Transitive

ikk

akk

eNT

oRR

yy

Causative

ippikk

al

ER

ott

tt

Tense

-

aL

ERR

rkk

akk,Akk,ukk,

Past Tense

u

alk

ES

rtt

Okk,ykk

Present Tense

unnu

aLL

eT

tt

iT,eT,oT

Future Tense

um

AN

ett

uk

ng,ngng

aNG

ikk

ukk

y

ANG

Ikk

Ukk

others

ar

iLK

UL

AR

inn

UL

aRR

Int

umm

ARR

ippi

uNG

as

iT

upp

AT

ITT

uR

att

iy

uRR

ATT

iZh

avv

Lmp

ay

ng

Okk

ngng

yy

ntt

zhtt

op Perfect

iTTuNT

Perfect Continuous

koNTAyirikk

Passive

appeT

Auxiliary-can

AM

Auxiliary-may

EkkAM

Auxiliary-should

aNaM

Auxiliary-could

Anpatti

Negative

illa

Question

O

Infinite

An

N

ot

C

koNTirikk

US utt

ykk

L

D

oT

Continuous

o

okk

y

Ak

TOTAL = 71

TOTAL = 10

TOTAL = 19

Total Number of Sandhi Rules Written = 103

58   

Table.5.3 Testing results of morph generator for Malayalam nouns Morphological Generator for Malayalam Nouns Number of nouns taken for testing

100

Number of inflections considered for testing

7

Total number of nouns in testing corpus

700

Number of correct outputs

498

Number of wrong outputs

202 71.14

op

y

Accuracy (in %)

Table.5.4 Testing results of morph generator for Malayalam verbs Morphological Generator for Malayalam Verbs

C

Number of verbs taken for testing

Number of inflections considered for testing Total number of verbs in testing corpus

100

12 1200 872

Number of wrong outputs

328

ot

Number of correct outputs

72.66

N

Accuracy (in %)

o

Table.5.5 Testing results of morph analyzer for Malayalam nouns Morphological Analyzer for Malayalam Nouns

D

Number of nouns taken for testing Number of inflections considered for testing

7

Total number of inflected nouns in testing corpus

700

Number of correct outputs

475

Number of wrong outputs

225

Accuracy (in %)

67.85

59   

100

Table.5.6 Testing results of morph analyzer for Malayalam verbs Morphological Analyzer for Malayalam Verbs Number of verbs taken for testing

100

Number of inflections considered for testing Total number of inflected verbs in testing corpus

12 1200 786

Number of wrong outputs

414

Accuracy (in %)

65.5

y

Number of correct outputs

op

Table.5.7 Coverage of Malayalam nouns and verbs Coverage for Malayalam nouns Number of nouns taken

1000

Number of nouns inflected

C

7500

Coverage for Malayalam verbs Number of verbs taken

ot

Number of verbs inflected

1500 12500

N

5.2 Discussion about results of Malayalam morphological generator and analyzer A testing corpus of 100 nouns and 100 verbs has been taken. The nouns include all

o

types of human nouns and stem suffixes and the verbs include all different kinds of

D

special categories. The inflections considered are 7 and 12 for nouns and verbs respectively. Therefore there are 700 inflections for nouns and 1200 inflections for verbs for testing. The difference between the testing of morph generator and analyzer is the testing corpus. For morph generator, the testing corpus contains word stems and inflection information separately whereas the testing corpus of morph analyzer will be the inflected words. The testing corpus of morph analyzer consists of 100 nouns that are inflected into 700 inflections and 100 verbs that are inflected into 1200 inflections. Table.5.3, table.5.4, table.5.5 and table.5.6 show the testing results of morph generator and analyzer for Malayalam nouns and verbs and the table.5.7 shows the 60   

coverage of them. At present, the morphological generator system is developed for plural markers, case markers, adjectivization, adverbalization and clitics markers for Malayalam nouns. For Malayalam verbs, base forms, tense markers in all forms, voice, auxiliary, negative types, question types and infinitive verbs. The morphological synthesizer for Malayalam nouns gives an accuracy of 71.14 percent for the inflections considered. The accuracy could be improved when more human nouns are considered. The morphological synthesizer for Malayalam verbs gives

y

an accuracy of 72.66 percent for the inflections considered. The accuracy could be improved when more number of special cases of past inflections of verbs is considered.

op

Also the question and negative types have to be effectively handled for all cases. The morphological analyzer for Malayalam nouns gives accuracy of 67.85 percent and for morphological analyzer are: i.

C

Malayalam verbs gives accuracy of 65.5 percent. The factors that affect the accuracy of

When same type of inflections occur for different categories (i.e.,) clitics ‘um’

ii.

ot

and future tense ‘um’. Example: ‘pAmpum’ and ‘Odum’. When different words end up with same word after morphological generation. Example: ‘avan’, ‘avaL’ => ‘avar’.

When special cases and general cases vary completely with inflections. Example:

N

iii.

‘mar’ and ‘kaL’. When a human form is given, then output would be with ‘kaL’. iv.

In case of Malayalam nouns, the first output amongst all is taken as the morph

o

analysed output. In case of verbs, the output is given in the order of the rules

D

written and ambiguity arises in choosing the best possible output.

These problems of Malayalam morph analyzer could be solved by developing morph

analyzer and generator separately and having a separate FST model and orthographic rules for the same. Inflection Markers used in morphology of Malayalam Nouns and verbs are given in Appendix – C.1, C.2. FST state transition tables of Malayalam nouns

and verbs are given in Appendix – D.1, D.2. Orthographic rules written for Malayalam nouns and verbs are given in Appendix – D.3, D.4 respectively.

61   

5.3 Results of Rule Based Machine Translation System The details and statistics of English-Malayalam rule based machine translation system is given in table.5.8. Table.5.8 Statistics of Rule Based Machine Translation System Bilingual Dictionary Database Statistics 3576

Adverbs

832

Pronouns

90

y

Adjectives

Prepositions

op

Verbs

100

4276

Nouns

11909

Others

195

20, 978

Dependency for nouns considered

40

Dependency for verbs considered

292

Number of Reordering rules

99

Number of Orthographic rules for nouns

152

N

ot

C

Total words in Database

Number of Orthographic rules for verbs

103

D

o

Table.5.9 Testing results for English-Malayalam machine translation system Testing of Rule Based Machine Translation system

Number of sentences tested

757

Number of correctly translated sentences

406

Number of understandable (readable) translations Number of wrong translations

253

Overall System Accuracy (in %)

62   

98

53.63

5.4 Discussion about results of Rule Based Machine Translation System The testing results of English-Malayalam rule based machine translation system is given in table.5.9. The test corpus consists of 757 random simple sentences collected from short stories. While testing, the sentences are ranked into three categories: 1) Exact translations, 2) Understandable or Readable translations and 3) Wrong translations. The complex sentences that involve various clauses, exclamations, negations and questions give wrong reordering outputs because they are not handled in the system.

y

Words that are not available in dictionary and the phrases in sentences resulted in

op

semantically wrong translations. Consider the example below, where the translation is in wrong order and also the word ‘how’ got translated. Eg:

He teaches them how to read » aവന് aവെരtെnെയ വായിkുക e

െന പഠിpിkുnു

C

There are some sentences which are correct in semantic and reordering but they are wrong in that particular context. Also some translations convey the meaning but not in a clear sense. These kinds of sentences are considered as understandable, readable or

ot

acceptable translations. Consider the example below, where the translation is word by word correct, but ‘powerful’ and ‘strikes’ are context-wise wrong. Powerful earthquake strikes Philippines » ഫിലിpിേനസകള് വീരനായ ഭൂമികുലുkം-പണിമുടkുകള്

N

Eg:

The correct outputs of sentences without any errors are considered as exact translations. Consider the example below, where the translation is semantically,

We celebrated his birthday » ഞ

D

Eg:

o

morphologically and syntactically exact. ള് aവന്െറ ജന്മദിനം ആേഘാഷിc

The major problem with the rule based system is the execution of same rule for

different sentences with same structure that require different transfer rule in the target. Eg:

Please open the door » കതകിെന തുറn സേnാഷിpിkുക

The sentence follows the reordering rule (VP:VB PRT NP) (VP:NP PRT VB), which gives syntactically correct output for all other sentences like ‘Wake up the kid’, with same structure, but the same rule gives wrong output for the above sentence. 63   

The system gives an overall accuracy of 53.63 percent for the exact translations of test sentences we have used. The complete list of tested sentences with rankings is given in Appendix – E.1. The system accuracy could be improved by increasing the number of lexical items in the bilingual dictionary, increasing the number of rules and by considering more inflections of Malayalam morphology. The accuracy can also be improved by taking care of semantic ambiguity among the lexical items. The semantic ambiguity could be solved by linking the dictionary with the Malayalam WordNet.

y

The screenshot of the sample output of the machine translation system is shown in fig.5.1. The first text box in the translation window is to type the input sentence for

op

translation. The second text box gives the corresponding Malayalam translated output when the ‘Translate’ button is pressed. The third box shows the parsed output with mapping of English words to equivalent Malayalam words along with the reordered

C

structure. The fourth text box shows the Morphological generation of Malayalam nouns and verbs. The final fifth box shows the occurrence of words of given input sentence, in bilingual dictionary along with their synonyms, if available. The font used in the

ot

‘Netbeans IDE’ is ‘Arial Unicode MS’ in order to display both Malayalam and English words. The Malayalam alphabets are displayed in a separated fashion because of font rendering problem with the ‘jdk’. But when the output is displayed in any text editor or

N

html document, the font will be rendered with combinational alphabets of Malayalam, by setting the default font to ML-TT Karthika in text editor or UTF-8 encoding in html. The

o

screenshot of the online translation system is shown in fig.5.2, where the font of Malayalam

text

is

rendered

properly.

The

system

is

available

online

at

D

http://nlp.amrita.edu:8080/Eng2Mal/ . The screenshots of sample outputs of Malayalam morph generator and analyzer

for nouns are shown in fig.5.3 and fig.5.4 respectively and the screenshots of Malayalam morph generator and analyzer for verbs are shown in fig.5.5 and fig.5.6 respectively. Here the command ‘mg.sh’ executes the morph generator for given input and the command ‘ma.sh’ executes morph analyzer for given inputs. The input is given along with the executable commands and outputs are displayed in the following lines of input.

64   

5.5 Screenshots

N

ot

C

op

y

 

D

o

Fig.5.1 Screenshot of Machine Translation system

Fig.5.2 Screenshot of online translation system with proper font rendering of Malayalam text

65   

D

o

N

ot

C

op

y

Fig.55.3 Screensh hot of Morph Generatoor for Malayyalam nouns

hot of Morph Analyzer for Malayaalam nouns Fig.55.4 Screensh

66   

y

D

o

N

ot

C

op

Fig.55.5 Screensh hot of Morph Generatoor for Malayyalam verbss

Fig.55.6 Screensh hot of Morph Analyzer for Malayaalam verbs 67   

CHAPTER

6

CONCLUSION The rule based machine translation system for English to Malayalam has been developed. The main focus of the thesis revolves around morphological synthesizer in developing the rule based system. The morphology is well modeled into a state diagram with transitions. The reason is all other components depend only on the source text processing which are already developed and available as the open source to aid the

y

development of machine translation system for various languages from English. The goal

op

of developing such a translation system is to make the resources available to everyone.

6.1 Limitations

The machine translation system has many unavoidable limitations towards its NLP

C

modules that are being used. Multiple parse trees are not handled by the Stanford parser and the dependency parser is also not used in the tranasltion system. So handling the

ot

verbal phrases is not possible by the system. The transliterator is limited to the Indian names and so international names give wrong transliterations. The bilingual dictionary lacks the word sense information, so the semantic ambiguity arises in the system for

N

many words. The morph generator is implemented for certain cases but the dependency information of many inflectional categories is not given by the parser, such cases works well in morph generator but not in the translation of sentences. The reordering rules are

o

confined to the nodes of the branches and same rule could not be handled for different cases with same syntactic structure. The system has to be improved in a better way, by

D

resolving all the ambiguities discussed earlier.

6.2 Applications The translation system is flexible enough further developed for speech access and

speech to speech translation with human computer interaction. Then if a user is giving a speech input then the system has to be smarter enough to recognize and translate the documents to the user’s language and read it aloud. Such kinds of systems would improve the literacy rate in the country if they are installed at public places like ATM 68   

machines or carried as a source of education in mobile schools. With all these hopes we are releasing our machine translation as open source software to be used by anyone whoever wants to contribute for the society in good means. The machine translation system will be useful in schools for pedagogical purposes in teaching grammar. The corpus for SMT systems could be created using this RBMT system. The system could be installed in hand held devices such as mobile phones, for easy access to language. The system might be installed in restaurants and hotels for translation of food menus. A good translation system will be able to provide all educational resources available in internet,

i.

op

6.3 Future Work

y

in the local language to improve our society.

The system can be utilized as such for developing speech systems. A smart and intelligent speech to speech translation system with human computer

ii.

C

interaction will be producing a high impact on the society.

The system can be further enhanced by using a massive database of bilingual

iii.

ot

dictionary for better choice of words.

The major part of morphology is covered and more morphological categories can be handled and the reordering rules can also be further added. The phrase based approach can be used to develop a translation system.

v.

Rules always evolve with the language evolve. Therefore it is always better to

N

iv.

have a statistical machine translation system for a prolonged usage. Word Sense Disambiguation system can be developed on the target side for

o

vi.

Malayalam to English translation for avoiding semantic ambiguties. The translation memories could be used for handling idioms and phrases and

D

vii.

proverbs to aid children.

viii.

The word alignment system in [23] could be used for handling question and negative type sentences.

ix.

The data available for the Indian languages could be added more in internets on public interests to favor the translation oriented research.

x.

Every student should take up the translation work as an initiative measure by contributing with data which is still remains a coveted resource for research. 69 

 

REFERENCES [1] Philipp Koehnn, Statistical Machine Translation, Cambridge University Press, 2010. [2] Durgesh Rao, “Machine Translation in India: A Brief Survey”, in Proc. of SCALLA 2001 Conf., Bangalore, India, 2001. [3] Chris Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing, MIT Press, Cambridge, MA, May, 1999.

y

[4] C. Rahul, K. Dinunath, Remya Ravindran and K. P. Soman, “Rule Based Reordering and Morphological Processing For English-Malayalam Statistical

op

Machine Translation”, in Proc. of Int. Conf. on Advances in Computing, Control, and Telecommunication Technologies (ACT), Trivandrum, India, December, 2009.

C

[5] P. G. Raji, “Reordering Approach in English-Malayalam Statistical Machine Translation”, Master’s Thesis, Department of Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham,

ot

Coimbatore, India, July, 2010.

[6] P. Unnikrishnan, P. J. Antony and K. P. Soman, “A Novel Approach for English to South Dravidian Language Statistical Machine Translation System”, in Int.

N

Journal on Computer Science and Engineering (IJCSE), ISSN: 2749-2759, Vol. 2, No. 8, November, 2010.

o

[7] Mary Priya Sebastian, K. Sheena Kurian and G. Santhosh Kumar, “English to Malayalam Translation: A Statistical Approach”, in Proc. of the 1st Amrita ACM-

D

W Celebration on Women in Computing, India, September, 2010.

[8] Mary Priya Sebastian, K. Sheena Kurian and G. Santhosh Kumar, “Alignment Model and Training Technique in SMT from English to Malayalam”, in Contemporary Computing, Vol. 94, pp. 305-315, 2010. [9] Rashmi Gangadharaiah and N. Balakrishnan, “Application of Linguistic Rules to Generalized Example Based Machine Translation for Indian Languages”, in Proc. of the First National Symposium on Modeling and Shallow Parsing of Indian Languages (MSPIL), Mumbai, India, April, 2006.

70   

[10] R. Harshawardhan, Mridula Sara Augustine and K. P. Soman, “Phrase based English – Tamil Translation System by Concept Labeling using Translation Memory”, in Int. Journal of Computer Applications (IJCA), ISSN: 0975 – 8887, Vol. 20, no. 3, April, 2011. [11] R. Harshawardhan, Mridula Sara Augustine and K. P. Soman, “Advanced English –

Malayalam

Translation

Memory

for

Natural

Language

Processing

Applications”, in Proc. of Nat. Conf. on Indian Language Computing (NCILC),

y

February, 2011.

op

[12] Kamaljeet Kaur Batra and G. S. Lehal, “Rule Based Machine Translation of Noun Phrases from Punjabi to English”, in Int. Journal of Computer Science Issues (IJCSI), ISSN: 1694-0814, Vol. 7, Issue 5, September, 2010.

C

[13] Centre for Development of Advanced Computing (CDAC), Annual Report, 20072008 [Online]. Available at: http://www.cdacindia.com/html/about/annual.aspx [14] Remya Rajan, Remya Sivan, Remya Ravindran and K. P. Soman, “Rule Based

ot

Machine Translation from English to Malayalam”, in Proc. of Int. Conf. on Advances in Computing, Control and Telecommunication Technologies (ACT), Trivandrum, India, December, 2009.

N

[15] Rajeev Sangal, S.M. Bendre, Pavan Kumar and Aishwarya, “Unsupervised Improvement of Morphological Analyzer for Inflectionally Rich Languages”, in Proc. of 6th NLP Pacific Rim Symposium, Tokyo, November, 2001.

o

[16] A.G. Menon, S. Saravanan, R. Loganathan and K. P. Soman, “Amrita Morph

D

Analyzer and Generator for Tamil: a Rule based approach”, in Proc. of Int. Tamil Internet conference 2009, Univ. of Cologne, Germany, October, 2009.

[17] Jisha P. Jayan, R.R. Rajeev and S. Rajendran, “Morphological Analyser for Malayalam - A Comparison of Different Approaches”, in Int. Journal of Computer Science and Informaiton Technology (IJCST), Vol. 2, No. 2, pp. 155160, December, 2009. [18] Jisha P.Jayan, R. R. Rajeev and S. Rajendran, Morphological Analyser and Morphological Generator for Malayalam - Tamil Machine Translation, in Int.

71   

Journal of Computer Applications (IJCA), ISSN: 0975 – 8887, Vol. 13, No.8, January, 2011. [19] S. Saravanan, A. G. Menon and K. P. Soman, “Pattern Based English-Tamil Machine Translation”, in Proc. of Tamil Internet conference, Chap. 4, No. 6, pp. 295 - 300, Coimbatore, India, 2010. [20] R. E. Asher and T. C. Kumari, Malayalam, Routledge, 1997. [21] Sumaja Sasidharan, R. Loganathan and K. P. Soman , “English to Malayalam

op

Trends in Engineering, Vol. 1, No. 2, May, 2009.

y

Transliteration Using Sequence Labeling Approach”, in Int. Journal of Recent

[22] Daniel Jurafsky and James H. Martin, SPEECH and LANGUAGE PROCESSING - An Introduction to Natural Language Processing, Computational Linguistics,

C

and Speech Recognition, Prentice Hall, Second Edition, 2009.

[23] R. Harshawardhan, Mridula Sara Augustine and K. P. Soman, “A Simplified Approach to Word Alignment Algorithm for English-Tamil Translation”, in

ot

Indian Journal of Computer Science and Engineering (IJCSE), ISSN: 0976-5166,

D

o

N

Vol. 2, No. 1, 2011.

72   

APPENDIX - A A.1. Penn Treebank Tag set for POS category Tag

Description

Examples

''

closing quotation mark

' ''

dash

--

$

dollar

$ -$ --$ A$ C$ HK$ M$ NZ$ S$ U.S.$ US$

(

opening parenthesis

([{

)

closing parenthesis

)]}

,

comma

,

.

sentence terminator

:

colon or ellipsis

``

opening quotation mark

CC

conjunction, coordinating

& 'n and both but either et for less minus neither

CD

numeral, cardinal

mid-1890 nine-thirty one-tenth ten million 0.5

DT

determiner

all an another any each neither no some such that the

EX

existential there

there

FW

JJ

foreign word preposition or conjunction, subordinating adjective or numeral, ordinal

JJR

adjective, comparative

bleaker braver breezier briefer brighter brisker broader

JJS

adjective, superlative

LS

list item marker modal auxiliary

calmest cheapest choicest classiest cleanest A A. B B. D E F First I J K One SP-44001 SP-44002 Third can cannot could couldn't dare may might must

NN

noun, common, singular or mass

common-carrier cabbage knuckle-duster Casino

NNP

noun, proper, singular

Motown Venneboerger Czestochwa Ranzer Conchita

NNPS

noun, proper, plural

Americans Americas Amharas Amityvilles Amusements

NNS

noun, common, plural

undergraduates scotches bric-a-brac products

PDT

pre-determiner

all both half many quite such sure this

.!?

op

: ; ...

C

` ``

gemeinschaft hund ich jeux habeas

astride among upon whether out inside pro despite on

third ill-mannered pre-war regrettable oiled calamitous

D

o

MD

N

ot

IN

y

--

POS

genitive marker

' 's

PRP

pronoun, personal

hers herself him himself hisself it itself me myself

PRP$

pronoun, possessive

her his mine my our ours their thy your

RB

adverb

occasionally unabatingly maddeningly adventurously

RBR

adverb, comparative

further gloomier grander graver greater grimmer harder

RBS

adverb, superlative

best biggest bluntest earliest farthest first furthest

RP

particle

SYM

symbol

TO

to as preposition or infinitive marker

aboard about across along apart around aside at away % & ' '' ''. ) ). * + ,. < = > @ A[fj] U.S U.S.S.R \* \*\* \*\*\* to

UH

interjection

Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey

73   

VB

verb, base form

ask assemble assess assign assume atone attention

VBD

verb, past tense

dipped pleaded swiped regummed soaked tidied

VBG

verb, present participle or gerund

telegraphing stirring focusing angering judging

VBN

multihulled dilapidated aerosolized chaired

WDT

verb, past participle verb, present tense, not 3rd person singular verb, present tense, 3rd person singular WH-determiner

WP

WH-pronoun

that what whatever whatsoever which who whom

WP$

WH-pronoun, possessive

whose

WRB

Wh-adverb

how however whence whenever where whereby

VBP

bases reconstructs marks mixes displeases seals that what whatever which whichever

op

y

VBZ

predominate wrap resort sue twist spill cure

 

C

   

ot

   

   

D

 

o

 

N

 

       

74   

APPENDIX - B B.1. Hand coded Reordering Rules for RBMT

op

C

N

D

75   

1:3 2:2 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:3 2:2 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:3 3:1 1:3 2:1 3:2 1:2 2:1 1:2 2:1 1:2 2:3 3:4 4:5 5:1 1:2 2:1 1:2 2:1 1:3 2:2 3:1 1:3 2:2 3:1 1:4 2:1 3:2 4:3 1:2 2:1 1:2 2:3 3:1 1:2 2:3 3:1 1:2 2:3 3:1 1:2 2:3 3:4 4:5 5:1 1:2 2:1 1:2 2:3 3:4 4:1 1:2 2:3 3:1 1:3 2:1 3:2 1:3 2:2 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:4 2:2 3:3 4:1 1:2 2:3 3:4 4:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1

y

SBAR NP PP NP SBAR NP VP NP NP IN PP IN S IN NP TO S IN RB S WHADVP S WHNP S WHPP S IN ADVP VP MD VP MD RB VP MD VP TO ADVP S VB* ADJP VB* ADVP VBZ PP ADVP VB* VP ADVP VB* PP VB* CC VB* NP VB* NP ADVP VB* NP NP VB* NP PP VB* NP PP , PP VB* PP VB* PP , PP VB* PP SBAR VB* ADJP VB* RB VP RB VB* S VB* SBAR VB* VP VB* WHNP IN SBAR NP NP VB* ADVP ADJP PP VB* PP ADJP PP JJ S JJ PP VB* SBAR ADVP

ot

NP SBAR NP PP NP SBAR NP VP IN NP IN PP IN S TO NP RB IN S WHADVP S WHNP S WHPP S IN S MD ADVP VP MD RB VP MD VP TO VP VB* ADVP S VB* ADJP VB* ADVP VB* ADVP PP VB* ADVP VP VB* CC VB* PP VB* NP VB* NP ADVP VB* NP NP VB* NP PP VB* NP PP , PP VB* PP VB* PP , PP VB* PP SBAR VB* RB ADJP VB* RB VP VB* S VB* SBAR VB* VP IN WHNP VB* NP NP SBAR VB* ADVP ADJP PP ADJP PP JJ PP JJ S VB* PP ADVP SBAR

o

NP NP NP NP PP PP PP PP SBAR SBAR SBAR SBAR VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP WHPP VP VP ADJP ADJP ADJP ADJP ADVP

RB RB NP PP NP-TMP DT NN CD DT NN S JJ NN S NN S NP NP NP NP QP RB RB CD NN CC NP IN ADVP VB* NP IN CD TO CD RB JJR IN CD ADVP NP VP NP ADVP VP IN S XS WHNP SQ PP VP NP VB* NP ADJP MD NP VP MD NP VP MD RB NP VP

RB RB PP NP-TMP NP CD DT NN S NN DT S JJ NN S NN NP NP NP NP RB QP CD NN RB NP CC ADVP IN NP VB* CD TO CD IN CD IN RB JJR NP ADVP VP NP VP ADVP S IN SX SQ WHNP PP NP PP NP ADJP VB* NP VP MD NP VP MD NP VP RB MD

SQ

S MD NP VP

S NP VP MD

SQ SQ SQ SQ SQ VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP

VB* NP ADJP VB* NP NP VB* NP PP VB* NP VP VB* NP VP ADVP VB* NP ADVP VB* SBAR NN ADVP SBAR VB* ADJP , SBAR VB* ADJP ADVP VB* ADJP PP VB* ADVP ADJP VB* ADVP ADVP VB* ADVP NP VB* ADVP NP-TMP VB* FRAG VB* NP NP-TMP VB* NP PP PP VB* NP SBAR VB* NP-TMP VB* PP ADVP VB* PP NP-TMP

NP ADJP VB* NP NP VB* PP NP VB* NP VP VB* NP VP VB* NP ADVP VB* ADVP SBAR VB* ADVP NN SBAR ADJP VB* , SBAR ADVP ADJP VB* PP ADJP VB* ADVP ADJP VB* ADVP ADVP VB* ADVP NP VB* NP-TMP ADVP VB* FRAG VB* NP-TMP NP VB* PP PP NP VB* NP SBAR VB* NP-TMP VB* ADVP PP VB* PP NP-TMP VB*

op C

ot

N

o D

76   

1:2 2:1 1:2 2:3 3:1 4:4 1:3 2:1 3:2 1:3 2:2 3:1 1:3 2:1 3:2 1:2 2:1 1:2 2:1 1:2 2:1 3:3 1:2 2:1 1:2 2:3 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:3 3:4 4:1 1:4 2:3 3:1 4:2 1:2 2:1 3:3 4:4 1:1 2:3 3:2 4:4 1:2 2:1 1:2 2:1 1:2 2:1 3:3 1:1 2:3 3:2 4:4 1:2 2:3 3:1 1:2 2:3 3:1 1:2 2:3 3:1 4:4 1:3 2:4 3:2 4:1 5:5 1:1 2:2 3:4 4:5 5:3 6:6 1:2 2:3 3:1 4:4 1:2 2:3 3:1 4:4 1:3 2:2 3:1 1:2 2:3 3:1 1:2 2:3 3:1 4:4 1:3 2:1 3:2 1:1 2:3 3:2 1:2 2:1 3:3 1:2 2:1 3:3 4:4 1:3 2:2 3:1 1:3 2:2 3:1 1:2 2:3 3:1 1:3 2:2 3:1 1:2 2:3 3:1 1:3 2:2 3:1 1:2 2:1 1:3 2:2 3:1 1:3 2:4 3:2 4:1 1:2 2:3 3:1 1:2 2:1 1:3 2:2 3:1 1:2 2:3 3:1

y

ADVP FRAG NP NP NP NP NP NP NP NP PP PP PP QP QP S S SBAR SBAR SBARQ SINV SINV SQ SQ SQ

VP VP VP VP VP VP VP

VB* PP PP VB* PP S VB* PRT VB* PRT NP VB* PRT NP-TMP PP VB* PRT PP VB* RB NP

PP PP VB* S PP VB* PRT VB* NP PRT VB* NP-TMP PRT PP VB* PP PRT VB* NP VB* RB

1:2 2:3 3:1 1:3 2:2 3:1 1:2 2:1 1:3 2:2 3:1 1:3 2:2 3:4 4:1 1:3 2:2 3:1 1:3 2:1 3:2

   

y

   

op

   

C

     

ot

 

   

N

 

D

 

o

 

             

77   

APPENDIX - C C.1. Inflection Markers used in morphology of Malayalam Nouns INFLECTION MARKER

Noun

N

Plural

PL

Accusative case

ACC

Dative case

DAT

Locative case

LOC

Sociative case

SOC

Instrumental case

INS

op

Genetive case

GEN

Ablative case

ABL

Benefactive case

BEN

Adjectivization

ADVZ

Clitics_E Clitics_A

CLI_A

CLI_um CLI_E

ot

Clitics_um

CLI_O CLI_tanne

Clitics_ANO

CLI_ANO

o

N

Clitics_O

Clitics_tanne

D

                           

ADJZ

C

Adverbalization

78   

y

NOMINAL INFLECTION

C.2. Inflection Markers used in morphology of Malayalam Verbs INFLECTION MARKER

Verb

V

Intransitive

INT

Transitive

TRA

Causative

CAU

Past Tense

PAST

Present Tense

PRES

Future Tense

FUT

Continuous

CONT

Perfect

PERF

op

Perfect Continuous

PERFCONT

Passive

PART

Auxiliary-can

AUX_CAN

Auxiliary-may Auxiliary-could Auxiliary-would

AUX_MAY

AUX_SHOULD

C

Auxiliary-should

AUX_COULD AUX_CAN

AUX_MAY

Auxiliary-shall

FUT

Negative

NEG_NOT

Question

QUES

Infinite

INF

 

D

 

N

ot

Auxiliary-might

o

 

           

79   

y

VERBAL INFLECTION

APPENDIX - D D.1. FST State Transition Table modeled for Noun Morphotactics

e e f1 f2 f3 f4 f5 f6 f7 f8 g g g g g g g g

ACCUSATIVE DATIVE LOCATIVE SOCIATIVE INSTRUMENTAL GENETIVE ABLATIVE BENEFICIAL

kaL

: : : : : : : :

e in il OT Al uTe ilninnu inuvENTi

C

ot

o D

: :

h h

ADJECTIVIZATION ADJECTIVIZATION

: :

Aya Aya

i i

ADVERBALIZATION ADVERBALIZATION

: :

Ayi Ayi

j j j k1 k2 k3 k4 k5 k6 l l l l l

CLITICS1 CLITICS2 CLITICS3 CLITICS4 CLITICS5 CLITICS6

: : : : : :

um E A O tanne ANO

80   

SURFACE TAPE

y

LEXICAL TAPE CHARACTER SET NOUN PLURAL

op

NEXT STATE b c d

N

CURRENT STATE a b c d: c d e e e e e e e e f1 f2 f3 f4 f5 f6 f7 f8 g: c d h: c d i: c d g j j j j j j k1 k2 k3 k4 k5

k6 l:

l



D.2. FST State Transition Table modeled for Verb Morphotactics a

b

b c:

c

LEXICAL TAPE CHARACTER SET VERB

c d d d e1 e2 e3 f:

d e1 e2 e3 f f f

INTRANSITIVE TRANSITIVE CAUSATIVE

c f k m1 o q1 g g g h1 h2 h3 i:

g g g g g g h1 h2 h3 i i i

D

o

:

uka

: : :

ikk ippikk

: : :

unnu um u

C

N

ot

PRESENT FUTURE PAST

i j k:

j k

CONTINUOUS

:

koNTirikk

c f l m1:

l l m1

PARTICIPLE

:

appeT

i n o:

n o

PERFECT

:

iTTuNT

81   

SURFACE TAPE

y

NEXT STATE

op

CURRENT STATE

i

p

p

q1

PERFECT CONTINUOUS

:

koNTAyirikk

: : :

AM AnpaRRi aNaM

q1: r r r r r r s1 s2 s3 t t t

AUX_CAN AUX_COULD AUX_SHOULD

i u v:

u v

AUX_MAY

:

EkkAM

i t v w x1:

w w w x1

NEG_NOT

:

illa

y1 y1 y1 y1 z

QUES

:

O

aa aa bb

INF

:

An

op

C

ot N

D

o

i t v x1 y1 z: c f aa bb:

        82   

y

c f k m1 o q1 r r r s1 s2 s3 t:

D.3. Orthographic Rules for Malayalam Nouns #Plural "kaL"

avan[b2]kaL ivaL[b2]kaL avaL[b2]kaL tAn[b2]kaL nI[b2]kaL njAn[b2]kaL nAM[b2]kaL amma[b2]kaL accan[b2]kaL cETTan[b2]kaL cEcci[b2]kaL makan[b2]kaL makaL[b2]kaL ammAyi[b2]kaL muttaccan[b2]kaL muttaSSi[b2]kaL



__

ivar



__

avar



__

ivar



__

avar



__

tAngngaL



__

ningngaL



__

njangngaL



__

nammaL



__



__



__



__



__

makkaL



__

makkaL



__

ammAvanmAr



__

ammAyimAr



__

muttaccanmAr



__

muttaSSimAr



__

avar



__

ivar



__

mantrimAr



__

manushyar



__

rAjAkkanmAR



__

tampurAkkannmAR



__

tampurATTikaL



__

kalAkAranmAR



__

anujanmAR



__

anujattimAR



__

AngaLamAR



__

pengnganmAR



__

ANungngaL



__

peNNungngaL



__

AtmAkkaL



__

addhyApakanmAR

/  __

addhyapikamAR

/  __

ammamAr accanmAr cETTanmAr cEccimAr

N

ayAL[b2]kaL

__

ava

ot

ammAvan[b2]kaL



iyAL[b2]kaL

mantri[b2]kaL

manushyan[b2]kaL

o

rAjAv[b2]kaL

tampurAn[b2]kaL

D

tampurATTi[b2]kaL kalAkAraN[b2]kaL anujan[b2]kaL anujatti[b2]kaL AngngaLa[b2]kaL pengngaL[b2]kaL AN[b2]kal peNN[b2]kal AtmAv[b2]kal

addhyApakan[b2]kaL addhyApika[b2]kaL

83   

y

ivan[b2]kaL

iva

op

atu[b2]kaL

‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐> 

C

itu[b2]kaL

bhaTan[b2]kaL vakkIl[b2]kaL purushan[b2]kaL bhArya[b2]kaL ay[b2] NN[b2] M[b2]k [b2] [b2] [b2]

/  __

‐>  bhaTanmAR ‐>  vakkIlanmAR ‐>  purushanmAR ‐>  bhAryamAR

/  __

‐>  ‐>  ‐>  ‐>  ‐>  ‐> 

aya



__ kaL

NNu



__ kaL

ngng



__ aL

k

/  (u|U) __ kaL

u

/  [CON] __ kaL

/  __

/  __ /  __ /  __

y

sahOdari[b2]kaL

‐>  sahOdaranmAR ‐>  sahOdarimAR

[]

op

sahOdaran[b2]kaL

/  (a|A|i|I|e|E) __ kaL

# CASE MARKERS

‐>  ‐> 

njAn[b2]e

[b2] [b2] M[b2]

D

o

[b2]

njAn[b2]n nI[b2]n [b2]e [b2]n [b2]n [b2]

M[b2][] [b2]



__

ninne



__

‐>  ‐>  ‐>  ‐>  ‐>  ‐> 

[]

/  (L) __ e

y

/  (a|A|i|I|e|E) __ e

v

/  (u|o) __ e

vin

/  (U|O) __ e

tt



in

/  [CON] __ e

N

[b2] [b2]

enne

ot

nI[b2]e

C

# Accusative "e" 

# Dative "n"  ‐>  enikk ‐>  ninakk ‐>  u



__



__

‐>  ‐>  ‐>  ‐>  ‐> 

kk

/  (L) __

kk

/  (a|A|i|I|e|E) __

vi

/  (u|U|o|O) __ n

tti



i

/  [CON] __ n

/  n __

84   

__ e

__ n

# Locative "il" 

nI[b2]il

[b2] [b2] M[b2] [b2]

‐>  ‐> 

ennil



__

ninnil



__

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ il

v

/  (u|U|o|O) __ il

tt



[]

/  [CON] __ il

[b2] [b2] M[b2][] [b2]

ennOT

‐>  ‐>  ‐>  ‐>  ‐> 

[]

/  (L) __ OT

y

/  (a|A|i|I|e|E) __ OT

ninnOT

vin



__



__

/  (u|U|o|O) __ OT

ot

[b2]

‐>  ‐> 

C

nI[b2]OT

op

# Sociative "OT"  njAn[b2]OT

__ il

y

njAn[b2]il

ttin



__ OT

in

/  [CON] __ OT

N

# Instrumental "Al" 

njAn[b2]Al

o

nI[b2]Al

D

[b2] [b2]

M[b2] [b2]

‐>  ‐> 

ennAl



__

ninnAl



__

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ Al

v

/  (u|U|o|O) __ Al

tt



[]

/  [CON] __ Al

__ Al

# Genitive "e"  njAn[b2]Te nI[b2]Te

‐>  ‐> 

enRe



__

ninRe



__

85   

‐>  ‐>  ‐>  ‐>  ‐> 

[b2]Te [b2]Te [b2]Te M[b2]Te [b2]Te

uTe

/  (L) __

yuTe

/  (a|A|i|I|e|E) __

vinRe

/  (u|U|o|O) __

attinRe



inRe

/  [CON] __

__

# Benefactive  "inuvENTi" 

[b2]in [b2]in [b2] M[b2][]

__

ninakkuvENTi



__

‐>  ‐>  ‐>  ‐>  ‐> 

kk

/  (L) __ uvENTi

kk

/  (a|A|i|I|e|E) __ uvENTi

v tt

y



/  (u|U|o|O) __ inuvENTi / 

[]

ot

[b2]

enikkuvENTi

op

nI[b2]inuvENTi

‐>  ‐> 

C

njAn[b2]inuvENTi

__ inuvENTi

/  [CON] __ inuvENTi

# Ablative "ilninn" 

njAn[b2]ilninn

[b2]

ennilninn



__

ninnilninn



__

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ ilninn

v

/  (u|U|o|O) __ ilninn

att



[]

/  [CON] __ ilninn

N

nI[b2]ilninn

‐>  ‐> 

o

[b2]

M[b2]

D

[b2]

__ ilninn

# RootWord + N + ADJECTIVZATION  njAn[b2]Aya nI[b2]Aya

[b2] [b2]

‐>  ‐> 

njAnAya



__

nIyAya



__

‐>  y ‐>  v

/  (a|A|i|I|e|E) __ Aya /  (u|U|o|O) __ Aya

86   

‐>  m ‐>  []

M[b2] [b2]



__ Aya

/  [CON] __ Aya

# RootWord + N + ADVERBALIZATION 

[b2] [b2] M[b2] [b2]

njAnAyi



__

nIyAyi



__

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ Ayi

v

/  (u|U|o|O) __ Ayi

m



[]

‐>  ‐> 

njAn[b2]um

[b2]u

[b2]u



__

nIyum



__

‐>  ‐>  ‐>  ‐> 

yu

/  (a|A|i|I|e|E) __ m

vu

/  (u|U|o|O) __ m

vu



u

/  [CON] __ m

N

[b2]u M[b2]u

njAnum

ot

nI[b2]um

__ Ayi

/  [CON] __ Ayi

C

# RootWord + N + Clitics " um " 

y

nI[b2]Ayi

‐>  ‐> 

op

njAn[b2]Ayi

__ m

o

# RootWord + N + Clitics " E "  njAn[b2]E

D

nI[b2]E

[b2] [b2]

M[b2] [b2]

‐>  ‐> 

njAnE



__

nIyE



__

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ E

v

/  (u|U|o|O) __ E

tt



[]

/  [CON] __ E

__ E

# RootWord + N + Clitics " A "  njAn[b2]A

‐> 

njAnA



87   

__

nI[b2]A

‐> 

nIyA



[b2]

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ A

v

/  (u|U|o|O) __ A

m



[]

/  [CON] __ A

[b2] M[b2] [b2]

__

__ A

# RootWord + N + Clitics " O " 

[b2] [b2] M[b2] [b2]



__

nIyO



__

‐>  ‐>  ‐>  ‐> 

y

y

njAnO

op

nI[b2]O

‐>  ‐> 

/  (a|A|i|I|e|E) __ O

v

/  (u|U|o|O) __ O

m



[]

__ O

/  [CON] __ O

C

njAn[b2]O

njAn[b2]tanne

[b2]

njAntanne



__

nItanne



__

‐>  []



__ tanne

‐>  ‐> 

N

nI[b2]tanne

ot

# RootWord + N + Clitics " tanne " 

o

# RootWord + N + Clitics " ANO"  njAn[b2]ANO

D

nI[b2]ANO

[b2] [b2]

M[b2] [b2]

‐>  ‐> 

njAnANO



__

nIyANO



__

‐>  ‐>  ‐>  ‐> 

y

/  (a|A|i|I|e|E) __ ANO

v

/  (u|U|o|O) __ ANO

m



[]

/  [CON] __ ANO

  88   

__ ANO

D.4. Orthographic Rules for Malayalam Verbs ######********VERB********########  ‐>  uka[b2]uka [] ‐>  [b2]uka []



__



__

######********INTRANSITIVE********######## 

[b2]i [b2]ikk [b2]ikk [b2]

tt



r __

I



T __ kk

[] [] []

op

‐>  ‐>  ‐>  ‐>  ‐> 

[b2]ikk

y

######********TRANSITIVE********########

ikk __



akk __



__ ikk

ot

C

######********CAUSATIVE********########  ‐>  [b2] tt ‐>  [b2]i I ‐>  ikk[b2] [] ‐>  akk[b2]i a ‐>  [b2] []





r __ ippikk



T __ ppikk



__ ippikk



__ ppikk



__ ippikk

‐> 

[]



__ koNTirikk

o

[b2]

N

######********CONTINUOUS********######## 

D

######********PERFECT CONTINUOUS********########  [b2]

‐> 

[]



__ koNTAyirikk

######********PASSIVE********########  [b2]

‐> 

[]



__ appeT

‐>  ‐> 

kk



iri __ unnu

pOk



__ unnu

####PRESENT##### [b2] pO[b2]

89   

tA[b2] koT[b2] vey[b2] [b2]unnu [b2]

‐>  ‐>  ‐>  ‐> 

var



__ unnu

tar



__ unnu

koTukk



__ unnu

veykk



__ unnu

‐>  ‐> 

T



iTTuN __

[]



__ unnu

‐>  ‐>  ‐> 

kk



iri __ um

TAv



iTTuN __ um

[]



y

vA[b2]

####FUTURE##### [b2] [b2]

####PAST#####

######## SPECIAL CASES #############  ‐>  [b2] TAyirunn ‐>  ikk[b2] unn ‐>  koNTAyirikk[b2] koNTAyirunn

iTTuN __ u



koNTirikk __ u



__ u

nontu



__

ventu



__

taLLi



__

colli



__

konn



__ u

pOyi



__

vann



__ u

tann



__ u

pOyi



__

kaNT



__ u

vann



__ u

tann



__ u

tinn



__ u

koTutt



__ u

vecc



__ u

vecc



__ u

catt



__ u

cattu



__

ninn



__ u

uNT



__ u

C



ot

nOv[b2]u vEv[b2]u coll[b2]u koll[b2] pO[b2]u vA[b2]

o

tA[b2]

pOk[b2]u

D

kAN[b2] var[b2] tar[b2]

tinn[b2]

koT[b2] vey[b2] veykk[b2] cAv[b2] cAk[b2]u nilkk[b2] uNN[b2]

‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐> 

N

taLL[b2]u

90   

__ um

op

[b2]

nakk[b2]u cikk[b2]u uzh[b2] vIzh[b2] tAzh[b2] koLL[b2] pUkk[b2] iri[b2]

kAttu



__

nakki



__

cikki



__

uzhut



__ u

vIN



__ u

tAzhn



__ u

koNT



__ u

pUtt



__ u

irunn



__ u



y

‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐>  ‐> 

kAkk[b2]u

‐>  ‐>  ‐> 

nn

__ u



LL __

NT



__ u

‐>  ‐> 

yt



__ u

njnj



__ u

‐>  ‐>  ‐>  ‐> 

iTT



__ u

eTT



__ u

oTT



__ u

aTT



__ u

‐>  ‐> 

icc



__ u

IRR



__ u

ykk[b2]

‐> 

ycc



__ u

rkk[b2]

rtt



__ u

ukk[b2]

‐>  ‐> 

utt



__ u

akk[b2]

‐> 

ann



__ u

lkk[b2]

‐>  ‐> 

RR



__ u

TT



__ u

L[b2] yy[b2] y[b2]

eT[b2] oT[b2] aT[b2]

N

iT[b2]

o

ikk[b2]

D

Ikk[b2]

Lkk[b2]

i

C



[b2]u

91   

__ u

ot

ll[b2]



__ u

op

############GENERAL CASE#############  ‐>  R[b2] RR ‐>  r[b2] rnn

‐> 

[b2]u

i



__

[]



__ iTTuN

y



i __ iTTuN

[]



__ iTTuN

‐>  ‐>  ‐> 

u[b2] [b2] [b2]

y

######********PERFECT********######## 

kk

[b2]

‐>  ‐>  ‐>  ‐> 

[b2]

‐> 

[]

[b2]

[] TAk

ot

[b2]

[]

iri __ AM



koNTAyirikk __ AM



koNTirikk __ AM



iTTuN __ AM



__ AM

‐>  ‐>  ‐>  ‐> 

kk



iri __ AnpaRRi

[]



koNTAyirikk __ AnpaRRi

[]



koNTirikk __ AnpaRRi

TAk



iTTuN __ AnpaRRi

‐> 

[]



__ AnpaRRi

kk



iri __ aNaM

[]



koNTAyirikk __ aNaM

[]



koNTirikk __ aNaM

[b2]

‐>  ‐>  ‐>  ‐> 

TAk



iTTuN __ aNaM

[b2]

‐> 

[]



__ aNaM

u[b2]

‐>  ‐> 

[]



__ EkkAM

[]



__ EkkAM

[b2] [b2] [b2]

o

[b2]

N

[b2]

D

[b2] [b2] [b2]

i[b2]

92   



C

[b2]

op

######********AUXILLARY********######## 

‐>  ‐>  ‐>  ‐> 

Ekk



__ illa

AnpaRR



__ illa

AnpaRR



__ illa

aNTa



__

[]



y



um[b2]

‐>  ‐>  ‐> 

y

######********NEGATIVE********######## 

[b2]

‐> 

[]

AM[b2] AnpaRRi[b2] aNaM[b2]illa u[b2]

[]

i __ illa



__ illa



__ illa

C

[b2]

__ illa

op

EkkAM[b2]

######********QUESTION********######## 

iTTuNTAyirunnu[b2] koNTirunnu[b2] irunnu[b2] unnu[b2]



illa __ O

o

u[b2]

D

[b2] [b2]

‐>  ‐>  ‐>  ‐> 

iTTuNTAyirunn



__ O

koNTirunn



__ O

koNTAyirunn



__ O

irunn



__ O

‐>  ‐>  ‐> 

unnuNT



__ O

[]



__ O

y



i __ O

‐> 

[]



__ O

kk



iri __ An

[]



__ An

N

koNTAyirunnu[b2]

y

ot

‐> 

[b2]

######********INFINITE********########  [b2] [b2]

‐>  ‐> 

    93   

APPENDIX - E E.1. Tested sentences of Machine Translation System with Rankings RANK

ENGLISH INPUT SENTENCE

MALAYALAM OUTPUT SENTENCE

She loves her work

1

He put his towel on the towel rack

aവള് aവള െട േജാലിെയ സേനഹിkുnു aവന് aവന്െറ തൂവാലെയ തൂവാല-പലകt ില് i

1

It was a new red bike

aത് oരു പുതിയ ചുവn േമ ാ ൈസkിള് ആയിരുnു

1

The bike lock was on the ground

േമാ ാര്ൈസkിള്-പൂ ് തറയില് ആയിരുnു

1

It was an old book

aത് oരു പഴയ പുസതകം ആയിരുnു

1 1

Buy a goat She chopped off the two ends of the carrot

1

She had a mole on her face

oരു ആടിെന വാ ുക aവള് ശീമമുll ിെcടിയുെട ര ട് a നുറുtു aവ k് aവള െട മുഖtില് oരു മറുക് u ടായിരുnു

1

She hated it

aവള് aതിെന െവറുkുnു

1

It was a dark brown circle

aത് oരു iരു

1

It killed people

1

I love that store

1

She was not married

1

He lived in Los Angeles

aവന് െലാസ്-a

1

They were in love

aവര് സേനഹtില് ആയിരുnു

1

Your family is in Los Angeles

നി

1

You have your brothers

നി

1

He looked outside

aവന് പുറേtk് േനാkി

1

I hate it

1

His car was dark blue

aവന്െറ കാര് iരു

1

She walked outside

aവള് പുറേtk് നടnു

y

1

op

C

ട തവി നിറമുll വ ം ആയിരുnു

aത് ആള കള കെള െകാnു ഞാന് ആ കലവറെയ സേനഹിkുnു

o

N

ot

aവള് കല ാണംകഴിkെp ില െചെലസയില് ജീവിc

ള െട കുടുംബം െലാസ്-a k് നി

െചെലസയില് ആണ്

ള െട സേഹാദര മാര് u

ട്

ഞാന് aതിെന െവറുkുnു ട നീല ആയിരുnു

But it had a big mouth

enാലും aതിന് oരു വലിയ വായ് u

He sliced an onion

aവന് oരു uളളിെയ േഛദിc

He looked at the kitchen floor

aവന് aടുkള-തറയില് േനാkി

1

Water was on the kitchen floor

െവളളം aടുkള-തറയില് ആയിരുnു

1

He put it on the kitchen floor

aവന് aതിെന aടുkള-തറയില് i

1

Fifty stars are on the flag

amത് നk ത

1

They live on the street

aവര് െതരുവില് ജീവിkുnു

1

Everything was on sale

സകലതും വി പനയില് ആയിരുnു

1

I hate tomatoes

ഞാന് തkാളികെള െവറുkുnു

1

She fell down

aവള് താെഴ വീണു

1

She hit her head on a rock

aവള് aവള െട തലെയ oരു പാറയില് iടിc

1

Where is the mustard

e

1

The name of the magazine was Time

മാസികയുെട േപര് സമയം ആയിരുnു

1 1

D

1

94   

െള കള

ടായിരുnു

ള് പതാകയില് ആണ്

് ആണ് കടുക്



1

A cop saw her The farmer put a dozen apples into a bag

oരു േപാലീ േകാ s ള് aവെള ക ടു കര്ഷകന് oരു പ n ട് ആpി pഴ െള oരു സ ചിk് ullില് i

1

I love my newspaper

ഞാന് eന്െറ വ tമാനപ തെt സേനഹിkുnു

1

We live in a rainbow world



1

A lemon is yellow

oരു െചറുനാര

1

An apple is red or green

oരു ആpിള്pഴം ചുവn aഥവാ പc ആണ്

1

Your teeth are white

നി

1

She started screaming

aവള് കൂകിവിളിkുക ആരംഭിc

1

Money has germs

പണtിന് aണുkള് u

1

People have germs

1

Germs live on people for months

ആള കള ക k് aണുkള് u ട് aണുkള് മാസ kുേവ ടി ആള കള കളില് ജീവിkുnു

1

Two people died



1

Eight people were hurt

e ് ആള കള കള് വണെpടുtെp

1

He had a job

aവന് oരു േജാലി u

1

It was a bad job

aത് oരു ചീtയായ േജാലി ആയിരുnു

1

He was a waiter

aവന് oരു േഹാ

1

Luggage hit people in the face

യാ താസാമാന

ള് ആള കള കെള മുഖtില് iടിc

1

Luggage hit people in the head

യാ താസാമാന

ള് ആള കള കെള തലയില് iടിc

1

The police came

1

He was helpful

1

This was her money

iത് aവള െട പണം ആയിരുnു

1

She bought five lottery tickets

aവള് a

1

She looked for dark red

aവള് iരു

1

She liked cherry pink

aവള് രk പസാദമുll പാടലവ

1

She looked in a mirror

aവള് oരു ദ pണtില് േനാkി

1

She gave the cashier $20

aവള് പണംസൂkിkുnവന് $--5 20 െകാടുtു

1

He gave her a little change

aവന് aവ k് oരു െചറിയ മാ െt െകാടുtു

1

They are not taking vacations

o

aവര് aവധിkാല

1

she asked again He sells the corn at his vegetable stand

aവള് വീ ടും േചാദിc aവന് aവന്െറ പckറി-ത ില് ധാന െt വി kുnു

1

It is bright yellow

aത് േശാഭമയമായ മ

1

The cow loves the corn

പശു ധാന െt സേനഹിkുnു

1

A winter storm is coming

oരു ശീതകാലം-െകാടു ാ ് വnുെകാ

1

He has pretty pictures and maps

aവന് നിസര്gസുnരമായ ചി തം uം ഭൂപടം u

1

He talks and talks

aവന് സംസാരിkുnു uം സംസാരിkുnു

1

The wind started to blow

കാ ് കാ വ്ീശുക ആരംഭിc

1

Paper flew through the air

വര്tമാനp തം വായു മുേഖന പറnു

1

The rain started to fall

മഴ വീഴുക ആരംഭിc

1

It was a flood

aത് oരു പവാഹം ആയിരുnു

1

Then he saw lightning

പിnീട് aവന് മിnലിെന ക

ആണ്

ള െട പല കള് െവll ആണ്

y

ട്

op

ട് ആള കള കള് മരിc

ടായിരുnു

പരിചാരകന് ആയിരുnു

C

ot 95 

 



aവന് സഹായകമായ ആയിരുnു

D

1

ള് oരു മഴവില്-ഭൂമിയില് ജീവിkുnു

േപാലീസ് വnു

N

1

ച് ഭാഗ kുറി-കുറിമാന

െള വാ

ി

ട ചുവpിനുേവ ടി േനാkി െള i െp

െള eടുtുെകാ

ടിരിkുnില

ആണ്

ടു

ടിരിkുnു ട്

1

It was a very cold night

1

His head feels like it will explode

aത് oരു വളെര തണുt രാ തി ആയിരുnു aവന്െറ തല aത് െപാ ിെtറിkുമ് േപാെല aനുഭവിkുnു

1

They liked their jobs

aവര് aവരുെട േജാലികെള i െp

1

His head ache started an hour ago

aവന്െറ തല-േവദന oരു മണിkൂര് മുm് ആരംഭിc

1

They started shooting

aവര് െവടിവ kുക ആരംഭിc

1

The cop gave him the ticket

േപാലീ േകാ

1

He brushed his teeth

aവന് aവന്െറ പല കെള തുടc വൃtിയാkി

1

He went into his bedroom

aവന് aവന്െറ കിടpറk് ullില് േപായി

1

He drove fast

aവന് േവഗtില് വ

1

She had a new friend

aവ k് oരു പുതിയ കൂ കാരന് u

1

They put a big towel on the sand

aവര് oരു വലിയ തൂവാലെയ മണലില് i

1

They watched the sun go down

aവര് സൂര ന് താെഴ േപാകുnു ക

1

It was huge and orange

aത് ഗംഭീരമായ uം ഓറ

1

I like the beach

ഞാന് കട tീരെt i െpടുnു

1

A triangle has three sides

oരു തിേകാണtിന് മൂn് വശ

1

Faces have various shapes

മുഖ

1

Clouds have various shapes

േമഘ

1 1

Paul is a good man Some people stand on the edge of cliffs

1

They killed their friend

1

They feel the wind

1

Their marriage ends

aവരുെട വിവാഹം aവസാനിkുnു

1

We have a nice house



1

It has three bedrooms

aതിന് മൂn് കിടpറകള് u

1

It has three bathrooms

aതിന് മൂn് കുളിമുറികള് u

1

We live on a quiet street

1

The baby had a rare disease

ഞ ള് oരു ശാnമായ െതരുവില് ജീവിkുnു െചറിയകു ിk് oരു aസാധാരണമായ േരാഗം u ടായിരുnു

ള് aവന് കുറിമാനെt െകാടുtു

ടിേയാടിc

y ടു

op

ച് ആയിരുnു

k് വിവിധ

ള് u

ട്

ളായ ആകൃതികള് u

C

k് വിവിധ

ട്

ളായ ആകൃതികള് u

ട്

േപാള് oരു നല പുരുഷന് ആണ് a പം ആള കള കള് കിഴുkാംതൂkായപാറകള െട സീമയില് നി kുnു

ot

aവര് aവരുെട കൂ കാരനിെന െകാnു aവര് കാ ിെന aനുഭവിkുnു k് oരു ഹൃദ മായ വീട് u

ട്

ട് ട്

1

o

N

ടായിരുnു

He looked in the paper for another job

ഓേരാ രാ തി aവന് oരു സ നം u ടായിരുnു aവന് വ tമാനp തtില് മെ ാരു േജാലിkുേവ ടി േനാkി

1

She was a nurse

aവള് oരു ആയ ആയിരുnു

1

Her husband was a doctor

aവള െട ഭര്tാവ് oരു ൈവദ ന് ആയിരുnു

1

She saw another aeroplane

aവള് മെ ാരു വിമാനെt ക

1

They bought their tickets

aവര് aവരുെട കുറിമാന

1

He looked at the bubbles

aവന് കുമിളകളില് േനാkി

1

aവന് ക ാടിെയ േമശയില് i പണkാരനായ-ആള കള കള് aമിതവിലയുll കുറിമാന െള വാ ി

1

He put the glass on the table Rich people bought the expensive tickets The name of the lake was Yellow Lake

1

It had many colors

aതിന് ധാരാളമായ നിറ

Every night he had a dream

D

1

1

കായലി െറ േപര് മ

96   

s

ടു

െള വാ

ി

-കായല് ആയിരുnു ള് u

ടായിരുnു

1

But its belly was white

enാലും aതിന്െറ uദരം െവll ആയിരുnു

1

But usually they lose

enാലും പതിവായി aവര് േതാ kുnു

1

It was her birthday

aത് aവള െട ജ മദിനം ആയിരുnു

1

Mumbai is a big city

മുംൈബ oരു വലിയ നഗരം ആണ്

1

He examined the wife

aവന് ഭാര െയ പരിേശാധിc

1

He saw the baby

aവന് െചറിയകു ിെയ ക

1

They both smiled

aവര് iരുവരും പു

1

It attacked Donna

aത് െദാnെയ ആ കമിc

1

He loved his plants

aവന് aവന്െറ െചടികെള സേനഹിc

1

His plants were in pots

aവന്െറ െചടികള് പാ ത

1

It was Friday

aത് െവllിയാ ച ആയിരുnു

1

He went outside

aവന് പുറേtk് േപായി

1

The airplane had two engines

വിമാനtിന് ര

1

A big bird flew into each engine

oരു വലിയ പkി ഓേരാ യ ntിന് ullില് പറnു

1

She examined the pot

aവള് പാ തെt പരിേശാധിc

1

They ran

aവര് ഓടി

1

It was a problem

aത് oരു പ നം ആയിരുnു

1

Herman agreed

െഹര്മന് സmതിc

1

He had bought a ticket

1

The baby smiles

1

Air comes through the window

1

It is warm air

1

The boy goes to the kitchen

ആണ്കു ി aടുkളk് േപാകുnു

1

The ball is on the floor

പn് തറയില് ആണ്

1

It is a red ball

aത് oരു ചുവn പn് ആണ്

1

He sits down

aവന് താെഴ iരിkുnു

1

It is a rubber ball

aത് oരു റbര്-പn് ആണ്

1

The dog barks

നായ കുര kുnു

1

The bird sings

പkി പാടുnു

He throws the ball

aവന് പnിെന eറിയുnു

ചിരിc

y

ളില് ആയിരുnു

ള് u

ടായിരുnു

C

op

ട് യ n

aവന് oരു കുറിമാനെt വാ െചറിയകു ി പു

ിയി

ചിരിkുnു

aത് iളംചൂടുളള വായു ആണ്

o

N

ot

വായു ജനല് മുേഖന വരുnു

1

The cat licks its paws

പൂc aതിന്െറ ൈകptികെള നkുnു

1

The cat licks its belly

പൂc aതിന്െറ uദരെt നkുnു

1

The dog licks its paws

നായ aതിന്െറ ൈകptികെള നkുnു

1

He sees a package

aവന് oരു െപാതിെk ിെന കാണുnു

1

He opens the refrigerator

aവന് ശീതീകരണയ nെt തുറkുnു

1

It has a red cover

aതിന് oരു ചുവn ആവരണം u

1

She likes animals

aവള് മൃഗ

െള i െpടുnു

1

She has two cats

aവ k് ര

ട് പൂcകള് u

1

She likes her cats

aവള് aവള െട പൂcകെള i െpടുnു

1

He has a job

aവന് oരു േജാലി u

1

He is a teacher

aവന് oരു ad ാപകന് ആണ്

D

1

97   

ടു

ട്

ട്

ട്

ടായിരുnു

1

He teaches kids

aവന് കു ികെള പഠിpിkുnു

1

He likes his job

aവന് aവന്െറ േജാലിെയ i െpടുnു

1

He likes kids

aവന് കു ികെള i െpടുnു

1

He chews it

aവന് aതിെന ചവ kുnു

1

He sees an airplane

aവന് oരു വിമാനെt കാണുnു

1

The airplane is in the sky

വിമാനം ആകാശtില് ആണ്

1

The baby cried again

െചറിയകു ി വീ

1

It has two wings

aതിന് ര

1

It has a tail

aതിന് oരു വാല് u

1

She has a doll

aവ k് oരു പാവ u

1

The doll has long hair

പാവk് നീളമുളള തലമുടി u

1

It crawls slowly

aത് മnമnം iഴയുnു

1

She watches it

aവള് aതിെന കാണുnു

1

She puts it in her mouth

aവള് aതിെന aവള െട വായില് iടുnു

1

She likes the monkeys

aവള് കുര

1

They have long tails

aവ k് നീളമുളള വാലുകള് u

ട്

1

There are six monkeys in the cage

aവിെട തടവറയില് ആറ് കുര

ുകള് ആണ്

1

The snow falls



1

It covers the ground

1

He goes outside

1

He looks at his bicycle

1

It is an old bike

1

She watches the ants

aവള് uറുmുകെള കാണുnു

1

She has a dog

aവ k് oരു നായ u

1

The snow falls from the sky



1

He sees a butterfly

aവന് oരു പൂmാ െയ കാണുnു

1

The music starts

സംഗീതം ആരംഭിkുnു

1

She looks in the mirror

aവള് ദ pണtില് േനാkുnു

1

It has white hair

aതിന് െവള t തലമുടി u

They always hug her

aവര് aവെള eേpാഴും െക ിpിടിkുnു

The big fish eat the small fish

വലിയ മt ം െചറിയ മt െt തിnുnു

1

She loves him

aവള് aവെന സേനഹിkുnു

1

The hero solves the problem

കഥാനായകന് പ നെt പരിഹരിkുnു

1

The story has a happy ending

കഥk് oരു സnുഷ്ടമായ സമാ തി u

1

It is blue

aത് നീല ആണ്

1

He talks with his wife

aവന് aവന്െറ ഭാര േയാട് സംസാരിkുnു

1

He sneezes

aവന് തുmുnു

1

It was dark green

aത് iരു

1

He loved animals

aവന് മൃഗ

1

She went under the water

aവള് െവളളtിന് aടിയില് േപായി

1

She was a little girl

aവള് oരു െചറിയ െപ

ട്

ട് ട്

op

y

ട്

ുകെള i െpടുnു

C

് വീഴുnു

aത് തറെയ മൂടുnു

aവന് aവന്െറ ൈസkിളില് േനാkുnു aത് oരു പഴയ േമ ാ ൈസkിള് ആണ്

98   

ട് ചിറകുകള് u

ot

N

D

1



aവന് പുറേtk് േപാകുnു

o

1

ടും കര

ട്

് ആകാശtി നിn് വീഴുnു

ട്

ട്

ട പc ആയിരുnു െള സേനഹിc കു ി ആയിരുnു

1

His new car is green

aവന്െറ പുതിയ കാര് പc ആണ്

1

Dora loved her mom

െദാര aവള െട amെയ സേനഹിc

1

He loves his red bicycle

aവന് aവന്െറ ചുവn ൈസkിെള സേനഹിkുnു

1

He was at school

aവന് വിദ ാലയtില് ആയിരുnു

1

His friend laughed

aവന്െറ കൂ കാരന് ചിരിc

1

His brother took the apple

aവന്െറ സേഹാദരന് ആpി pഴെt eടുtു

1

The baby cried

െചറിയകു ി കര

1

I love my mom

ഞാന് eന്െറ amെയ സേനഹിkുnു

1

She stood in the water

aവള് െവളളtില് നിnു

1

It was a big lake

aത് oരു വലിയ കായല് ആയിരുnു

1

Wash your hands

നി

1

Be a good boy

oരു നല ആണ്കു ി ആകുക

1

Crow is a bird

കാk oരു പkി ആണ്

1

It was a good book

aത് oരു നല പുസതകം ആയിരുnു

1

It was closed

aത് aട kെp

1

It has three legs

aതിന് മൂn് കാലുകള് u

1

He has sheep

aവന് െചmരിയാട് u

1

She has blue eyes

aവ k് നീല ക

1

He loved her

1

It was his money

1

It was a big nail

1

He followed her

1

It was under the tree

aത് മരtിന് aടിയില് ആയിരുnു

1

I am a teacher

ഞാന് oരു ad ാപകന് ആണ്

1

I am from Delhi

1

He killed a police officer

aവന് oരു േപാലീസ്-തലവെന െകാnു

1

I hate her

ഞാന് aവെള െവറുkുnു

1

My friend went to Delhi

eന്െറ കൂ കാരന് െഡ ഹിk് േപായി

1

My dad is in Mumbai

eന്െറ acന് മുംൈബയില് ആണ്

I have three brothers

eനിk് മൂn് സേഹാദര മാര് u

It was a good question

aത് oരു നല േചാദ ം ആയിരുnു

1

My brother is a teacher

eന്െറ സേഹാദരന് oരു ad ാപകന് ആണ്

1

This is a good place

iത് oരു നല iടം ആണ്

1

She is dancing

aവള് നൃtംെച തുെകാ

1

He saw a cow

aവന് oരു പശുവിെന ക

1

It was a big house

aത് oരു വലിയ വീട് ആയിരുnു

1

Eight people stayed there

e ് ആള കള കള് aവിെട ത

1

My grandmother lived there

eന്െറ മുt ി aവിെട ജീവിc

1

It is a village

aത് oരു ഗാമം ആണ്

1

I went there

ഞാന് aവിെട േപായി

1

He is a good cook

aവന് oരു നല പാചകkാരന് ആണ്

y

op ട്

C

കള് u

ട്

N

ot

aത് oരു വലിയ നഖം ആയിരുnു aവന് aവെള പിnുട nു

ഞാന് െഡ ഹിയി നിn് ആണ്

99   

ട്

aത് aവന്െറ പണം ആയിരുnു

D

1

ള െട ൈകകള് കഴുകുക

aവന് aവെള സേനഹിc

o

1



ട്

ടിരിkുnു ടു ി

1

He called his wife

aവന് aവന്െറ ഭാര െയ വിളിc

1

James is a lawyer

െജയിംസ് oരു വkീല് ആണ്

1

He was a soldier

aവന് oരു േയാdാവ് ആയിരുnു

1

He went by train

aവന് തീവ

1

He is a cobbler

aവന് oരു െചരുp കുtി ആണ്

1

Rain is coming

1

She is working in an office

മഴ വnുെകാ ടിരിkുnു aവള് oരു കാര ാലയtില് പണിെയടുtുെകാ ടിരിkുnു

1

He saw a police officer

aവന് oരു േപാലീസ്-തലവെന ക

1

She had fever

aവ k് പനി u

1

She stood there

aവള് aവിെട നിnു

1

I love my grandmother

ഞാന് eന്െറ മുt ിെയ സേനഹിkുnു

1

She is a widow

aവള് oരു വിധവ ആണ്

1

He plays well

aവന് തൃ തികരമായി കളിkുnു

1

He has a boat

aവന് oരു വ

1

I went there

ഞാന് aവിെട േപായി

1

I met her

ഞാന് aവെള ക

1

He has my books

aവന് eന്െറ പുസതക

1

I gave my pen to him

1

He is my friend

1

We both went to Chennai

1

ടു

op

y

ടായിരുnു

ചി u

ട്

ടുമു ി

C

ള് u

ട്

ഞാന് eന്െറ േപനെയ aവന് െകാടുtു aവന് eന്െറ കൂ കാരന് ആണ് ള് iരുവരും െചൈnk് േപായി

We met him at the station



ള് aവെന താവളtില് ക

1

Then we went out

പിnീട് ഞ

1

He is my nephew

1

She is my aunt

1

Economy is in a good state

സാmtികവ വs oരു നല നിലയില് ആണ്

1

My name is Geetha

1

My daughter is working in a school

eന്െറ േപര് ഗീത ആണ് eന്െറ മകള് oരു വിദ ാലയtില് പണിെയടുtുെകാ ടിരിkുnു

1

Today is a holiday

in് oരു aവധിദിവസം ആണ്

This is a red ball

iത് oരു ചുവn പn് ആണ്

ടുമു ി

ള് പുറt് േപായി

aവന് eന്െറ aനnരവന് ആണ് aവള് eന്െറ amായി ആണ്

1

o

N

ot



He asked for a ball

aവന് oരു പnിനുേവ

1

She was new in town

aവള് െചറുപ ണtില് പുതിയ ആയിരുnു

1

She is enjoying

aവള് ആസ ദിc െകാ

1

He replied to her

aവന് aവ k് മറുപടിന കി

1

He is singing

aവന് പാടിെകാ

1

An idea can change your life

oരു ആശയം നി

1

It is a good idea

aത് oരു നല ആശയം ആണ്

1

This is a red flower

iത് oരു ചുവn പൂവ് ആണ്

1

I saw a beautiful girl

ഞാന് oരു സുnരിയായ െപ കു ിെയ ക

1

She has three goats

aവ k് മൂn് ആടുകള് u

1

It was a lie

aത് oരു നുണ ആയിരുnു

D

1

100   

ടിയില് േപായി

ടി േചാദിc ടിരിkുnു

ടിരിkുnു ള െട ജീവിതകാലെt മാ ാം

ട്

ടു

1

Sunday is a holiday

ഞായറാഴ്ച oരു aവധിദിവസം ആണ്

1

You should wait for her

നി

1

Chief minister went to Delhi

മുഖ -മ nി െഡ ഹിk് േപായി

1

The rain was very loud

മഴ വളെര uറെkയുളള ആയിരുnു

1

They are playing

aവര് കളിc െകാ

1

This is a cart

iത് oരു കാളവ

1

He did not eat meat

aവന് iറcിെയ തിnില

1

He is a writer

aവന് oരു eഴുtുകാരന് ആണ്

1

He is a famous doctor

aവന് oരു േപരുേക ൈവദ ന് ആണ്

1

I saw a lion in the zoo

ഞാന് oരു സിംഹെt മൃഗശാലയില് ക

1

I saw a yellow parrot

ഞാന് oരു മ

1

She had a beautiful mom

aവ k് oരു സുnരിയായ am u

1

It was an interesting story

aത് oരു രസകരമായ കഥ ആയിരുnു

1

He was a good farmer

aവന് oരു നല ക ഷകന് ആയിരുnു

1

He worked in a hotel

aവന് oരു േഭാജനശാലയില് പണിെയടുtു

1

She lived in a village

aവള് oരു ഗാമtില് ജീവിc

1

This is a hill

iത് oരു കുn് ആണ്

1

She is coming from a city

aവള് oരു നഗരtി നിn് വnുെകാ

1

He is a stranger

1

He went to his room

1

This aeroplane had two engines

1

I saw another aeroplane

ഞാന് മെ ാരു വിമാനെt ക

1

He is the owner of the house

aവന് വീടി െറ uടമസഥന് ആണ്

1

I bought a goat

ഞാന് oരു ആടിെന വാ

1

Colors are beautiful

നിറ

1

He is a merchant

aവന് oരു കcവടkാരന് ആണ്

1

My father is a professor

eന്െറ acന് oരു ആചാര ന് ആണ്

1

His elder brother is a teacher

aവന്െറ മൂt സേഹാദരന് oരു ad ാപകന് ആണ്

1

She did not believe her doctor

aവള് aവള െട ൈവദ നിെന വിശ സിcില

He failed in the examination

aവന് പരീkയില് േതാ േപായി

He became a doctor

aവന് oരു ൈവദ ന് ആയിtീ nു

1

His brother is a lawyer

1

Kerala seeks permission for new dam

aവന്െറ സേഹാദരന് oരു വkീല് ആണ് േകരള പുതിയ aണെk ിനുേവ ടി aനുമതിെയ ആരായുnു

1

They were killed in an accident

aവര് oരു aപകടtില് െകാലെp

1

We celebrated his birthday



1

The man did not come out

പുരുഷന് പുറt് വnില

1

It was written in Hindi

aത് ഹിnിയില് eഴുതെp

1

I have a sister

eനിk് oരു സേഹാദരി u

1

I invited her

ഞാന് aവെള kണിc

1

He would have rushed into the street

aവന് െതരുവിന് ullില് പാ

1

He is a student

aവന് oരു വിദ ാ tി ആണ്

ടി ആണ്

ടു

y

തtെയ ക

ടു

op

ടായിരുnു

C

ടിരിkുnു

aവന് aവന്െറ മുറിk് േപായി iത് വിമാനtിന് ര

101   

ടിരിkുnു

ട് യ n

ot

N

D

1

ടി കാtിരിkണം

aവന് oരു aപരിചിതന് ആണ്

o

1

ള് aവ kുേവ

ള് u

ടായിരുnു

ടു

ി

ള് സുnരിയായ ആണ്

ള് aവന്െറ ജന്മദിനം ആേഘാഷിc

ട് ുകയ ി

ടാകാം

1

He saw a strange sight

aവന് oരു aസാധാരണമായ കാ ചെയ ക

1

Viju was the son of a gardener

വിജു oരു േതാ kാരനി െറ മകന് ആയിരുnു

1

Kochi is a famous city

െകാcി oരു േപരുേക നഗരം ആണ്

1

He was the chief guest

aവന് മുഖ മായ aതിഥി ആയിരുnു

1

A scientist came here

oരു ശാ

1

We made arrangements



1

I saw an eagle

ഞാന് oരു കഴുകനിെന ക

1

He did not see the red light

aവന് ചുവn െവളിcെt ക

1

She is an angel

aവള് oരു മാലാഖ ആണ്

1

This is an island

iത് oരു തുരുt് ആണ്

1

Geetha is my aunt

ഗീത eന്െറ amായി ആണ്

1

He is our servant

aവന് നmുെട ഭൃത ന് ആണ്

1

The speaker was Gabriel

പസംഗകര്tാവ് ഗ ബിേയല് ആയിരുnു

1

She was very sad

aവള് വളെര ദുഃഖകരമായ ആയിരുnു

1

I saw a crocodile in the lake

ഞാന് oരു മുതലെയ കായലില് ക

1

Something was wrong

eേnാവസ്തു െത ായ ആയിരുnു

1

It was a strange town

aത് oരു aസാധാരണമായ െചറുപ ണം ആയിരുnു

1

It is delicious

aത് സ ാദുളള ആണ്

1

The water was not hot

1

He drove to the hotel

1

Shankar felt sorry for him

1

They were very scared

aവര് വളെര േപടിkെp

1

It was late at night

aത് രാ തിയില് കാലംെത ിയ ആയിരുnു

1

I am hungry

1

He did not like jail

1

He had tried

aവന് ശമിcി

1

My name is Aazaad

eന്െറ േപര് ആസാട് ആണ്

1

Old news was interesting

പഴയ വാര്t രസകരമായ ആയിരുnു

1

Rama lived in a small village The small village had a small school

രമ oരു െചറിയ ഗാമtില് ജീവിc െചറിയ ഗാമtിന് oരു െചറിയ വിദ ാലയം u ടായിരുnു

It was a forest

aത് oരു വനം ആയിരുnു

1

He went to his room

aവന് aവന്െറ മുറിk് േപായി

1

Everyone laughed

eലാവരും ചിരിc

1

They had short legs

aവ k് കുറിയ കാലുകള് u

1

Life is hard

ജീവിതകാലം ദൃഢമായ ആണ്

1

He was a carpenter

aവന് oരു ആശാരി ആയിരുnു

1

She saw an ad in the paper

aവള് oരു പരസ െt വ tമാനp തtില് ക

1

My brother is in hospital

1

My brother is coming from Mumbai

eന്െറ സേഹാദരന് ആശുപ തിയില് ആണ് eന്െറ സേഹാദരന് മുംൈബയി നിn് വnുെകാ ടിരിkുnു

1

Thaar is a desert

താര് oരു മരുഭൂമി ആണ്

ടാkി ടു ടില

C

op

y

ള് പdതികെള u

ടു

െവളളം ചൂടുll ആയിരുnു aവന് േഭാജനശാലk് വ ഷ ര് aവനുേവ

ടിേയാടിc

ടി നിn മായ aനുഭവിc

ഞാന് വിശp ll ആണ് aവന് ജയിലിെന i െp ില

102   

തjന് iവിേടk് വnു

ot

N

D

1

o

1

ടു

ടായിരുnു

ടായിരുnു

ടു

1

Wait until tomorrow

നാെള വെര കാtിരിkുക

1

A woman came out of her house

oരു

1

He was a great man

aവന് oരു മഹാനായ പുരുഷന് ആയിരുnു

1

He was a Prince

aവന് oരു യുവരാജാവ് ആയിരുnു

1

It was a rabbit

aത് oരു മുയല് ആയിരുnു

1

Autumn arrived in the forest

ശരല്kാരം വനtില് etിേc nു

1

I saw a deer in the forest

ഞാന് oരു മാനിെന വനtില് ക

1

He was a blacksmith

aവന് oരു െകാലpണിkാരന് ആയിരുnു

1

I hope

ഞാന് പതീkിkുnു

1

I have some books

eനിk് a പം പുസതക

1

your father called yesterday

നി

1

I am a housewife

ഞാന് oരു വീ m ആണ്

1

I am fine

ഞാന് സുഖം ആണ്

1

I am bold

ഞാന് ധീരമായ ആണ്

1

Forget the past

ഭൂതകാലെt മറkുക

1

Lock the door

കതക് പൂ ്

1

Reduce the volume

വ ാ തെt കുറ kുക

1

Return it safely

1

Put on your shirt

1

send him inside

1

they will win

1

They will not listen to me.

aവര് eെnk് േക kില

1

I like reading books

ഞാന് പുസതക

1

I like walking in the morning sun

ഞാന് പഭാതം-സൂര നില് നടkുക i െpടുnു

1

I like listening to music

ഞാന് സംഗീതtിന് േകള്kുക i െpടുnു

1

I like travelling by train

ഞാന് തീവ

1

I keep my books here

ഞാന് eന്െറ പുസതക

1

I wait for him at the station

ഞാന് aവനുേവ

I have a scooter

eനിk് oരു

We have a Maruthi Car



1

I have two brothers

eനിk് ര

1

I have three sisters and a brother

eനിk് മൂn് സേഹാദരി uം oരു സേഹാദരന് u

1

I will never forget your help I have come to the end of my patience.

ഞാന് നി

1

Don't try my patience. You are always complaining about something.

eന്െറ kമെയ ശമിkുക നി ള് eേnാവസ്തു പ ി പരാതിെp െകാ ടിരിkുnു eേpാഴും

1

go to hell

നരകtിന് േപാകുക

1

Anyone can make a mistake

ഏെത ിലുെമാരാള് oരു പിഴെയ u

1

I too have the same problem

eനിk് aധികമായി തുല മായ പ നം u

1 1

ട്

op

y

ള െട acന് inെല വിളിc

ള െട uടുpില് iടുക

aവെന aകേtk് aയ kുക

N

ot

aവര് വിജയിkും െള വായിkുക i െpടുnു

ടിയില് യാ തെചy ക i െpടുnു െള iവിേടk് വ kുക

ടി താവളtില് കാtിരിkുക

കൂ ര് u

ട്

k് oരു മാരുതി-കാര് u

ട്

ട് സേഹാദര മാര് u

ട്

ള െട സഹായെt മറkില

ഞാന് eന്െറ kമയുെട a tിന് വnി

103   

ള് u

C നി

D

1

ടു

aതിെന സുരkിതമായി തിരിെctുക

o

1

തീ aവള െട വീടി െറ പുറt് വnു

ട്

ടാkാം ട്

ട്

1

He came to my office

aവന് eന്െറ കാര ാലയtിന് വnു

1

He came to meet my father

aവന് eന്െറ acനിെന ക

1

He came along with his wife

aവന് aവന്െറ ഭാര േയാട് േചര്n് വnു

1

He came here alone

aവന് തനിcാkി iവിേടk് വnു

1

I give her a pen

ഞാന് aവ k് oരു േപനെയ െകാടുkുnു

1

My friend Ravi has a scooter

eന്െറ കൂ കാരന്-രവിk് oരു

1

I would have come yesterday

ഞാന് inെല വnി

1

He is on leave

aവന് aവധിയില് ആണ്

1

He has gone out.

aവന് പുറt് േപായിയി

1

Milk is white

പാല് െവll ആണ്

1

I am eating an apple

ഞാന് oരു ആpി pഴെt തിnുെകാ

2 2

Her shoes are old All shoes were on sale at the shoe store

aവള െട പാദരkകള് പഴയ ആണ് eലാ പാദരkകള് പാദരk-കലവറയില് വി പനയില് ആയിരുnു

2

They were very comfortable

aവര് വളെര സുഖ പദമായ ആയിരുnു

2

They felt good

aവര് നല aനുഭവിc

2

He opened his travel bag

aവന് aവന്െറ യാ ത-സ

2

His bike was gone

e

2

It was cut in two

2

It was delicious

2

She thought about it

2

Your dog is lazy

2

It was delicious

2 2

He was so happy He turned on the water and rinsed his face

aവന് a പകാരം സnുഷ്ടമായ ആയിരുnു aവന് െവളളtില് തിരി ു uം aവന്െറ മുഖെt കഴുകി

2

She rinsed the carrot peeler

2

I am going to adopt two baby children

aവള് ശീമമുll ിെcടി-േതലുരിkുnവെന കഴുകി ഞാന് ര ട് െചറിയകു ി-കു ികെള ദെtടുkുക േപായിെകാ ടിരിkുnു ഞാന് oരു െചറിയ െപണ്കു ി uം oരു െചറിയ ആണ്കു ി ആവശ െpടുnു aവ k് ര ട് െചറിയകു ിക kുേവ ടി കാtിരിkുക u ടായിരുnു

ടാകാം ട്

op

y

ടിരിkുnു

ചിെയ തുറൈn

C

ടില് മുറിkെp

N

ot

aവള് aത് പ ി ചിnിc

I want a little girl and a little boy She had to wait for the two babies

നി

ള െട നായ aലസനായ ആണ്

aത് സ ാദുളള ആയിരുnു

2

I can wait one year Every night she scrubbed her cheek extra hard

2

He sat down and read the newspaper

ഞാന് on് െകാലം കാtിരിkാം ഓേരാ രാ തി aവള് aവള െട കദളെt aതിയായ ദൃഢമായ uര c aവന് താെഴ iരിc uം വ tമാനപ തെt വായിkുnു

2

She poured the beans into the grinder

aവള് പയറുകെള ആ കലിന് ullില് oഴിc

2

Today I bought a winter cap

ഞാന് oരു ശീതകാലം-െതാpി വാ

2

The reception was bad

സ കാരം ചീtയായ ആയിരുnു

2

He went to jail

aവന് ജയിലിന് േപായി

2

It barked in the morning

aത് പഭാതtില് കുര kുക

2

He washed the plate

aവന് പാ തെt കഴുകി

2

104   

ട്

aത് സ ാദുളള ആയിരുnു

D

2

കൂ ര് u

് ആയിരുnു aവന്െറ േമാ ാര്ൈസkിള്

aത് ര

o

2

ടുമു ക വnു

ുക

2

She said she was fine

aവള് aവള് സുഖം ആയിരുnു പറ

2

Something was leaking

eേnാവസ്തു േചാ nുെകാ ടിരിc

2

He called his landlord

aവന് aവന്െറ ഭൂവുടമസഥനിെന വിളിc

2

What was wrong with him

ഏത് aവേനാട് െത ായ ആയിരുnു

2

He closed his eyes

aവന് aവന്െറ ക

2

He was sitting in his chair

aവന് aവന്െറ പീഠtില് iരിc െകാ

2

I am an adult

2

I want to drink a soda

ഞാന് oരു പായപൂ tിയായ ആണ് ഞാന് oരു േസാഡാെവllെt കുടിkുക ആവശ െpടുnു

2

I want to work

ഞാന് പണിെയടുkുക ആവശ െpടുnു

2

They were going to the airport

aവര് വിമാനtാവളtിന് േപായിെകാ

2

Something can always go wrong

eേnാവസ്തു eേpാഴും െത ായ േപാകാം

2

Where are we

e

2

I will send the police

ഞാന് േപാലീസെയ aയ kുമ്

2

The magazine was one year old

മാസിക on് െകാലം പഴയ ആയിരുnു

2

She was angry

aവള് കുപിതനായ ആയിരുnു

2

She was angry at her husband

aവള് aവള െട ഭ tാവില് കുപിതനായ ആയിരുnു

2

She was crying

aവള് കര

2

He walked into his house

2

She cooked the raw apples

2

She saw the flashing light

2

Superman was strong

സുെപര്മന് ശkമായ ആയിരുnു

2

Superman could pick up a house

സുെപര്മന് oരു വീടിെന േമേല eടുkാ പ ി

2

He looked down the street

aവന് െതരുവിെന താെഴ േനാkി

2

He sat down on the bench

aവന് െബ

2

The wind was blowing

2

The bus accident was near a dam

കാ ് കാ വ്ീശിെകാ ടിരിc ബസ്-aപകടം oരു aണെk ് aരിെകയുളള ആയിരുnു

2

He was poor

aവന് ദരി ദമായ ആയിരുnു

2

The meals were cheap

ഊണുകള് വിലകുറ

Floating is so easy

oഴുകുക a പകാരം eള pമായ ആണ്

Your mother said to tell the truth

നി

2

They say that lying is evil

aവര് നുണ േദാഷകരമായ ആണ് പറയുnു

2

That is a big lie

ആ oരു വലിയ നുണ ആണ്

2

Look at the people around you

നി

2

The weather got cold

കാലാവs തണുt കി ി

2

He turned up the volume

aവന് വ ാ തെt േമേല തിരി

2

He looked outside his door

aവന് aവന്െറ കതക് പുറം േനkി

2

Then they pulled out guns

പിnീട് േതാkുകള് പുറt് വലിkുക aവര്

2

He was late

aവന് കാലംെത ിയ ആയിരുnു

2

They sat on the towel

2

He talked to them for a minute

aവര് തൂവാലയില് iരിc aവന് aവെരtെnk് oരു kണtിനുേവ ടി സംസാരിc

ടിരിc

y

ടിരിc

op

ള്

ടിരിc

C

ുെകാ

aവള് aപക മായ ആpി pഴ

െള പാചകംെചy ക

aവള് മിnിkുക െവളിcെt ക

ടു

ചില് താെഴ iരിc

ആയിരുnു

ള െട am േനരിെന പറയുക പറ



ള് ചു പാടും ആള കള കളില് േനാkുക

105   

കെള aട c

aവന് aവന്െറ വീടിന് ullില് നടnു

ot

N

D

2

o

2

് ആണ് ഞ





2

They are alone

aവര് തനിcാkി ആണ്

2

I must find another job

ഞാന് മെ ാരു േജാലിെയ ക

2

They had four children

aവ k് നാലാം കു ികള് u

2

She was always sick

aവള് eേpാഴും ദീനമായ ആയിരുnു

2

It was nice and cold

aത് ഹൃദ മായ uം ചീരാp് ആയിരുnു

2

They drove to the lake

aവര് കായലിന് വ

2

They got out of the car

aവര് കാരി െറ പുറt് കി ി

2

Then they went home

പിnീട് aവര് വീടിെന േപായി

2

She played for several hours

aവള് പേത കമായ മണിkൂരുക kുേവ

2

They were crying

aവര് കര

2

powerful earthquake strikes philipines

വീരനായ ഭൂമികുലുkം-പണിമുടkുകള് ഫിലിപിെനunു

2

Your dog does not eat grass

നി

2

He did not listen to his friends

aവന് aവന്െറ കൂ കാരനുക k് േക ില

2

Her husband could not believe it

aവള െട ഭര്tാവ് aതിെന വിശ സിkാ പ ില

2

She could not handle it

aവള് aതിെന ൈകകാര ംെചyാ പ ില

2

I am doing good

ഞാന് നല െച തുെകാ

2

Few persons seemed to love him

ചുരുkം ആള കള് aവെന

2

He pushed me to the door

aവന് eെnെയ കതകിന് തllി

2

I reached my home

2

Grandmother took her to the garden

2

Life was difficult for Somu

2

He had made a new friend on his way

മുt ി aവെള പൂേnാ tിന് eടുtു ജീവിതകാലം െസാമുവിനുേവ ടി ദു കരമായ ആയിരുnു aവന് oരു പുതിയ കൂ കാരനിെന aവന്െറ വഴിയില് u ടാkിയി ടായിരുnു

2

Bablu did not go to school

ബാ

2

I do not want to go to school

ഞാന് വിദ ാലയtിന് േപാകുക ആവശ െpടുnില

2

I will teach you to read and write

ഞാന് നി

2

Somu loved to read ghost stories

െസാമു ഭൂതം-കഥകെള വായിkുക സേനഹിc

2

They loved everyone

2

A farmer had some puppies

aവര് eലാവരുെt സേനഹിc oരു ക ഷകനിന് a പം നാ kു u ടായിരുnു

ടിേയാടിc

ടി കളിc

y

ടിരിc

op

ള െട നായ പുലിെന തിnുnില

ടിരിkുnു

C

േനഹിkുക േതാnി

ഞാന് eന്െറ വീടിെന etിേc nു

I threw the brick

ള വിദ ാലയtിന് േപായിയില ള് വായിkുക uം eഴുതുക പഠിpിkുമ്

ുകള്

I planted a tree in my garden

2

He called me yesterday

aവന് inെല eെnെയ വിളിc

2

He explained to me

aവന് eെnk് വിവരിc

3

Then she turned on the stove

3 3

She took an egg out of the refrigerator He picked up the can of shaving cream

പിnീട് aടുpില് തിരിയുക aവള് aവള് eടുtു oരു a െt പുറt് ശീതീകരണയ natി െറ aവന് പാ pാടെയ kൗരംെചy ക കഴിയുകെയ േമേല eടുtു

3

Then he shaved his upper lip

3

He had red bites all over his body

3

She told Kim to fill out many forms

D 2

ഞാന് i ികെയ eറി ു ഞാന് oരു മരെt eന്െറ പൂേnാ tില് ന പിടിpിc

പിnീട് aവന്െറ upറ് ചിറി kൗരംെചy ക aവന് aവന് ചുവn കടികള് eലാ േമല് aവന്െറ ശരീരം u ടായിരുnു aവള് കിം ധാരാളമായ രൂപ െള പുറt് നിറ kുക പറ ു

106   

ടായിരുnു

ot

N

o

2

ുെകാ

െടtുക

3

She asked her mom to cut the mole off with a razor

aവള് aവള െട amെയ മുറിkുക മറുകിെന കള oരു kരktിേയാട് േചാദിc

3

He got up

3

3

It was still frozen He cut the brown spots out of the apple She took the bag of coffee beans out of the freezer She put a paper filter into a plastic cone She waited until the cup was full of hot coffee She sipped her coffee while she read the newspaper

3

I can pull the cap down over my ears

aവന് േമേല കി ി aത് നി ലമായിരിkുn തണുp െകാ ടുറc േപാകെp aവന് മുറിc തവി നിറമുll പുളളികെള പുറt് ആpി pഴatി െറ aവള് പുറt് fെരezeര് കാpി-പയറ് സ ചിെയ eടുtു aവള് oരു വര്tമാനp തം-ഫി തറിെന oരു ള തിക് കൂ പിന് ullില് i aവള് കp് ചൂടുll കാpിയുെട നിറ ആയിരുnു വെര കാtിരിc aവള് aവള െട കാpിെയ aവള് വ tമാനപ തെt വായിkുക aേpാള് മുtിkുടിc ഞാന് വലിkാം െതാpിെയ താെഴ eന്െറ െചവികള് േമല്

3

Paula said she wasn't pretty

3

I do not have anyone

3

I do not have any brothers

3

What was the noise

3

He drove it out to the street He wants to play golf with me next week

3

3

y

3

op

3

െപൗല aവള് നിസര്gസുnരമായ ആയിരുnു പറ ു eനിk് ഏെത ിലുെമാരാള് u ടായിരിkുക iല െചy ക eനിk് ഏെത ിലും സേഹാദര മാര് u ടായിരിkുക iല െചy ക

C

3

ശbം ആയിരുnു ഏത്

aവന് വ ടിേയാടിc aതിെന പുറt് െതരുവിന് aവന് കളിkുക െഗാ ഫ്~e eെnേയാട് aടുt ആഴ്ച ആവശ െpടുnു ഞാന് നി െള സഹായിkുക aത് eടുkുമ് ധാരാളമായ ആ ചകള് നി ള െട െഗാല്ഫ്-തൂkിയിടുകെയ പുേരാഗമിkുക

His friends didn't believe him Will you help me improve my golf swing

aവന്െറ കൂ കാരനുകള് aവെന വിശ സിkുക നി ള് eെn eന്െറ െഗാല്ഫ്-തൂkിയിടുകെയ പുേരാഗമിkുക സഹായിkുമ്

Don't drink and drive Let me make you a cup of hot chocolate The landlord said he would talk to the lady

കുടിkുക uം വ ടിേയാടിkുക eെn നി ള് ചൂടുll െചാെകാലെ യുെട oരു കp് ആയി u ടാkുക aനുമതിെകാടുkുക ഭൂവുടമsന് aവന് മാന സ തീk് സംസാരിkാം പറ ു enാലും aവന് െചy ക

3

But he never did The dog's mouth was bigger than the dog

3

One day he yelled at the dog

aവന് നായയില് aലറുക on് പകല്

3

The lady got angry

മാന

3

aവള് മറുപടിപറയുക

3

She couldn't answer He knew that electricity was dangerous The teacher walked into the classroom

3

He didn't know what to do

aവന് െചy ക ഏത് aറിയുക

3

What was wrong

െത ായ ആയിരുnു ഏത്

3

But he didn't tell them how he felt

enാലും aവന് aവെരtെnെയ aവന് aനുഭവിc

3 3 3 3

D

3

N

3

o

3

ot

I can't help you It will take many weeks to improve your golf swing

3

3

നായ ് വായ് നായ കാള് ബിgറ് ആയിരുnു തീ കുപിതനായ കി ി

aവന് ആലkികത ആപല്kരമായ ആകുക aറിയുക ീചറ് kസ സൂtിന് ullില് നടnു

107   



e

െന പറയുക

3

He didn't say anything

aവന് eെn ിലുെt പറയുക

3

Many planes were behind it

ധാരാളമായ നിരpായകള് aത് പിnില് ആയിരുnു

3 3

Planes are for flying, not sitting The doctor asked her a lot of questions

നിരpായകള് പറkുക , iരിkുക ആണ് ൈവദ ന് aവള് േചാദ ള െട oരു eലാം ആയി േചാദിc

3

The doctor asked for a blood sample

ൈവദ ന് oരു രkം-ആദ ശtിനുേവ

3

You need to exercise

നി

3

Walk up stairs every day

േകാണികെള േമേല നടkുnു ഓേരാ പകല്

3

She didn't believe her doctor

aവള് aവള െട ൈവദ നിെന വിശ സിkുക

3

She didn't exercise

aവള് aഭ സിkുക

3

I am not a kid

ഞാന് oരു കു ി ആണ്

3

I have no food

eനിk് iലാെത ഭkണം u

3

But water has no taste

enാലും െവളളtിന് iലാെത രുചി u

3

My car has no bed

3

Many people do not have a car

eന്െറ കാരിന് iലാെത കിടk u ട് ധാരാളമായ ആള കള ക k് oരു കാര് u iല െചy ക

3

A street has no bed

oരു െതരുവിന് iലാെത കിടk u

3

I don't know what to do

ഞാന് െചy ക ഏത് aറിയുക

3

I don't know where to go

3

Carol said they must leave early

3

You never know what can go wrong

3

They left two hours early

3

The train made many stops

തീവ

3

Target was having a sale

unം oരു വി പനെയ u

3

We have no more mustard



3

She told him to take a seat

aവള് aവന് oരു iരിpിടെt eടുkുക പറ

3

He looked at the date on the magazine

aവന് ദിനtില് മാസികയില് േനkി

3

He didn't mind

aവന് ശdിkുക ആള കള കള് സുെപ മനിന് ക k eലാ െള െകാടുtു

He was at the bus stop

y

op

ട്

ട്

ടായിരിkുക

C

ട്

നി ള് െത ായ േപാകാം ഏത് aറിയുക aവര് ര ട് മണിkൂരുകള് iളം പായtില് uേപkിkലി ടി ധാരാളമായ ബs്~കെള u

ടാkി

ടായിരുnുെകാ

k് iലാെത െമാെര കടുക് u

ടിരിc

ട് ു

ടatി െറ

aവന് ബസ്-ബs്~iല് ആയിരുnു aത് etിേcരുക ബസkുേവ ടി സമയം ആയിരുnു

3

It was time for the bus to arrive

3

He stood up again

3 3

I also get the news from my friends I can read the newspaper any time I want

3

I can read any story I want

aവന് നിnു േമേല വീ ടും ഞാന് aതുകൂടാെത കി nു വാ tെയ eന്െറ കൂ കാരനുകളി നിn് ഞാന് വായിkാം വ tമാനപ തെt ഏെത ിലും സമയം ഞാന് ആവശ െpടുnു ഞാന് ഞാന് ആവശ െpടുക ഏെത ിലും കഥെയ വായിkാം

3

But it didn't stop

enാലും aത് നിര്tിവ kുക

3

It kept going I have only one problem with my newspaper

aത് േപാകുക വ c

3

eനിk് on് പ നം onുമാ തമായ eന്െറ

108   

ടിവരിെകൗnു

ഞാന് േപാകുക e ് aറിയുക ഭkിഗീതം aവര് iളം പായtില് uേപkിkല് പറ ു

ot

N

People gave Superman lots of candy

D

3

o

3

ള് aഭ സിkുക േവ

ടി േചാദിc

വര്tമാനപ തം u

3

People like good news He told her that he was in love with her

നല വാര്t േപാെല ആള കള കള് aവന് aവള് aവന് aവേളാട് സേനഹtില് ആയിരുnു പറ ു

3

He did not even know her

aവന് േപാലും aവെള aറി

3

He didn't know anything about her

aവന് eെn ിലുെt aവള് പ ി aറിയുക

3

She did not see him or hear him

aവള് aവെന ക

3 3

Everything you touch has germs Wash your hands after you touch other people

നി ള് aണുkള് u ട് െതാടുക കഴുകുക നി ള് പിnീടു നി ള െട ൈകകള് പര്ശിkുക മേ തായ ആള കള കള്

3

It is not a wide road

aത് oരു വി താരമുളള നിരt് ആണ്

3

I will plug it in

ഞാന് aട kുമ് aതിെന aകേtk്

3

But it wasn't a good job

enാലും aത് oരു നല േജാലി ആയിരുnു

3

That made him angry

ആ aവന് കുപിതനായINF u

3

He didn't want to go to jail

aവന് ജയിലിന് േപാകുക ആവശ െpടുക

3

Right now life was bad

ipം aര്ഹത ചീtയായ ആയിരുnു ജീവിതകാലം

3

But he would make it better

enാലും aവന് aത് കൂടുത നലതായINF u

3

This was not their money She went across the street to the liquor store

iത് aവരുെട പണം ആയിരുnു aവള് െതരുവ് വിരുdമായ ദാവകം-കലവറk് േപായി

3

ില

y

ടില aഥവാ aവെന േകള്kുക

op

ടാkി

ടാkാം

C

3

a പകാരം aവള് aവള െട ബസെയ ന െpടുക a പകാരം aവള െട തലവന് aവള് കാലംെത ിയ ആയിരുnു aറി ില

3

But she couldn't find dark red

enാലും aവള് iരു

3

Are you coming or not

നി

3

Why wouldn't I

ഊi െവൗല്ദ് ഞാന്

3

Your grandma told you not to lie

നി

3

No one can tell the truth all the time

iലാെത oരു eലാം സമയം േനരിെന പറയാം

3

Everyone lies sometimes

eെവര്െയാെന കൂെടkൂെട നുെണൗnു

3

You lie to be polite

You lie to get something you want

നി ള് നുണ സഭ മായINF ആകുക നി ള് നുണ നി ള് േനഹിkുക െസാെമാെനെയ പതിേരാധിkുക നി ള് നുണ നി ള് ആവശ െpടുക eേnവസതുവിെന കി ക

3

You lie to be popular

നി

3

They say that they never lie

aവര് aവര് നുണ പറയുnു

3

തറയില് മ

3

He plants yellow corn in the ground He plants the yellow corn in the spring

3

They don't pay for it

aവര് aതിനുേവ

3

They eat it while it is in the field

aവര് aത് aത് പാടtില് ആകുക aേpാള് തിnുnു

3

They don't cook it

aവര് aതിെന പാചകംെചy ക

3

They eat it raw

aവര് aത് aപക മായINF തിnുnു

3

The farmer doesn't get angry

കര്ഷകന് കുപിതനായ കി ക

3

Is that true

ആണ് ആ തു

N

You lie to protect someone you love

D

3

o

3

ot

3

So she didn't miss her bus So her boss did not know that she was late

3

ട ചുവpിെന ക

െടtുക

ള് aഥവാ iലെയ വരുക ആണ് ള െട ഗന് മ നി

ള് നുണ പറ



ള് നുണ ജനകീയമായINF ആകുക ധാന ം aവന് െചടികള്

വസnകാലtില് മ

109   

ട്

ധാന ം aവന് െചടികള്

ടി വിലെകാടുkുക

3

We do not have fancy technology

ഞ k് ഭാവന പയുkശാ iല െചy ക

3

Then it got louder

പിnീട് െലൗദറ് കി ക aത്

3

It was a storm

aത് oരു െകാടു ാ ് ആയിരുnു

3

The mountains will have snow

മലക k് മ

3

3

Then he can read again But right now he must live with the pain They both worked for the same supermarket But you can't make a snow man out of hail They planned to get married and live together

3

But they needed a down payment

പിnീട് വീ ടും വായിkുക aവന് enാലും ipം aര്ഹത aവന് േവദനേയാട് ജീവിkുക aവര് iരുവരും പണിെയടുkുക തുല മായ സുെപ മ െക ിനുേവ ടി enാലും നി ള് u ടാkുക oരു മ ്-പുരുഷനിെന പുറt് വിളിയുെട aവര് മറീട് കി ക uം െ ാെഗtറ് ജീവിkുക uേdശ ംi enാലും aവര് oരു പkിcിറകിെലമൃദുേരാമം വീ ലിെന േവ ടിവരിൈക

3

Two people were dead



3

Time always went too fast

സമയം eേpാഴും േപായി aധികമായി േവഗtില്

3

They shook the sand out of the towel Thank you for taking me to the beach today

aവര് ചലിc മണലിെന പുറt് തൂവാലയുെട നി െള eെnെയ കടല്tീരം-inിന് eടുkുക കൃതjതകാ ക

3

y

3

ട് ആള കള കള് പൂര് മായി ആയിരുnു

3

Other things have weird shapes

aവര് aവരുെട കാലുകള് നന INF കി ി aവര് ക ടു ധാരാളമായ ആള കള കെള തമാശെയ u ടായിരിkുക െകൗ ര്ീസക k് മ nവാദസംബnമായ ആകൃതികള് u ട്

3

3

Then he visited her at home But it was the best thing that ever happened to you Prisoners can't hide in their orange uniforms A recession is a time when people have only a little money

പിnീട് aവെള വീ ില് സn ശിkുക aവന് enാലും aത് തുടര്cയായി നി k് സംഭവിkുക ത ് െബs് സംഗതി ആയിരുnു തടവുകാരനുകള് aവരുെട മധുരനാര യൂണിേഫാറ ളില് oളിkുക oരു പിന്വാ ല് ആള കള ക k് onുമാ തമായ oരു െചറിയ പണം u ട് eേpാള് oരു സമയം ആണ്

3

They don't buy new things

aവര് പുതിയ സംഗതികെള വാ

3

They don't go to Disneyland

aവര് ദിസെനഐല ഡിന് േപാകുക

They point the gun at a friend

aവര് േതാkിെന oരു കൂ കാരനില് നിയമിkുnു

They get cancer

3 3

N

3

o

3

ot

3

They got their feet wet They watched many people having fun

3

ുക

It does not have any stairs

3

It doesn't have a second floor They had to take her to the doctor often He said she would be healthy in a few years

aതിന് oരു aടിtറ u ടായിരിkുക ന് ് െചy ക aവ k് eടുkുക aവെള ൈവദ നിന് കൂെടkൂെട u ടായിരുnു aവന് aവള് oരു ചുരുkം െകാല ളില് ആേരാഗ കരമായ ആകുക പറ ു

3

The ice cubes floated to the top He took a cigarette out of the Marlboro box

iെക-സമചതുരഷ ഭുജ ള് പ പരtിന് oഴുകി aവന് eടുtു oരു ചിഗെരെtെയ പുറt് മര്ല്െബാെരാ-െപ ിയുെട

3

But it was not gold

enാലും aത് സ

3

Then one of the kids caught a fish

പിnീട് oരു മt െt പിടിkുക കു ികള െട on്

D 3

aവര് ഞ ടിെന കി ക aതിന് ഏെത ിലും േകാണികള് u െചy ക

3 3 3

110   

ടായിരിkുക വില്ല്

op

3

ടായിരിkുക

C

3

്u

തം u

ടായിരിkുക iല

ം ആയിരുnു

3

They took a chance with their money

3

People said they don't have money

aവര് oരു aവസരെt aവരുെട പണtിേനാട് eടുtു ആള കള കള് aവ k് പണം u ടായിരിkുക ന് ് െചy ക പറ ു

3

They have no money

aവ k് iലാെത പണം u

3

3

The wife couldn't believe it They passed a new law to protect children They jumped when they heard a strange sound

ഭാര aതിെന വിശ സിkുക aവര് കടnുേപായി oരു പുതിയ നിയമെt കു ികെള പതിേരാധിkുക aവര് aവര് oരു aസാധാരണമായ ആേരാഗ മുllെയ േക eേpാള് ചാടി

3

I think she likes to argue

ഞാന് aവള് വാദിkുക i െpടുnു ചിnിkുnു

3

I wanted it closed

3 3

I guess my vocabulary is not so good The pilot said he felt sorry for the two dead birds

ഞാന് aത് aട c ആവശ െp ഞാന് eന്െറ പദാവലി a പകാരം നല ആണ് ഊഹിkുnു ൈവമാനികന് aവന് ര ട് മരിc പkിക kുേവ നിn മായ aനുഭവിc പറ ു

3

No one died

iലാെത oരു മരിc

3

she didn't care

aവള് ശd

3

But it didn't stop

enാലും aത് നിര്tിവ kുക

3

But Richard did not pull over

enാലും രിചര്ദ് oവറ് വലിcില

3

He never changed his car clock

aവന് aവന്െറ കാര്-ഘടികാരെt മാ ക

3

It wasn't broken

3

He put it back on the stove

3

I hope she is all right

3 3

Their mom wasn't going She takes another cookie out of the package

3

He teaches them how to read

3

She takes it out of her mouth

aവരുെട am േപാകുക aവള് പുറt് െപാതിെk ് മെ ാരു കൂകീെയ eടുkുnു aവന് aവെരtെnെയ വായിkുക e െന പഠിpിkുnു aവള് eടുkുnു aതിെന പുറt് aവള െട വായി െറ

3

They are pictures with words

aവര് പദ

3

He takes his glasses off

y

op

C

ot

aവന് i aതിെന പിnില് aടുpില് ഞാന് aവള് aര്ഹത ആണ് പൂര് മായി പതീkിkുnു

േളാട് ചി തം ആയി ആണ്

3

He did not have any friends in school

3

She jumped in again

aവള് ചാടി aകേtk് വീ ടും

3

The other was the manager Delhi Police files FIR against Kiran Bedi You can not wash your hands too often

മേ തായ േമ േനാ kാരന് ആയിരുnു െഡല്ഹി-േപാലീസ് കിരണ്-െബദി eതിേര ശ കപാദപെt രാണ് നി ള് നി ള െട ൈകകെള കൂെടkൂെട aധികമായി കഴുകാ പ ില

Her mom told her not to worry James took the milk out of the refrigerator

aവള െട am aവള് വ ാകുലെpടുക പറ െജയിംസ് eടുtു പാലിെന പുറt് ശീതീകരണയ natി െറ

D

o

3

He needs his glasses to read She does not have many germs on her hands

aവന് eടുkുnു aവന്െറ ക ാടികെള കള ് aവന് േവ ടിവരിെകൗnു aവന്െറ ക ാടികെള വായിkുക aവ k് aവള െട ൈകകളില് ധാരാളമായ aണുkള് u ടായിരിkുക iല െചy ക aവന് വിദ ാലയtില് ഏെത ിലും കൂ കാരനുകള് u ടായിരിkുക iല െചy ക

3

3 3 3 3

111   

ടി

aത് െപാ ിkുക

N

3

ട്



3

Both mayors were happy

iരുവരും നഗരപതികള് സnുഷ്ടമായ ആയിരുnു

3 3

You can not miss them My mother wished me to have a good education

3

I don't need new workers right now

3

I will teach you how to be even better

നി ള് aവെരtെnെയ ന െpടാ പ ില eന്െറ am eെnk് oരു നല വിദ ാഭ ാസം u ടായിരിkുക ആ ഗഹിc ഞാന് പുതിയ പണിkാരിെന ipം aര്ഹത േവ ടിവരിക ഞാന് നി െള േപാലും െബ റ്INF ആകുക e പഠിpിkുമ്

3

He didn't like that idea Place your hands firmly on the ground A group of frogs were traveling through the woods

aവന് ആ ആശയെt i െpടുക

Some people say that lying is bad You cannot talk about colors to blind people He had many arms and multiple heads.

a പം ആള കള കള് നുണ ചീtയായ ആണ് പറയുnു നി ള് നിറ ള് പ ി anമായ ആള കള ക k് സംസാരിkാ പ ില

The waves at the beach will be high The next morning the weather was clear

കട tീരtില് തിരമാലകള് uയര്n ആകുക

3 3 3 3 3 3

She was also having fun He had one thousand children with his wives

y

3

op

3

വ kുക നി ള െട ൈകകെള മുറുെക തറയില് തവളകള െട oരു സംഘം തടികള് മുേഖന യാ തെച തുെകാ ടിരിc

aവന് ധാരാളമായ ൈക uം മട

ടായിരുnു

കാലാവs വ k ആകുക aടുt പഭാതം aവള് തമാശെയ aതുകൂടാെത u ടായിരുnുെകാ ടിരിc aവന് on് ആയിരം കു ികള് aവന്െറ ഭാര മാര് u ടായിരുnു

aത് oരു മാനുഷികമായ oc ആയിരുnു കു ികള് ഓേരാ മേ തായെയ ക ടുമു ക സnുഷ്ടമായ ആയിരുnു

3

But they can be dangerous

3

I know what you are thinking

enാലും aവര് ആപല്kരമായ ആകുക ഞാന് നി ള് ചിnിc െകാ ടിരിkുnു ഏതിെന aറിയുnു

3

They were meeting after 10 years Your father had an accident while driving to office

3

N

D

3

It made him feel very proud They waited till everyone else was asleep it became difficult to preserve the peace

o

3

ot

3

It was a human voice The children were never happy to meet each other

3

3 3 3 3 3

Rama gave Hanuman Sita's ring Rotate your fist clockwise and anticlockwise. Lower you head to face your navel. Place your hands firmly on the ground

aവര് 10 െകാല ള് പിnീടു ക ടുമു ിെകാ ടിരിc നി ള െട acനിന് കാര ാലയtിന് വ ടിേയാടിkുക oരു aപകടം-aേതസമയം u ടായിരുnു aത് aവന് വളെര aഭിമാനമുll aനുഭവിkുnു u ടാkി aവര് eലാവരും aലായിരുnുെവ ില് നി ദയില് ആയിരുnു aതുവെര കാtിരിkുക aത് ശാnിെയ പുലര്tുക ദു കരമായ ആയിtീ nു രമ ഹനുമന്-സീത ് വളയെt െകാടുtു നി ള െട ൈകpട ഘടികാരദിശയില് uം aണ് ിെ ാk ിെസ കറkുക നി ള് െലാവറ് നി ള െട െപാkിെള േനരിടുക തലവ kുnു വ kുക നി

ള െട ൈകകെള മുറുെക തറയില്

ൈദവ

3

The gods lived in Heaven In the skies there were magical creatures.

3

It was inhabited by humans.

aത് മനുഷ ഗുണ

3

ആകാശ

112   

ായ തല u

C

3

െന

ള് ആശാശtില് ജീവിc ളില് aവിെട മഗികല് ആകുക ജnുkള്. ള llകളാല് പാ kെp

3

He went about disguised as Indra

aത് രാജാവ്-െ ാസെരാ ാല് ഭരിkെp aത് oരു ക ാടി-ആരാധനാസഥലtില് iലാtയാള് aവെന െകാലാ പ ി a പകാരം വ kെp aവന് സംബnിc് ദിസഗ ിെസദിെന iന് ദ a പകാരം േപായി

3

He seduced many of the goddesses. He had one thousand children with his wives The demons became gradually more aggressive and powerful Janaka gave her the name Sita and brought her up

aവന് േദവികള െട മനിെയ വഴിെത ിc aവന് on് ആയിരം കു ികള് aവന്െറ ഭാര മാര് u ടായിരുnു പിശാചുകള് പടിപടിയായി െമാെര aെ gsിവ് uം വീരനായ ആയിtീ nു ജനക aവ k് േപര്-സീതെയ െകാടുtു uം െകാ ടുവരിൈക aവെള േമേല

He was educated by Shiva Hanuman's mother had told him to join Rama, He advised Rama against proceeding towards Lanka on his own. They would need a powerful army to succeed Hanuman mobilised support from friendly kings, Rama instructed Hanuman to find the way to Lanka and to carry a message for Sita.

aവന് ശിവയാല് പഠിpിkെp ഹനുമന് ് am aവന് രമെയ േയാജിpിkുക പറ ി ടായിരുnു , aവന് ല േലk് aവന്െറ o നില് മുേnാ നീ eതിേര രമെയ uപേദശിc aവര് േവ ടിവരികആം oരു വീരനായ േസനെയ തുട cയായിവരിക ഹനുമന് ചാെയ സൗഹാര്dപരമായ രാജാk മാരി നിn് പടെയരുkംെച തു

3 3 3 3 3 3 3

രമ ഹനുമന് വഴിെയ ക െടtുക നിര്േdശിkുക ല aന്ഡ് സീതkുേവ ടി oരു സേnശെt ചുമkുക

aവന് സീത ് വളയെt പിnില് രമ െകാടുtു രാവണാ സീതെയ ല k് െകാ ടുവരിൈകയി ടായിരുnു uം utരവാകുക aവന്െറ ആയിരം മkെള aവെള കാkുക രമ uം ലkമന് സീതയുെട aേന ാഷിkുകയില് പുറt് സഥാപിc aടിയnിരമായി ഹനുമന് oരു രാജകുമാരികള െട മകന് a പകാരം സഹിkെp ഹനുമന് aവന്െറ acന് ് ബലം uം മഗികല് ശkി aനnരവകാശമായികി ി

3

Sadayu attacked Ravana in the air

he tried to lure her to leave Rama when she refused, Ravana shifted back to his true shape. Ravana abducted her and flew away with her towards Lanka

സദയു വായുവില് രാവണാെയ ആ കമിc രാവണാ oരു ചാkുഷവിദ വളയെt സീത ് ൈകവിരലി നിn് eടുtു uം aതിെന സദയുവില് eറി ു രാവണാ aവnെnെയ oരു മുനിk് ullില് പരിവ tിc aവന് aവള് രമെയ uേപkിkല് പേലാഭിpിkുക ശമിc aവള് നിരസിc eേpാള് , രാവണാ aവന്െറ തു ആകൃതിk് പിnില് സഥാനംമാ ്റു രാവണാ aവെള ത ിെkാ ടുേപാകുക uം പറnു ദൂെര aവേളാട് ല േലk്

he decided to seize her he went to the forest together with one of his men They went near the place where Rama lived with his wife and his brother. Mareet transformed himself into a beautiful golden deer.

aവന് aവെള കyടkുക തീരുമാനിc aവന് വനtിന് െ ാെഗtറ് aവന്െറ പുരുഷ മാരി െറ on് േപായി aവര് iടം aരിെകയുളള രമ aവന്െറ ഭാര uം aവന്െറ സേഹാദരന് ജീവിc e ് േപായി മരീ aവnെnെയ ullില് പരിവ tിkുക a സുnരിയായ കനകമയമായ മാന്

3

3

Ravana took a magic ring from Sita's finger and threw it at Sadayu Ravana transformed himself into a hermit

D

3

N

3

o

3

ot

3

he gave Sita's ring back to Rama. Ravana had brought Sita to Lanka and ordered his thousand sons to guard her. Rama and Lakshaman immediately set out in search of Sita. Hanuman was born as the son of a princess Hanuman inherited his father's strength and magical powers.

3

3 3 3 3 3 3 3

113   

ുക

op

3

C

3

y

3

It was ruled by King Tosarot. It was kept in a glass shrine so that nobody could kill him.

3

3

Rama went hunting for the deer he cried out for help with the voice of Rama

രമ ഹു ടി aവന് കര ocേയാട്

3

Lakshaman heard the cry

ലkമന് കരയുകെയ േക

3

He and Rama remained friends,

aവന് uം രമ കൂ കാരന് ആയി aവേശഷിc

3

he refused to accept the throne. Rama would not break his father's promise. He went to live in the deep forests together with Sita and his younger brother, Lakshaman.

aവന് സിംഹാസനെt സ ീകരിkുക നിരസിc

Barata accepted the throne He vowed to kill himself after fourteen years

ബരത സിംഹാസനെt സ ീകരിc aവന് aവnെnെയ പതിnാല് െകാല െകാല ക െവാവി

3

Rama did not return to retrieve it Rama and Sita lived happily together for a while. Kaiyakesi was the second wife of King Tosarot of Ayudhya

3

King Tosarot, had promised her to fulfil any of her wishes.

3

she wished that her son Barata should rule as a king for fourteen years.

രമ aതിെന രkെpടുtുക തിരിെctിയില രമ uം സീത ജീവിc സേnാഷേtാെട െ ാെഗtറ് oരു aേതസമയtിനുേവ ടി ൈകയെകസി aയുധ ായുെട രാജാവ്-െ ാസെരാ ി െറ aധികമായ ഭാര ആയിരുnു രാജാവ്-െ ാസെരാ ് , u ടായിരിkുക aവള് നിറേവ ക പതിൈj aവള െട ഏെത ിലും ആ ഗഹിkുnു. aവള് aവള െട മകന്-ബരത പതിnാല് െകാല kുേവ ടി oരു രാജാവ് a പകാരം ഭരിkണം ആ ഗഹിc

3

King Tosarot became very unhappy He would not eat nor drink until he died of grief.

രാജാവ്-െ ാസെരാ ് വളെര aസnു ആയിtീ nു aവന് തിnാ പ ില aതുമില aവന് ദുഖatി െറ മരിc വെര കുടിkുnു

3

Once he heard the story.

aവന് കഥെയ േക

3

3

D

                                   

y

3

114   

ള് പിnീടു

op

3

C

3

aവന് വന k് െ ാെഗtറ് സീത uം aവന്െറ െയൗ റ് സേഹാദരന് , ലkമന് േപായി

ot

3

ിെന മാനിനുേവ ടി േപായി ു പുറt് സഹായtിനുേവ ടി രമയുെട

രമ aവന്െറ acന് ് വാkിെന െപാ ിkാ പ ില

N

3

o

3

oരിkല് .

PUBLICATIONS  

[1] R. Harshawardhan, Mridula Sara Augustine, K. P. Soman, “Phrase based English – Tamil Translation System by Concept Labeling using Translation Memory”, in Int. Journal of Computer Applications (IJCA), ISSN: 0975 – 8887, Vol. 20, no. 3, April, 2011.

y

[2] R. Harshawardhan, Mridula Sara Augustine, K. P. Soman, “A Simplified Approach to Word Alignment Algorithm for English-Tamil Translation”, in Indian Journal of

op

Computer Science and Engineering (IJCSE), ISSN: 0976-5166, Vol. 2, No. 1, 2011. [3] R. Harshawardhan, Mridula Sara Augustine, K. P. Soman, “Advanced English – Malayalam Translation Memory for Natural Language Processing Applications”, in

D

o

N

ot

C

Proc. of Nat. Conf. on Indian Language Computing (NCILC), February, 2011. 

115