33 downloads 0 Views 643KB Size Report
using Bangla alphabet (have to used Bijoy or Avro keyboard to write Bangla alphabet [5]) or Bangla word written in. English alphabet or in the regular ...

Transformation of Bangla to English Word/Sentences for Proficient Net-Searching +

Depok Chakma+, Md. Sazzadur Ahamed+, Mahmudul Hasan~

Daffodil International University, Dhaka; ~Comilla University, Comilla, Bangladesh [email protected], [email protected], [email protected] Abstract: Over 245 million peoples are using Bangla to converse and gigantic number of them use internet for their preferred information. A problem is seen that, lots of people do not know English as well, especially when they need information from internet by search engines. Some does get appropriate results for their deplorable understanding in English. To overcome this downside this interface is developed to easy conversation of bangle to English using search engines [8]. Keywords: Bangla Net-Searching, Browser apps, Language Converter.

I. INTRODUCTION In this age of Information Technology, at present we have to used search engines or web browser to find our aspirated information for all purposes of communication. In fact we cannot think a day without using search have done over internet, and to used internet we need to have a search engine and web browser (including Internet Explorer, Mozilla Firefox, Google Chrome, etc)[8]. People with lacking of English knowledge very often suffer to search in internet. So we are trying to give some additional facilities to user through an interface in web-browser, so that they can drive a search by using Bangla alphabet (have to used Bijoy or Avro keyboard to write Bangla alphabet [5]) or Bangla word written in English alphabet or in the regular language[6][7], which is English. Whatever, every time the search result is appear in English. The main idea behind developing this interface is to provide the capability of using search engines to those users who use search engines everyday in their life, but do not have too much knowledge of using vocabulary to write search request correctly to find approximate search information. These people can use our interface to do browsing easily by themselves, because people like them can search information by written in Bangla alphabet (i.e. বাংলােদশ iিতহাস) or Bangla written in English alphabet (Bangladesh etihas) or if someone enable in browsing, then he can search in English (Bangladesh history) also. As a result, lack of inefficiency in language, one can find his desire information by using any of these above formats. II. DESIGN METHODOLOGY We have provided a database which connected with our developed interface. This database contains a table called ‘bangla_english’, which holds data that can be used for Bangla to English conversion. This database is like English to Bengali conversion dictionary. The main differences between this

database and typical dictionary are in conversion process. In a typical dictionary only words can be translated. But in our developed database system it can convert whole sentence with prior grammatical knowledge. When someone search any information by writing in Bangla, like “বাংলােদশ iিতহাস”, at first these Bangla word’s send to the database to check whether it is available in the database. If so, then there are some necessary grammatical rules will be applied for appropriate conversion and finally the converted English sentence “Bangladesh history”, is send to the search engine as it’s parameter. After getting this parameter, the default browser (which user currently using) will display the relevant search result.

Fig. 1: Developed Search engine converter

The important feature and characteristic of this interface is following: • • • • • •

The developed interface can be used without the prior concern of opening a web-browser. It automatically open the browser when enter button is pressed. Desired contents can be written in Bengali, In Bengali pronounced (using English alphabet) or English are allowed here. Targeted search result always in English. Drag and drop search engine selection (i.e. Google, Yahoo, Bing) are available. The interface can be used as a Bengali to English dictionary. Word and sentence suggestions are available when typing in the text box.

When users click on the exit button of this interface, it goes to the taskbar as taskbar icon. To exit permanently click right button of mouse and click on exit. If someone double click on that icon or click right button of mouse and click on open, it show again on top. III. WORKING PRINCIPLE

In the figure-2 Bangla words are stored in database with information of English word, tense (if word is verb), parts of speech and person (if word is noun or pronoun or other pronoun). During translation all data of words are taken from database.

Which word does not have any information of parts of speech or not found in database, those will also consider as noun. Noun Word= Word {null} Example: Raju -> Raju, রাজু -> Raju. Prepositional Words: Prepositions show the relationship between a noun or a pronoun and some other word in the rest of the sentence. Example: of, for, at, in, under, towards, before, about, above as, as far as, as of, etc [2]. If given sentence contain any preposition words it will check next. Rule 1.1: If not found any word after preposition word, words before preposition are exchange with words after preposition and words are considered as same portion after concatenation. Prepositional Word=word {Preposition} + words {before preposition} Example: desh ar-> of country, bisshobiddaloy ar -> of university.

boi ar -> of book,

Rule 1.2: If found any word without adjective, an exchange will occur between word before preposition and word after preposition. That mean, words before preposition treat as words after preposition and words after preposition are treat as words before preposition. Prepositional Word=Word {after preposition} + Word {preposition} + Word {before Preposition} Example: desh ar boi -> book of country, Bisshobiddaloy ar chatro -> student of university

Fig. 2: Converter working Procedures

Bangla sentence can be written by Bangla Unicode or English alphabet as input, by changing language selection option. If someone types a bangla word our application search for equivalent English word in to database. If it is available into database it fetches the data. If not it remains same. If there are more than one word is being typed it checks for proper grammatical rules then convert the whole bangla word into corresponding English sentences. IV. GRAMMATICAL RULES FOR WORD CONVERSION Translated words are classified into some categories depending on parts of speech. Noun Words: Nouns are part of speech typically denoting a person, place, thing, animal or idea [3]. Words, which are declared into database as noun, are used as noun. Noun Word= Word {Noun} Example: boi -> book

Rule 1.3: If found any Adjective word after preposition, adjective rules are applied. After applying adjective rules, resulting words are set to left of preposition word and words at the left of the preposition are set to right of preposition. Prepositional Word=Words {After Preposition (From adjective rule)} + Word {Preposition} + Words {Before Preposition} Example: bisshobiddaloy ar valo chatro -> good student of university Adjective Words: Adjectives are one of the traditional eight English parts of speech. Adjectives describe or qualifying (or adding something to) the meaning of a noun or pronoun [2]. Example: good, small, dark, yellow, etc. If any adjective word is found, it will check next word. Rule 2.1: If next word not found, single adjective words will be stored as adjective. Adjective Word=Word {adjective} Example: valo -> good Rule 2.2: If found any word without preposition, two words are considered as single portion. Adjective Word=Word {adjective} + Word {Next}

Example: valo bisshobiddaloy -> good university Rule 2.3: If found any prepositional word after adjective or there is another word after adjective, and after that found preposition, in this situation rules of preposition are effected and words will considered as prepositional words. Adjective word and word after adjective (if have) will consider as same portion of before preposition. Example: valo bisshobiddaloy ar -> of good university Other Pronoun: Unless Personal Pronoun, all other types of pronouns (including relative, demonstrative, indefinite, reflexive, intensive, interrogative, possessive pronoun) is considered as other-pronoun.

Example: ki-> what, konti-> which Verb: A verb is a word that expresses an action or a state of being [4]. Example: study, read, writes, shopping, etc Words which are assigned as verb in database are considered as verb. Verb Word= Word {Verb} Subject: First categorized words are mentioned as subject with some additional information (person and category). Subject Word= Word {First Categorized word}

When other pronoun is found, next word is checked. Rule 3.1: If not found any word after Other Pronoun, that single word is considered as Other Pronoun Other Pronoun Word=Word {Other Pronoun} Example: amar -> my, আমার-> my. Rule 3.2: If found any word and that word is adjective, then adjective rules are applied. If adjective rules 2.1 & 2.2 are applied, those words are concatenation with other pronoun. Other Pronoun Word=Word [Other pronoun] + Word {From Adjective rules 2.1 & 2.2} Example: amar valo -> my good bisshobiddaloy -> my good university


amar valo

If Adjective rule 2.3 is applied then words will be considered as prepositional words.

Fig. 3: Identical word Conversion using rules

Example: amar valo bissobiddaloy ar Rule 3.3: If second or third word after other pronoun will appear as propositional word, then prepositional rules will applied and words will considered as prepositional words. Example: amar valo bissobiddaloy ar  for my better university Pronoun Words: We know that a pronoun is a word that takes the place of a noun or noun phrase. Examples: he, she, I [2]. That means only personal pronoun is considered as pronoun words. Words, which are declared into database as Pronoun, are used as Pronoun. Pronoun Word= Word {Pronoun} Example: ami -> I, tumi -> you Questionable Words: Words which are used to ask some questions, is known as Questionable word. Example: who, what, which, when, where, how, etc. Words, which are declared into database as Questionable, are used as Questionable Words. Questionable Word= Word {Questionable}

V. CREATION OF SENTENCES BY GRAMMATICAL RULES After categorized, all words are sequenced by some grammatical rules to create a sentence. Auxiliary verb: An auxiliary verb is a verb that adds functional or grammatical meaning to the clause in which it appears. It used to express tense, aspect, modality, voice, emphasis, etc [2]. Example: am, is, are, were, have, has, will, etc. Rule 1: Auxiliary verbs are found out by using person of subject and tense of verb. Example: am, is, are, have, has, was, were, etc Rule 2: If Verb is not used, tense are considered as present and be verb is found out by Rule 1. Rule 4: If questionable word is used and have not any verb, find out person of subject and concatenation auxiliary verb after questionable word. Questionable Word =Word {Question} + Auxiliary verb

Example: tader bisshobiddaloy ar naam ki ? -> What is the name of their university? Rule 5: If both questionable and verb words are present, find out be verb by Rule 1, Auxiliary verb is used instead of questionable word.

After generate a complete sentence, it sends to the selected search engine. Then results are displayed by selected search engine in web browser. Example: amra valo bisshobiddaloy ar chatro -> we are the students of good university.

Questionable Word =Auxiliary verb Example: chatro ki boi porche -> is student reading book


Rule 6: If have other pronoun and questionable word but not have any verb, auxiliary verb is defined by Rule 2 and added before Other Pronoun.

For enhanced use of search engine with the fundamental knowledge of writing this was our main aphorism when we developed this research work. This interface really performs a better outcome whether it is used in individual conversion. We tried to make it optimistic. In future we want to facilitate the search option considering more grammatical rules. We are planning to enhance the database by bearing in mind of all complex Bengali words. We will extend this interface for android devices. We will provide our own Bangla keyboard for typing that will reduce the clumsy of Bangla type.

Other Pronoun Word = Word {Question} + Auxiliary verb+ Word {Other Pronoun} Example: tar naam ki -> what is his name Rule 7: If have adjective and questionable word but not have verb, auxiliary verb is taken from Rule 2 and added before adjective words. Adjective Word= Word {Question} + Auxiliary verb+ Word {Adjective}


Example: valo boi konti-> which is good book Rule 8: If have verb but not used questionable word, auxiliary verb identify by Rule 1 and added before verb. Verb Word =Auxiliary verb+ Word {Verb} Example: sey porche -> he is reading Rule 9: If have prepositional word but not have verb, Rule 2 will effect and find out auxiliary verb. That auxiliary verb will be added before prepositional word. Prepositional Word= auxiliary verb+ Word {prepositional word} Example: boi ar naam ki->what is the name of the book Rule 10: If have noun and subject but not verb and any other, then auxiliary verb are identify by Rule 2 and added before noun. Noun Word =Auxiliary verb + word {Noun} Example: sey chatro-> he is a student


Nandkishor Vasnik, Shriya Sahu, Devshri Roy “Talash: Asemantic and Context Based Optimized Hindi Search Engine”, IJCSEIT, Vol.2, No.3, June 2012


English Grammer and its Different Rules


J.C. Nesfield, P.C. Wren, H.Martin, “Applied English Grammar and Composition”, 4th Edition,




Nafid Haque, M. Hammad Ali, Matin Saad Abdullah, Mumit Khan, “Infrastructure for Bangla Information Retrieval in the Context of ICT for Development”


Dashgupta, S. and M. Khan, Morphological Parsing of Bangla Words using PC KIMMO, in International Conference on Computer, and Information Technology (ICCIT) 2004. Bhattacharyya, P. Multilingual Information Processing Using Universal Networking Language. in Indo UK Workshop on Language Engineering for South Asian Languages LESAL. 2001. Mumbai, India. Sergey Brin, Lawrence Page “The Anatomy of a Large-Scale Hypertextual Web Search Engine” Computer Science Department, Stanford University, Stanford, CA 94305, USA Manjira Sinha, Sakshi Sharma , Tirthankar Dasgupt , Anupam Basu ; “New Readability Measures for Bangla and Hindi Texts”, Proceedings of COLING 2012: Posters, pages 1141– 1150,COLING 2012, Mumbai, December 2012.


[8] [9]

Fig.4: Conversion of Complete sentences by Rule