Towards Extended Machine Translation Model for Next ... - Springer Link

2 downloads 6881 Views 160KB Size Report
new concept of web services based data translation, a multilingual machine ... ness, convenience, efficiency, and higher accuracy, scalability, self-learning, ... number of software systems extend their capability to encompass web services tech- nology. ... facing challenges to provide reliable, high-accuracy and generic portal ...

Towards Extended Machine Translation Model for Next Generation World Wide Web Gulisong Nasierding, Yang Xiang, and Honghua Dai School of Information Technology, Deakin University Melbourne Campus, Burwood 3125, Australia {gn,yxi,hdai}

Abstract. In this paper, we proposed a Data Translation model which potentially is a major promising web service of the next generation world wide web. This technique is somehow analogy to the technique of traditional machine translation but it is far beyond what we understand about machine translation in the past and nowadays in terms of the scope and the contents. To illustrate the new concept of web services based data translation, a multilingual machine translation electronic dictionary system and its web services based model including generic services, multilingual translation services are presented. This proposed data translation model aims at achieving better web services in easiness, convenience, efficiency, and higher accuracy, scalability, self-learning, self-adapting.

1 Introduction With the strong increasing of data complexity both computationally and descriptively, a creative technique which can greatly simplify the process, storage, transmission and communication of data becomes essential and in great demanding. The World Wide Web is increasingly used for application to application communication. The programmatic interfaces made available are referred to as Web services. An increasing number of software systems extend their capability to encompass web services technology. Web services provide a platform to develop automatic application systems based on cross-organizational and heterogeneous software components. This characteristic of web-services is intrinsically similar with Machine Translation (MT) and Data Translation technology. Machine translation refers to the use of computer systems to translate texts among various languages and help human translators on their translation work. Machine Translation has been researched for decades. However, today’s MT system is still facing challenges to provide reliable, high-accuracy and generic portal translations. Laurie [1] stated in her article that majority people in worldwide area couldn’t satisfy simply translating web site. Since customers need to communicate with each other by using different language channels or exchange ideas through preferred type of data. As a result, web services based data translation has become important and necessary as we proposed in this paper.

2 Machine Translation Systems 2.1 Review of MT Systems According to Andy Way [2], machine translation methods can be classified to two major branches: (1) Rule-based MT and (2) Data-Driven MT [3]. In Arul’s research H. Jin, Y. Pan, N. Xiao, and J. Sun (Eds.): GCC 2004, LNCS 3251, pp. 1017–1020, 2004. © Springer-Verlag Berlin Heidelberg 2004


Gulisong Nasierding, Yang Xiang, and Honghua Dai

work, machine-learning technique is used to attain translation knowledge from bilingual datasets, which improved the quality of translation. Example-Based Machine Translation via the Web system is proposed by Nano [4]. Contemporary web-based MT systems use HTML pages transmission to translate large amount of data. For example, in UNIX system, 'wget' function is used to pass the source web pages to translation system. After complete translation, the web server returns the translated web pages to the request. In the IBM, LMT machine translation system has been implemented based on rules of word formation rather than rely on entries in the bilingual core lexicons. Our system belongs to rule-based MT systems which implicitly contains rule bases in the translation main modal. An instance model of MT system can be seen from the previously presented work [6]. 2.2 A Model of Web Services Based Multilingual Translation System The system model is shown in the following Figure 1. It includes five layers, Generic Multilingual Translation Web Services layer, Specific Translation Web Services layer, Multilingual Translation Engine layer, Knowledge Base layer, and Database and Rule Base layer. The main function of multilingual web services model is to provide distance services to a client’s request of multilingual translation focusing on words, terms or phrase translation with phonetic notation, then approaching to translate sentences and paragraphs in the future work. The generic multilingual translation web services layer is an open portal to bridge different translation. According to the requirements, specific request is directed to different translation web services layer. Then the request is passed to translation engine. The key role of the multilingual translation engine is to translate between every two different languages, and add notation to the translated ones based on production rules. Machine learning techniques are used for searching, extracting and retrieving translated corresponding words and phrases. The different language translation rules are different upon to the characters or scripts of the languages and formation of the words. All relevant language translation rules are remained in the rule bases and all different lexical information are stored in different data bases. In order to retrieve the translated words efficiently, the lexical data can be indexed and associated retrieval techniques are adopted. Our multilingual translation web services model can be applied in different scenarios. For example, staff in different company branches have to use the same management information system belongs to the same group. They need to manage the same source information, but face to different language interface. By using our system, they don't need to purchase different language versions of the system so that it could bring financial benefits to the user if they register to our web services.

3 Data Translation (DT) The concept of data translation is an ‘ISO 9001-certified supplier to the data acquisition, imaging and machine vision markets with expertise in the design of highaccuracy, high-quality and reliable analogue-to-digital products’ [7]. We extend the concept of data translation to translating or transforming data within different types, i.e. it refers to different pattern of visualized information, such as text, images/pictures (still or moving) or audio data (the spectrum of it can be visualized).

Towards Extended Machine Translation Model for Next Generation World Wide Web

Fig. 1. A model for multilingual translation web services


Fig. 2. Data translation Web services model

The data translation web services model is an extension of the multilingual translation web services model. The main methods and techniques we designed to apply here is similar to previous one. But it expands to the databases not only containing the lexical database but also comprised by image databases. Knowledge base builds up with databases and rule bases. Descriptions and annotations of the image shapes are to be stored in the data bases. The rules of searching and retrieving specific images from image database and matching rules between images and its text-annotations or descriptions are stored in the rule bases. The function of data translation engine is to be a reasoning machine based on production rules. Decision tree learning algorithms need to be applied in order to realize accurate and efficient translation within different types of visual data. We also could consider object recognition as machine translation when translating from images or moving pictures to text or vice versa in our further work. Currently the web-based translation services are mainly for website translation, and can only handle uncomplicated context. Our approach in fact is to build an open architecture for data translation web services. It is not only faced to the translate websites, but also to handle all the software systems, file systems and even operating systems. This is a great advantage for people with minority languages because it costs too much to translate system into every minority language; but with this web services, it is easy for the people to access the information out of their language knowledge. All kinds of information in the world such as text, image, audio and video can be shared with people all around the world by the data translation web services.

4 Conclusions and Future Work In this paper, we proposed a data translation web service model which is extended from machine translation. As an instance of data translation, a multilingual translation


Gulisong Nasierding, Yang Xiang, and Honghua Dai

web services model is presented in this paper. It is designed to achieve high accuracy, scalability and intelligence. In order to obtain distance communication, retrieving and extracting information in different patterns and in different languages, we apply the data translation into our web services system. Data translation for new generation world wide web with knowledge communication in different format of the data can be a direction of our future work.

References 1. Laurie, G.: Supporting a Multilingual Online Audience, Machine Translation: From Research to Real Users, Proc. of AMTA (2002) 2. Way, A.: Web-Based Machine Translation,, 2003 3. Arul, M.: Better Contextual Translation Using Machine Learning, Machine Translation: From Research to Real Users, Proc. of AMTA (2002) 4. Nano, G.: Example-Based Machine Translation via the Web, Machine Translation: From Research to Real Users, Proc. of AMTA (2002) 5. Nasierding, G.: Application of Artificial Intelligence in Uighur-Chinese Bilingual Translation Electronic Dictionary, Proc. of the First International Conference on Success and Failures of Knowledge-Based Systems in Real-World Applications (1996) 6. Nasierding, G., Tu, X.: Research on Uighur, Chinese, English, Japanese and Russian multilingual two-way translation Electronic Dictionary, Proc. of The Second Chinese World Congress on Intelligent Control and Intelligent Automation (1997) 7., (2003) 8. W3C, Extensible Markup Language (XML), (2003) 9. Sergei, N.: Machine Translation: Theoretical and methodological issues (1987)

Suggest Documents