Author Guidelines for 8

2 downloads 0 Views 428KB Size Report
“Enhanced Moodle CMS using Semantic web Technology: Developing an Architecture of ... Question Answering System”, Proceeding: MMU. International.
Semi Automatic Ontological Knowledge Base Construction From Learning Materials in eLearning Management System Ahmad Mukhlason Computer And Information Sciences Department Universiti Teknologi PETRONAS [email protected] Ahmad Kamil Mahmood Computer And Information Sciences Department Universiti Teknologi PETRONAS [email protected] Noreen Izza Arshad Computer And Information Sciences Department Universiti Teknologi PETRONAS [email protected] Abstract Manual ontology construction has proven to be very difficult task and is becoming a problem in the process of acquiring knowledge for ontology engineer. This article presents a semi-automatic ontological knowledge base construction framework and several aspects of evaluating the semi-automatic ontology generation tool in eLearning Management System setting. The framework combined an ontology learning tool called Text2Onto to construct ontology from scratch and an ontology mapping tool called PROMPT to map the ontology the existing ontology. Experiment on generating ontology from text corpus shows a significant result for knowledge concept extraction.

1. Introduction Ontology as formal specification of shared conceptualization is the backbone of semantic web technology [1] and some related areas such as knowledge management. The success of semantic web technology heavily depends on the success of formal ontologies development to structure data for comprehensive and transportable machine understanding [2]. There are some existing ontology representation languages, e.g., SHOE[3], DAML+OIL[4], RDF[5] /RDFS[6], OWL[7]. There are also some ontology

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

development frameworks and tools like protégé [8], poWl [9], ontoEdit[10], WebOde[11], and others. However manual ontology construction has proven to be very hard and a tedious task and might create a bottleneck in the process of ontology acquiring process for ontology engineer. Due to this problem, we have developed a framework for semi-automatic ontological knowledge base construction in eLearning Management System setting. The ontological knowledge base is constructed from learning materials uploaded by the user to the eLearning management system. The performance evaluation of ontology learning tool in semi-automatic ontology generation is also presented at the end of this paper by comparing its performance with human expert performance.

2. Related Work The (semi)automatic ontology construction or usually called as ontology learning is a crossdisciplines research, including the NLP, Machine Learning and Data Mining, etc. Alexander Maedche and Steffen Staab [2] distinguish different ontology learning approaches based on the type of its input, i.e., ontology learning from text, from dictionary, from knowledge base, from semi-structured schemata and from relational schemata. The approach discussed in this article is mainly about the ontology learning from text.

Previous works on investigating (semi) automatic ontology constrution framework and tool with different methods and techniques had been done such as Text2Onto[13], OntoExtractor [23], OntoLancs [24], frame semantics [25], Granular [26], and Event Based Knowledge Acquisition [27]. There are also numerous ontology mapping tool and framework for combining distributed and heterogeneous ontologies in previous works. The evaluation and comparison studies on these ontology mapping tool has been done in previous

Our semi-automatic ontological knowledge base construction framework is depicted in Figure 1. The semi-automatic ontology generation starts when user/lecturer uploaded new learning material in the eLearning Management System (in this case we used Moodle[12], a free open source eLearning/Course Management System). The learning materials uploaded by the user/lecturer will be the input for the ontology learning tool. The ontology will be constructed by the ontology learning

Figure 1 - Semi Automatic-Ontology Construction Framework works. Natalya et. al. [14] evaluated PROMPT[15][32], ONION[16], Chimaera[17], FCAMerge[18][33], GLUE[19][37], and OBSERVER[20]. Kaza et. al. [30] evaluated PROMPT[15][32], Chimaera [17], and LOM [31]. Choi et. al. evaluated MOMIS[34], LSD[35] CTXMATCH [36], GLUE [19] [37], MAFRA [38] LOM[31], ONION[32], PROMPT[15] [32],FCA-Merge[18][33]. Most of evaluation in ontology learning tools and ontology mapping tools concluded that Text2Onto and PROMPT as the most powerfull ontology learning tool and the most powerfull ontology mapping tool respevtively. Considering thus evaluation, this article present a framework to combinate ontology learning and ontology mapping framework that usually separated framework in previous works using Text2Onto and PROMPT. By this combination, it is purposefully gives a significant improvement (semi) automatically ontology generation.

3. Semi-Automatic Ontological Knowledge Base Construction Framework

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

tool with the selection of appropriate algorithm. But, the primary obstacle is to build the interoperation between the new ontology and the existing ontology. Accordingly, to make sense with the existing ontology, the new ontology needs to be mapped (semantically to be related at the conceptual level) with the existing ontology based on the semantic relations. The ontology mapping task as outlined in the framework is performed by ontology mapping tool using several mapping algorithms which will be described in section 3. At the final stage, new ontology will merge (update) with the existing ontology.

4. Ontology Generation from Learning Materials (LMs)

The purpose of the ontology generation from Learning Materials(LMs) is to realized our vision in [39] i.e. semantic web-enhanced e-learning management system. Considering that ontology is the backbone of semantic web, it is very important to generate ontology (semi) automatically in e-learning managemeny system environtment. The automatic ontology generation in our framework is performed by the ontology learning tool. It will build ontology from scratch, enrich, or adapt an existing ontology in a semi-automatic fashion from several sources. In our framework we adopted Text2Onto [13] architecture as depicted in Figure 2. As described in [7] the architecture of Text2Onto is centered around the Probabilistic Ontology Model (POM) which stores the results of the different ontology learning algorithms. The algorithms are initialized by a controller, and the purpose of which is (i) to trigger the linguistic preprocessing of the data, (ii) to execute the ontology learning algorithms in the appropriate order and (iii) to apply the algorithms' change requests to the POM. The fact that none of the algorithms has the permission of directly manipulating the POM guarantees maximum transparency and allows for the flexible composition of arbitrarily complex algorithms as described below. The execution of each algorithm consists of three phases: First, in the notification phase, the algorithm learns about recent changes to the corpus. Second, in the computation phase, these changes are mapped to changes with respect to the reference repository, which

Figure 2 - Architecture of Text2Onto

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

Figure 3 - Text2Onto User Interface stores all kinds of knowledge about the relationship between the ontology and the data (e.g. pointers to all occurrences of a concept). And finally, in the result generation phase, requests for POM changes are generated from the updated content of the reference repository. The algorithms provided by the Text2Onto framework can be classified according to two different aspects: task, i.e. the kind of modeling primitives they produce, and type, that means the method which is employed in order to extract instances of the regarding primitives from the text. Each algorithm produces a certain kind of modeling primitive which can be configured to apply several algorithms of different types and to combine their requests for POM changes in order to obtain a more reliable probability for each instantiated primitive. Various types of pre-defined strategies allow the specification of the way the individual probabilities are combined. The Graphical User Interface (GUI) of Text2Onto is depicted in Figure 3.There are 3 main parts of the Text2Onto GUI. The first part is the ontology generation algorithms consisting of several algorithms that can be selected individually or combined with the others as presented in Table 1.

Table

1

Ontology Notion

Concept

Instance Similarity SubclassOf

Algorithms

implemented

in

Algorithms 1.EntropyConceptExtraction 2.ExampleConceptExtraction 3.RTFConceptExtraction 4.TFIDFConceptExtraction 1.ExampleInstanceExtraction 2.TFIDFInstanceExtraction 1.ContextExtraction 1.PatternConceptExtraction 2.SpanishVerticalRelationsConcept Classification 3.SpanishWordnetRelationsConcept Classification 4. VerticalRelationsConcept Classification 5. WordnetRelationsConcept Classification

InstanceOf

1.ContextInstanceClassification 2.GoogleInstanceClassification 3.GoogleInstanceClassification2 4.PatternInstanceClassification Relation 1.SubcatRelationExtraction SubtopicOf 1.SubtopicOfRelationConversion 2. SubtopicOfRelationConversion Moreover, each ontology generation algorithm can be configured using configuration algorithms for context extraction as presented in Table 2.

Table 2 Configuration Algorithm No. Algorithm 1 ContextExtractionWithFrame 2 ContextExtractionWithFrameV2 3 ContextExtractionWithoutStopWords 4 ContextFeaturesExtraction 5 ExampleContextExtraction If more than one ontology generation algorithm is selected, the algorithms can be combined using combiner algorithm presented in Table 3.

4

Minimum Combiner

The final output of the ontology learning is a new ontology constructed from input Learning Materials (this input usually called as corpus) which is represented in OWL (Web Ontology Language). Hereafter, this new ontology need to be integrated, merged, and aligned with existing ontology. These processes are called ontology mapping that will described in the following section of this article.

5. Ontology Mapping Ontology Mapping is the process where two ontologies are semantically related at the conceptual level, and the source ontology instances are transformed into the target ontology entities according to those semantic relations. After ontology mapping, the ontology integration, merging, and alignment also need to be conducted. These proceses are considered as an ontology reuse. In this case, ontology mapping establishes correspondence among existing ontology in knowledge base and new ontology generated from learning material to be merged or aligned, and determines the set of overlapping concepts, synonyms, or unique concepts to those sources. This mapping identifies similarities and conflicts between the various source (local) ontologies to be merged or aligned[31]. The ontology mapping in this article is adopt PROMPT developed in Stanford University. PROMPT is a semi-automatic ontology merging and alignment tool. It begins with the linguistic-similarity matches for the initial comparison, but generates a list of suggestions for the user based on linguistic and structural knowledge and then points the user to possible effects of these changes. The heart of PROMPT approach is ilustrated in Figure 4 [40]. The gray boxes indicate the actions performed by PROMPT, and the white box indicates the action performed by the user.

Table 3 Algorithm Combiner No. 1 2 3

Combiner Average Combiner Default Combiner Maximum Combiner

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

Figure 4 - PROMPT Algorithm

In detail, the algorithm of PROMPT is described as follows: PROMPT takes two ontologies as input and guides the user in the creation of one merged ontology as output. First PROMPT creates an initial list of matches based on class names. Then the following cycle happens: (1) the user triggers an operation by either selecting one of PROMPT’s suggestions from thelist or by using an ontology-editing environment to specify the desired operation directly; and (2) PROMPT performs the operation, automatically executes additional changes based on the type of the operation, generates a list of suggestions for the user based on the structure of the ontology around the arguments to the last operation, and determines conflicts that the last operation introduced in the ontology and finds possible solutions for those conflicts.

6. Experiment, Result and Discussion The experiment was conducted in Introduction To Problem Solving and Programming (IPSP) course to generate IPSP otology from two Learning Materials(LM). Those two LMs are available online in [21] and [22]. The first LM (LM1) presented an introduction to computer programming, explored any fundamental concepts in programming that consist of 285 words and 1697 characters. The second LM (LM2) presented an introduction to C Programming that consits of 275 words and 1621 characters. Both LMs are in text format. The experiment result of ontology learning using extractor algorithms as depicted in Figure 5 and all configurator algorithms in Table 2 is presented by Table 5.1-5.3 and graphically is depicted in Figure 6. The values in Table 5.1 represented amount of each ontology notion that can be extracted by ontology learning tool. T, F, and D indicates TRUE, FALSE, and DON’T KNOW respectively which means that in the perspective of domain/human expert, the corresponding ontology notion extracted by ontology learning tool is correct, incorrect, or user/domain expert can not determine wether it is either correct or not. In other words, in this experiment the one who can justify T, F, D for ontology notion extracted by ontology learning is the domain expert or human expert, in this case the domain/human expert is the authors ourselves. For an example, based on Table 5.1 amount of the concept that can be extracted from first learning

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

Figure 5 - Algorithm for Ontology Learning Experiment material (LM1) is 90, where 54 out of 90 are T(correct), 4 out of 90 are F (wrong) , and 2 out of 89 are D (don’t know). Then in Table 5.2 represented the values from Table 5.1 in precentage or we called it acceptance notion rate (ANR) using (1).

ANR

T ⎛ ⎞ = ⎜ ⎟ x 100 ⎝T + F + D ⎠

(1)

Table 5.1 Experiment Ontology Learning Result Onto Notion T Concept 54 Instance 0 Similarity 0 SubClassOf 11 InstanceOf 0 Relation 1 SubTopicOf 11 * Learning Material

LM1* F 4 2 0 94 0 6 300

D 2 0 0 3 0 0 50

T 40 2 0 3 0 1 1

LM2 F 7 5 0 55 1 1 355

D 1 0 0 3 0 0 3

Those result as depicted in Figure 6 showed that the human experts (authors) agreed 86.67% of concept extracted by ontology learning tool. But still less than 15% for instance and relation (similarity, subclassof, instanceof, relation, and subtopicof) extraction between concepts.

Table 5.2 Experiment Ontology Learning Result in Percentage Onto Notion

LM F 21.29 85.715 0 90.12 50 68 91.445

T 86.67 14.285 0 7.54 0 32 1.64

Concept Instance Similarity SubClassOf InstanceOf Relation SubTopicO f

D 8.54 0 0 4.04 0 0.1 6.915

7. Evaluation

Then the average value of ANR from learning material 1 and learning material 2 is presented in Table 5.3 Table 5.3 Experiment Ontology Learning Onto Notion

LM1 T(%)

Concept Instance

LM2

90

F(% ) 28

D(%)

15

0

100

0

Similarity

0

0

0

SubClassOf

10

87

InstanceOf

0

Relation SubTopicO f

F(%)

D(%)

83.34

14.58

2.08

28.57

71.43

0

0

0

0

3

5.08

93.24

5.08

0

0

0

100

0

14

86

0

50

50

0.20

3

84

13

0.28

98.89

0.83

T(%)

100% 80% 60% 40% 20% 0%

8. Conclusion In this article we have presented a semi-automatic ontological knowledge base construction framework. The experiment conducted in e-learning management system environment to generate ontology (semi)automatically from learning materials. The result of experiment showed that the performance of ontology learning (Text2Onto) compared with human expert performance in extracting knowledge concept demonstrated high performance, thus saving the expert time and effort.

D F T

C

on ce pt In st an ce Si m i la Su b C r ity la ss In st O f an ce O f R el at Su i b T on op ic O f

ANR

Result (Average Percentage)

Based on the experiment in section 6 above, our results in ontology generation (semi)automatically from two LMs demonstrated that a human expert agreed with a very large fraction of concept in ontology that ontology learning (in our framework) produced. It was able to perform a large number of concept extraction, thus saving the expert time and effort. However, the ontology learning in our framework still demostrated low performance in instance and relation between concepts (similarity, subclassof, instanceof, relation, and subtopicof)extraction. It needs more effort in future work that collaborated crossresearc area such as linguistic, NLP, and Text mining.

Onto Notion

Figure 6 - Graphics of Experimental Ontology Learning Result Figure 6 depicted the graphical representation of experimental ontology learning result that previously presented in Table 5.3. Vertical bar in the figure represents precentage or ANR value of ontology notion which can be extracted from Learning Materials.

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

9. References [1] He Hu and Da-You Liu, “Learning OWL Ontologies from Free Text”,Proceedings of 2004 International Conference on Machine Learning and Cybernetics, IEEE, vol. 2, 2004, pp. 1233 - 1237. [2] Maedche, A. and Staab, S., “Ontology Learning for the Semantic Web”, Journal of Intelligent Systems, IEEE, vol. 16, no.2, Mar-Apr 2001, pp. 72-79. [3] SHOE, http://www.cs.umd.edu/projects/plus/SHOE/ [4] DAML+OIL, http://www.w3.org/TR/daml+oil-reference [5] RDF, http://www.w3.org/RDF/ [6] RDFS, http://www.w3.org/TR/rdf-schema/

Conference on Sensing and Control, 2008 ( ICNSC 2008). pp. 1726 – 1729

[7] OWL, http://www.w3.org/TR/owl-features/ [8] Protégé Ontology Editor and Framework, http://protege.stanford.edu/.

Knowledge-based

[9] Auer, S., ”pOWL – A Web Based Platform for Collaborative Semantic Web Development” , http://powl.sourceforge.net/overview.php. [10] OntoEdit, an Ontology Engineering Environment. http://www.ontoknowledge.org/tools/ontoedit.shtml. [11] Corcho, O. et al., “WebODE: An Integrated Workbench for Ontology Representation, Reasoning, and Exchange” Book Chapter Of Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web, springer, vol. 2473/2002, 2002, pp. 295-310. [12] Moodle, http://moodle.org/ [13] Cimiano, P. and Voelker, J. “Text2onto - a framework for ontology learning and data-driven change discovery”. Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems (NLDB'2005), JUN 2005. [14] Natalya F. Noy and Mark A. “Evaluating OntologyMapping Tools: Requirements and Experience”, 13th International Conference on Knowledge Engineering and Knowledge Management EKAW 2002 Siguenza (Spain), 30th September 2002 [15] PROMPT, http://protege.stanford.edu/plugins/prompt/ prompt.html [16] UNION, ?ONION

http://ontolog.cim3.net

/cgi-bin

/wiki.pl

[17] Chimaera, http:// www-ksl.stanford.edu/ software/ chimaera/ [18] FCA-Merge, http:// www.aifb.uni-karlsruhe.de/ WBS/ gst/ presentations/ 2001-05-04-DBFusion.pdf [19] GLUE, http:// www.cs.washington. edu/ homes/ pedrod/papers/hois.pdf [20] OBSERVER, http://sid.cps.unizar.es/OBSERVER/ [21] Corpus: Introduction to Programming, http://www. geocities.com/ahmad.mukhlason/research_corpus_01.pdf [22] Corpus: Introduction to Programming, http://www. geocities.com/ahmad.mukhlason/research_corpus_02.pdf [23] Nie, X.,Zhou, J., A Domain Adaptive Ontology Learning Framework Networking, IEEE International

978-1-4244-2328-6/08/$25.00 © 2008 IEEE

[24] Gacitua, R.,Sawyer, P. . Ensemble Methods for Ontology Learning - An Empirical Experiment to Evaluate Combinations of Concept Acquisition Techniques. Seventh IEEE/ACIS International Conference on Computer and Information Science, 2008. (ICIS 08). pp. 328 – 333. [25] Chen, E.,Wu, G., An ontology learning method enhanced by frame semantics, Seventh IEEE International Symposium on Multimedia, 2005 pp. 8 [26] Qiu, T., Chen, X., Liu, Q., Huang, H. A Granular Space Model for Ontology Learning. IEEE International Conference on Granular Computing, Fremont, CA, USA, 2007. pp. 61 – 61 [27] Zhou W., Liu Z.,Liu Y.,Zhao Y., Event-Based Knowledge Acquisition for Ontology Learning, 6th IEEE International Conference on Cognitive Informatics, Lake Tahoo, CA, 2007,pp. 498 - 501 [28] Mitra. P, Noy N F, Jaiswal A R. The Semantic Web – ISWC2005: OMEN: a probabilistic ontology mapping tool. Berlin: Springer. 2005. pp. 537-547 [29] Kaza, S., Chen, H., Evaluating ontology mapping techniques: An experiment in public safety information sharing. Elsevier Journal of Decision Support Systems, 2008. [30] J. Li, LOM: a lexicon-based ontology mapping tool, Presented at Information Interpretation and Integration Conference (I3CON), 2004. [31] Choi, N., Song, I., and Han, H. 2006. A survey on ontology mapping. SIGMOD Rec. 35, 3 (Sep. 2006) [32] N. Noy and M. Musen, “PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment.” Proceedings of the National Conference on Artificial Intelligence (AAAI), 2000.

[33] Gerd Stumme, Alexander Maedche, “FCAMerge:Bottom-Up Merging of Ontologies”, In proceeding of the International Joint Conference on Artificial Intelligence IJCA101, Seattle, USA, 2001. [34] Domenico Beneventano, Sonia Bergamaschi, Francesco Guerra, Maurizio, “Synthesizing an Integrated Ontology”, IEEE Internet Computing, September-October 2003. [35] AnHai Doan, Pedro Domingos, Alon Halevy, “Learning to Match the Schemas of Data Sources: A Multistrategy Approach”, Machine Learning, 50 (3): 279- 301, March 2003.

[36] Paolo Bouquet, Luciano Serafini, Stefano Zanobini, “Semantic Coordination: A New Approach and an Application”, ISWC 2003, LNCS 2870, pp.130-145, 2003. [37] AnHai Doan, Jayant Madhavan, Pedro Domingos, Alon Halevy, “Learning to Map between Ontologies on the Semantic Web”, VLDB Journal, Special Issue on the Semantic Web, 2003. [38] Nuno Silva, Joao Rocha, “MAFRA – An Ontology Mapping FRAmework for the Semantic Web”, Proceedings of the 6th International Conference on Business information Systems; UCCS, Colorado Springs, CO, May 2003. [39] Mukhlason, A., Mahmood, A.K., and Arshad,N.I., “Enhanced Moodle CMS using Semantic web Technology: Developing an Architecture of Ontology-driven Automatic Question Answering System”, Proceeding: MMU International Symposium on Information and Communication Technologies, 2007. [40] N.F. Noy and M. Musen. PROMPT: Algorithm and Tool for Automated Ontol- ogy Merging and Alignment. In Proceedings of the 17th National Conference on Arti¯cial Intelligence (AAAI'00), Austin, TX, USA, July 2000.

978-1-4244-2328-6/08/$25.00 © 2008 IEEE