Web Based Communities and Social Media2011 ...

10 downloads 1246546 Views 8MB Size Report
Feb 5, 2005 - broadcasting, reproduction on microfilms or in any other way, and storage ..... Susmit Bagchi, Samsung Electronics (siso), India .... IADIS International Conferences Web Based Communities 2011,. 3 ...... In addition, coexistence calls for a climate of mutual respect to ...... S6 50s Male Middle Accommodation.
CCSIS M MCCSIS Proceedings of the IADIS International Conferences

Web Based Communities and Social Media 2011 Collaborative Technologies 2011 Internet Applications and Research 2011

EDITED BY Piet Kommers, Nik Bessis and Pedro Isaías

IADIS INTERNATIONAL CONFERENCES

WEB BASED COMMUNITIES AND SOCIAL MEDIA 2011 COLLABORATIVE TECHNOLOGIES 2011 INTERNET APPLICATIONS AND RESEARCH 2011

part of the IADIS MULTI CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS 2011

ii

PROCEEDINGS OF THE IADIS INTERNATIONAL CONFERENCES WEB BASED COMMUNITIES AND SOCIAL MEDIA 2011 COLLABORATIVE TECHNOLOGIES 2011 INTERNET APPLICATIONS AND RESEARCH 2011

Part of the IADIS MULTI CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS 2011

Rome, Italy JULY 22 - 24, 2011 Organised by IADIS International Association for Development of the Information Society iii

Copyright 2011 IADIS Press All rights reserved This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Permission for use must always be obtained from IADIS Press. Please contact [email protected]

Volume Editor: Piet Kommers, Nik Bessis and Pedro Isaías Computer Science and Information Systems Series Editors: Piet Kommers and Pedro Isaías

Associate Editors: Luís Rodrigues

ISBN: 978-972-8939-40-3

iv

TABLE OF CONTENTS FOREWORD

ix

PROGRAM COMMITTEE

xi xix

KEYNOTE LECTURES

FULL PAPERS SOA BASED APPROACH FOR INTERCONNECTING WORKFLOWS ACCORDING TO THE SUBCONTRACTING ARCHITECTURE

3

Saida Boukhedouma, Zaia Alimazighi, Mourad Oussalah and Dalila Tamzalit

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM (IRANDOC CASE STUDY)

13

Ammar Jalalimanesh and Elaheh Homayounvala

WIKI VS. EMAIL - UNDERSTANDING COLLABORATION WITHIN VIRTUAL COMMUNITIES

20

Osama Mansour

A SYSTEM CONCEPT TO SUPPORT ASYNCHRONOUS AHP-BASED GROUP DECISION MAKING

29

Heiko Thimm

DIGITAL LIBRARIES AND SOCIAL WEB: INSIGHTS FROM WIKIPEDIA USERS’ ACTIVITIES

39

Asta Zelenkauskaite and Paolo Massa

TOWARDS A REFERENCE ARCHITECTURE MODEL FOR VIRTUAL COMMUNITIES

48

Richard Braun and Werner Esswein 57

MEDIA AND MULTICULTURALISM Kishwar Sultana

DECENTRALISING ATTACHMENT: DYNAMIC STRUCTURE ANALYSIS IN WITTER AS A FLOW-TYPE INFORMATION MEDIUM

65

Kiyonobu Kojima and Hideyuki Tokuda

THE ROLE OF ONLINE SOCIAL MEDIA APPLICATIONS IN INITIAL TRUST FORMATION TOWARDS UNKNOWN E-RETAILERS Farhod Karimov and Malaika Brengman

v

73

VANDALISM AND CONFLICT RESOLUTION IN WIKIPEDIA. AN EMPIRICAL ANALYSIS ON HOW A LARGE-SCALE WEB-BASED COMMUNITY DEALS WITH BREACHES OF THE ONLINE PEACE

81

Thomas Roessing

ANALYZING ONLINE DISCUSSION FORUMS – WHAT DO PEOPLE SHARE?

87

Alton Y.K Chua, Radhika Shenoy Balkunje and Dion Hoe-Lian Goh

LINGUISTIC CONTROL OF SELF-PRESENTATION DATA IN A CORPUS OF FORUM POSTS BY LEARNERS OF ITALIAN AS L2

94

Mirko Tavosanis and Francesco Possemato

IDENTIFYING ON-LINE GROUPS BASED ON CONTENT AND COLLECTIVE BEHAVIORAL PATTERNS

101

Dave Engel, Michelle Gregory, Eric Bell and Liam McGrath

LEARNING COMMUNITY USING SOCIAL NETWORK SERVICE

109

Maomi Ueno and Masaki Uto

RICH INTERNET APPLICATION FOR SIMPLIER TIMETABLING

120

Florent Devin, Yannick Le Nir and Peio Loubière

INSIGHTS IN USAGE OF MULTIMEDIA STREAMING SERVICES

127

Amela Karahasanović, Marika Lȕders, Elena Terradillos, María Alejandro, Juan Rodríguez, José Manuel Núñez and David Flórez Rodríguez

PUBLICATION OF WEB SERVICES WSDL FILE INTO UDDI REGISTRY OPTIMIZATION

135

Milen Petrov, Lyudmil Latinov and Adelina Aleksieva-Petrova

AUTOMATIC WEB TABLE TRANSCODING FOR MOBILE DEVICES BASED ON TABLE CLASSIFICATION

143

Chichang Jou

FLEXIBLY MANAGED USER INTERFACES FOR MOBILE APPLICATIONS

151

Pekka Sillberg, Janne Raitaniemi, Petri Rantanen, Jari Soini and Jari Leppäniemi

SHORT PAPERS LITERATURE INTEGRATING AND ANALYZING OF INTERNET LITERACY: THE CASE OF 2000-2010 THESES IN TAIWAN

163

Po-Yi Li, Shinn-Rong Lin and Eric Zhi-Feng Liu

BEHAVIORAL ANALYSIS OF SNS USERS WITH REGARD TO DIET

167

Masashi Sugano and Chie Yamazaki

THE USE OF FACEBOOK BY AMERICAN AND GERMAN TEENAGERS

171

Annamarie Krcmar, Carol Krcmar and Helmut Krcmar

SNAP: THE SOCIAL NETWORK ADAPTIVE PORTAL Alexiei Dingli, Mark Scerri, Brendan Cutajar, Kristian Galea, Saviour Agius, Mark Anthony Cachia, Justin Saliba, Jeffrey Cassar, Erica Tanti, Sarah Cassar, Shirley Cini and Mariya Koleva

vi

177

ONLINE WOM USAGE MODEL FOR TOURISM

182

Masato Nakajima, Kosuke C. Yamada and Muneo Kitajima

ISOLATING CONTENT AND METADATA FROM WEBLOGS USING CLASSIFICATION AND RULE-BASED APPROACHES

187

Eric J. Marshall and Eric B. Bell

SENTIMENT ANALYSIS IN SOCIAL WEB ENVIRONMENTS ORIENTED TO E-COMMERCE

192

Luigi Lancieri and Eric Lepretre

TOWARD INTEGRATING SOCIAL NETWORKING SERVICE AND JAPANESE MANGA IN STRATEGIC CONSUMER GENERATED DESIGN

197

Anak Agung Gede Dharma, Hiroyuki Kumamoto, Shogo Kochi, Natsuki Kudo, Wei Guowei and Kiyoshi Tomimatsu

INNOVATION MANAGEMENT IN ENTERPRISES: COLLABORATIVE TREND ANALYSIS USING WEB 2.0 TECHNOLOGIES

203

Iris Kaiser and Michael Durst

METADATA FOR A REUSABLE BUSINESS VOCABULARY ELEMENT

209

N. Ghatasheh, D. Storelli and A. Corallo

EXPLORING THE RELATIONSHIP BETWEEN IMPRESSION MANAGEMENT AND INTERPERSONAL ATTRACTION IN SOCIAL NETWORKING SITE

213

Hueiju Yu, Pei-Shan Wei, Hsi-Peng Lu and Jen-Chuen Tzou

EVALUATION OF THE SERVICE COMPOSITION TECHNIQUES: A TOOL AND CRITERION

217

Abrehet Mohammed Omer and Alexander Schill 222

MODELING XML CONTENT EXPLAINED Harrie Passier and Bastiaan Heeren

COOPERATIVE INFOTAINMENT SERVICES PLATFORM FOR AMBIENT ASSISTED LIVING

227

Markus Hager, Mais Hasan, Karsten Renhak, Maik Debes and Jochen Seitz

REFLECTION PAPERS THE STYLES OF ONLINE WOM-SENDERS AND ONLINE WOM-RECEIVERS AMONG HOT SPRINGS TOURISTS

235

Kosuke C. Yamada, Masato Nakajima and Muneo Kitajima

THE ENTANGLEMENT OF HUMAN AND TECHNOLOGICAL FACETS IN THE INVESTIGATION OF WEB-BASED COMMUNITIES

239

Laura Carletti and Tommaso Leo

ZOOTECHNICS E-SCIENCE - A MANAGEMENT TOOL RELATED TO LIVESTOCK RESEARCH Adriano Rogério Bruno Tech, Aldo Ivan Céspedes Arce, Max Vicente, Gustavo de Sousa Silva, Ana Carolina de Sousa Silva and Ernane José Xavier Costa

vii

244

A BUSINESS INTELLIGENCE VIRTUAL COMPETENCY COMMUNITY OF PRACTICE PROPOSAL

249

Diana Târnăveanu and Mihaela I. Muntean

DIGITIZATION OF THE GREEK NATIONAL THEATRE ARCHIVE

253

Nick Hatzigeorgiu and Nikos Sidiropoulos

POSTERS PORTAL DEVELOPMENT APPROACHES. PROPOSAL FOR COLLABORATIVE COMMUNITIES

259

Mihaela I. Muntean

CONTENT MANAGEMENT IN RUBY ON RAILS

263

Antonio Tapiador and Joaquín Salvachúa

WORKSHOP PAPERS CCBS – A METHOD TO MAINTAIN MEMORABILITY, ACCURACY OF PASSWORD SUBMISSION AND THE EFFECTIVE PASSWORD SPACE IN CLICK-BASED VISUAL PASSWORDS

269

Haider Al-Khateeb and Carsten Maple

THE IMPACT OF CYBERSTALKING: REVIEW AND ANALYSIS OF THE ECHO PILOT PROJECT

277

Emma Short and Carsten Maple

CYBERSTALKING: FEAR OF PHYISICAL CONFRONTATION AND ITS EFFECTS ON TRAUMA

285

Antony Brown and Emma Short

SOCIAL MEDIA AND CYBER-HARASSMENT: A LEGAL PERSPECTIVE Chris Bryden and Michael Salter

AUTHOR INDEX

viii

291

FOREWORD These proceedings contain the papers of the IADIS International Conferences Web Based Communities and Social Media 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011, which was organised by the International Association for Development of the Information Society Rome, Italy, 22 – 24 July, 2011. These conferences are part of the Multi Conference on Computer Science and Information Systems 2011, 20 - 26 July 2011, which had a total of 1402 submissions. The IADIS Web Based Communities and Social Media 2011 covers in detail two main themes: Web Based Communities and Social Media. Social Media are growing rapidly and play an increasingly important role in the development of Online Communities. Web based communities announce themselves both in your professional and private life through several new media suchas LinkedIn, Twitter, Plaxo, etc. Social media allow dynamic roles in participation, virtual presence and online communities. These new ways to communicate via online social media are having great societal effects and are motivating the creation of best practices to help individuals, corporations and authorities to make the best of it. Aware of the growing impact of social media and the influence of web based communities in today’s users/consumers behaviour, many organisations spend an increasing share of their budget in online social marketing strategies. The mission of this conference is to publish and integrate scientific results and act catalytically to the fast developing culture of web communities, while helping to disseminate and understand the latest developments social media and their impact. The IADIS Collaborative Technologies 2011 is focused on issues related to the concepts, theory, modeling, specification, implementation and evaluation of collaborative systems, technologies and their ‘wider’ applications in the information society. It pays particular attention to the ‘wider’ dimension as a mean to diversify it and broaden the applicability and scope of the current body of knowledge in the area of applied collaborative domain including emerging and next generation methods and technologies. The aim is to cover both technical and non-technical aspects of the collaborative nature of today’s information society, as well as, prompt future direction for the advancement of the community. The IADIS Internet Applications and Research 2011 aims to address the main issues of concern within Internet and WWW. These events received 134 submissions from more than 24 countries. Each submission has been anonymously reviewed by an average of four independent reviewers, to ensure that accepted submissions were of a high standard. Consequently only 23 full papers were approved which means an acceptance rate of about 17 %. A few more papers were accepted as short papers, reflection papers and posters. An extended version of the best papers will be published in selected journals, especially in the International Journal of Web Based Communities (IJWBC); ISSN: 1477 - 8394 [4 issues per year], in the International Journal of Distributed Systems and Technologies (IJDST) ISSN: 1947-3532, in the IADIS International Journal on WWW/Internet (ISSN: 1645-7641), and also in other selected journals, including journals from Inderscience. ix

Besides the presentation of full papers, short papers, reflection papers and posters, the conferences also included two keynote presentations from internationally distinguished researchers. We would therefore like to express our gratitude to Professor Gráinne Conole, The Institute of Educational Technology, The Open University, UK and Professor Carsten Maple, University of Bedfordshire, UK, for accepting our invitation as keynote speakers. This year the Collaborative Technologies 2011 conference also has a satellite workshop, the First IADIS International Workshop on the Transgressive Uses of Collaborative Systems 2011 (TUCS 2011). We wish to thank Emma Short, University of Bedfordshire, UK, for organising this successful workshop. As we all know, organising these conferences requires the effort of many individuals. We would like to thank all members of the Program Committees, for their hard work in reviewing and selecting the papers that appear in the proceedings. This volume has taken shape as a result of the contributions from a number of individuals. We are grateful to all authors who have submitted their papers to enrich the conferences’ proceedings. We wish to thank all members of the organizing committees, delegates, invitees and guests whose contribution and involvement are crucial for the success of the conferences. Last but not the least, we hope that everybody will have a good time in Rome, and we invite all participants for the next editions of the IADIS International Conferences Web Based Communities and Social Media 2012, Collaborative Technologies 2012 and Internet Applications and Research 2012, that will be held in Lisbon, Portugal. Piet Kommers University of Twente, The Netherlands Pedro Isaías Universidade Aberta (Portuguese Open University), Portugal Web Based Communities and Social Media 2011 Conference Program Co-Chairs Nik Bessis, University of Bedfordshire, United Kingdom Collaborative Technologies 2011 Conference Program Chair Pedro Isaías, Universidade Aberta (Portuguese Open University), Portugal Internet Applications and Research 2011 Program Chair Piet Kommers, University of Twente, The Netherlands Pedro Isaías, Universidade Aberta (Portuguese Open University), Portugal MCCSIS 2011 General Conference Co-Chairs Rome, Italy July 2011

x

PROGRAM COMMITTEE WEB BASED COMMUNITIES AND SOCIAL MEDIA CONFERENCE PROGRAM CO-CHAIRS Piet Kommers, University of Twente, The Netherlands Pedro Isaías, Universidade Aberta (Portuguese Open University), Portugal

COLLABORATIVE TECHNOLOGIES CONFERENCE PROGRAM CHAIR Nik Bessis, University of Bedfordshire, United Kingdom

INTERNET APPLICATIONS AND RESEARCH CONFERENCE PROGRAM CHAIR Pedro Isaías, Universidade Aberta (Portuguese Open University), Portugal

MCCSIS GENERAL CONFERENCE CO-CHAIRS Piet Kommers, University of Twente, The Netherlands Pedro Isaías, Universidade Aberta (Portuguese Open University), Portugal

WEB BASED COMMUNITIES AND SOCIAL MEDIA COMMITTEE MEMBERS Adolfo Estalella, Universitat Oberta De Catalunya, Spain Adriana Berlanga, Open University Of The Netherlands, Netherlands Agisilaos Konidaris, Technological Educational Institute Of Ionian Isla, Greece Alberto Cattaneo, Swiss Federal Institute For Vocational Education A, Switzerland Allan Yuen, The University of Hong Kong, Hong Kong Anna Hannemann, Rwth Aachen University, Germany Annabell Preussler, FernUniversität in Hagen, Germany Antonio Fini, University Of Florence, Italy Apostolos Gkamas, University Of Patras, Greece Carlo Torniai, Oregon Health&Science University, United States Chien-sing Lee, Multimedia University, Malaysia Christin Seifert, Graz University of Technology, Austria Christos Bouras, University Of Patras, Greece Christos Georgiadis, University Of Macedonia, Greece Claudia Thurner-Scheuerer, Know-center, Austria Deniz Deryakulu, Ankara University, Turkey

xi

Elias Pimenidis, University Of East London, United Kingdom Eliza Stefanova, St. Kl. Ohridski University Of Sofia, Bulgaria Enyedi Szilard, Technical University Of Cluj-napoca, Romania Eshaa Alkhalifa, Royal University for Women, Bahrain Fabio Nascimbeni, Menon Network, Belgium Flavio Correa Da Silva, University Of Sao Paulo, Brazil Francis Brouns, Open University, Netherlands Frederico Figueiredo, Mobbit Systems, Portugal Gayle Davidson Shivers, University Of South Alabama, USA George Dafoulas, Middlesex University, United Kingdom Gisela de Clunie, Universidad Tecnológica de Panamá, Panamá Giuliana Dettori, Istituto Per Le Tecnologie Didattiche, Italy Huang-Yao Hong, National Chengchi University, Taiwan Ilias Karasavvidis, University Of Thessaly, Greece Ismael Peña-López, Universitat Oberta De Catalunya, Spain Jan Frick, University of Stavanger, Norway Jesus Arias Fisteus, Universidad Carlos III De Madrid, Spain John Murnane, The University Of Melbourne, Australia Jon Dron, Athabasca University, Canada Jose Jesus Garcia Rueda, Universidad Carlos III De Madrid, Spain Jose Luis Sierra Rodriguez, Universidad Complutense De Madrid, Spain Kirsti Lindh, University Of Tampere, Finland Konstantinos Giotopoulos, University Of Patras, Greece Krassen Stefanov, St. Kl. Ohridski University Of Sofia, Bulgaria Lawrie Hunter, Kochi University Of Technology, Japan Ljuan Marko Gashi, University Of Novi Sad, Serbia Lorna Uden, Staffordshire University, United Kingdom Mandy Schiefner, University Zurich, Switzerland Marcello Sarini, University Of Milan-bicocca, Italy Marco Kalz, Open University Of The Netherlands, Netherlands Marcus Specht, Open University The Netherlands, Netherlands Maria Grazia Lerardi, Cnr, Italy Marlies Bitter-Rijpkema, Open University Of The Netherlands, Netherlands Martin Gonzalez, Tecnológico De Monterrey, Mexico Martin Llamas-Nistal, Etsi Telecomunicación, Spain Michael Filsecker, University Duisburg-essen, Germany Michael Kerres, University Duisburg-essen, Germany Michalis Xenos, Hellenic Open University, Greece Nikolina Nikolova, St. Kl. Ohridski University Of Sofia, Bulgaria Oliver Bohl, Kassel University, Germany Panayotis Fouliras, University Of Macedonia, Greece Patrick Hoefler, Know-Center, Austria Peter Albion, University Of Southern Queensland, Australia Peter Kraker, Know Center, Austria Peter Sloep, The Netherlands Open University, Netherlands Radojica Petrovic, Technical Faculty Of Cacak, Serbia Sandra Lovrencic, University Of Zagreb, Croatia xii

Sibren Fetter, Open Universiteit Nederland, Netherlands Slavi Stoyanov, The Netherlands Open University, Netherlands Sobah Abbas Petersen, Norwegian University Of Science And Technology, Norway Spyros Polykalas, Techological Education Institute Of The Ionian Isl, Greece Stefanie Lindstaedt, Know Center Graz, Austria Stylianos Hatzipanagos, King's College London, United Kingdom Thomas Köhler, Dresden University of Technology, Germany Tiberiu Letia, Technical University Of Cluj-napoca, Romania Tobias Ley, Know-Center Graz, Austria Ulrich Thiel, Fraunhofer , Germany Vaggelis Kapoulas, Research Academic Computer Technology Institute, Greece Vanessa Dennen, Florida State University, USA Vassilis Kollias, University Of Thessaly, Greece Vicente Luque Centeno, Universidad Carlos III De Madrid, Spain Viktoria Pammer, Graz University Of Technology, Austria Violeta Damjanovic, Salzburg Research, Austria Violeta Vidacek-Hains, University Of Zagreb, Croatia Vladimir Bures, University Of Hradec Kralove, Czech Republic Walid Maalej, Technische Universitat Munchen, Germany Witold Abramowicz, Poznan University Of Economics, Poland Yiwei Cao, Rwth Aachen University, Germany Zinayida Petrushyna, Rwth Aachen University, Germany

COLLABORATIVE TECHNOLOGIES COMMITTEE MEMBERS A.v. Senthil Kumar, Bharathiar University, India Alejandro Fernandez, La Plata University, Argentina Alton Chua, Nanyang Technological Universit, Singapore Andreas Menychtas, National Technical University Of Athens, Greece Andrei Semeniuta, Belarussian Trade Economic University, Belarus Areti Manataki, University Of Edinburgh, United Kingdom Arianna D'ulizia, National Research Council - Irpps, Italy Azzelarabe Taleb-bendiab, Liverpool John Moores University, United Kingdom Barbara Benito Crosetti, University Of Balearic Islands, Spain Bjorn Gottfried, University Of Bremen, Germany Bogdan Ghita, Plymouth University, United Kingdom Carlos Pinheiro, Dublin City University, Ireland Chaoying Ma, University Of Greenwich,, United Kingdom Charalampos Karagiannidis, University of Thessaly, Greece Claudia Raibulet, University Of Milano-bicocca, Italy Daniel Sanchez, European Centre for Soft Computing, Spain Darijus Strasunskas, Norwegian University of Science And Technology, Norway David Cook, American University of Iraq-Sulaimani, Iraq Dimitris Kotzinos, Technical Educational Institution Of Serres, Greece Dimosthenis Kyriazis, National Technical University Of Athens, Greece Dongwan Shin, New Mexico Tech University, USA xiii

Dorel Dusmanescu, Petroleum-Gas University Of Ploiesti, Romania Ejub Kajan, University Of Nis, Serbia El Hassan Abdelwahed, University Cadi Ayyad, Morocco Elaheh Pourabbas, National Research Council, Italy Eleana Asimakopoulou, University Of Bedfordshire, United Kingdom Elena Mugellini, University Of Applied Sciences Western Switzerland, Switzerland Fadila Bentayeb, University Of Lyon 2, France Frederic Hubert, Laval University, Quebec, Canada Gabriel Baum, La Plata University, Argentina Gabriela Moise, Petroleum-gas University Of Ploiesti, Romania Georgios Oikonomou, Loughborough University, United Kingdom Gert-jan De Vreede, University Of Nebraska At Omaha, USA Grigore Albeanu, Spiru Haret University, Romania Guillermo Jimenez, Instituto Tecnologico De Monterrey (itesm), Mexico Haifeng Chen, NEC Laboratories America, USA Hakikur Rahman, Schoolnet Foundation, Bangladesh Hani Qusa, University Of Roma, Italy Hao Cheng, Yahoo Inc, USA Ian Grimstead, Cardiff University, United Kingdom Ikram Bououd, Telecom & Management Sudparis, France Ilias Karasavvidis, University Of Thessaly, Greece Imed Boughzala, Institut Telecom Sud Paris, France Jabar H. Yousif, Sohar University, Oman Jameela Al-jaroodi, United Arab Emirates University, United Arab Emirates Jérôme Darmont, Université de Lyon (ERIC Lyon 2), France Jesus Salinas, University Of Balearic Islands, Spain Jose Santos, University Of Ulster, Northern Ireland Kirti Ruikar, Loughborough University, United Kingdom Konstantinos Tserpes, Harokopio University of Athens, Greece Liz Bacon, University Of Greenwich, United Kingdom Louise Cooke, Loughborough University, United Kingdom M. Antonia Martínez-carreras, University Of Murcia, Spain, Spain Manolis Tzagarakis, Computer Technology Institute, Greece Marcello Sarini, University Of Milano-bicocca, Italy Maria Malek, Eisti, France Marin Vlada, University Of Bucharest, Romania Marina Mondin, Politecnico Di Torino, Italy Mario Vacca, University "La Sapienza", Italy Martin Molhanec, Czech Technical University In Prague, Czech Republic Massimiliano Laddomada, Texas A&m University-Texarkana, USA Maytham Safar, Kuwait University, Kuwait Michael Vassilakopoulos, University Of Central Greece, Greece Mihaela Muntean, West University Of Timisoara, Romania Mitul Shukla, University Of Bedfordshire, United Kingdom Monica Vladoiu, PG University Of Ploiesti, Romania Mudasser Wyne, National University, USA Nikos Karacapilidis, University Of Patras, Greece xiv

Olfa Chourabi, Telecom & Management Sudparis, France Pankaj Kamthan, Concordia University, Canada Pilar Manchon, Intelligent Dialogue Systems S.l. (indisys), Spain Raquel Trillo, University Of Zaragoza, Spain Roman Povalej, Karlsruhe Institute of Technology (KIT), Germany Rossi Setchi, Cardiff Universit, United Kingdom Roula Michaelides, University Of Liverpool, United Kingdom Rushed Kanawati, University Paris Nord, France Samira Si-said Cherfi, Cnam Of Paris, France Samuel Wamba, University Of Wollongong, Australia Sarah Wilson-medhurst, Coventry University, United Kingdom Sergio Ilarri, University Of Zaragoza, Spain Sharon Cox, Birmingham City University, United Kingdom Stefano Ferretti, University Of Bologna, Italy Stelios Sotiriadis, University Of Bedfordshire, United Kingdom Stephen Emmitt, Loughborough University, United Kingdom Susmit Bagchi, Samsung Electronics (siso), India Thierry Badard, Laval University, Quebec, Canada Tim French, University Of Bedfordhsire, United Kingdom Ting Yu, University Of Sydney, Australia Tony Valsamidis, University Of Greenwich, United Kingdom Vladimir Dyo, University Of Bedfordshire, United Kingdom Wael El-medany, University Of Bahrain, Bahrain Weidong (Tony) Huang, CSIRO ICTCentre, Australia Wolfgang Prinz, Fraunhofer FIT, Germany Ye Huang, University Of Fribourg, Switzerland

INTERNET APPLICATIONS AND RESEARCH COMMITTEE MEMBERS Ai-Chun Pang , National Taiwan University, Taiwan Alberto Corrales Garcia, Universidad Castilla La Mancha, Spain Allan Macleod, University Of Abertay Dundee, Scotland Alvaro Fernandez, Universidad De Granada, Spain Andres Soto, Universidad Autónoma Del Carmen, Mexico Anna Goy, Universita Di Torino , Italy Antonia Maria Reina Quintero, University Of Sevilla, Spain Antonio Gabriel Lopez Herrera, University Of Granada, Spain Bin Guo, Institut Telecom Sudparis, France Carlos Porcel, Escuela Politécnica Superior Campus De Las Lagunil, Spain Carlos Rodriguez Dominguez, Universidad De Granada, Spain Carmen Martinez Cruz, Universidad De Jaén, Spain Christos Grecos, University Of West Of Scotland, United Kingdom Constantine Kotropoulos, Aristotle University Of Thessaloniki, Greece Cristian Mihaescu, University Of Craiova, Romania Danco Davcev , Sts Cyril And Methodius University, Macedonia Daniel Thalmann, Nanyang Technological University, Singapore xv

David Pinto, Benemérita Universidad Autónoma De Puebla, Mexico Demetrios Sampson, University Of Piraeus, Greece Dimitru Burdescu, University Of Craiova, Romania Edgardo Aviles-lopez, Cicese Research Center, USA Eduardo Peis, Universidad De Granada, Spain Enrique Herrera-viedma, University Of Granada, Spain Fabiana Vernero, University Of Turin, Italy Federica Cena, University Of Turin, Italy Florian Daniel, University Of Trento, Italy Francisco Pascual Romero, Universidad De Castilla La Mancha, Spain Garmpis Aristogiannis, Technological Educational Institution Of Messolong, Greece Geoff Lund, University Of Abertay Dundee, United Kingdom Gerald Schaefer, Loughborough University, United Kingdom Giovanna Petrone, University Of Turin, Italy Habib Zaidi, Geneva University Hospital, Switzerland Harry Agius, Brunel University, United Kingdom Hsin-mu Tsai, National Taiwan University, Taiwan Hui Yu, University Of Glasgow, United Kingdom Huiyu Zhou, Queen's University Belfast, United Kingdom Irena Mlynkova, Charles University in Prague, Czech Republic Isabel Ramos Roman , University Of Sevilla, Spain Javier Jesus Gutierrez Rodriguez, University Of Sevilla, Spain Jesus Serrano Guerrero, University Of Castilla La Mancha, Spain Jesus Torres, University Of Sevilla, Spain Jing Dong, Chinese Academy Of Sciences(casia), China José Alfredo Ferreira Costa, Federal University, UFRN, Brazil Jose Luis Martinez, University Of Castilla La Mancha, Spain José Manuel Morales Del Castillo, Universidad De Granada, Spain Juan Gabriel Gonzalez Serna, Centro Nacional De Investigacion Y Desarrollo Tecn, Mexico Juan Pablo Soto Barrera, Universidad De Sonora, Mexico Kate Ching-ju Lin, Academia Sinica, Taiwan Kevin Curran, University Of Ulster, United Kingdom Laila Benhlima, Ecole Mohammadia D'ingénieurs, Morocco Liliana Ardissono, University of Torino, Italy Luca Cernuzzi, Universidad Católica “nuestra Señora De La Asunció, Paraguay Manuel Mejias Risoto, Escuela Técnica Superior De Ingeniería Informática, Spain Marco Furini, University Of Modena And Reggio Emilia, Italy Marcus Specht, Open Universiteit Nederland, Netherlands Maria Visitacion Hurtado, University Of Granada, Spain Maria Jose Escalona Cuaresma, Universidad De Sevilla, Spain Maria Jose Rodriguez Fortiz, Escuela Técnica Superior De Ingenieria Informatica, Spain Maria Luisa Rodriguez-almendros, Escuela Técnica Superior De Ingeniería Informática, Spain Martin Gaedke, Chemnitz University Of Technology, Germany Michael Weber, University Of Ulm, Germany Miguel Onofre Martinez Rach, Universidad Miguel Hernandez, Spain xvi

Milos Kravcik, Open Universiteit Nederland, Netherlands Morten Goodwin, Tingtun AS, Norway Nawaporn Wisitpongphan, King Mongkut's University Of Technology, Thailand Noureddine Bouhmala, Vestfold University College, Norway Octavio Martin-Diaz , Universidad De Sevilla, Spain Ole-Christoffer Granmo, University of Agder, Norway Oscar Mario Rodriguez Elias, Instituto Tecnológico De Hermosillo, Mexico Otoniel Lopez, Universidad Miguel Hernandez, Spain Pablo Pinol Peral, Universidad Miguel Hernandez, Spain Paolo Ferragina, Università Di Pisa, Italy Pingkun Yan, Chinese Academy of Sciences, China Qi Wang, University Of The West Of Scotland, United Kingdom Ryszard S. Choras, University Of Technology & Life Sciences, Poland Sam Suppakkul, University Of Texas, USA Shuai Zheng, Institute Of Automation, Chinese Academy Of Scienc, China Shun-ren Yang, National Tsing Hua University, Taiwan Sok-ian Sou, National Cheng Kung University, Taiwan Spiros Sirmakessis, Technological Educational Institution of Messolong, Greece Swee Keow Goo, University Of Abertay Dundee, United Kingdom Tayeb Lemlouma, IRISA / IUT Of Lannion (University of Rennes 1), France Tran Khanh Dang, HCMC University of Technology, VNUHCM, Vietnam Victor SOSA-SOSA, Cinvestav, Mexico Wolfgang Deiters, Fraunhofer-institut Fur Software Und Systemtechnik, Germany Zhang Zhang, Chinese Academy Of Sciences, China Zhaoxiang Zhang, Beihang University, China

xvii

xviii

KEYNOTE LECTURES RETHINKING LEARNING AND TEACHING IN A DIGITAL AGE Professor Gráinne Conole, The Institute of Educational Technology, The Open University, UK

ABSTRACT There is no doubt that social and participatory media have enormous potential to transform learning, teaching and research. However this potential has not yet being realised. The talk will review the characteristics of new technologies and consider how they might be used in an educational context. It will consider some of the paradoxes associated with these new technologies and put forward some strategies for more effective use of them.

TECHNIQUES AND CHALLENGES FOR ENSURING EFFECTIVE USE OF COLLABORATIVE SYSTEMS Professor Carsten Maple, University of Bedfordshire, UK

ABSTRACT There have been significant technological advances in the development of collaborative systems in recent years leading to wide-spread adoption of such systems. While this has clear benefits for rapid communication and development of ideas, the value of the information held has meant that there are an increasing number of attacks on such systems. This talk will present some of the latest challenges in ensuring the proper use of collaborative technologies, and methods for protecting these systems.

xix

xx

Full Papers

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

SOA BASED APPROACH FOR INTERCONNECTING WORKFLOWS ACCORDING TO THE SUBCONTRACTING ARCHITECTURE Saida Boukhedouma1, Zaia Alimazighi1, Mourad Oussalah2 and Dalila Tamzalit2 1

USTHB- FEI- Departement of Computer Science- LSI Laboratory – ISI Team El Alia BP n°32, Bab Ezzouar, Alger, Algérie. 2 Nantes University - LINA Laboratory – MODAL Team 2, Rue de la Houssinière, BP 92208, 44322 – Nantes, cedex 3- France

ABSTRACT In the area of business processes, services needed and provided by organizations are more and more increasing, especially with the emergence of new technologies, such as workflow and web services supported by Service Oriented Architectures (SOA). The two technologies aim to provide flexibility, scalability and efficiency for business applications and to improve collaboration between business partners. This paper lies at a conceptual level, it proposes an approach to connect workflows of several partners using services. The approach is supported by a process meta-model which combines workflow concepts and SOA concepts, for modeling inter-organizational processes particularly built according to a subcontracting architecture. The advantage of using an approach based on services is to obtain process models flexible enough in order to allow easier adaptation in case of new business needs, because services are loosely coupled components. Our approach is illustrated by instancing concepts on a simple example of inter-organizational process. KEYWORDS Inter-organizational process, process meta-model, SOA, web service, workflow.

1. INTRODUCTION Workflow technology has been widely used in the organizational environments to support the automation of part or all of a business process according to predefined rules. This has led to considerable improvement of these processes which are therefore, called workflow processes (Aalst, 2002). Today, companies face many challenges: the exceptional growth of services they must offer to their customers, the increased need to provide better quality of service and the necessity of cooperation and collaboration with other business partners. In the workflow area, this cooperation has been initially supported by concepts and tools of InterOrganizational workflow (IOWF) (Aalst, 1999), (Aalst, 2000). Since the year 2000, with the emergence of Service Oriented Architectures (SOA) (Papazoglou, 2007) and web services (Alonzo et al, 2004), many research works like (Leymann et al, 2002), (Crusson, 2003), (Gorton et al, 2009) have been directed towards the combination of workflow and web services technologies for the development of collaborative business applications implementing inter-organizational processes, in order to take benefits from the advantages offered by both technologies. In our research, we focus on structured inter-organizational processes mainly based on architectures of cooperation well defined in the literature of IOWF; we talk about the capacity sharing, the chained execution, the subcontracting, the (extended) case transfer, and the loosely coupled architectures (Aalst, 1999), (Aalst, 2000). In our opinion, these architectures include various forms of cooperation that can exist between business partners in the context of a structured cooperation. So, they can be considered as basic patterns of IOWF models. Structured cooperation means that the inter-organizational process model is clearly defined and all process instances are executed according to this model. Also, in a context of increased globalization and in order to meet new market demands, businesses are often faced with stressful situations like a breach of contract with a partner, a failure of the business process, needs of additional resources or other situations. Faced with this, these companies must review their systems,

3

ISBN: 978-972-8939-40-3 © 2011 IADIS

their business processes and their cooperation with other business partners in order to make the necessary adjustments. In our research, we want to focus on adaptation of inter-organizational process models. Therefore, our objective at medium term is to achieve the adaptation of inter-organizational process models in order to support new business requirements and changes. For that, we aim to realize the interconnection of workflows so that they remain flexible and easily adaptable. The approach we adopt is based on services because of their characteristics: loosely coupled, invoked and business oriented. In this paper, we consider workflow processes obeying the subcontracting architecture. Thus, we propose an SOA based approach for interconnecting workflows according to this architecture. Conceptually, our approach is based on a process meta-model that regroups some concepts of the workflow technology and other concepts of the SOA paradigm. The rest of the paper is structured as follows: section 2 defines the context of the work and introduces some basic concepts. Section 3 exposes quickly some related works and explains the motivation of this paper. Section 4 presents our conceptual approach that means the scheme of interconnecting workflows and the process meta-model which covers different and complementary views of process modeling. Section 5 illustrates our approach by describing a simple example of IOWF that implements a subcontracting cooperation between two partners, providers of internet access to customers.

2. CONTEXT OF THE WORK A workflow process is the automation of all or part of a business process in which information flows from one activity to another (respectively, one participant to another) according to a set of predefined rules. Inter-organizational workflow (IOWF) can be defined as a manager of activities involving two or more workflows (affiliated with business partners) autonomous, possibly heterogeneous and interoperable in order to achieve a common business goal.

Figure 1. Scheme of subcontracting architecture

One of the architectures defined in (Aalst, 1999) for IOWF is the subcontracting architecture. This model of cooperation connects two or more business partners, each of which implements its own workflow process (see Figure 1). There is one main workflow (WF1) which subcontracts some activities not implemented locally (like A1 and A2) to one or more secondary workflow(s) (WF2 and WF3) involved in the subcontracting relationship.

3. RELATED WORKS AND MOTIVATION With the emergence of SOA and web services standards, many research works deal with orchestration and choreography of web services (Peltz, 2003), (Decker et al, 2007), (Amirreza, 2009), especially based on BPEL4WS (Business Process Execution Language for Web Services) (Jordan et al, 2006) in order to build

4

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

business processes by service composition. Also, most of the research works which are SOA oriented (Casati, 2000), (Chen et al, 2005), (Arsanjani et al, 2008), more focus on technical aspects of the proposed approaches, consider collaborative business applications that are fully automatic and do not take into account human intervention in the process. Other research works such as (Leymann et al, 2002), (Crusson, 2003), (Gorton et al, 2009), (Pedraza Ferrera, 2009) show the interest of combining BPM (Business Process Management), workflow and SOA for the re-use of services to construct dynamic business processes. In (Gorton et al, 2009), the authors propose an approach combining workflow and SOA for business process modeling. The work presented in (Crusson, 2003) describes a complete environment for collaborative processes allowing interaction between business partners via web services. Also, many platforms and approaches based on workflow and web services technologies have been proposed in the context of structured B2B cooperation, we cite as examples: CoopFlow (Chebbi et al, 2006), (Chebbi, 2007), CrossWork (Mehandjiev et al, 2005) and Pyros (Belhajjame et al, 2005), (Perrin et al, 2004). These approaches provide a certain degree of flexibility since they allow internal adaptation of workflows without compromising the coherence of the overall workflow. In our recent works, we focus on combining services and workflow for the construction of collaborative applications (Boukhedouma et al, 2010). Actually, we are interested in structured inter-organizational workflows mainly based on architectures of cooperation already defined in the literature: capacity sharing, chained execution, subcontracting, (extended) case transfer and loosely coupled (Aalst, 1999). These architectures describe various forms of cooperation between business partners in the context of structured B2B cooperation and can be considered as basic patterns of IOWF models. We plan at medium term, to achieve the adaptation of structured inter-organizational process models (basis on IOWF patterns) in order to support changes in case of new business requirements. Thus for interconnecting workflows, we adopt an approach based on services, since services are software components loosely coupled, invoked and business oriented; consequently, the IOWF models remain flexible enough to allow easier adaptation. The current work is related to the subcontracting architecture which is in our opinion, fairly common in B2B relationships. The paper exposes an approach to connect workflows using services and proposes a metamodel for process definition. The meta-model that we propose supports two kinds of activities: internal activities which can be automatic, semi-automatic or manual for the support of intra-organizational aspect and interaction activities (invocations/replies) for the support of inter-organizational aspect. Generally, a meta-model based approach aims to ensure the compliance of process models generated according to the concepts defined on the meta-model and facilitates the adaptation of models in case of new business requirements.

4. OUR APPROACH In this section, we describe a scheme of interconnecting Workflows using services and the conceptual supports of our approach mainly, the process meta-model, the generated models and the support formalisms at each level.

4.1 Scheme of Interconnecting Workflows Figure 2 bellow describes the scheme of interconnecting WFs according to a subcontracting architecture. Each partner involved in the inter-organizational workflow, implements and hosts its workflow locally. Interactions between the main WF and secondary WF(s) are done through operations of invocation/reply of services. Thus, each secondary WF involved in the cooperation is encapsulated within a service (particularly a web service) and has two main activities: an input activity for service invocation and input data flow and an output activity for returning results provided by the service, to the main workflow.

5

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 2. Scheme of interconnecting WFs according to SOA approach

In the following, we explain the conceptual supports of our approach, mainly the process meta-model. Through an instantiation mechanism, the meta-model is used for generating process models which can be of two kinds: PIM (Platform Independent) or PSM (Platform Specific) models (see Figure 3).

4.2 Conceptual Supports of the Approach

Figure 3. Conceptual levels of our approach

The meta-model level exhibits the principal concepts and the links between them for the support of process modeling. The meta-model ensures a conform definition of process models, eventually redefinition of these models if changes must be operated. The meta-model is described using UML class diagrams. The model level represents the different views of a particular process related to a case study of the real world. Process models are built conformably to the concepts identified on the meta-model. These models can be platform independent (PIM models) or related to a specific platform (PSM models) (see Fig.3). At the PIM level of business process modeling, we can use appropriate UML diagrams as modeling tools, mainly sequence diagrams for modeling interactions, activity diagrams for modeling activities of the process and control flow between them, class diagrams to describe the informational aspect and object diagrams in order to show instantiation of the meta-model. At the PSM level, we use a business process specification language to specify the process to be implemented. In the following, we describe the meta-model of the overall process and then we detail the different modeling perspectives covered.

6

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

4.3 The Process Meta-model of Inter-Organizational Processes The meta-model described in Figure 4 as UML class diagram regroups the main concepts attached to the workflow process definition according to the WFMC1 standards, and some concepts attached to SOA paradigm according to the OASIS2 standards. The goal is to exhibit the relation between the main concepts related to the two technologies in order to guarantee a correct interconnecting between workflow processes using services, in a context of subcontracting relationship.

Figure 4. Global Meta-model according to the subcontracting architecture

In fact, an IOWF is a WF composed of a main WF and a set (one or more) of secondary WF(s). The main WF defines the scheme of the overall process and triggers the execution of secondary WF through interaction activities (invocation). Seen from the main WF, the secondary WF is a simple activity but its implementation at the partner level that hosts it can be very complex. In our approach, each secondary WF is encapsulated within a service because it is perceived from outside as a black box with only inputs and outputs visible. The main WF interacts with each secondary WF via interaction activities (invocation/reply). Other activities are internal ones and not visible from outside. With this vision, the partner that hosts the main WF becomes consumer of the service and the partner that hosts the secondary WF becomes provider of the service. This particularly follows SOA architecture limited to static cooperation since in our case, the partners implied in the cooperation are known a priori. For WF process modeling, the WFMC and some works in the area like [Saikali, 2001] identified some modeling aspects covering all views of the process. These aspects are usually the process, the organizational and the informational aspects. The meta-model of Figure 4 covers all aspects mentioned above through a combination of WF concepts and SOA concepts and additionally, an interactional aspect in order to support 1 2

WorkFlow Management Coalition – http://www.wfmc.org http://www.oasis.org

7

ISBN: 978-972-8939-40-3 © 2011 IADIS

the interaction between the main WF and secondary WF(s). For SOA paradigm, we introduce only the concepts those provide visibility and invocation of the service encapsulating the secondary WF.

4.3.1 The Process Aspect The process aspect covers the functional and the behavioral views of process modeling. As shown in Figure 4, this aspect describes on one hand, the aggregation of inter-organizational WF (IOWF) in terms of one main workflow and one or more secondary workflow(s). Each WF process is structured into activities. Activity is the central concept of the meta-model linking the four views of process modeling. On the other hand, the process aspect describes the control flow of activities, namely the disjunction points, the conjunction points and the synchronization points imposed by transition conditions; these are the pre-conditions for triggering activities. The execution order of activities is expressed through appropriate operators of control flow supported by the modeling formalism. A condition can be simple or complex, a simple condition is either a logical expression on the workflow data or an event (end of activity, an expiry time or an external event such as service invocation). A composed condition is expressed through two or more simple conditions and appropriate logical operator of flow control. An activity may specialize in internal activity or interaction activity. Internal activity is controlled locally by the WFMS, it is supported by private resources of the partner that implements it. Internal activity can be manual, semi-automatic or automatic, this latter can be achieved by invoking an application or a service from the local information system. The secondary WF has the same structure as the main WF but it is completely encapsulated within a service.

4.3.2 The Organizational Aspect The organizational aspect highlights the participants involved in the achievement of the inter-organizational process. It exhibits on one hand, the partners involved in cooperation that means the service provider and the service consumer. On the other hand, this aspect shows the internal resources of each WF implemented locally (see Figure 4). Thus, each partner has a set of resources that specialize in human, material or software resources. A human resource (or human actor) takes one or more role(s), the concept of role gives flexibility at the runtime level of the process when affecting activities instances to humans.

4.3.3 The Interactional Aspect

Figure 5. Meta-model for the interactional aspect

Activities enabling communication between the main WF and the secondary WF are interaction activities: an invocation from the main WF to which corresponds a reply from a secondary WF (see Figure 5). From the service perspective, two operations allow interaction: an input operation which receives data for invocation

8

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

and an output operation which transmits the service results. Input and output data are public artifacts manipulated by interaction activities. The inter-organizational aspect is supported by the concept of contract defined through a set of constraints those can be for three kinds: syntactical, semantic and constraints of QoS (quality of service). Service interface provides necessary and sufficient information for communication with the service; it specifies syntactical constraints (format of messages and technological information). Service description contains complementary information that ensures correct interaction, it contains semantic constraints and constraints of QoS (quality of service).

4.3.4 The Informational Aspect The informational aspect is supported by the generic concept of artifact that can be specialized in data, file, form or any other document. In fact, an artifact represents any information used or produced when performing activities instances. In the context of IOWF, some artifacts are public, others are private (see Figure 4). A public artifact can be seen and manipulated by interaction activities since it is accessible by all workflows implied in the cooperation. In our case, a public artifact is any information conveyed by messages during the invocations / replies between the main WF and secondary WF(s). By cons, a private artifact is visible only within one organization and can be handled only by the activities implemented locally, that means the internal activities handling both types of artifacts (public or private).

5. ILLUSTRATION OF THE APPROACH In order to illustrate our approach, we present a simple example of inter-organizational process obeying the subcontracting architecture and we show the instantiation of the main concepts defined on our meta-model. This section mainly illustrates the models levels shown in Fig.3. Thus for the PIM models, we just exhibit an activity diagram (see Figure 6) in order to describe activities, roles (using swimlanes), invocation of web services, control flow and data flow in the process; and an object diagram to highlight the concepts instantiated on our example. We also, express the correspondence between the main concepts of our metamodel and the concepts of jPDL, the language that we use for specifying the process of our example.

5.1 Description of the Process and Instantiation of Concepts The process that we consider, involves two business partners respectively called EEPAD and AT providing internet access to customers. The EEPAD company implements a workflow processing customer requests, it needs to subcontract some of these requests with the AT company, especially when dealing with an application for novel connection to the internet. The inter-organizational process is outlined in Figure 6. A client request arrives at the agent hotline of EEPAD, it is immediately checked. If this is a technical problem, it is diagnosed by the agent hotline who solves it locally if the problem is resolvable. Otherwise, the request is forwarded to the supervisor hotline who issues a complaint and sends it to the appropriate technician. The technician studies the request of the customer and tries to solve the problem. If the client request is a request to connect to the internet, it is forwarded (via web service invocation) to the AT company in order to process it through its local workflow, in the context of subcontracting cooperation between the two companies. The workflow process of AT is encapsulated in a web service and sends the results of processing a request to the EEPAD company that undertakes to complete the processing locally and finally, informs the client of the outcome of his application for internet connection. Figure 7 bellow is an object diagram that shows the instantiation of the key concepts of the meta-model on the simple example previously described.

9

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 6. Activity diagram of the inter-organizational workflow EEPAD/AT

Figure 7. Instantiation of the meta-model on the example –workflow EEPAD/AT

10

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

5.2 Implementation of the Process To implement the inter-organizational process, we use a process definition language that allows implementation of the key concepts identified in the proposed meta-model. For that, we have specified the process of the previous example using jPDL3, a process definition language enabling the expression of automatic, manual and semi-automatic activities. This language is interpretable by the jBPM4 workflow engine which can run under the jBoss5 environment which allows the interfacing with web applications (mainly web services). In order to specify any application or service invoked by the process. Table 1 shows the correspondence between some jPDL concepts and the main concepts identified on our meta-model. Table1. Correspondence of concepts Concept of the meta-model Process Manual/ Semi-automatic activity Automatic activity Role/ Human resource Artifact Sequence Operator of conjunction Operator of synchronization Operator of disjunction

Concept jPDL Process task-node Task Swimlane/ Actor Variable Transition Fork Join Decision

Jpdl proposes the concept of “process-state” which allows the invocation of any application or service considered as a sub-process of the parent process (the invoker). The instance of the parent process stops running during the execution of the sub-process invoked and resumes execution when it is finished.

6. CONCLUSION In our research works, we are interested in structured IOWF mainly based on architectures of cooperation defined in (Aalst, 1999), (Aalst, 2000): capacity sharing, chained execution, subcontracting, (extended) case transfer and loosely coupled. These architectures which can be considered as basic patterns of IOWF models, include various forms of cooperation that can exist between business partners in the context of structured B2B cooperation. Our ultimate objective is to achieve the adaptation of structured inter-organizational process models (basis on IOWF patterns) in order to support model’s changes in case of new business requirements. For that, we focus first on the issue of workflow interconnection; we adopt an approach based on services for interconnecting workflows, since services are software components loosely coupled, invoked and business oriented, thus the IOWF models remain flexible enough to allow easier adaptation. In this paper, we have presented an approach based on services for interconnecting workflows according to the subcontracting architecture. Thus, we have described a scheme for interconnecting workflows using services, which is conceptually supported by a process meta-model. The meta-model regroups workflow concepts and SOA concepts and supports two kinds of activities: internal activities and interaction activities (invocation/reply) for the support of the interactional aspect. Through an instantiation mechanism, the meta-model is used for generating process models related to a specific process of the real world. These models are of two kinds: PIM models described by appropriate UML diagrams and PSM models expressed through appropriate specification language. For illustration, we have presented a simple example of inter-organizational process and we have shown instantiation of the main concepts using an object diagram. We have specified the process with jPDL, a language for business process definition allowing the definition of manual, semi-automatic and automatic activities and the invocation of web services under the jBPM workflow engine and the jBOSS environment. 3 4 5

www.jboss.com/products/ www.jboss.com/products/jbpm/ www.jboss.org

11

ISBN: 978-972-8939-40-3 © 2011 IADIS

We are currently working to generalize our approach for interconnecting workflows using services, to support other specific architectures of structured IOWF like chained execution, (extended) case transfer and loosely coupled architectures. Once we have established all schemes of workflow interconnection, we intend to focus on the definition of formalism for expressing basic IOWF patterns and we will work on adaptation of IOWF models.

REFERENCES (Aalst, 1999): Aalst W.V.D., 1999, Process-oriented architectures for electronic commerce and interorganizational workflow, Journal of Information systems , Copyright Elsevier Sciences. (Aalst, 2000): Aalst W.V.D., 2000, Loosely Coupled Interorganizational Workflows : modeling and analyzing workflows crossing organizational boundaries, Journal of Information and Management, 37(2): 67-75. (Aalst, 2002): Aalst W.V.D., 2002, Workflow Management: Models, Methods and Systems, The MIT Press. Cambridge, Massachusetts, London, England. (Alonso et al, 2004): Alonso G., 2004, Casati F., and Kuno H., Web services: concepts, architectures and applications, Springer Verlag, Heidelberg, Germany. (Amirreza, 2009): Amirereza T., 2009, Web Service Composition Based Interorganizational Workflows, Sudwestdeutscher Verlag fur Hochschulschriften edition, ISBN 9783838106700. (Arsanjani et al, 2008): Arsanjani A., Ghosh S., Allam A., Abdollah T., Ganapathy S., and Holley K., 2008. A Method for Developing Service Oriented Solutions. IBM Systems journal, 47(3): 377-396. (Belhajjame et al, 2005): Belhajjame K., Vargas-Solar G., and Collet. C., 2005. Pyros - an environment for building and orchestrating open services. In SCC ’05 : Proceedings of the 2005 IEEE International Conference on Services Computing, pages 155–164, Washington, DC, USA, IEEE Computer Society. (Boukhedouma et al, 2010) : Boukhedouma S., Alimazighi Z., 2010. A process meta-model based method for the development of collaborative applications built on workflow and SOA. In proceedings of EMCIS’2010- UAE. (Casati et al, 2000): Casati F., Ilnicki S., Jin L., Krishnamoorthy V., Shan M.C., 2000. Adaptive and Dynamic Service Composition in eFlow, HP- Laboratories. HLP-2000-39. (Chebbi et al, 2006): I. Chebbi, S. Dustdar, S. Tata, 2006. The view based approach to dynamic inter-organizational workflow cooperation, journal of Data and knowledge engineering, volume 56, pp. 139-173. (Chebbi, 2007): Chebbi I., 2007. CoopFlow : An approach for workflow ascending cooperation in virtual enterprises. PhD thesis- National Institute of Telecoms, France. (Chen et al, 2005): Chen M., Zhang D., Zhou L., 2005. Empowering collaborative commerce with web services enabled business process management system, Decision Support System, www.sciencedirect.com (Crusson, 2003): Crusson T., 2003. Business Process Management : from modeling to execution, positioning in relation to SOA www.intalio.com. (Decker et al, 2007): Decker G., Kopp O., Leymann F., and Weske M., 2007. Bpel4chor:Extending BPEL for modeling choreographies, In Proceedings of the 2007 IEEE International Conference on Web Services (ICWS 2007): 296–303. IEEE Computer Society, (Gorton et al, 2009): Gorton S., Montangero C., Reiff-Marganiec S., and Semini L., 2009. StPowla: SOA, Policies and Workflows, ICSOC workshops, LNCS 4907: 351-362. (Jordan et al, 2006): Jordan D., and Evdemon J., 2006. Web services business process execution language V.2.0, W3C. (Leymann et al, 2002): Leymann F., Roller D., and Schmidt M.T., 2002. Web Services and Business Process Management, IBM Systems Journal, 41 (2). (Mehandjiev et al, 2005): N. Mehandjiev, I. Stalker, K. Fessl, and G.Weichhart. 2005. Interoperability contributions of crosswork. In invited short paper to Proceedings of INTEROP-ESA’05 Conference, Geneva, Springer-Verlag. (Papazoglou et al, 2007): Papazoglou M. P., and Heuvel W., 2007. Service Oriented Architectures : approaches, technologies and research issues, Springer Verlag, The VLDB Journal, 16: 389-415. (Pedraza Ferrera, 2009): Pedraza Ferrera G.R, 2009. An extensible Framework for the Construction of Process-Oriented Applications. P Pedraza Ferrera hD Thesis, University of Grenoble I, France (Peltz, 2003): Peltz C., 2003. Web Services Orchestration and Choreography. IEEE Computer, 36(10):46-52. (Perrin et al, 2004): Perrin O., and Godart C.. 2004. A model to support collaborative work in virtual enterprises. DataKnowledge Engineering, 50(1) :63–86. (Saikali, 2001): K. Saikali, 2001. Flexible workflows using the object approach : 2flows, a framework for flexible workflow. PhD Thesis, central school of Lyon, France.

12

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM (IRANDOC CASE STUDY) Ammar Jalalimanesh and Elaheh Homayounvala Information engineering department, Iranian Research Institute for Information Science and Technology 1090, Enqelab St., PO Box: 13185-137,1 Tehran, Iran

ABSTRACT One of the most popular techniques for identifying knowledge in organizations is knowledge mapping. It can help decision makers to better understand the knowledge flow within the organizations. Mapping organizations knowledge, especially in research institutes, has attracted much attention from senior managements in recent years. Libraries, among the most important parts of research institutes, have a significant role in scientific advances. Due to this important role, many knowledge operations take place in collaboration with libraries. All of library transactions including users borrowing and returning logs and also books metadata are recorded in library information systems. Users' transaction logs are rich resources to extract information about knowledge operations in an organization. In this paper we propose a new methodology for drawing knowledge map, based on library information system logs. Our proposed methodology contains five steps including data collection and making data warehouse, data preprocessing and refinement, applying knowledge mapping algorithm for extracting input data for mapping, drawing knowledge map and finally analyzing the results. According to this methodology, we have drawn the IRANDOC knowledge map emphasizing interdisciplinary domains based on library information system users’ logs. IRANDOC knowledge map shows most studied subjects and also interrelation between them which are invaluable source of knowledge for IRANDOC decision makers in order to initiate research projects. KEYWORDS Knowledge mapping, library information system, log analysis, decision making

1. INTRODUCTION Knowledge is one of the most important strategic resources of organizations (McLure Wasko and Faraj, 2000, Hult and Ketchen, 2006). Due to the importance of knowledge, knowledge management becomes a controversial issue for organizations these days. Knowledge management requires identification, generation, acquisition, diffusion and capturing the benefits of knowledge which provides strategic advantages to the organization (Dalkir, 2007). Knowledge identification is an important phase in knowledge management and is a prerequisite for other steps. One of the most popular techniques for identifying knowledge in organization is knowledge mapping. Knowledge mapping can help decision makers to better understand the knowledge flow within the organizations. These maps can be built in order to illustrate the knowledge sources, sinks, and constraints (Chan and Liebowitz, 2006, Grey, 1999). There have been many researches focusing on the importance of knowledge mapping in organizations and society (Eppler, 2001, Huijsen et al., 2004, Jafari et al., 2009, Wexler, 2001). Some of them try to make a framework for method selection (Jafari et al., 2009, Wexler, 2001) and others focus on defining conceptual and practical methods for drawing knowledge maps (Eppler, 2001, Huijsen et al., 2004). Eppler has classified organizational knowledge maps in five categories including Knowledge source, asset, structure, application and development maps (Eppler, 2001). According to his classes, knowledge source maps answer questions about organization capability for handling projects. Knowledge asset maps visually exhibit the existing stock of knowledge of an individual, a team, a unit, or the whole organization. Knowledge structure maps depict the global architecture of a knowledge domain as well as the way in which its parts are inter connected. The application maps demonstrate which type of knowledge

13

ISBN: 978-972-8939-40-3 © 2011 IADIS

should be applied in a specific business situation and finally knowledge development maps can be used to illustrate the necessary stages to develop a certain competence. Mapping the knowledge of an organization especially in research institutes has attracted much attention from senior managements in recent years. Decision makers in these institutions, are interested in figuring out the areas that mostly fit to their backgrounds. However, the organizational knowledge has several dimensions and it is hard to draw a comprehensive knowledge map. Data analysis of organizational databases can assist decision makers in depicting such maps. Libraries are among the most important parts of research institutes and have a significant role in scientific advances. Due to this important role, many knowledge operations take place in collaboration with libraries. We believe that users' transaction logs are rich sources of data from which organization's knowledge flows can be extracted. According to this hypothesis, we tried to draw the Iranian Research Institute for Information Science and Technology (IRANDOC) knowledge map emphasizing interdisciplinary domains based on library information system logs. In this paper we propose a novel methodology for drawing knowledge map based on library information system logs. According to Eppler categories, our knowledge map is a combination of knowledge source, asset and structure maps. The aim of this research is to find interdisciplinary domains that IRANDOC can establish future research projects on them. We analyzed logs of borrowing from LIS1 and extract the relation between knowledge fields according to user’s subjects of study and based on library of congress (LC) classification. We assume that when users borrow from any couple of subjects, then they have knowledge to connect those subjects. More borrowing from both subjects shows that IRANDOC has a strong chance of success, defining interdisciplinary research projects connecting these subjects. The organization of this paper is as follows. In the second section, we describe our proposed methodology for drawing knowledge map based on LIS logs, including data collection procedure and pre-processing of raw data, as well as the logic behind the mapping algorithm. In section three, we explain our case study for drawing IRANDOC knowledge map. In this section we demonstrate implementation steps and results of the project. In section four, we make discussion and conclusion and finally we present a number of interesting topics for future research.

2. LIS-BASED ORGANIZATIONAL KNOWLEDGE MAPPING Since knowledge in organizations is hidden, we need to establish a methodology for knowledge discovery from facts. There are some key resources in an organization that can supply these facts and show the knowledge flows. Organization's library is one of these key resources especially in case of research or academic organizations. Since the most important knowledge assets of organizations are their researchers, tracking their information seeking behavior is a good way for understanding the state of organizational knowledge in every field. All of library transactions including users borrowing and returning logs and also books metadata are recorded in library information system. Therefore, by analyzing these logs we can extract valuable information about information seeking behavior of library users. Our methodology for drawing knowledge map from LIS logs which consists of five phases is depicted in figure (1). These phases are as follows: 1-Data collection and making data warehouse, 2-Data preprocessing and refinement, 3-Applying knowledge mapping algorithm in order to extract input data for the map, 4-Drawing knowledge map 5Analyzing the results.

1

Library Information System

14

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Figure 1. LIS knowledge mapping phases.

2.1 Data Collection and Preprocessing The first step is to collect data and establish a data warehouse from LIS. Data warehouse is an integrated data repository including historical data of an organization for supporting decision-making processes(Song and LeVan-Shultz, 2010). In this phase, library transaction data, users' data and books metadata including subjects are extracted from LIS and a new data model based on our needs for decision making process is built in form of a relational database. In the second phase, data has to be preprocessed and prepared for mapping. In data refinement phase incorrect, incomplete and outlier data are cut from repository. Then, in order to prepare exact input data for mapping process, some queries are formed to be answered by our data warehouse. At the next stage, mapping algorithm is applied on collected data to draw organizational knowledge map.

2.2 Knowledge Mapping Algorithm and Visualization We assume that if a researcher gets a book, he would have knowledge about its topic or is interested in it. Then, if the researcher studies two books from different subject, he would have knowledge in both of them and may have the ability to do research in interdisciplinary fields connecting the two subjects. We need a template for partitioning knowledge to distinct areas for mapping organizational knowledge. This template also needs to support knowledge flows in the organization. Based on this requirement and since books and other library documents are classified by LC code, we decided to apply Library of Congress (LC) categories as knowledge partitioning template. From this point of view, we calculated the relation between binary combinations of all subjects by counting the amount of books that every user studied from these subjects. Furthermore, we calculated size of every subject as a measure of organizational knowledge on that subject, by aggregating the number of borrowed books by all library users in the field. Formulas (1) and (2) show the core of our algorithm for drawing the organizational knowledge map based is the number of studied books on the borrowing logs. denotes the size of subject i in the map, where by user m from subject i. Therefore, all the studied books by all users in that subject are aggregated to calculate the size of it. In order to reduce the difference between the largest and smallest subject size, the result were transformed by Neperian logarithm (ln). In formula (2) depicts the weight of relation between subjects i and j where and are the number of books that users m borrowed from subjects i and j. We assume that this researcher has interdisciplinary knowledge connecting those subjects equal to minimum number of books that he studied from each of them. Now we have a cross matrix of relation between subjects W and matrix of subjects size O for drawing a knowledge map. The final output of this algorithm is an undirected graph composed of K nodes as subjects and connections between these nodes, if we have per Node i and j.

15

ISBN: 978-972-8939-40-3 © 2011 IADIS

݇

ܱ݅ = ‫ ܰܮ‬ቌ ෍ ܰ݅݉ ቍ ݇

݉ =1

ܹ݆݅ = ෍ ‫ ݉݅ܰ(݊݅ܯ‬, ݆ܰ݉ ) ݉ =1

(1)

(2)

3. IRANDOC CASE STUDY Iranian Research Institute for Information Science and Technology (IRANDOC) is an institute affiliated with the Ministry of Science, Research, and Technology (MSRT) which was established to work in the field of science and technology of Information and Librarianship. IRANDOC library books are organized based on the Congress Classification System and is run on the basis of the Open-shelf System. The users of the library are comprised of university professors, students, researchers, and the IRANDOC staff. Based on the Library Collection Policy, the IRANDOC Library, at present, provides the following subjects: Information Science and subjects related to the Library Science, Information Systems Management, Information Technology, Information Analysis, Knowledge & Information Management, Linguistics, Computerized Terminology and Technology. Since research institutes are among knowledge intensive organizations, it is crucial for them to track and explore knowledge flows in their organization. They also need to know about their abilities to plan vision of future. According to IRANDOC background, most of its previous research projects were interdisciplinary and related to information science and technology. In addition to that, information science has an interdisciplinary soul (Saracevic, 1995). Hence it is obvious that making decision about future research needs knowledge about interdisciplinary domains that IRANDOC has had more experiences and knowledge assets in them. We decided to draw IRANDOC knowledge map based on our algorithm for finding interdisciplinary knowledge capital of our organization. As noted before we assume that if more researchers study more books from two subjects, then these subjects are connected more than others. Therefore IRANDOC can establish its future research projects based on them.

3.1 Data Collection and Preprocessing IRANDOC library has about 14000 Latin books. Librarians manage books collection by use of LIS software. The software has a database based on Microsoft SQL-server and has a report generating module. We started with getting query from LIS and exporting data to Microsoft Excel files. Two main queries were formed on Latin books meta-data and borrowing transaction data since last 5 years. Based on the gathered data, new data-model was designed. Figure (2) shows entity relationship diagram of our data-model. It is consists of four tables and has star-schema architecture. Book, Transaction and patron tables are general tables. We built LC Categories table based on Library of Congress Classification (LCC). This table will help us to make better queries in relation with subjects. All the tables are connected directly or indirectly via other tables, so we can make complex queries to find relations between all dependent fields.

16

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Figure 2. Entity relationship diagram of books borrowing logs.

After building the data-model skeleton in Ms-access, we tried to fulfill it with gathered data in Microsoft Excel format. But there were some problems in data types and also some problems with the data which have been entered in an incorrect way. Some data were not consistent with others, and we had to omit them and also there were some critical incomplete fields which should be filled with correct data. At the end we had data of 12000 Latin books, about 140 users and 4655 borrowing transactions from last 5 years.

3.2 Algorithm Implementation and Visualization Our proposed knowledge mapping algorithm is implemented in our case study by utilizing MS-Access and SQL2 scripting language. For making a matrix of subjects size (O), a view was created to aggregates all the books that borrowed in that field by all the users since last five years. Building weight matrix of relation between binary combinations of subjects was more complicated in comparison with the previous queries. A complex script was developed to produce the cross matrix between all combinations. Table (1) shows the schema of this matrix for some of high studied subjects. HD category contains books in the fields of Industries, land use and labor, Z includes Bibliography, Library Science and General Information Resources, HF and T are commerce and technology respectively. Table 1. Matrix of relation between subjects.

LC

HD

HD HF T, Z,

71 74 124

HF

T,

Z,

71

74 43

124 48 54

43 48

54

The result of SQL script for producing Subjects relation was 71×71 cross matrix. For drawing the final knowledge map of IRANDOC based on extracted data, an open source add-in for Ms-Excel named NodeXL2 was used. Figure (3) shows the initial map that we draw by the aid of NodeXL. The map shows the complex graph that many vertices are connected to each other directly or indirectly via few edges.

2

http://www.codeplex.com/NodeXL

17

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 3. Initial IRANDOC knowledge map based on library users borrowing logs.

Figure (4) shows the same graph filtered to show more related subjects. The size of sphere in each node shows the amount of books that studied by IRANDOC researchers totally. The width of bars that connect subjects together shows the amount of studies as explained in formula (2).

Figure 4. Filtered graph of IRANDOC most related subjects.

4. CONCLUSION AND FUTURE WORKS Examining the output graphs reveals useful information about knowledge status in IRANDOC especially for interdisciplinary fields. For example, figure (4) shows that HD category (Industries, land use and labor) that

18

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

contains books about management science, policy making, decision making, organization productivity and industrial management has an important role in connecting other topics together. It means that HD is a hub for relating other topics and IRANDOC can invest on its sub categories in order to handle future projects. It also means that researchers, who have knowledge in this area, would be good assets for managing the research projects. The figure also shows that class Z (Bibliography, Library Science and General Information Resources) is the most studied subject according to historical data and its relation with HD class is more than other subjects. This evidence validate the IRANDOC background in library science and shows the decision makers that their organization have enough knowledge to succeed in research projects in combination of these two subjects. The map could also be used for other purposes like finding the way to handle interdisciplinary research projects. As another example, if IRANDOC wants to do a research project that needs knowledge of psychology and industrial management, analyzing the visual knowledge map can provide good guidance for such project. According to figure (4), these two subjects are not connected strongly but sociology is connecting them. So it can be inferred that researchers who have knowledge in sociology might play critical role in such projects. It can be concluded that knowledge mapping in this methodology gives valuable information to organization decision makers. It also clarifies the information seeking behavior of knowledge workers and reveals existing deviations between core strategies and knowledge operations. Knowledge mapping can be a helpful way for managers and stakeholders to recognize their organization knowledge assets. In this research, we used a straight forward way for mapping knowledge based on real data. Adding more metadata or connecting external databases like research projects will enrich the map. In this study we assumed subjects as knowledge nodes. However, the research groups or even books might be considered as graph vertices. Therefore the network shows the way that groups are connected. One of the most useful techniques that recently has been widely used in knowledge management is data mining. Techniques such as clustering based on users, books or projects may give us more knowledge about our organization knowledge capitals.

REFERENCES CHAN, K. & LIEBOWITZ, J. 2006. The synergy of social network analysis and knowledge mapping: a case study. International journal of management and decision making, 7, 19-35. DALKIR, K. 2007. Knowledge management in theory and practice, Elsevier, Butterworth Heinemann. EPPLER, M. J. Year. Making knowledge visible through intranet knowledge maps: concepts, elements, cases. In, 2001. IEEE, 9 pp. GREY, D. 1999. Knowledge mapping: a practical overview. SWS Journal. HUIJSEN, W. O., DRIESSEN, S. & JACOBS, J. Year. Explicit conceptualizations for knowledge mapping. In, 2004. HULT, G. T. M. & KETCHEN, D. J. 2006. Knowledge as a strategic resource in supply chains. Journal of Operations Management, 24, 458-475. JAFARI, M., AKHAVAN, P., BOUROUNI, A. & AMIRI, R. H. 2009. A Framework for the selection of knowledge mapping techniques. Journal of Knowledge Management Practice, 10. MCLURE WASKO, M. & FARAJ, S. 2000. “It is what one does”: why people participate and help others in electronic communities of practice. Journal of strategic information systems, 9, 155-173. SARACEVIC, T. 1995. Interdisciplinary nature of information science. Ciência da Informação, 24, 36-41. SONG, I. Y. & LEVAN-SHULTZ, K. 2010. Data Warehouse Design for E-Commerce Environments. Advances in Conceptual Modeling, 374-387. WEXLER, M. N. 2001. The who, what and why of knowledge mapping. Journal of Knowledge Management, 5, 249264.

19

ISBN: 978-972-8939-40-3 © 2011 IADIS

WIKI VS. EMAIL - UNDERSTANDING COLLABORATION WITHIN VIRTUAL COMMUNITIES Osama Mansour Department of Computer and Information Science, Linnaeus University DFM, SE – 351 95, Växjö, Sweden

ABSTRACT The email has been for many years now an indispensable organizational tool for personal communication and group collaboration. However, recently, the evolution of the wiki technology has introduced novel forms of open collaboration and flexible communication. More organizations are increasingly adopting and using this technology at the workplace. This paper reports on results from an interpretive case study which explored the evolution in collaborative and communicative practices. It examined the perceptions of members of communities of practice towards the differences between using a wiki and an email for collaboration within their communities. The case was primarily based on 16 interviews at a large multinational organization. The paper concludes with rich insights into five themes which characterize the major differences between wiki and email collaboration. These themes are nature or purpose of use, patterns or forms of collaboration, technological characteristics, representation of content, and habitual behavior. KEYWORDS Wiki, Email, Collaboration, Communities of Practice (CoPs), Knowledge Sharing.

1. INTRODUCTION The advent of the social media has introduced novel ways of collaboration, communication, and knowledge sharing in organizations. Social media is defined as a group of internet-based applications that build on the ideological and technological foundations of Web 2.0. It allows the creation and exchange of user- generated content (Kaplan & Haenlein, 2010). In this respect, Web 2.0, a platform for the evolution of social media, describes technologies like wikis, Blogs, social networks, etc. Many scholars argue that social media is changing the way people interact and work together (Hirschheim & Klein, 2010; Majchrzak, 2009). For instance, Stenmark (2008) explained that social media involves new ideas, services, and attitudes on the web. Other scholars, such as Kaplan & Haenlein (2010), maintained that while Web 2.0 is argued to refer to the old roots of the web, still it involves technical advances which enable novel forms of virtual collaboration and communication that are fundamentally different from earlier technologies. However, long before the social media, emails existed as major and established communication channels ingrained in organizations as the most widely used communication technology after the telephone (Wagner, 2004). The use of emails has a long history as a communication tool both for personal and group interactions (Muller & Gruen, 2005; Whittaker & Sidner, 1996). An email or electronic mail is a one-to-one or one-tomany tool and is used for asynchronous interactions with no central knowledge repository or knowledge organization facility (Wagner, 2004; Markus, 1994). In this respect, emails together with other types of technologies, such as electronic discussion forums and bulletin boards, are often used by electronic networks of practice to extend their reach of collaboration and sharing on the web (Wasko & Faraj, 2005). Each of these technologies has different characteristics and allows for various forms of collaboration and interaction on the web. Hence, the paper focuses on understanding perceived differences in using the wiki and the email for enabling collaborative practices among distributed members of CoPs in a large multinational organization. It seeks to find out how members of CoPs use wikis and emails for knowledge collaboration and sharing within an organizational setting. The paper eventually contributes rich insights into the differences between using

20

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

the wiki and the email for collaboration based on five major themes: nature or purpose of use, patterns or forms of collaboration, technological characteristics, representation of content, and habitual behavior.

2. RELATED LITERATURE 2.1 Web-based Collaboration Web-based collaboration has several synonyms discussed in the literature. These include electronic collaboration (eCollaboration), virtual collaboration, computer-supported collaboration, etc. In this respect, Kock (2005) defined eCollaboration as collaboration using electronic technologies (e.g. wikis, emails) among different individuals to accomplish a common task. He further identified six key conceptual elements of eCollaboration: the collaborative task, eCollaboration technology, individuals involved in the collaboration task, mental schemes processed by the individuals, the physical environment surrounding the individuals, and the social environment surrounding the individuals. Nowadays, organizations adopt collaborative technologies, such as emails, discussion forms, listrevs, etc., in order to enable and support collaboration within their virtual and distributed teams or CoPs (Wasko & Faraj, 2000). More recently, the evolution of social media and Web 2.0 technologies represents the emergence of various collaborative technologies which facilitate flexible collaboration and participation as well as the development of user-generated content on the web (Kaplan & Haenlein, 2010; Stenmark, 2008). Against this background, many scholars have argued that these technologies are changing the way people work and interact together and are introducing new possibilities for organizations to collaborate and share (Hirschheim & Klein, 2010; Majchrzak, 2009; McAfee, 2006).

2.2 Collaboration Tools 2.2.1 Wiki A wiki is often described as a simple tool which allows anyone to freely and openly collaborate in the creation of knowledge (Yates et al., 2010; Stenmark, 2008; Hasan & Pfaff, 2006). It consists of a set of web pages which are dynamically updated by a group of collaborating users who continue to add and edit content into these web pages and determine the relationships among them (Hasan & Pfaff, 2006; Wagner, 2004). Generally, a wiki can be best described as a collaborative authoring tool which allows the creation of multiple documents by a large number of people (Happel & Treitz, 2008; Augar, 2005). Openness is one of the major aspects of the wiki. It implies that anyone should have access to the wiki and collaborate freely with others to share and edit content (Wagner, 2006; Stenmark, 2005). This ability to freely edit and change wiki documents is often referred to as open editing that has a substantial effect on maintaining a democratic use of the wiki (Rafaeli & Ariel, 2008; Augar, 2005). In this respect, Hasan & Pfaff (2006) & Wagner (2006) described the wiki as an emergent conversational technology that allows for a democratic use of Information Systems in organizations through conversations and particularly for a bazaarlike collaboration where people voluntarily engage together to share and manage their knowledge collaboratively. Another important aspect of the wiki, versioning or version control, allows people to see recent changes and the history of changes on the wiki. This is important for the maintenance of the quality of contributions to the wiki (Korica et al., 2006; Wagner, 2006). Further aspects of the wiki include linking and creating pages, using simple markup language, fluid structure, and incremental development (Wagner, 2006). In addition, Yates et al. (2010) described a process called shaping which allows people not only to contribute and edit their knowledge collaboratively but also to shape their knowledge by rewriting, reorganizing, and integrating. The process of shaping is considered an important property of the wiki since it allows for its transformation into an evolving knowledge platform as a result of continued open editing and flexible interactions among people. However, this has also caused challenges for organizations with respect to the quality and reliability of contributed knowledge (Danis & Singer, 2008).

21

ISBN: 978-972-8939-40-3 © 2011 IADIS

2.2.2 Email The email is considered "the father" of all electronic collaboration technologies and is arguably the most dominant and used technology in organizations after the telephone (Kock, 2005). An email or electronic mail is described as a one-to-one or one-to-many asynchronous communication and conversation tool without a central knowledge repository or knowledge organization facility (Wagner, 2004; Markus, 1994). Jarvenpaa & Staples (2000) described the email as a computer-based collaborative system that allows information sharing within and across organizations, thus encouraging the sharing of ideas in a free manner as well as in the form of structured repositories. In their early days, emails were conceived as means to replace or extend communication possibilities in organizations (Haythornthwaite, 2001). In this respect, Haythornthwaite (2001) believed that an email could shift communications from face-to-face to electronic communications among weakly tied communicators as well as reinforce a strong social network among strongly tied communicators. Others, such as Descantics & Monge (1999), further explained that emails complement general work networks and allow for a more participative and diverse environment as well as less formal relationships and hierarchy. In this respect, knowledge culture and values can have an impact on the use of electronic mail concerning the flattening of communication hierarchies in organizations (Jarvenpaa & Staples, 2000; Descantics & Monge, 1999). Furthermore, Lee (1994) discussed information and communication richness using emails and their capacity to process rich information. He explained that an email is a lean medium because it lacks the capability for immediate feedback, uses only a single channel, filters out significant cues (e.g. body language), and tends to be impersonal. In the same vein, Dennis & Kinney (1998) discussed the immediacy of feedback and defined it as the extent to which a medium enables users to give rapid feedback on the communications they receive. He divided it into two types: concurrent feedback and sequential feedback. He termed the former as simultaneous feedback provided with the delivery of the message while the latter as the feedback that often takes the form of non-verbal gestures and occurs when the sender pauses and communicates to confirm or to redirect his\her presentation of the message. Further, Sarbaugh-Thompson & Feldman (1998) discussed two effects of the lack of cues in email communication. One was the limited range of communication with emails: negotiations that might be difficult through emails and require a richer medium. Another effect was related to communication equality in the sense that existing social hierarchies do not have a great impact on people’s electronic communication. Similarly, Markus (1994) conducted a study based on the theory of information richness in order to examine the perceptions of senior managers towards the use of electronic mails for their organization communications. He found that managers had found the email as a lean medium rather than a rich one. However, it was found to be also the primary medium for internal work-related communication.

3. METHOD The current paper was part of a larger case study at a large multinational organization called Consolidated Contractors Company (CCC). CCC is specialized in civil and mechanical construction and ranks 13 by the Engineering News Record (ENR) on the list of international contractors. It has over 170,000 employees distributed among various projects across the globe. The case in this paper was based on a recent initiative to introduce the wiki, thus enabling flexible collaboration and knowledge sharing among dispersed employees at CCC. This initiative was motivated by the complex distribution of a large number of employees and the lack of flexible systems that would allow for dynamic collaboration and sharing among them. In this respect, CCC often used several traditional document management systems to provide a basis for all content at the organization. These systems were insufficient to allow for flexible collaboration and knowledge sharing that would help in leveraging knowledge and experience of distributed employees. Accordingly, the company decided to establish a Knowledge Management (KM) department. This department was charged with the development of a collaborative platform which would allow these employees to collaborate and share knowledge with each other. Equally important, the KM department was also charged with the establishment of what is called CoPs to provide a basis for employees who would share common professional interests and stimulate collaboration among them through the wiki. While the wiki has been the primary focus of our investigation, we also sought to understand how other tools, particularly the email, were used to enable and support collaboration among member of CoPs.

22

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

In order to address this issue, an interpretative case study was used as a vehicle for our empirical inquiry. The choice of the interpretive case study was motivated by the need to emphasize human beings’ meanings and interpretations (Walsham, 1995; Walsham, 2006). These are necessary to achieve an understanding of how community members perceived their collaboration and interaction using a wiki or an email. The interpretative case study has also helped us to obtain data from the participants in their real-life settings (Yin, 2009; Eisenhardt, 1989; Trauth, 2001). In this respect, primary empirical data were obtained through sixteen interviews with junior and senior employees as well as top managers at CCC. These interviewees were self-selected after an email invitation had been sent by the KM Department to a random sample of community members to participate in this investigation. The selection criteria of these participants included age, seniority level, activity within CoPs, technological background, gender, and geographical distribution. As a result, we had five junior employees, three senior employees, and eight managers. Of these, we had four lurkers and twelve active contributors. It is worth mentioning that all manager participants had different roles within their communities. Some were community captains, responsible for driving discussions on the wiki, motivating others to join the community, and suggesting topics for discussion, while others were community managers whose role was to establish the foundation of the community, nominate community captains, etc. Our sample was also geographically diverse covering Australia, Europe, Middle East, and CIS countries. Pertaining to the actual data collection, the case study protocol included questions derived from the literature of wikis and emails together with other general information about the study such as time, purpose, confidentiality issues, etc. This protocol was used as a tool to guide our conversations with the interviewees. It was also used in an informal rather than rigid way to allow for a fluid stream of questions and the emergence of new aspects that might contribute to further support our investigation (cf. Yin, 2009). The analysis of data obtained from interviews was based on a hermeneutic approach which emphasizes a holistic understanding of the whole through understanding of the parts of a situation and their interrelationships in respect to the whole (Cole & Avison, 2007; Klein & Myers, 1999). In other words, the data was examined and read through to identify important parts linked to the purpose of the study. Then we used open coding (Rowland, 2003; Trauth & Jessup, 2000) to develop codes representing each part. These codes were further combined using axial coding to develop general themes interrelated to each other. Each interview transcript was subject to both open and axial coding to allow for an iterative understanding of the parts in respect to the whole. Furthermore, cross-transcript analysis was conducted to establish and develop further themes among several transcripts. In addition, the validity of this data was tested by member checks in order to increase the robustness of our data.

4. FINDINGS While community members at CCC used their email to receive notifications about any newly added content on the wiki, the email was still used for other primary purposes mainly personal communication and group collaboration. At the same time, they also used the wiki to support collaboration and interactions within their communities. As such, the wiki and the email were both used to enable collaboration and interaction among employees at CCC. However, they differed in terms of the way community members use each tool to collaborate and interact with each other as well as their particular affordances. A mechanical construction manager explained these differences as follows: “....The email…is a quick, versatile means of communication that has replaced the telex, fax… I think this is the number one tool for communication. The wiki is another communication tool but it has certain characteristics which, for the application of KM, is indispensable and cannot be replaced. You cannot replace it with emails; you would never reach the recipients, and the recipient would never have access to information that the wiki could provide if the wiki was not there…” The head of IT Systems described these differences by providing an example from his daily fieldwork. It shows different approaches of collaboration and interactions using an email or a wiki. He said: “I receive an email stating that there is a bug in application X. Two months later the same bug is discovered again and a request is sent on how to fix it. So most of the time we receive the same requests, many times, from different locations. If we have this solution on the wiki then any person can go and search

23

ISBN: 978-972-8939-40-3 © 2011 IADIS

for a certain, let’s say, bug or error … They can search for the solution. If they have any further questions, they can elaborate on the same page and it will be saved for later.” Further, a group technical manager described generally his perception towards the differences between an email and a wiki. His view of these differences is primarily determined by the kinds of interactions each tool is suitable for. In his words, “…The email is for day-to-day routine communication needs, its versatility …in every corner of the globe; you can read an email, so the email is indispensible. But the wiki for specific applications is also indispensable. What the wiki can do, the email cannot do.” In this respect, we sought to learn more about how and why community members used the wiki to share knowledge and experience with each other. We asked a group technical manager about this and he said: “…All information is there and to tap that information, I have to use it [the wiki]. It is the platform that carries all the information available, There are no other means.” We then asked him why he thought there were no other means. He said: “One could say, for example, we can use normal emails, but again there is a limitation in the size of emails you can transfer and emails are individual pieces of information which you cannot trace back in an easy manner. In contrast, in the wiki you can search, you can find any subject by just putting a word on it, and it will give you all the related articles that have been written on that, and it is an easy, practical tool; it has lots of advantages over any other means.” At that point, a mechanical construction manager commented on the impact of emails and wikis by saying that: “... the impact of the wiki, I must say, is not as big as the impact of the email system…Our habits have changed as a result of emails, but our habits have not changed as a result of the wiki.” He further explained the habitual use of email communications in everyday lives: “...Everybody communicates with emails. Emails have now been accessible in a practical manner through specialized mobile phones, like BlackBerry and iPhones, which have email facility. So you have the tools that make emails more practical. But that is for specific communication items. The subjects that we handle through the wiki, which is KM, are different. It’s also a communication tool but not all people go to the wiki every day, every morning, as a first job they do. Their first job is not to open the wiki; the first job is to open emails.” This perception emphasizes important habitual practices that determine the increasing use of emails compared to wikis. The ubiquity of emails in everyday life and the accessibility of email applications through regular daily devices, such as mobile phones, are major factors that motivate people to use their email more than the wiki. At the same time, this perception also implies that both wikis and emails are suited for specific and different forms of communications and interactions. For instance, one of the major aspects of the wiki is that it allows for open social interactions among community members providing them with opportunities to meet new people and establish new professional relationships. This might not be supported using private emails where communications and interactions are personal and restricted to a specific number of people. The head of IT Systems explained social interaction as follows: “Social interaction, for example, if I am discussing any point, I sometimes know people based on my experience during project visits. But there are other people whom I don’t know so it will give me an opportunity to know more about them and their titles or functions within the project or the company. So it is introducing more people through this media instead of just sitting and knowing the persons around you.” In a similar sense, a group plant manager said: “...You can also go to the wiki, visit these communities, look up a certain article, and you see faces or friends, It’s completely different from emails... You are sharing with others...you feel, especially when categorizing these communities into different disciplines, you are part of a group or family. That feeling you don’t have with other means of communication ”. The mechanical construction manager added : “…I have access to a lot of information, which was previously hard to get, and it might have been difficult to find easily who had had the best information within our company. But now I can easily access this man, and this piece of information [using the wiki]…”

24

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Further, the group plant manager explained that the open and informal nature of a wiki stimulates collaboration and increases its scope compared to the formal nature of an email. He said: “...To a certain extent it is a less formal means of communication so people would really voluntarily be more open to write things. You know if I want to send an article by email I will be more selective.... People see ... that they are free to put ideas and select things and contribute.” A senior systems administrator further described the free and open nature of a wiki and its influence on motivating people to collaborate and share their knowledge with others. She said: “...When you see more people online and more people sharing their opinions, posting things, and so on, you feel more motivated”. However, a group quality manager explained the problems and barriers which have impeded sharing and collaboration using a wiki, These are related to what we have previously said about the ubiquity and accessibility of emails in everyday life: “...Emails are available everywhere so a lot of knowledge is being shared through emails and not necessarily through the wiki ... Emails probably are easier and faster so there is a lot of knowledge and exchanges through the normal email route.” One mechanical manager provided more elaboration on this and explained how an email could be more useful for particular forms of interactions that would differ from interactions supported by the wiki: “With emails, you can discuss more problems and get direct answers...Because, through emails, you discuss day-to-day problems. Any time you have a problem at site, you send an email or receive comments...but the wiki system, is used as a reference for information, already discussed and put there for a reference.” A construction manager added briefly: “...The email, outlook in particular, is faster than the wiki when it comes to getting information.” As such, an email is more suited to address daily problems and issues that require immediate feedback and quick answers. The wiki, in contrast, is more suited for discussing problems and sharing content that can be used for later purposes. For instance, it can be used to read about specific work methods discussed by the community. In this respect, a control project manager (Australia) explained his perception towards shared knowledge on the wiki as follows: “The wiki is more detailed ... To me it is like a base of information related to the job.” He further gave an example to illustrate how shared knowledge on the wiki could be useful to learn from other community members who share and discuss their experiences and knowledge on the wiki. He said: “...It’s learning ... Maybe I was doing some work which I used to do in a particular way in the last ten years. And now probably there is somebody in South Africa who is doing it in a different way using efficient equipment, and they post it into the system. Of course I am learning.” In addition, a control project manager (Oman) predicted different directions in which emails and wikis could be used. He argued: “I think even with more development and improvement of the wiki, email will still be there. But it will not be maybe in the same amount that we have now. Maybe ... for confidential things and private things, maybe you will use emails. For things that can be shared, then you will use the wiki.” Though the group plant manager agreed with that, he voiced his concerns about confidentiality of sharing knowledge on the wiki and the potential risk of becoming accessible to competitors: “... There was really a debate about, let's say, whether we should do that or I mean, let’s say, what sort of knowledge will be shared ... I mean one of the things ... is that our competitors may use this wealth of knowledge which I am against ”

5. DISCUSSION AND CONCLUSIONS In our investigation of how members of CoPs used wikis and emails for knowledge collaboration and sharing, we sought to increase our understanding of several aspects of collaboration that describe the differences in using the wiki and email. We found five major themes that characterized these differences:

25

ISBN: 978-972-8939-40-3 © 2011 IADIS

nature or purpose of use, patterns of collaboration, technological characteristics, representation of content, and habitual practices. Each theme represents particular differences in respect to how community members perceived collaboration with either technology. The first theme, the nature or purpose of using a collaborative technology, is related to what Kock (2005) described as the collaborative task. Kock (2005) explained that the nature of the collaborative task could have a strong effect on its outcomes when certain electronic collaboration technologies are used. In this respect, both wikis and emails share the purpose of enabling collaboration, communication, and knowledge sharing among people (cf. Yates et al., 2010; Jarvenpaa & Staples, 2000). However, participants in this research revealed several variations in the nature of collaborative tasks that a wiki or an email could support. For instance, an email was described as a tool that was useful to communicate immediate feedback in the sense of sharing and communicating daily project problems with others and eventually receiving quick answers. While this is an asynchronous process of collaboration where feedback often requires some time to be delivered or received, it is still faster than the wiki. Many participants described the wiki as a conversational technology (cf. Hasan & Pfaff, 2006; Wagner, 2004) suitable for discussions and negotiations and that it might be useful for future referencing purposes rather than getting quick feedback or answers. The immediacy of feedback (cf. Dennis & Kinney, 1998) represents a major difference in respect to the nature or purpose of using the wiki or the email. In other words, the nature of collaborative tasks supported by an email are often expected to provide or intend an immediate feedback. In the wiki, the nature of collaborative tasks depends on rich conversations and negotiations among community members and thus feedback is not expected to be immediate but rather gradual and indirect. The second theme is concerned with patterns or forms of collaboration. This theme is directly related to the third theme: technological characteristics. Emails and wikis enable and support various forms of collaboration and communication depending on their technological characteristics. We found that, for instance, the open and free nature of the wiki (Yates et. al, 2010; Happel & Treitz, 2008; Cunningham, 2004) allowed for flexible, informal, transparent, and community approach to collaboration and interaction. For instance, our participants explained that the openness of the wiki facilitated free collaboration and sharing among community members in which any could freely and openly create and shape knowledge (cf. Yates et al., 2010). Most importantly, openness of the wiki (cf. Mansour et al., forthcoming) allows for transparent interactions and open sharing within the community. The ability to openly contribute knowledge that is accessible by anyone creates opportunities for recognizing frequent contributors in particular domains as well as establishing relationships with them. While an email can strengthen relationships among members with strong ties (Haythornthwaite, 2001), open interactions using a wiki allow members to meet new people and establish a wider social network at work. Further, our participants explained the importance of this kind of open sharing to facilitate transparency of interactions within the community. Such transparency was considered a means to motivate collaboration and interaction since community members tend to be more motivated to share and collaborate when they can see others actively participating in sharing their knowledge with the community. In contrast, the email was described as more personal and confidential where participants tended to be selective when they communicated and shared knowledge with others and thus collaboration was limited to one-to-one or one-to-many communication channels (cf. Wagner, 2004; Sarbaugh-Thompson & Feldman, 1998; Markus, 1994). By and large, different technological characteristics of emails and wikis determine and shape several variations among patterns or forms of collaboration and interaction through these technologies. Fourth, the representation of content in both wikis and emails is also linked to technological characteristics of these tools. Many participants explained that the way content was represented or communicated had a determining impact on shaping and driving collaboration. In an email, for instance, content is scattered across several email messages. Our participants explained that this unstructured representation of content made it difficult to find necessary information. It also implies that knowledge is not owned by individuals rather than the community. On the contrary, content on a wiki is created and developed by the community. It is also located in one place so that all community members can participate in creating and shaping knowledge collaboratively. In fact, the way by which knowledge is communicated and shared through a wiki or an email determines how knowledge is represented. Email communications are most often individual and thus result in scattered information lacking common repository, integrity, and consistency (cf. Wagner, 2004; Markus, 1994). Wiki-based communications, in contrast, are collaborative involving many people who collaborate with each other to develop shared content.

26

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Finally, our participants considered habitual behavior as one major determining factor on whether to use a wiki or an email to collaborate and share with others. Habitual practices have been observed to represent a difference in why community members would choose to use the wiki or the email. Our participants further explained that ubiquity and accessibility of email applications through daily devices, such as mobile phones, made people more accustomed to using emails for collaboration and communication in their everyday lives. Moreover, since emails are designed to send and receive quick information, people tend to favor them over wikis for day-to-day communications. Accordingly, our participants explained that the fact that emails existed long before wikis was a major factor for developing habitual practices, thus making them tend to use the email more than the wiki in their everyday lives.

REFERENCES Augar, N. et al., 2005. Towards building web based learning communities with wikis. Proceedings of the IADIS International Conference on Web Based Communities, Algarve, Portugal, pp. 207 – 214. Cole, M., and Avison, D. 2007. The potential of hermeneutics in information systems research. European Journal of Information Systems, Vol. 16, No. 6, pp. 820 – 833. Cunningham, W. 2004. Wiki Design principles. Available on the web at: http://c2.com/cgi/wiki?WikiDesignPrinciples [Accessed March 2011]. Danis, C., and Singer, D. 2008. A Wiki Instance in the Enterprise: Opportunities, Concerns, and Reality. Computer Supported Cooperative Work, San Diego, USA, November 8 – 12. Dennis, A., and Kinney, S. 1998. Testing media richness theory in the new media: The effects of cues, feedback, and task equivocality. Information Systems Research, Vol. 9, No. 3, pp. 265 – 274. Descantics, G., and Monge, P. 1999. Communication processes for virtual organizations. Journal of Computer Mediated Communication, Vol. 10, No. 6., pp. 693 – 703. Eisenhardt, K. 1989. Building theories from case study research. Academy of Management Review, Vol. 14, No. 4, pp. 532 – 550. Happel, H-J., and Treitz, M. 2008. Proliferation in enterprise wikis. Proceedings of the 8th International Conference on the Design of Cooperative Systems (COOP), Carry-le-Rouet, France. Hasan, H., and Pfaf, C., 2006. Emergent conversational technologies that are democratizing Information Systems in Organizations: The case of the corporate wiki. Proceedings of the Information Systems Foundations (ISF): Theory, Representation and Reality Conference, Australian National University, Canberra, Australia. Haythornthwaite, C. 2001. The strength and the impact of new media. Proceedings of the 34th Hawaii International Conference on Systems Sciences, Island of Maui, Hawaii, USA. Hirschheim, R., and Klein, H., 2011. A Short and glorious history of the Information Systems field. To appear in the Journal of the Association of Information Systems. Jarvenpaa, S., and Staples, D. 2000. The use of collaborative electronic media for information sharing: An Exploratory study of determinants. Journal of Strategic Information Systems, Vol. 9., pp. 129 – 154. Kaplan, A., and Haenlein, M. 2010. Users of the world unite! The challenges and opportunities of social media. Business Horizons, Vol. 53, pp. 59 – 68. Klein, H., and Myers, M. 1999. A set of principles for conducting and evaluating interpretative field studies in information systems. Management Information Systems Quarterly, Vol. 23, No. 1, pp. 67 – 94. Kock, N. 2005. What is e-collaboration?. International Journal of e-Collaboration, Vol. 1, No. 1, i – vii. Korica, P. et al., 2006. The growing importance of e-communities on the web. Proceedings of the IADIS International Conference on Web Based Communities, San Sebastian, Spain, February 26 – 28, pp. 165 – 174. Lee, A., 1994. Electronic mail as a medium for rich communication: An Empirical investigation using hermeneutic interpretation. Management Information Systems Quarterly, Vol. 18, No. 2, pp. 143 – 157. Majchrzak, A. 2009. Comment: Where is the theory of Wiki?. Management Information Systems Quarterly, Vol. 33, No. 1, pp. 18 – 20. Mansour, O. et al., forthcoming. Wiki-based community collaboration in organizations. Proceedings of the 5th International Conference on Communities and Technologies, Australia, Brisbane. McAfee, A. 2006. Enterprise 2.0: The dawn of emergent collaboration. MIT Sloan Management Review, Vol. 47, No. 3, pp. 21 – 28. Markus, M. 1994. Electronic mail as the medium of managerial choice. Organization Science, Vol. 5, No. 4, pp. 502 – 527.

27

ISBN: 978-972-8939-40-3 © 2011 IADIS

Muller, M., and Gruen, D. 2005. Working together inside an emailbox, In H. Gellersen et al. (eds.), ECSCW 2005: Proceedings of the Ninth European Conference on Computer-Supported Cooperative Work, September 18 – 22, Paris, France, pp. 103 – 122. Rafaeli, S., Ariel, Y. 2008. Online motivational factors: Incentives for participation and contribution in Wikipedia". In Barak, A. (ed.), Psychological aspects of cyberspace: Theory, research, applications. Cambridge, UK: Cambridge University Press. Rowlands, B. 2003. Employing interpretive research to build theory of information systems practice. Australasian Journal of Information Systems, Vol. 10, No. 2, pp. 3 – 22. Sarbaugh-Thompson, M., and Feldman, M. 1998. Electronic mail and organizational communication: Does saying “Hi” really matter?. Organization Science, Vol. 9, No. 6, pp. 685 – 698. Stenmark, D., 2008. Web 2.0 in the business environment: The new intranet or a passing hype?. Proceedings of the 16th European Conference on Information Systems, Galway, Ireland. Trauth, E. 2001. Qualitative research in IS: issues and trends, London: Idea Group Publishing. Trauth, E., and Jessup, L. 2000. Understanding computer-mediated discussions: Positivist and interpretive analysis of group support system use. Management Information Systems Quarterly, Vol. 24, No. 1, pp. 43 – 79. Wagner, C., 2004. Wiki: A Technology for conversational knowledge management and group collaboration. Communications of the Association of Information Systems, Vol. 13, No. 9, pp. 265 – 289. Wagner, C. 2006. Breaking the knowledge acquisition bottleneck through conversational knowledge management. Information Resources and Management Journal, Vol. 19, No. 1, pp. 70 – 83. Walsham, G., 1995. Interpretive case studies in IS research: Nature and method. European Journal of Information Systems, Vol. 4, No. 2, pp. 74 – 81. Walsham, G. 2006. Doing interpretive research. European Journal of Information Systems, Vol. 15, pp. 320 – 330. Whitaker, S. and Sidner, C. 1996. Email overload: Exploring personal information management of email. Proceedings of CHI’96, pp. 276 – 283. Yates, D. et al., 2010. Factors affecting shapers of organizational wikis. Journal of the American Society for Information Science and Technology, Vol. 61, No. 3, pp. 543 – 554. Yin, R., 2009. Case Study Research: Design and Methods, London: SAGE Publications. Wasko, M., and Faraj, S. 2000. “It is what one does”: Why people participate and help others in electronic communities of practice. Journal of Strategic Information Systems, Vol. 9, No. 2 – 3, pp. 155 – 173. Wasko, M., and Faraj, S. 2005. Why should I share? Examining social capital and knowledge contribution in electronic networks of practice. Management Information Systems Quarterly, Vol. 29, No. 1, pp. 35 – 57.

28

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

A SYSTEM CONCEPT TO SUPPORT ASYNCHRONOUS AHP-BASED GROUP DECISION MAKING Heiko Thimm Pforzheim University, School of Engineering Tiefenbronner Straße 65, 75175 Pforzheim, Germany

ABSTRACT Through group decision making complex decision problems can be handled. Over more than three decades the Analytic Hierarchy Process (AHP) has been successfully applied to such problems especially if many decision alternatives and criteria are to be considered. We first present a process model for AHP based group decision making that we derived from a practical view. We then analyze limitations of today’s available AHP based group decision support systems (GDSS) with respect to decision scenarios where participants are separated in space and/or time. A concept for a GDSS system is then proposed that through Group Decision Scripts can automatically complete specific group moderation tasks and through Group Decision Wikis enables effective information sharing. KEYWORDS AHP, Group Decision Making, Collaborative Decision Making, Group Decision Support System

1. INTRODUCTION A lot of decisions in today’s globalized business world and also in other organizational contexts are complex decisions that are difficult to make by only a single person. As a possibility to overcome these difficulties organizations can request a group of persons (often experts in the subjects related to the decision) to perform a corresponding group decision process in a collaborative effort. Watson et al. (1998) described the ultimate aim of this effort to be the achievement of a consensus for one of the possible decision alternatives. This group result is obtained by interpersonal communication (i.e. the exchange of information) among the members for detecting and structuring a problem, generating alternative solutions to the problem, and evaluating the solutions (DeSanctis and Gallupe 1987). The Analytic Hierarchy Process (AHP) developed by Thomas Saaty in 1970s (Saaty 1980) has been applied successfully in such group decision making processes for more than three decades (Vaidya and Kumar 2006). Even though the AHP has not explicitly been devised for group decisions it offers enormous potential to multi-persons decisions through capabilities such as support of conflict resolution, problem and preferences structuring, alternative development, group facilitation, consensus building, and fairness (Schmoldt et al. 2001). Several dozens of commercial and also open source software solutions have been developed which at the centre make use of the AHP process. The general type of software under which these solutions can be subsumed is referred by Group Decision Support System or in short just GDSS (Gray et al. 2011; Gray 2008; Nunamaker and Deokar 2008; French and Turoff 2007). The first GDSS solutions were mostly developed as single user desktop software systems. Today’s available software support options for AHP-based group decision making include web-versions of these early solutions enabling multiple users to directly participate in group decisions. However, often these web versions are the result of the usual webification approach (Shen et al., 2007). It is the principle goal of this approach to re-organize and enhance a pre-existing GDSS software such that users can connect from a browser over the Internet to a backend system that performs the core AHP algorithm. In many cases of AHP-based software this evolutionary approach was followed instead of a radical redesign of the initial (desktop) version under consideration of the specific requirements of group decision making over the Internet. For that reason today’s available AHP-based GDSS solutions that can be deployed in web scenarios impose some limitations concerning effective group decision making processes.

29

ISBN: 978-972-8939-40-3 © 2011 IADIS

Given this situation in our ongoing research reported in this article we from a practical point of view investigate the limitations of AHP-based GDSS. We first develop a process model that describes the main elements of group decision making processes such as the different phases and activities of such processes and also the main actors. Drawn upon the results of our investigation we develop a system concept for AHPbased group decision making over the Internet with two unique features. First, through dynamically generated scripts referred by Group Decision Scripts the system offers active capabilities that free the decision process moderator from otherwise time consuming administrative and monitoring tasks. These lower level active capabilities are complemented by further more advanced active capabilities that deal with AHP-specific moderation tasks. Second, functionality specialized for sharing of information related to the group, the AHP model, intermediate and further AHP processing results is directly integrated within the system. This functionality is made available per group decision process in a separate collaboration wiki environment through the concept of Group Decision Wiki. Following this introduction the next section contains our analysis of the broader context of AHP based group decision making. Then, in section three our system concept is described followed by a discussion of related work in section four. Finally, section five contains our conclusions and also main aspects of our future research.

2. AHP BASED GROUP DECISION MAKING IN PRACTICE In this section we analyze the broader context of AHP based group decision making in the practice. The analysis does not address the relatively rare cases where all steps of the decision making process and especially the execution of the core AHP algorithm are completed without any support software. We in the following assume that the core AHP algorithm is completed by a corresponding software solution referred by AHP Processing System (APS). In the first part of this investigation we focus on the general core process structure of AHP based group decision making. Thereafter, we identify the main actors and describe their engagement in the different process phases. We then extend our investigation by considering different collaboration modes of AHP based group decision making. In this context we will also review major software support options available today.

2.1 General Core Structure of AHP Based Group Decision Making A frequently cited model of the group decision process is Simon’s five phases model (Simon 1960) in which the group decision process is structured into five phases. These phases are referred by (1) pre-decision phase, (2) intelligence phase, (3) design phase, (4) choice phase, and (5) post-decision phase. Our research builds on an own but similar model shown in Figure 1 which structures AHP based group decision making in only three phases referred by (1) set up phase, (2) core AHP execution phase, and (3) conclusion phase. Our model in contrast to Simon’s model is specialized to the AHP method and reflects experimental results that we obtained for a set of real world group decision scenarios to which we applied the AHP method based on the easy-mind AHP software. S

Set Up Phase

Core AHP Execution Phase

• build group • create logical AHP Model

• map logical AHP model • calculate inconsistency ratio • manage inconsistent votes to physical model • complete AHP algorithm • perform sensitivity analysis

Conclusion Phase

E

• build consensus • develop action plan • conclude decision process

Figure 1. Our three phases model of AHP based group decision making.

The set up phase consists of two activities that may partially be completed in parallel. The activity that targets to build the group starts first. It largely depends on the organizational context how this activity is executed. For example, the execution can follow a command and order style such as common in the military, an open community style with people applying to become a group member, a by-invitation style as common for peer-reviewed research publications and many other styles.

30

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

The second activity of the set up phase is referred by “create logical AHP Model”. It is the objective of this activity to prepare the AHP model that includes group members, alternatives, criteria, and rating scales. Different model building approaches can be followed by this activity depending again on the organizational context and the kind of decision to be made. The common model building alternatives include the development of a completely new AHP model from scratch, the use of a model template, and the re-use of an existing model with some modifications. It again mainly depends on the organizational context to which degree the group members are involved in the model building activity. If there is the objective to achieve a group consensus about the AHP model to be used or if the model is dictated by some authority is also mainly determined by the organizational context. For example, in emergency response handling contexts often the tight time constraints require that the AHP model is “dictated” by the command office. In the second process phase referred by “Core AHP execution” the (logical) AHP model is mapped into a corresponding physical model. By this physical model we mean the specific model representation of the APS software being used. Following that the AHP steps are completed according to the general AHP algorithm. First, the pairwise criteria comparisons are performed and respective votes from all group members are elicited and collected. An inconsistency analysis of the ratings follows next which can result into revisions of the previous criteria comparison results. Then, the pairwise alternative comparisons are completed where the group members’ votes are obtained and then further processed to yield the final group decision result. The following sensitivity analysis either directly concludes the second process phase or it leads to a re-iteration over some of the earlier activities defined for the second phase. The activities of the third phase target the conclusion of the entire group decision process in a more or less formalized way. Typically it is attempted to achieve a group decision consensus (Schmoldt et al. 2001) and also to develop an agreed action plan and recommendations. For example, in the case of a financial planning decision problem where a strategic budget allocation decision was obtained as group decision result a possible recommendation can be the refinement of this result into a detailed budget allocation plan. Note that it can also be recommended to perform revisions to the group constellation and/or the AHP model and to once more repeat the group decision process. Which style will be followed in the execution of this third phase depends again on the organizational context.

2.2 Central Actors of AHP Based Group Decision Making In practice, group decision making based on the AHP method involves two main types of actors that we in the following refer by “facilitator” as the first type and “group member” as the second type. In Table 1 these types of actors are characterized in terms of their engagement roles in the three phases of AHP based group decision making. The columns of the table contain the activities that are performed by these actors in the three process phases. Note that in practice persons can participate in a group decision process as facilitator and at the same time too as group member. Table 1. Actors and their activities in the different phases of AHP based group decision making.

Actor facilitator

group member

Set Up  manage forming of group by communication with potential participants  manage AHP model develop-ment by sharing a first model proposal and by moderation of the further revision steps  communicate with process owner concerning participation  engage in AHP model development

Activities of Actor Core AHP Execution  prepare APS by specification of AHP model  moderate and steer process execution  elicit judgments from participants and enter them into APS  handle inconsistency analysis  moderate sensitivity analysis  state judgments  perform inconsistency improvements  participate in sensitivity analysis

Conclusion  moderate formal conclusion  moderate development of recommendations and action plan

 engage in development of recommenda-tions and action plan

31

ISBN: 978-972-8939-40-3 © 2011 IADIS

The facilitator is a person in charge of the complete group decision process. That she/he is responsible for the process is often explicitly defined in the organization’s business process landscape or other defining artifacts. In terms of the three core phases of group decision making the facilitator is clearly the most important actor and “driving and steering engine” for a smooth process execution and effective information sharing (Dennis 1993). In fact in a recent study (Kolbe and Boos 2009) evidence was found that the suboptimal decision performance of groups is often directly related to ineffective group coordination activities. In the set up phase it is the facilitator’s responsibility to establish the group and to build the AHP model so that the expected output will be achieved. Obviously, this can involve a lot of communication efforts. In the core AHP execution phase the facilitator is expected to prepare the APS system by creating the specific (physical) AHP model. The facilitator’s further responsibility is to lead and to moderate the group through the various execution steps. This for example includes that the votes are elicited from the group members and entered into the APS System in a well organized manner. This also includes that the inconsistency analysis and sensitivity analysis are completed in a structured way. The moderation of the group with the objective to bring the process to a formal conclusion and to obtain recommendations and an action plan are the responsibilities of the facilitator in the conclusion phase. Every person who acts as group member is expected to complete several tasks. In the set up phase the group members agree to participate in the group decision process possibly after some negotiations with the facilitator. In the first phase they also participate in the development of the specific AHP model for example by giving review feedback to the facilitator’s model proposal. In the core AHP execution phase the group members are expected to carefully perform judgments for the respective AHP elements based on available knowledge and to state votes (possibly anonymously). During the conclusion phase the group members are requested to interact with the facilitator and the other group members. These interactions are intended to lead to a formal conclusion of the group decision process and result into recommendations and an action plan.

2.3 Collaboration Scenarios and Available System Support Options In our above described process model we so far abstracted from details concerning the collaboration among the actors. We in the following address this aspect by an investigation of major collaboration characteristics in synchronous and asynchronous collaboration scenarios. In this investigation we also consider for each scenario concrete system support options that are available today. Note that in our process model these available system options have been abstracted by the notion of AHP Processing System (APS). Synchronous Collaboration Scenarios. In a synchronous group decision making collaboration scenario the actors are neither separated by time nor by space. That is the facilitator(s) and the group members are colocated in the same room and participate together in a face-to-face meeting. By using specific digital presentation devices for shared information viewing with integrated moderation functionality such as beamers and interactive whiteboards it is possible to improve the efficiency of the decision making process. The use of such (optional) peripheral equipment also leads to more convenience for the participants. A relatively straight-forward approach to improve the efficiency of synchronous group decision processes is the use of one of the available spreadsheet software products with a corresponding AHP extension (i.e. plug-in). Through this approach the data related to the AHP model as well as some but not all of the data related to the group decision making process can be administered in a tabular form. The functional range of these spreadsheet based software solutions is typically focused on the activities of the core AHP execution phase while the other phases are often simply ignored. Despite the functional limitations in some cases a spreadsheet based support system approach can still lead to good group decision results (Syamsuddin and Hwang 2010). Compared to a spreadsheet based AHP software more advanced system support can be obtained by the use of a specialized decision support software package that builds upon the AHP method. The desktop versions of Decision Lens™ and Expert Choice™ are two examples of such commercial packages. Corresponding open source product examples include RightChoiceDSS and HIPRE 3+. It can be generally observed that these products apart from core AHP processing functionality often also offer functionality that is targeted on other aspects of the decision process. Typical examples are functions to document the process execution, functions to obtain interactive context-sensitive help, orientation information, and recommendations. Also functions for reporting and analysis purposes including the generation of diagrams

32

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

and charts are usually supported. Some systems also offer more sophisticated evaluation functionality for example to detect and improve voting inconsistencies. Moreover, some systems support peripheral devices that are specialized on group decision making such as voting keypads to enable anonymous votes. Asynchronous Collaboration Scenarios. In asynchronous collaboration scenarios the actors are separated by time or/and by space. Therefore, it is difficult and sometimes even impossible for the facilitator to directly interact with the group members. Because of the fact that AHP based group decision making requires a considerable amount of interactions between the facilitator and the group members and also among the group members themselves a solution to overcome this separation is inevitably required. In general, by the use of today’s available information and communication technology solutions it is relatively easy to overcome the separation in time, in space, and in both dimensions. If the actors are only separated in space the group decision process can be completed through the use of an online meeting support system with functionality for application sharing such as studied in the research of Mark et al. (1999). Through the application sharing functionality joint viewing and editing of AHP data can be enabled for the dislocated actors. Our research is however focused on the case where the actors are separated in the time and space dimension in parallel. Obviously, sharing an AHP support system in an online meeting is not a proper approach for this case. A possible approach to overcome the two dimensional separation is the use of one of today’s available AHP based decision support software developed for Intranet/Internet scenarios such as easy-mind, AHPproject, Web-HIPRE, and the web versions of Decision Lens™ and Expert Choice™. The main characteristics of this system support option are that the users can through a standard Internet browser connect to a backend system which can be reached over an Internet/Intranet connection. The AHP model and all data related to the execution of the group decision process are centrally stored in this backend system. Also the model execution is performed at the backend system. The asynchronous execution of the group decision implies that the backend system supports information sharing and also information management functionality for the facilitator. We believe that today’s available Web based AHP systems with the above described characteristics can lead to good results for certain asynchronous group decision scenarios. It however appears that the strength of these systems is rather at the core AHP model execution and therefore they only offer limited support functionality for group facilitation. These limitations can often be observed for web based AHP systems that where derived from a pre-existing AHP software package through the usual webification approach (Shen et al. 2007).

3. TOWARDS AN ACTIVE GDSS FOR ASYNCHRONOUS GROUP DECISION MAKING OVER THE INTERNET In the previous section we analyzed AHP based group decision making in practice. Our analysis lead to the observation that available Internet based GDSS software solutions do not fully address the practical needs of AHP based group decision making processes. A more detailed analysis of the functional limitations and their implications for group decision making processes follows next. Drawn upon the results of this investigation we propose a system concept. This concept has been devised with the goal to support the practical needs of today’s typical asynchronous group decision making scenarios and also those scenarios that can be envisioned for the future.

3.1 Analysis of AHP Based Group Decision Making over the Internet Asynchronous group decision making over the Internet based on today’s available GDSS requires that at least one participant (most likely the facilitator) coherently administers and moderates the completion of the group decision process. The asynchronous mode of collaboration incurs several essential tasks to be managed including the organization, facilitation, and coordination of the group interactions, progress and time management, and conclusion management. Compared to a synchronous collaboration context a lot more efforts are needed for these moderation tasks when the participants are separated in time and/or space. These larger efforts are related to the fact that the computer mediated interaction among separated group members naturally requires extra efforts such as the editing of text messages or recording of audio messages. Another

33

ISBN: 978-972-8939-40-3 © 2011 IADIS

reason is that if members are separated the time management usually requires from the moderator a larger number of direct interactions with members for example in the form of reminder messages as compared to synchronous collaboration scenarios. In addition to these more general reasons also the specific structure of the core AHP method can require from the moderator (i.e. the facilitator) considerable more efforts in asynchronous scenarios as opposed to synchronous scenarios. For example, when a too high inconsistency ratio occurs in the core AHP execution phase an extra (possibly reoccuring) coordination task is implied. In this extra task it is attempted to achieve an acceptable inconsistency ratio by the revision of the most inconsistent votes. This may require several revision management iterations where voters are to be requested to revise their earlier votes. When the revised votes result into a new inconsistency ratio which is still too high a further iteration needs to be completed. In synchronous scenarios it is possible to efficiently handle these extra tasks in an ad hoc one-tomany interaction mode with all group members together. As opposed to that, in asynchronous scenarios the handling of inconsistencies requires a possibly large set of one-to-one communication activities which can include individualized notifications and reminders for each group member. Today’s available Web based AHP systems only address some (but very limited – if at all) of the needed support functionality for the above identified moderation tasks. It seems yet to be possible for an experienced person to achieve acceptable results for asynchronous AHP based group decision making over the Internet with such limited functionality. However, the functional limitations will most likely require considerable time resources for example for information sharing and interaction with the remote group members. Some of the challenges implied by the moderation management tasks can be handled with the use of specialized information and communication technology solutions such as email, phone, messaging systems, collaboration portals, wikis, and blogs. However, the use of such tools in parallel to a Web based AHP system leads to the typical problems of disintegrated systems including error-prone multiple entry of the same data, limitations concerning global status information, and process inefficiencies due to switching between different systems.

3.2 Overview of Proposed System Concept As a central design principle we target a system concept that through active capabilities can perform some of the administrative tasks of group decision making. We envision the possibility that facilitators of group decision processes can chose a set of tasks from a predefined set of options and delegate these tasks to the system. Thus, the facilitator can spend more time on the more sophisticated activities of group decision making. The completion of facilitation tasks often involves and also depends on certain conditions. The conditions for example can refer to deadlines. Time management for group decision processes, among others, requires to monitor deadlines and in case of overdue votes to send reminders to group members. Some facilitation tasks involve pre-conditions. We especially consider pre-conditions for the inconsistency analysis, sensitivity analysis, and the computation of the final group decision result. The pre-conditions for these main steps of the group decision process often imply an evaluation of the response received from the participants. For example, it is considered as a pre-condition for the inconsistency analysis that the votes of all participants need to be available before the system may automatically continue with the evaluation of inconsistency ratio. Another condition example is that a minimum number of votes or a specific set of participants’ votes need to be available before it can be continued with a determination of the highest ranking decision alternative. As a foundation of our system concept we used the above described approach of Internet based GDSS. Given this basis our further conceptual elaboration was focused on two goals: (1) enhancement of this basis towards a system that can autonomously perform pre-specified moderation tasks and (2) extension of this basis by comprehensive and seamlessly integrated functionality for information sharing specialized on the needs of the set up phase and conclusion phase. An overview of the resulting system concept is given in Figure 2. A standard web browser is considered as the front end component for the typical web style of interaction of the system with the users. The core part of the system is the backend environment with a central data repository through which the components of the environment share common data. The Control Handler of the backend environment handles the communication with the front ends. Note that the typical components needed for a web style of communication (e.g. a HTTP server) from a conceptual

34

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

point of view belong to the Control Handler. However, in the further description we abstract from them and rather focus on the other main tasks of the Control Handler for our specific group decision context. The Control Handler in response to user requests handles and controls the processing of the backend environment. Instructions as well as reply messages from the users are received and processed. Whenever other components of the backend need to be involved in the further processing the respective components are activated by the Control Handler. The required active system capabilities for specific administrative moderation management tasks are addressed by means of dynamically generated control scripts referred by Group Decision Scripts or GD Scripts in short. In principle these control scripts imply system performed monitoring and execution tasks at runtime. The GD Scripts are derived from user-specified parameters that are declared in an initial step by the facilitator and then submitted to the system where respective runtime system activities are prepared. Table 2 contains a first set of areas of moderation management with corresponding system actions.

Browser based Front Ends Internet/Intranet Backend Environment Control Handler Interaction Manager

Message Manager

Model Manager

Model Processor GDScript Script GD GD Script GD Script

GD Wiki Data Repository

Figure 2. Proposed concept for an active system to support AHP based group decision making over the Internet.

The Interaction Manager among others supports information sharing functions that are helpful for the set up phase and the conclusion phase. These functions are offered based on a web wiki approach where the complete interactions within a group decision making process over all three phases are integrated in a wiki context. That is the Interaction Manager administers the occurring set of interactions separately for each group decision process in a logically separated own wiki referred by Group Decision Wiki or GD Wikis in short. The typical functions of web wikis for information sharing, asynchronous interaction between users and also the completion of surveys and votes is by this approach made available to the users. Apart from administering and sharing information that are entered by the human participants the Interaction Manager is also used by other system components of the backend environment for information sharing with the users. For example, results obtained by the Model Processor such as the inconsistency ratio or the final AHP result are entered by the Interaction Manager into the corresponding GD Wiki. The Model Manager handles the creation of new AHP models based on corresponding user specifications. This user specification task can be supported by generic model templates and also the possibility to reuse existing models. In addition to the AHP model this user specification task also includes the selection of the set of moderation management tasks to be delegated to the system. The AHP Model Manager administers the model and the selected parameter settings in the central data repository of the backend. This includes the AHP model data, contact and further data about the participating group members, and the set of execution and monitoring parameters.

35

ISBN: 978-972-8939-40-3 © 2011 IADIS

Table 2. Initial set of areas for active system capabilities.

Area of moderation management Facilitation of set up phase Management of sensitivity analysis Time management

Management of voting inconsistencies Provisioning of update information Facilitation of conclusion phase Information transfer management

System Actions

System prepares a GD Wiki with corresponding initial information, guidelines, artifacts, and templates specific to the set up phase; this can also include voting functionality System makes proposals for revisions of votes as implied by a sensitivity analysis; system issues revision requests to participants and handles the response and nonresponse System monitors deadlines concerning replies to invitations to participate in group decision process, replies to requested votes and revisions of votes, and requested responses to proposals for an action plan; system issues reminder messages System computes proposals for revisions of votes as implied by an inconsistency analysis; system issues revision requests to participants and handles the response and non-response; depending on the resulting inconsistency ratio these actions are to be repeated several times System delivers progress status information to process owner and group members concerning the information content, level of detail, update frequency System extends the GD Wiki with results and background information of the core AHP execution phase; also provided are relevant templates, guideless, and artifacts System transforms AHP model and group description from the more user friendly logical format to the specific physical representation of the AHP model processor; system transforms results of the AHP model processor into different graphical formats

The Model Processor based on the parameter settings generates a corresponding GD Script. The script is executed by the Model Processor which can last relatively long time spans such as several months. The data resulting from the AHP model execution that are prepared for viewing by the group members are not directly returned to the users. Instead these data are provided to the users through the Interaction Manager based on a web2.0 like approach as described earlier. Note that this approach concerns also the preparation of data resulting from the intermediate steps of the AHP process. The Message Manager in general performs active notification services over various communication channels including email and sms in order to inform participants. The notification activities of the Message Manager are driven by the Model Processor. For example, users are informed when new data resulting from the AHP model execution have been published by the Interaction Manager. Also the Message Manager monitors scheduled responses from the participants and possibly issues reminders if deadlines for responses are missed. These scheduled responses are targeted on the specific needs of the three phases of the AHP process. For example, if the participants have been notified about needed revisions of their votes in order to improve the inconsistency ratio, the Message Manager keeps track of the reply messages from the participants.

4. RELATED WORK GDSS are typically designed to aid groups in analyzing problem situations and in performing group decision making processes. According to DeSanctis and Gallupe (1987) a GDSS can support groups on three levels. It provides process facilitation (technical features), operative process support (group decision techniques) and logical process support (expert knowledge). The research reported in this article mainly focuses on process facilitation and operative process support with the AHP method as underlying basis. With the decision for the AHP method as foundation we parallel the work of several other research groups (Schmoldt et al 2001). Different types of GDSS have been described by Power (Power 2000) who defines communications driven GDSS to support more than one person working on a shared task by support for communication, cooperation, and coordination. Data driven GDSS emphasize access to and manipulation of time series of internal

36

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

company and external data. Document driven GDSS manage and retrieve, summarize, and manipulate unstructured electronic information. Knowledge driven GDSS offer expertise in problem solving while model driven GDSS emphasize statistically financial optimization or analyzing a situation. With respect to this classification one focus area of our current research are the communication driven aspects of GDSS. Through the concept of Group Decision Script for automating specific moderation management tasks we strive to improve the efficiency of group decision making processes. The consideration of information sharing across the different phases of group decision processes by the concept of Group Decision Wikis is another focus area of our research. This area is especially related to Power’s category of document driven GDSS. Our Group Decision Wiki concept is especially directed at the intelligence phase, design Phase, and to a lesser degree to the post-decision phase of Simon’s group decision process model (Simon 1960). In these phases information is acquired, structured, augmented, prepared, and shared in a controlled manner (Dennis 1993). The different information acquisition patterns during group decision making in synchronous and asynchronous scenarios and their impact on the decision quality have been studied in (Saunders and Miranda 1998). Also the general information exchange patterns in group decision making processes under consideration of theories on group dynamics have been researched (Troyer 2003). We will in the planned near future refinement of our system concept define specific functions to address these and other results reported in the research literature.

5. CONCLUSIONS AND OUTLOOK In this article we presented first results of our long term research on IT support for group decision making. In the current stage of our research the focus is on asynchronous group decision making over the Internet with the AHP being used as multi-criteria decision making algorithm. Drawn from our analysis of the structure and the various actors of AHP based group decision making processes central GDSS requirements were identified. The two most important requirements that circumscribe our system approach from existing web based GDSS are 1. active system capabilities for specific moderation management tasks and 2. a seamless integration of web2.0 like collaboration support functionality. We address these requirements through a backend system environment that in addition to an AHP model processor consists of further components that automate specific moderation management tasks. Users can through the activation of parameters dynamically configure system activities that contribute to effective group decision processes. A prototype implementation of the system concept using the MySQL Database Management System and state of the art web technologies such as the PHP programming environment is currently under development. We will use the prototype to gain through laboratory experiments more insights into the interaction patterns of asynchronous group decision making and the set of moderation management tasks that can be automated by the system. For our lab experiments we will consider different domains of group decision making including collaborative design in virtual corporations, natural resources planning, telemedicine, and rescue and disaster management. While most of the experimental research on group support systems as reported by (Fjermestad and Hiltz 1998) and also other researchers is mostly focused on brainstorming tasks we will furthermore use our prototype to experiment with group evaluation tasks. Sample areas that imply group evaluation tasks include quality audits in companies and outcome assessments in university accreditation processes. The broad range of domains that we intend to address in laboratory experiments indicates our general assumption that our proposed GDSS can deliver benefits to many different types of group decision problems. Our more long term oriented future research will address especially two extensions of our current focus on asynchronous group decisions. The first extension concerns system concepts to support group decision processes where the group collaboration mode may dynamically change across different phases of a group decision process. This can even be further extended to seamless transitions between different collaboration modes at any time within a group decision process. Furthermore, this issue can involve transitions from one mode of collaboration into another one for all group members or only a subset of the group members. An example for such a subgroup is a set of group decision participants that in a spontaneous face-to-face meeting focus on a sub-issue of the decision problem. We also intent to investigate concepts to enable that different group members work on different parts of the AHP hierarchy.

37

ISBN: 978-972-8939-40-3 © 2011 IADIS

Another focus of our future research is the integration of mobile devices where the communication patterns need to be optimized in terms of energy consumption. Furthermore, in order to support different types of mobile devices the active system capabilities need to be extended by functionality to adapt information to specific limitations of the receiving device such as the display size.

REFERENCES DeSanctis, G. and Gallupe, R.B., 1987. A foundation for the study of group decision support systems. Management Science,Volume 33, Number 5, pp. 589-609. Dennis, A.R., 1993. Information Processing in Group Decision Making: You Can Lead a Group to Information, but You Can’t Make It Think, MIS Quarterly 20(4), pp. 433-457 French, S. and Turoff, M., 2007. Decision support systems, Communications of the ACM, vol. 50, no. 3, pp. 39–40 Jerry Fjermestad and Starr Roxanne Hiltz. 1998. An assessment of group support systems experimental research: methodology and results, J. Management Information Systems, Vol. 15, Issue 3, pp. 7-149. Gray, P. 2008. The Nature of Group Decision Support Systems. in Burstein, F. and Holsapple, C. (Eds.), Handbook on Decision Support Systems 1 - Basic Themes, Springer, pp. 371-389 Gray, P. et al., 2011. GDSS Past, Present, and Future. in Schuff, D. et al. (Eds.), Decision Support, Annals of Information Systems 14, Springer Science+Business Media, pp. 1-24 Kolbe, M. and Boos, M., 2009. Facilitating Group Decision-Making: Facilitator's Subjective Theories on Group Coordination. Forum: Qualitative Social Research, 10(1), Art. 28, http://nbn-resolving.de/urn:nbn:de:0114fqs0901287 (accessed April 4th, 2011) Mark, G. et al., 1999. Meeting at the Desktop: An Empirical Study of Virtually Collocated Teams. Proc. 6th European Conf. on Computer Supported Cooperative Work, Copenhagen, Denmark, Kluwer, pp. 156-179 Nunamaker, J., and Deokar, A.V. 2008. GDSS Parameters and Benefits. in Burstein, F. and Holsapple, C. (Eds.), Handbook on Decision Support Systems 1 - Basic Themes, Springer, pp. 391-414 Power, D.J., 2000. Web based and model-driven decision support systems: concepts and issues. Proceedings Americas Conference on Information Systems. Long Beach, California Power, D.J. and Kaparthi, S., 2002. Building Web-based decision support systems. Studies in Informatics and Control, issue 11, pp. 291-302 Saunders, C. and Miranda, S., 1998. Information acquisition in group decision making. Information & Management, issue 34, Elsevier Sciences, pp. 55-74 Saaty, T. L. 1980. The Analytic Hierarchy Process. McGraw Hill, New York. Schmoldt, D. L. and Peterson, D. L. 2000. Analytical group decision making in natural resources: Methodology and application. Forest Science 46: 62-75. Schmoldt, D.L., Mendoza, GA, and Kangas, J., 2001. Past Developments and Future Directions for the AHP in Natural Resources, in Schmoldt et al. (Eds.), The Analytic Hierarchy Process in Natural Resource and Environmental Decision Making, Kluwer Academic Publisher Shen, H. et al., 2007. Collaborative Web Computing: From Desktops to Webtops, IEEE Distributed Systems Online, vol. 8, no. 4, pp. 3 Simon H.A., 1960. The new science of management decision, Harper and Row, New York Syamsuddin, I. and Hwang, J., 2010. The Use of AHP in Security Policy Decision Making: An Open Office Calc Application. Journal of Software, Vol. 5, No. 10, Academy Publishers, pp. 1162-1169 Troyer, L. 2003. Incorporating Theories of Group Dynamics in Group Decision Support System (GDSS) Design. Proceedings International Parallel and Distributed Processing Symposium, IEEE Society, pp.108 Vaidya O.S. and Kumar, S., 2006. Analytic hierarchy process: An overview of applications. European Journal of Operational Research, vol. 169, no. 1, pp. 1–29 Watson, R.T. et al.. 1998. Using a GDSS to facilitate group consensus: Some intended and unintended consequences. MIS Quarterly, pp. 463-478

38

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

DIGITAL LIBRARIES AND SOCIAL WEB: INSIGHTS FROM WIKIPEDIA USERS’ ACTIVITIES Asta Zelenkauskaite1 and Paolo Massa2 1

Indiana University, 1229 East Seventh Street,, Bloomington, IN 47405, USA 2 Fondazione Bruno Kessler , Via Sommarive 18, Trento, 38123, Italy

ABSTRACT A growing importance of the social aspects within large scale knowledge depositories as digital libraries was discerned since the last decade for its ever increasing number of digital depositories and users. Despite the fact that this digital trend influenced multiple users, yet little is known about how users navigate in these online platforms. In this study Wikipedia is considered as a lens to analyze user activities within a large scale online environment, in order to achieve a better understanding regarding user needs in online knowledge depositories. This study analyzed user activities in real setting where editing activities of 686,332 active contributors of English Wikipedia have been studied within a period of ten years. Their editing behaviors were compared based on different periods of permanence (longevity) within Wikipedia’s content-oriented versus social-oriented namespaces. The results show that users with less than 21 days of longevity were more likely to interact in namespaces that were designated for social purposes, compared to the users who remained from two to ten years who were more likely to exploit functionalities related to content discussion. The implications of these findings were positioned within the collaborative learning framework which postulates that users with different expertise levels have different exigencies. Since social functionalities were more frequently used by users who stayed for short periods of time, inclusion of such functionalities in online platforms can provide support to this segment of users. This study aims at contributing to the design of online collaborative environments such as digital libraries where socialoriented design would allow creating more sustainable environments that are built around the specific needs of diverse users. KEYWORDS Digital libraries, Social web, Wikipedia, User-centric design.

1. INTRODUCTION Over the past twenty years, the widespread diffusion of information and communication technologies transformed academic and institutional repositories to digital archives. Digital libraries and museums on a daily basis are used by millions of users with varying levels of expertise and ranges of interests. Thus, the increased number of digital depositories not only serves for archival purposes, digitalization provides a better access and a wider range of services to its users. The necessity of digital libraries to better serve the needs of the users remains of a high relevance (Bearman, 2007). The importance of digital libraries has been emphasized by the European Union with the launch of the i2010 Digital Libraries Initiative - one of three flagship projects under the EU’s i2010 strategy. From the content digitalization point of view, more than 10 million digital objects of European cultural heritage are going to be served by Europe's largest digital library – Europeana (European Commission n.d.). Europeana was built to serve as a single access point for consulting digital copies of the materials held by libraries, museums and archives. Similarly, in the context of United States, digital libraries have been embraced from the infrastructure point of view as “a national challenge application area” where digital libraries become a priority (Borgman, 1999, p. 228). Digitalization of the libraries opened a window for multiple areas of research. Bearman (2007) investigating the last decade of research in the realm of digital libraries, noted that there is no agreement about what knowledge architecture is best suited for digital libraries. He indicated two alternatives: ontologies and collaboration. Researchers proposing the first approach assert that ontologies could help in

39

ISBN: 978-972-8939-40-3 © 2011 IADIS

overcoming the differences between schemas of distributed datasets and even linguistic boundaries allowing to deal with multilingual datasets. The other line of research is related to collaboration among users. The relevance of the users was argued based on the premise that collections of digitalized content are gathered to fulfill the needs of users (Borgman, 1999). Similarly, it has been emphasized that collaboration toolsets can foster the rise of a more useful digital library for each user community (Bieber, et al., 2002; Renda & Straccia, 2002). Given the discussion about these two different lines - ontologies and collaboration, this study considers digital libraries as being part of services that are dedicated for users defining it in the following way: A digital library is a distributed technology for creation, dissemination, manipulation, storage, integration and reuse of digital information (Semeraro et al., 2001, p. 45). The accessibility issue has been especially capitalized in the context of World Wide Web since enables users to access information in more rapid ways and in more geographically dispersed areas compared to traditional library settings. The digitalization process drastically affected the user experience in using the online services. One of the critical point regarded the ambivalence of digitalization – on one hand it increased the access on the other hand it resulted in users being more remote than ever to the traditional library services such as consulting. Therefore, users of the online platforms are increasingly getting more heterogeneous with different levels of skills and their distinct needs (Borgman, et al., 2005). However, digitalization of libraries represents more than just technological change in affordances of the system: it entails also changes in social uses which are described in terms of social affordances. Social affordances have been defined as analogous to technological affordances - as the “[p]roperties of a CSCL (computer-supported collaborative learning) environment that act as social-contextual facilitators relevant for the learner’s social interaction” (Kreijns et al., 2002, p. 13). In the recent years, social interaction has become a leading trend in ICT. Gruber (2008) defined it as “an ecosystem of participation, where value is created by the aggregation of many individual user contributions.” The scholar emphasized that the social web is one of the two key ingredients, along with the semantic web, for creating collective knowledge systems. In line with the user-centric trend that is reflected through social web and the increased need for interaction within the online systems, digital libraries are faced with new challenges such as how to create platforms that would better fit the needs of its users. User-centric components of social web are entering into the realm of all the online environments, including digital libraries. Digital libraries within the social web context have been proposed as “the application of web-based library services and collections” (Maness, 2006). The concept of library in the social web era is constructed around the user-centric elements to increase accessibility, and benefit from the digital libraries communities (Maness, 2006). Therefore, in addition to primary services that libraries provide to its users, digital libraries face different challenges – such as making the services accessible and easily available to its users. Digital library communities serve to share the know-how of large scale depositories where librarians are not always readily available for consultation. In line with this social-oriented trend in online environments, user experience – such as navigation and interaction with the services - becomes an increasingly important component of the vision of the future of digital libraries. In the context of digital libraries design-based issues, the emphasis is positioned to a user-centric approach and there is a notable need to relate users in relation the digital library community (Buckland, 2003). Given the emphasis on the users, the issues of personalization have been emphasized by Renda and colleagues (2002). The scholars addressed the value of personalization claiming that digital libraries not only serve as an information resource where users may submit queries to satisfy their information needs, but also as a collaborative working and meeting space. In their prototype of digital library environment, users may “organize the information space according to their own subjective view, they may become aware of each other, exchange information and knowledge with each other, may build communities” (Renda et al., 2002, p. 1). Despite the fact that digital libraries foster library community building, yet a better understanding of needs of users of digital libraries has been neglected (Bearman, 2007; Farooq et al., 2009; Gazan, 2008; Pomerantz, 2008). Therefore, understanding how various types of users navigate in large-scale online depositories and how they benefit from the services is of crucial importance. The goal of this study is to provide a systematic analysis of user engagement with different social-oriented and content-oriented functionalities in the context of a large collaborative environment – Wikipedia. Overall, the study aims at contributing in designing of digital library platforms that in addition to content-oriented facilities will endorse social-oriented functionalities in order to meet needs of its users.

40

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

1.1 Towards a User-Centric Approach The user-centric approach to digital libraries has been argued by Marchionini (1999) in terms of digital library as being space for sharing knowledge. The scholar acknowledged diverse needs of users and provided a model which emphasized community-based sharing and its benefits. Marchionini (1999) suggested that a digital library could serve as a sharing environment that “by combin[ing] elements of learning communities, scientific collaboratories, and special libraries to facilitate communication and distribute the load of solving information problems” (Marchionini, 1999, p. 1). The problem-solving issues have become a part of the future vision of digital libraries. Marchionini situated problem-solving in the context of the information seeking and collaboration with substantial advantages for the community. Such system would allow (a) experts to share their knowledge and time in digital reference, question-answering, and recommendation services; (b) easy contribution and sharing of (digital) content by library patrons and the user community; and (c) better support of services (Marchionini, 1999, p. 3). It has been emphasized that services to the user community “have traditionally not been as strong” in digital libraries compared to physical library environments (Pomerantz & Gary, 2007, p. 10). However, Pomerantz and Gary (2007) argued that the value of digital libraries consists in providing a space for users for purposes of sharing. The scholars argued that the crucial role of the digital library is the space for individuals, for collaborative work, and as a space for social activity, that will become increasingly important. Wikipedia has been considered as an exemplar case of online collaboration where through the unpaid and volunteer work of millions of users coordinated via a single web site, up-to date resources are created (Bryant et al., 2005, Viégas et al., 2007). Wikipedia serves as a unique case of study because it involves a large number of users who every day log on to the system to receive and provide information as well as use various functionalities provided by the platform. By January 2011 more than 13 millions of registered users have contributed to the English Wikipedia (Wikipedia, n.d.). Moreover the system has been operating for around 10 years and hence accumulated a vast amount of evidence about user’s activities in a large online setting. In order to coordinate between large fluxes of users, Wikipedia provides different functionalities, grouped in different namespaces (Viégas et al, 2007). They can be distinguished in two main types of activities – creation and maintenance of content articles, which are primary by design in the context of an encyclopedia, and secondary spaces which serve the purposes of communication. The distinction between primary activities by design and secondary ones dedicated to user-centric activities has been observed in other forms of media, and especially in convergent media, such as interactive television with user-based contributions via short message service (Zelenkauskaite & Herring, 2008a). For example, in the studied interactive television context, television programming has been considered as a primary activity, while short message service – where interactions between the users occurred, was considered as a secondary by design. Differentiating needs of users in online communities of practice have been studied in the context of Wikipedia by Bryant and colleagues (2005). The scholars argued that membership in a community of practice got mediated by the access to different functionalities that the platform provides. Based on the idea that various types of users may benefit from each other in digital library systems (Bryant et al., 2005; Marchionini, 1999), this study considered user activities in Wikipedia’s various spaces – the ones that are dedicated to content and the ones which are based on social purposes, the following research question was posed: RQ: Did users differentiate in using social-oriented and content-oriented functionalities based on their length of stay in Wikipedia?

2. DATA To test the research question focused on the user behavior in digital spaces, the contributions of the users to the different name spaces have been extracted from datasets provided by the Wikimedia Foundation, which contained information about every change occurred in every page of Wikipedia. The contribution has been operationalized as a single edit made by a Wikipedia user. The analyzed dataset comprised all the edits made by the users to every page for a period of ten year from 2001 (the inception of English Wikipedia) up to 2nd August, 2010. This study considered only editing of registered users because they are univocally

41

ISBN: 978-972-8939-40-3 © 2011 IADIS

recognizable by their nick name, while anonymous users could edit multiple times with different login name (in a form of IP address) . The unit of analysis of this study is the user. The English Wikipedia sample contained more than 13 million registered users, but only 3,762,277 of them performed at least one edit. For the purposes of this analysis, we have considered only those with more than 10 edits in order to obtain a meaningful sample of users who have a minimum level of involvement in Wikipedia. The number of users that met this criterion was equal to N=686,332. With regards to the activity level (the total number of edits executed by these users in all the name spaces), it was equal to N=253,502,255. This large number of edits makes the English Wikipedia a very interesting setting for the study of user behaviors in online platforms.

2.1 Longevity Users' editing activity has been compared based on their longevity in the system, i.e. how long they remained active in Wikipedia. The longevity - the number of days spent in Wikipedia - was extracted from the dates of the first and the last edit for each user. Further, users were grouped into four longevity groups based on quartiles. The 1st quartile comprised users who remained in Wikipedia less than 21 days; the 2nd quartile contained users who remained in the system from 21 to 226 days; the 3rd quartile included users who remained in Wikipedia from 226 to 848 days (more than 2 years); the 4th quartile included users who remained from 848 days to 10 years (see Table 1).

2.2 Namespaces of Wikipedia Based on the fact that the main goal of the study was to compare users' activity level in the content-oriented spaces with social-oriented name spaces, they were operationalized in the following ways. The study excluded editing behaviors in the Wikipedia article pages - the main namespaces, which constitutes the core of activity of the online encyclopedia. Instead, the study was based on three secondary namespaces, respectively Article Talk, User, User Talk namespaces, since those are the name spaces where majority of coordination and user interaction occurs. Article Talk namespace. This namespace contains pages in which users are invited to discuss improvements of the articles. Every page (such as “Digital Library”) has an associated talk page (i.e., “Talk:Digital Library”). If a user does not fully agree with a certain article, she has basically two choices: either editing the article itself (main namespace) or proposing a change by editing the related talk page, initiating a discussion with others. User namespace. User profile page is located in this namespace. For example, user Mary has her user page at “User:Mary”. By editing this name space users provide information about themselves. User Talk namespace. Every user page has also an associated talk page (seen as “User_talk:Mary”), by editing this page, other users can leave messages to a given user, hence edits in this namespace can be considered as communicative acts between users. User and User Talk namespace are hence social namespaces related to users’ interactions and selfrepresentation, while the Article Talk namespace is related to discussions based on the content of the encyclopedia. In addition to the mentioned above namespaces, there are others which are less frequently used, such as Wikipedia, File, MediaWiki, Template, Help, Category, Portal, Book and their related talk namespaces. However, this study excluded these marginal namespaces due to their minor use. As a result, this study focused on the three namespaces explained before because they are relatively more used and clearly identify social and content-oriented spaces.

2.3 Analytical Procedures For each one of the 686,332 Wikipedia users we analyzed (the ones that executed more than 10 edits), the percentage of edits performed in each of the three namespaces under consideration (Talk, User and User Talk namespaces) was computed. The percentage was calculated as the ratio between edits in the specific namespace and total number of edits performed by the users. The differences between groups (independent variable: longevity) have been assessed testing the dependent variables - previously described percentages of edits in talk, user and user talk namespaces. The

42

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

data have been analyzed in the following ways. To test the differences between the four groups by the longevity in Wikipedia, analysis of variance (ANOVA) has been performed with the statistical package SPSS. In order to test if there are differences between the dependent variables (since there are more than two), further analysis has been computed by using a Tukey post-hoc test. The results are presented in the next section.

3. RESULTS The first level of analysis provides descriptive statistics of the studied sample, namely, the users actively contributing to Wikipedia (i. e. who made more than 10 edits). Descriptive statistics of the users divided by longevity quartiles (by number of days) are summarized in Table 1. Table 1. Number of users by longevity Longevity quartiles 1st quartile 2nd quartile 3rd quartile 4th quartile Total

Longevity (N of days in Wikipedia) 0-21 21-226 226-848 848-3,426

N of users per quartile 172,755 170,430 171,579 171,568 686,332

Mean number of edits per user 26.98 91.3 254.76 1,104.93

Sd of edits per user 100.17 1,670.27 4,104.77 14,177.48

Table 1 shows that about half of the users within English Wikipedia remained for less than a year. In particular, 25% of the users remained active in the system for less than 21 days, while the last quartile represented the users who remained active for more than two years. For example, the user “Larry Sanger”, co-founder of Wikipedia, remained in the system for the longest period (3,426 days, almost 10 years) and the other founder of Wikipedia, the user “Jimbo Wales” remained 3,415 days. Table 1 shows that generally the majority of the users in English Wikipedia contribute for a short period of time. This finding indicates that it is important for the users to get oriented quite quickly and learn how to interact with the interface in an effective way, so that they can benefit of the different affordances offered by the platform. Table 1 also shows the mean number of edits (in every namespace, both the primary and the secondary ones such as talk, user, user talk and others) performed by the users of each quartile. For example, users in the first longevity quartile performed around 27 edits on average, while users in the last quartile more than 1,100. Figure 1 summarizes editing behavior of users. Not surprisingly, users who remained active longer performed a larger number of edits.

Figure 1. Mean number of edits performed by users of different longevity quartiles within the English Wikipedia

To test if there was a statistically significant correlation between the longevity and average mean edits made by Wikipedia users, Pearson correlation test has been performed. The results revealed statistically significant positive correlation between the two variables – the number of edits and number of days spent in Wikipedia (r=.073 where N=686,332, pportType -> UDDI->tModel WSDL->binding -> UDDI->tModel WSDL->service -> UDDI->businessService WSDL->port -> UDDI->bindingTemplate

Figure 5. UDDI data model defines several object used for representing information, adopted from [OASIS-2004]

139

ISBN: 978-972-8939-40-3 © 2011 IADIS

Classification systems are modeled as UDDI tModel with specific key references (figure 5). In order to one tModel to be identified as classification one in UDDI it should have reference key to uddi:uddi.org:categorization:types with value of “categorization” and reference key to same tModel key but with value stating if classification values are checked or unchecked (whether should be validated for existence or not). Third reference key is added to mark classification systems as unique. Reference is to specific uddi:latinov.net:categorization:types tModel. In this way application will find only classification tModel published by themself and also their classification tModels can be found by other applications because of the two standardized reference keys applied. Values are added as references key in classification tModel. Values should be unique. Classification of services is done by republishing UDDI businessService entity and adding references keys with tModel key from classification system and value from classification value selected by the customer. Search using classification then becomes standard UDDI search for businessService by reference key.

4.2 Software Implementation and Open Source Components System as described in previous section is implemented. It eases publication of WSDL into UDDI registry. It also provides additional capabilities of web services classification; managing of classification systems and values; search for web services by name or/and by classifications. System was developed suing following open source technologies: • jUDDI (http://juddi.apache.org/) – an open source reference application of UDDI v3 standard. Distribution is working on Tomcat with Derby database. • Tomcat (http://tomcat.apache.org/) – an open source software implementation of Java Servlet and JavaServer Pages technologies. • WSDL4J (http://sourceforge.net/projects/wsdl4j/) – an open source Java library for manipulating of WSDL documents. • Java (http://www.java.com) and JSP – widely used open source programming languages.

4.3 Service Usage Scenario Test scenario is depicted on figure 6 below. First step of the usage scenario is there available one or more services which are labeled accordingly as WSDL1, WSDL2 and WSDL3 are parsed via “WSDL to UDDI” processing block. Next processed data are scheduled and executed as sequence of UDDI calls.

Figure 6. WSDL file publication in UDDI registry process.

Result from web application execution of WSDL file publication after registering in UDDI is technical keys, which can be used for further additional information and analysis of published services (figure 7).

140

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Figure 7. Results from publication of service

Reference keys depicted on figure are also important artifacts for further technical utilization of published meta-information about web-services into UDDI. Most of the existing tools requires such a detailed technical information in order to make their calls.

5. CONCLUSIONS AND FUTURE WORK An application for WSDL parsing, mapping to UDDI objects and publication in UDDI registry was introduced. After reasonable amount of different WSDL file definition publications via developed web service (more than 70 tests) result web service is accepted as reliable. Nevertheless further tests with WSDL files from different domains is possible and desirable. Developed software component exposes its functionality as a web service for simplification of web services WSDL file publication. This is first step toward design and implementation of classification system, which will be located on the UDDI registry side. For future development of application can be added operation supporting management of classification systems, classification of services and search of services via classification. Authentication can be introduced in the application as another point of future work. Search by web service description can be also targeted for additional design and improvement. Classification systems can be designed as exportable units to a data file and imported on another system or saved as a backup.

ACKNOWLEDGEMENTS Research is partially funded by scientific research projects: eduPub, under number 183/2011 from state budget of Sofia University “St. Kliment Ohridski”, ADOPTA project no. D002/155, started 01.2009 - funded by Bulgarian Ministry of Education and Science and supported by the SISTER project, funded by the European Commission in FP7-SP4 Capacities via agreement no. 205030.

REFERENCES Cerami E, 2002, Web Services Essentials Distributed Applications with XML-RPC, SOAP, UDDI & WSDL, Chapter 6, O’Reilly, 0-596-00224-6, Order Number: 2246, 304 pages Ivanova E., 2004, WSDL Interface of Services for Distributed Search in Databases, International Conference on Computer Systems and Technologies - CompSysTech’2004, pp. II.14-1 - II.14-6. Petrov M., 2007, Event-driven interoperability framework for interoperation in e-learning information systems monitored repository, IADAT JOURNAL of Advanced Technology on Education (IJAT-e), March 2007, Volume 3, Number 1, ISSN (print): , ISSN: 1885-6403, ISSN (print): 1698-1073 , pp.332-335.

141

ISBN: 978-972-8939-40-3 © 2011 IADIS

Petrov M., V. Vlaykov, 2010, Software architecture components of an abstract framework for assessment in e-learning, proceedings of European Conference of Computer Science (ECCS'2010), Puerto De La Cruz, Tenerife (Spain), November 30- December 2, ISBN: 978-960-474-250-9, pp.290-294. Skonnard A., 2003, Understanding WSDL, Northface University, available online at http://www.cnblogs.com/MayGarden/archive/2010/01/20/1652212.html Skonnard A. M. Gudgin, 2001, Essential XML Quick Reference: A Programmer's Reference to XML, XPath, XSLT, XML Schema, SOAP, and More, DevelopMentor Books, Addison-Wesley Pub. Co., 2001 W3C, 2001, Web Services Description Language (WSDL) 1.1 , available at http://www.w3.org/TR/wsdl Ryman A., 2003, Understanding web services, IBM Toronto Lab, available at http://www.ibm.com/developerworks/websphere/library/techarticles/0307_ryman/ryman.html OASIS, 2004, UDDI v 3.02 Specification, available at http://uddi.org/pubs/uddi_v3.htm Microsoft, 2005, UBR Shutdown FAQ, available at http://uddi.microsoft.com/about/FAQshutdown.htm Joseph A., 2008, Generating a web service proxy client from WSDL file http://www.albeesonline.com/blog/2008/01/29/generating-a-web-service-proxy-client/ Eclipse Foundation, 2011, Eclipse WTP home page, available at http://www.eclipse.org/webtools/ws/ OASIS,2004 b, UDDI Executive Overview: Enabling Service-Oriented Architecture, pp.1-13 OASIS, 2004 c, Using WSDL in a UDDI Registry, Version 2.0.2 - Technical Note, available at http://www.oasisopen.org/committees/uddi-spec/doc/tn/uddi-spec-tc-tn-wsdl-v202-20040631.pdf, pp. 1-43. Apache, 2010, jUDDI home page, available at http://juddi.apache.org/ Oracle, 2011, JavaEE JSP Specification

142

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

AUTOMATIC WEB TABLE TRANSCODING FOR MOBILE DEVICES BASED ON TABLE CLASSIFICATION Chichang Jou Department of Information Management, Tamkang University 151 Ying-Chuan Road, Tamsui, New Taipei City, Taiwan

ABSTRACT Many techniques have been proposed to improve web browsing experiences on the mobile devices by transcoding the original web content. However, the original semantics of web tables tend to be broken in the transcoded results. We capture basic features of web tables from their DOM-tree (Document Object Model Tree) semantic information. We propose a new table feature called Cell Extension Direction (CED) to capture the extension direction of cell content as one-directional or bi-directional. CED is computed by checking the difference between the average composite object type (ACOT) of rows and that of columns. These features are used to classify web tables into data tables and layout tables. The classification results, along with CC/PP configurations of the mobile device, are then utilized to guide the applications of the following three transcoding strategies for tables: zooming, transposition, and one-column-view. We demonstrate that the table semantics could be preserved in the transcoding results. KEYWORDS Table Classification, Web Table Transcoding, Mobile Computing

1. INTRODUCTION Along with the fast developments of wireless and computing technologies, more and more people use diversified mobile devices to receive emails, to browse web sites, and to handle business. These devices, such as tablets, smart phones and PDAs, have miscellaneous hardware and software configurations. Since most web pages are designed for large screens on desktop and notebook computers, most mobile web pages are either distorted or with broken images, and thus hinder their comprehension. Web tables were originally designed to embed important static data, like time tables and exchange rates, in a two dimensional structure. Nowadays, they are also frequently used to control layout of arbitrary content, and to exhibit dynamic database content. Due to changes of trends in web page design and in embedding cell content, previously proposed table features could not correctly classify tables. Although W3C has provided guidelines1 for designing web tables that transform gracefully, the transcoded tables on mobile devices either could not convey original semantics, or would hinder comprehension of semantics. In some cases, even after manipulating vertical and horizontal scroll bars back and forth, users still could not capture the table semantics. We use the following example to explain the broken table semantics issue: Fig. 1 is a currency exchange table of four countries and exchange dates. Fig. 2 is the one-column-view of this table on a mobile device. With the loss of relative positions among cells, the correlation between the exchange dates and country names is lost.

1

http://www.w3.org/TR/WAI-WEBCONTENT/wai-pageauth.html

143

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 1. Original web table

Figure 2. One-column-view of the original table

From the above example, the binding degree of relationship among cell content in a table would greatly affect whether the table semantics could be preserved in the transcoded result. If tables could be classified to capture their functionality, then proper transcoding strategies could be selected, not only to fit the hardware and software configurations of each mobile device, but also to preserve the original table semantics. We design and implement a “Web Table Transcoding based on Classification” system, abbreviated as WTTC, to classify web tables, and then transcode these tables based on the classification result and the CC/PP configurations of the mobile devices. In addition to extended table features extracted from the DOM-tree semantic information, we propose a new table feature called Cell Extension Direction (CED) that represents the extension direction of cell content as one-directional or bi-directional. CED is computed based on the difference of average composite object type (ACOT) of rows and ACOT of columns. Based on the above features, web tables are classified into data tables and layout tables. Along with the client’s CC/PP device configurations, the classification result would then be applied to guide the applications of zooming, transposition, and one-column-view transcoding strategies for tables. Our automatic transcoding results would thus be customized for each device, and preserve the structure and semantics of the original tables. The rest of the article is organized as follows: Section 2 is about related work. Section 3 is the system architecture of WTTC. Table classification and the classification results of WTTC are presented in Section 4. Section 5 illustrates the use of the classification results in applying proper transcoding strategies. Section 6 concludes the article by summarizing our achievements and identifying future research directions.

2. RELATED WORK 2.1 Transcoding of Web Tables Bickmore et al. [2] employed a heuristic planning algorithm and a set of structural page transformations to produce the ‘best’ looking document for a given display size. For tables that could not be directly sent to the client, it output one sub-page per table cell. Their table transformation also determined ‘navigational sidebar columns’ and moved cells to the end of the list of sub-pages. Chen et al. [3] developed a page-adaptation technique that split a page into smaller, logically related units that could fit onto a mobile device’s screen. The Web page then could be adapted to form a two-level hierarchy with a thumbnail representation at the top level for providing a global view and an index to a set of subpages at the bottom level for detailed information. Hwang et al. [7] introduced two new heuristics, the generalized outlining transform and the selective elision transform, to preserve web page structures during transcoding. Both exploited layout characteristics of complex web pages. They attempted to preserve the original table structure: in addition to table attributes such as cell width, this transform used syntactic attributes such as font size to decide whether to elide a table cell. Artail and Rayden [1] introduced a method that applied the device type and screen size to render web pages that fit the display area of the requesting device. It employed CSS elements to reduce the size of the web page’s building blocks. It used scripting for hiding and re-appearing parts of the page’s textual items and for converting tables to text. They applied the zooming strategy for the layout tables by adjusting the table width. For data tables, they rearranged cell contents so that the cell content would be presented as a pure text in a row.

144

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

He et al. [4] proposed a rule-based content adaptation system to facilitate extensible content adaptation for miscellaneous clients. They classified HTML objects into structure, content, and pointer objects. They used fuzzy logic to capture the measurement of distortion, caused by zooming and partitioning, and user satisfaction for each cell and each row, so that the adaptation quality could be used in guiding the adaptation decision. Tajima and Ohnishi [9] introduced the concept of keys and developed a method of automatically discovering attributes and keys in tables. They then proposed three modes for browsing tables: normal mode, record mode, and cell mode. Each mode was provided with interaction features like hiding unnecessary rows and columns. Most of the above researches tried to adjust the original tables so that users could use the vertical scroll bar to browse the result. However, this principle could not fit all cases, and may destroy table’s semantics easily.

2.2 Use of Machine Learning in Table Classification Hurst [6] modeled tables in terms of geometry, simple hierarchies of strings and database-like relational structures. He used simple two-dimensional geometry of tables to denote both the organization of their terms and the relations that hold between these terms. Geometric relationships were employed to express the structure (and by implication, the meaning) of the table. Wang and Hu [10,11] applied machine learning techniques to classify web table into data tables and layout tables. To increase classification precision, Wang and Hu extracted a special feature, called content length consistency (CLC), to reflect whether there is a string length consistency among cells in a row or in a column. Similarly, they extracted a feature, called content type consistency (CTC), to reflect the average cumulative dominant content type for cells in a row or in a column. Wang et al. [12] extracted extended visual and content features to classify web tables into data tables and layout tables. To preserve structures of data tables, the classification result was then applied to transform data tables into one-column view through zooming and rotation. Their features were mostly based on ratios of various objects, which are not suitable for describing diversified usage of tables. Okata and Miura [8] proposed a classifier to check whether the pages include layout-purpose tags using the ID3 technique. Their result showed that the tags could be classified with attribute values of border, number of rows, number of tags that appear ahead of the
tag, and the nest of
tags.

3. SYSTEM ARCHITECTURE AND DATA FLOW OF WTTC 1.http request

WTTC Service listening component

2. dispatched request Web content extraction and parsing component mobile device

internet

3.1 web content request 4. parsed web content and service request

3.2 web content

Web table classification and transcoding module Web table classification component 5. parsed web content, service request and classification result

6. http response

Web table transcoding component

Figure 3. System architecture and data flow of WTTC

145

ISBN: 978-972-8939-40-3 © 2011 IADIS

The design concept of WTTC is to preserve table semantics by keeping the relative positions of highly correlated cells. We extend an existing proxy server 2 to handle the web content extraction, web content transcoding, and transcoded web content delivery to the mobile devices. Fig. 3 displays the system architecture and data flow among the modules of WTTC. The numbers indicate the sequence of data flow or message transmissions. The operations of WTTC are explained as follows: 1) The mobile device sends out a web page request. The client’s device information, including hardware platform, software platform, and browser user agent, will be embedded inside these requests through CC/PP diff, which is a modified version of predefined CC/PP profile from the hardware manufacturers. Many protocols have been proposed to enhance HTTP 1.1 protocol to include CC/PP profile diff. We adopt CC/PP-ex3 in this framework. 2) The service listening component receives the request, and dispatches the request to web content extraction and parsing component. 3) The web content extraction and parsing component obtains the requested content from the internet, and use JTidy4 to reformat the web page into the XHTML format and then build the DOM-tree for the page. Then the parsed web content and service request would be sent to the web table classification component. 4) The web table classification component would extract features of the leaf tables, meaning no nested table contained inside, from the DOM-tree to perform the classification. The classified result would be sent to the web table transcoding component. 5) The web table transcoding component would perform the transcoding according to the classification result. 6) The transcoded result is sent back to the mobile devices. Since this article is focused on web table transcoding, we skip detailed descriptions regarding service listening component and web content extraction and parsing component.

4. WEB TABLE CLASSIFICATION We adopt the following two table definitions from previous studies [6,11,12,13]: 1. Data table: The content in cells is highly correlated. Once the relative positions of the cells are changed, the correlation hierarchy is damaged and the information validity is lost. For this type of table, correlated cells’ relative positions should not be changed. 2. Layout table: these tables are only considered as tables by their appearance rendered by a browser. The functionality of these tables is just to give a better layout. Even if the relative positions of their cells are changed, these tables are still readable and its meaning is not lost. Fig. 1 is an example data table. Fig. 4 is an example layout table in a web page that provides Google search functionality for both WWW and its web site. Content of the four cells in the table are independent.

Figure 4. Example layout table

4.1 Table Features for Classification Many previously proposed features, like border width of the table, textual content ratio, width of column span, and length of row span, could no longer provide distinguishing power in classifying tables. For example, although texts are still used in many cells for displaying highly correlated information, many cells with correlated content contain composite multi-media objects. Thus, the textual content ratio itself could not provide the same distinguishing power as before. The table features extracted in this study are listed in Table 1.

2 3 4

http://www.cs.technion.ac.il/Labs/Lccn/projects/spring97/project1 http://www.w3.org/TR/NOTE-CCPPexchange http://jtidy.sourceforge.net/

146

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Hurst [6] and Wang et al. [12] observed that the textual content ratios for data tables normally were greater than a threshold, while the link content ratios and image content ratios below a threshold. Our cells-withoutspan ratio was to emphasize their difference of frequencies in cells with the colspan and rowspan attributes. It was normally lower than a threshold in data tables, so as to emphasize the correlation of cell content. It was normally higher than a threshold in layout tables, so as to improve the appearance. Hurst [6] and Wang et al. [12] checked whether number of cells with colspan or rowspan is within an interval. We observed that due to change of web table usage, these two numbers have large deviations among tables. It is difficult to find an interval that is suitable for table classifications. Thus, we modified data types of these features to Boolean, to simply illustrate whether there exists at least one cell with colspan or rowspan. Table 1. Table features used in WTTC Feature name Link content ratio

Textual content ratio Image content ratio

Cells-without-span ratio

Colspan existence Rowspan existence CED (Cell Extension Direction)

Description Whether the ratio of number of cells containing links to total number of cells is greater than a threshold Whether the ratio of number of cells containing texts to total number of cells is greater than a threshold Whether the ratio of number of cells containing images to total number of cells is greater than a threshold Whether the ratio of number of cells without the span attribute to total number of cells is greater than a threshold Is there at least one cell with the colspan attribute? Is there at least one cell with the rowspan attribute? “one-directional” or “bi-directional”

Table 2. Composite cell object types and their values Sources Hurst[6], Wang et al. [12]

This research This research This research

Composite Cell Object Type

Type Value

Non-numeric string

1

Numeric string

2

Image

3

Link

4

String+image

5

String+link

6

Image+link

7

String+image+link

8

Form

9

other

10

We extend CTC [10, 11] to compute the Average Composite Object Type of rows and columns (ACOTrow and ACOTcol) in a table, respectively. These two values are then used to determine a new table feature called Cell Extension Direction (CED). The steps for determining CED through computing ACOTrow and ACOTcol of a table are summarized as follows: 1) Record number of rows, number of columns, colspan of cells, and rowspan of cells in the table. Based on the above information, extract the Maximal Continuous Block without rowspan and colspan (MCB) of the table. 2) For the MCB, compute ACOTrow and ACOTcol. 3) CED of the table is then determined based on the difference of ACOTrow and ACOTcol. Details about the above three steps are explained as follows: We define six basic object type of cell content: numeric string, non-numeric string, image, link, form, and other. Since objects in a cell may be of different object types, we extend the three basic object types, i.e. string, image, and link, to their combinations. Thus, totally we have ten composite cell object types, which are listed in Table 2. Suppose there are r rows in the table. We use Ri (1 ≤ i ≤ r) to denote the i-th row, and RCi the number of cells in Ri. For 1 ≤ j ≤ RCi, we use RCi,j to denote the j-th cell (from left to right) in Ri. RCi,j,col and RCi,j, row are used to denote the existence of colspan attribute and rowspan attribute for RCi,j, respectively. To obtain MCB of the table, we find the smallest row number (i) and then the largest number (ni) that satisfy the following four conditions: 1) RCi ≥ 2 2) RCi = RCi+1 = RCi+2 = … = RC ni 3) for all 1 ≤ j ≤ RCi , RCi,j,col = 0 and RCi,j, row = 0 4) i+1 ≤ ni ≤ r Then, MCB is the block of rows from Ri to R ni . Condition 1 ensures that there is more than one cell in a row. Condition 2 ensures that numbers of cells for all rows in MCB are equal. Condition 3 ensures that there is no colspan or rowspan for all cells in MCB. Condition 4 ensures that rows in MCB are extended from top to

147

ISBN: 978-972-8939-40-3 © 2011 IADIS

down. Thus, number of rows in MCB is ni –i +1, and number of columns in MCB is RCi. Note that some tables may not have a pair of i and ni that satisfying all four conditions. Suppose there are rM rows in the MCB of a table. We use Ri (1 ≤ i ≤ rM) to denote the i-th row of the MCB, and RCi number of cells in Ri. Let T denote the set of the 10 composite cell object types. For all t in T, we use Nit to denote number of cells in Row Ri with type t. We then compute the maximal composite object type for row Ri, denoted as MCOTi, as follows: MCOTi = max( N it ) / RC i (1) t∈T

The average composite object type of rows for MCB is computed as follows: ACOTrow=

1 rM



rM i =1

MCOT

(2)

i

The same procedure could be applied to compute ACOTcol. For a small value δ as a threshold, CED is then determined based on the difference between ACOTrow and ACOTcol as follows: | ACOTRow - ACOTCol | ≤ δ or if there is no MCB CED = " bi - directional" , (3)  | ACOTRow - ACOTCol | > δ " one - directional"

4.2 Table Classification We follow the method in [10,11,12] to train and test our proposed table classification in the business and news directory of the DMOZ open directory project. We use the following 12 keywords: “exchange rate”, “score”, “sport”, “finance”, “stock”, “weather”, “report”, “shopping”, “table”, “results”, and “value” to extract 1502 unique tables from 236 pages of 105 web sites. These 1502 tables are manually classified into 1198(79.76%) layout tables and 304 (20.24%) data tables. To decide the thresholds for the four table features with ratios, five-fold cross-validations are applied to obtain the thresholds in Table 3. We use the following two classical classification techniques: Bayesian and ID3 decision tree. To illustrate the importance of the table feature CED, we also perform classifications without CED. By treating data tables as the desired classification, with δ in (3) set as 0, the precision, recall, and F-measure values of the classification results are listed in Table 4. Table 4. Classification effectiveness

Table 3. Thresholds for table features with ratios Classification with CED

Classification without CED

Link content ratio

50%

50%

without CED

with CED

without CED

with CED

Textual content ratio

20%

80%

Precision

0.4604

0.9801

0.7576

0.9114

Image content ratio

80%

40%

Recall

0.8366

0.9673

0.1634

0.9412

F-measure

0.5939

0.9737

0.2688

0.9261

Feature name

Cells-without-span ratio

70%

80%

Bayesian

ID3

Classification Classification Classification Classification

From Table 4, it is easy to see that the inclusion of CED boosts the precision and recall tremendously.

5. WEB TABLE TRANSCODING In WTTC, the web table transcoding component calculates the table width based on font size information, number of characters in a string, image metadata, and CC/PP of the mobile device. It then performs transcoding based on the classification result. For layout tables, we would use one-column-view and zooming as main strategies, such that the whole table would be preserved together. For data tables, we would use zooming and transposition as main strategies. When the calculated table width exceeds the screen width of the mobile device, the zooming strategy would be applied. Its procedures include: (1) decreasing the distances between cell content and their borders and (2) shrinking images inside cells. The procedures for the one-column-view strategy are to adjust tables by adopting the sequential placement methods [6,16] used in placing the semantic blocks sequentially for web pages on the mobile devices. We

148

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

adjust the and
positions in the node of the DOM-tree, so that in the transcoded result, each row would have only one node of its DOM-tree, the rows in the original table would be transposed into their corresponding positions in the transcoded columns. For example, we would transpose the currency exchange table in Figure 5 (a) into Figure 5(b).

.. (a) Original table

(b) Transposed table

Figure 5. Example of the transposition strategy

(a) Original table

(b) Transposed table

Figure 6. Transcoding example for layout tables

5.1 Transcoding Layout Table For layout tables, to keep a flexible zooming effect, one-column-view strategy would be applied first. Then the zooming strategy would be applied to every cell. Figure 6 is a transcoding example for layout tables. With the flexibility in shrinking ratio, this handling could prevent generating unreadable tiny transcoded results, especially for cases with large cell content in a row.

5.2 Transcoding Data Tables

(a):Example large table

(b):Upper Transcoded Result

(a) Original table

(c):Lower Transcoded Result

(b) Transposed table

Figure 7. Transcoding example for data tables

Figure 8. Transcoding

For small data tables with ACOTrow – ACOTcol > δ, we would transpose the table first. If the resulting table is too wide, then the zooming strategy would be applied. For small data tables with ACOTcol - ACOTrow > δ, since it is extended in the column direction, then the zooming strategy would be enough. Figure 7 is a transcoding example for data tables with ACOTrow – ACOTcol > δ. Since it is extended in the row direction, there is no attribute for its cells in the two rows. Since both the image and the name are accompanied with a link, its link content ratio is 100%. WTTC would perform the transposition strategy to o the one-column-view without losing semantics of the original table. For large data tables with CED value “bi-directional” or ACOTcol - ACOTrow > δ, the above strategies could not achieve satisfying result. Normally number of columns in these large data tables is much larger than number of rows. In the transcoding, a splitting of columns would be performed first. First column of the

149

ISBN: 978-972-8939-40-3 © 2011 IADIS

original table would be treated as identifiers and be copied to all split fragments. Finally, a zooming strategy would be applied to each group so that the transcoded results could fit the screen width. For large data tables with ACOTrow – ACOTcol > δ, general transcoding strategy would be appropriate. Figure 8 is a transcoding example of tables with 77 columns and 7 rows, in which a splitting procedure is applied before transcoding.

6. CONCLUSIONS AND FUTURE WORK We designed and implemented a web table transcoding system based on table classification (WTTC) so that the semantics of the original table could be preserved. More specifically, we proposed a new table feature, Cell Extension Direction (CED), which represents the direction of correlated cell content. CED is determined by the difference between average composite object type for rows and that for columns. We demonstrated the power of CED in classifying web tables into data and layout tables. The classification result and the CC/PP configurations of the mobile device were then used to apply the one-column-view, zooming, and transposition transcoding strategies customized for the mobile devices. For large data tables, we use transposition and zooming strategies. For layout tables, we use one-column-view and zooming strategies. The flexibility provided by WTTC in the zooming ratios helps improve users’ browsing experience. This methodology focuses on leaf tables, and nested tables are treated as layout tables. More investigation in verifying the above handling is ongoing. Besides the
or tag. One-column-view strategy could make the transcoded result large enough to prevent unclearness caused by too much shrinking. When the ratio of width and height of a data table exceeds a threshold (1.5 in WTTC), the transposition strategy would be applied. Its procedures are similar to the one-column-view strategy. By adjusting the or tags in a
tag, we would like to extend this methodology to other HTML elements, such as
and CSS files.
tags are becoming popular in rendering tabular data, and are used for many miscellaneous purposes. Without clear semantics from the cell content, the semantics of the
tag needs more investigation. Meanwhile, CSS files are commonly used in many web pages for presenting consistent styles. Features obtained from CSS files would improve the classification result. More interestingly, CSS files themselves may also need to be transcoded.

REFERENCES [1] Artail, H. & Raydan. M. (2005). Device-aware Desktop Web Page Transformation for Rendering on Handhelds.

Personal Ubiquitous Computing, 9(6), 368-380. [2] Bickmore, T., Girgensohn, A., & Sullivan, J. (1999). Web Page Filtering and Re-authoring for Mobile Users. The

Computer Journal, 42(6), 334-346. [3] Chen, Y., Xie, X., Ma, W. & Zhang, H. (2005). Adapting Web Pages For Small-Screen Devices. IEEE Internet

Computing Archive, 9(1), 50-56. [4] He, J., Gao, T., Hao, W. & Yen, I.L. (2007). A Flexible Content Adaptation System Using a Rule-Based Approach.

IEEE Transactions on Knowledge and Data Engineering, 19(1), 127-140. [5] Hurst, M. (2002). Classifying Table Elements in HTML. Proceedings of the 11th International World Wide Web

Conference, 7-11. [6] Hurst, M. (2006). Towards a Theory of Tables. International Journal on Document Analysis and Recognition, 8(2-3),

123-131. [7] Hwang, Y., Kim, J. & Seo, E. (2003). Structure-Aware Web Transcoding for Mobile Devices. IEEE Internet

Computing, 7(5), 14-21. [8] Okata, H. & Miura, T. (2007). Detection of Layout-Purpose TABLE Tags Based on Machine Learning. Lecture Notes

in Computer Science 4556, Part 3, 116-123. [9] Tajima, K. & Ohnishi, K. (2008). Browsing Large HTML Tables on Small Screens. Proceedings of the 21st annual

ACM symposium on User Interface Software and Technology, 259-268. [10] Wang, Y. & Hu, J. (2002). Detecting Tables in Html Documents. Proceedings of the 5th International Workshop on

Document Analysis Systems, 249-260. [11] Wang, Y. & Hu, J. (2002). A Machine Learning Based Approach for Table Detection on the Web. Proceedings of

11th International World Wide Web Conference, 242-250. [12] Wang, C., Xie, X., Wang, W. & Ma, W.Y. (2004). Improving Web Browsing on Small Devices Based on Table Classification. Proceedings of the 12th International World Wide Web Conference, 20-24. [13] Wu, J. J. & Yang, J. H. (2007). Using Content Analysis Technique to Enhance Content Adaptation System. Proceedings of the 9th IEEE International Symposium on Multimedia Workshops, 23-28.

150

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

FLEXIBLY MANAGED USER INTERFACES FOR MOBILE APPLICATIONS Pekka Sillberg, Janne Raitaniemi, Petri Rantanen, Jari Soini and Jari Leppäniemi Tampere University of Technology, Pori P.O.Box 300, FI-28101 Pori, Finland

ABSTRACT Fluent and impressive user interfaces (UI) have become more important, especially in mobile applications. This creates challenges for updating and modernizing UIs. As a whole, this can make software management and maintenance more difficult. This paper presents an approach to distribute UIs created with declarative languages based on languages such as Extensible Markup Language (XML) and JavaScript. These can be used for rapidly designing and implementing graphical content in mobile and desktop applications. The main focus of this paper is on the possibility of using adapter software and a centralized management system to update UIs dynamically. The proposed approach is explained using two example cases. The presented cases describe how to retrieve UI embedded with content, and show how the same approach can be used to distribute software updates. KEYWORDS Adapter software, Adaptive user interfaces, Declarative languages.

1. INTRODUCTION Traditionally software has been updated either by installing completely new versions or by installing patch files. Many of the current operating systems include automatic updating tools (e.g. Windows Update and MeeGo’s Zypper), but often these tools are limited to specific programs and require some form of user interaction to be useful. In practice, most of the applications need to be manually and individually updated and this can cause a lot of extra work and sometimes difficulties for an average user. If the updating tools are automatic, they usually rely on scheduled updates, which can be performed for example each day or once a week. From the user's point of view, a better and more convenient approach would be run-time, on-demand software updates. In this kind of approach – or model – the software components are updated just-in-time and the user always has the most recent version of the software. The starting point of this study is the ongoing MOP (Mukautuvat Ohjelmistopalvelut, in English: Adaptive Software Services) research project, coordinated by the Tampere University of Technology Pori Unit (TUT 2011). The general goal of this two-year research project is to research software technologies that can be useful for searching, sorting and processing information, and presenting data using context-based adaptive user interfaces. The project also examines the distribution and deployment solutions in the scope of user interfaces. The usability and maturity of the studied technologies will be validated and tested by creating a prototype system for flexible adaptive user interface (UI) distribution. The project is funded by the Finnish Funding Agency for Technology and Innovation (Tekes 2011), a consortium of Finnish ICT companies, and the City of Pori. In this paper we introduce the preliminary results and the research topics of our ongoing project. In the scope of this paper we will present the possibilities offered by adapter software to display user interfaces created with declarative languages. The paper presents a comparison between native applications, web applications (excluding plug-ins) and adapter software, but the main focus is on the latter two. More specifically the plug-ins of browsers are not included in our study. In the context of this paper native applications are any – binary or bytecode – applications created by the primary programming language of the target platform, web applications are any Internet content that can be displayed on a normal web browser, and adapter software is a middleware application that has access to device capabilities and accessories, and

151

ISBN: 978-972-8939-40-3 © 2011 IADIS

that can be used to execute dynamic content offered by a remote service. The content shown by adapter software can be made with web technologies (e.g. JavaScript), but in the scope of this paper only declarative languages are considered. The aim of the adapter software is to combine the best sides of both, the native and web applications while offering high platform independency in mobile environments. This means that the same application could be used on different mobile operating systems such as iOS, Android and Symbian. The structure of the paper is as follows: in section 2 we give a short introduction to the technologies involved in this study and make a comparison between web applications and the adapter software approach. The section provides insight on the advantages and improvements that the adapter software approach can offer over traditional web applications. Section 3 illustrates the possibilities of adapter software by describing two example cases. The first of the cases illustrates problems related to binding data to user interface elements and proposes some solutions offered by declarative languages. The second example describes how run-time updates can be used to keep the software provided with the most up-to-date version without any user interaction. In the final sections (Section 4 and 5), we summarize the contents of this paper and present the next steps we are going to take in our research project.

2. FLEXIBLE USER INTERFACES Typically, user interfaces are designed for very specific tasks. For example, an instant messaging client has a user interface that is optimized for reading incoming messages from other users and also offers some form of input field that can be used to send messages to other users. Another example could be an image viewer that can be used to display pictures and may offer various options for image editing. Both programs work well for their own tasks, but cannot perform each other's tasks. This kind of task-specific approach simplifies software design and in turn lowers development costs, and in many cases it works well enough. But from the end-user perspective the approach can sometimes be problematic: what if the user wants to receive a picture from his friend using the instant messaging client? In practice the picture can be received, but another program is needed to view the picture. If the user does not have a program that can be used to open the received picture, either the picture cannot be viewed at all or the user has to find a compatible image viewer. The problem in this case is obvious: one program cannot transform into another kind of program depending on the context. The World Wide Web (WWW) offers one solution to this problem. There are many well known alternatives (e.g. Google Inc. 2011a; Google Inc. 2011b; Microsoft Corporation 2011a) that can be used when desktop versions of the software are not available. The advantages, disadvantages, problems, and considerations of web applications have been discussed at length in previous studies (e.g. Siponen and OinasKukkonen 2007; Wartel 2008) and this paper will not delve deeper into those matters. In the scope of this paper a more intriguing aspect is the role of the web browser in the case of web applications. The browser can be thought of as being a single application that can offer the functionality of multiple applications by working as a mere frame which is filled with whatever content is required. Through standardization of protocols and technologies, various independent browsers can also be used on different platforms to allow the cross-platform compatibility of web applications. Despite its multipurpose use the browser is not without its limitations. These limitations are discussed in subsection 2.2.

2.1 On Declarative UI Languages This section gives a brief overview of the current state of declarative UI languages. Most of these languages are based on Extensible Markup Language (XML) like XML User Interface Language (XUL) (Mozilla Foundation 2011) and Extensible Application Markup Language (XAML) (Microsoft Corporation 2011b). There are a few exceptions such as Qt Meta-Object Language (QML) (Nokia Corporation 2010, Nokia Corporation 2011a), which is based on JavaScript. Quite a popular XUL has been around since the end of the nineties while XAML was released in 2008 and QML, the newest of the three, appeared in 2009. The common factor for all is that they specify UIs by using text- based formats, which makes modification and deployment faster as there is no need to compile the whole application as in traditional software development. Work with other similar techniques has been made in User Interface Markup Language (UIML) which has currently reached Committee Draft (Version 4.0, 2008) status in the Organization for the Advancement

152

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

of Structured Information Standards (OASIS) standardization process (Abrams et al. 1999, OASIS 2008, UIML.org 2009). Research on eXtensible Interface Markup Language (XIML) (Puerta and Eisenstein 2002, XIML Forum 2004) has been made, but further development appears to have stopped as the latest update to its specification dates from 2004. This listing of UI languages is not meant to be comprehensive, as the main point is that a lot of work has been done in this area and there is no clear leader among them. Studies (e.g. Bishop and Horspool 2006; Mitrovic and Mena 2002; Mitrovic et al. 2004) have given examples of using XML-based UI language in simple applications and have noted that it is possible for a programmer to embed event-handling code (JavaScript in XUL, .NET in XAML) in the UI document. In QML, the same is possible with JavaScript (Nokia Corporation 2010). With Extensible Stylesheet Language transformations (XSLT) (W3C 1999), it is possible to reformat XML documents for other notations, even ones that are not XML. If the difference is purely syntactical, writing an XSL transformation is quite straightforward. Also, if there is enough information on the original XML document, a correctly done transformation is possible. Problems begin to emerge if the target notation requires more information than is available in the original and when “guessing” the missing information is not possible. UIs have quite a lot of common components (labels, text edits, combo boxes etc.) and properties (size, name, value etc.), so making a working XSLT should be possible, at least when the UIs are not too complex. For example, Mitrovic and Mena (2002) used XSLT to transform an XUL document into Java Swing and HyperText Markup Language (HTML). Keeping this in mind, one could write a transformation to transform an XUL document into QML. Our approach to the topic is to have adapter software on the target device that has the capability to interpret the UI file. A suitable analogy of our adapter software with regard to UI files is like a WWW browser is with regard to web pages. The adapter – when needed – can update its UIs and extend functionalities by fetching newer specification files from a system managing the UIs and their distribution. The adapter software can be made in many ways and does not have to rely on any particular programming language or declarative UI format. As mentioned before, notation may be transformed into the correct format as needed on the system managing the UIs. The goal is to have software that is both platform- and toolkitindependent. By using this approach, the final user interfaces will have the target device’s native look and feel.

2.2 Web Browsers and Device Capabilities The original purpose of a web browser was to show the content downloaded over Hypertext Transfer Protocol (HTTP). Downloaded content (e.g. web applications) is typically sandboxed inside the browser and does not have direct access to the Application Programming Interfaces (API) offered by the operating system. For this reason the downloaded content cannot access device accessory or sensor information. This problem can be bypassed by making either a special plug-in for the browser or a native application, or adapter software that can execute code written in a declarative language (see subsection 2.1.). Another method is to use, for example, Web Runtime (WRT) applications (Nokia Corporation 2011b). Any of these methods can provide access to APIs that allow the use of special device capabilities and functionality, which are not normally available. The access to device capabilities can offer the application developer a wider range of possibilities and enable richer application content and services when compared to web applications. A special case of adapter software is what is known as WebKit hybrid (Rosenthal 2010; The WebKit Open Source Project 2011). Depending on the software framework used to create the adapter software it may be possible to run normal web applications and to show web content without the need to create additional code. Some frameworks – like Qt (Nokia Corporation 2011c) – provide built-in APIs that allow running of all content supported by the WebKit engine inside native applications. In terms of functionality, WebKit hybrids are somewhere between the web applications running in a web browser and adapter software. They do not offer the same access to the underlying operating system or device capabilities as other adapter software, but depending on the case they may offer easier and faster application development, especially if the developers are already familiar with popular web technologies such as HTML and JavaScript. Using any of the aforementioned methods it is possible to create adapter software that can retrieve devicespecific information and combine it with web content or a user interface created with a declarative language. Figure 1 (below) shows how XUL can be extended to access device data. In the example code, the textbox component is extended to include a method call to custom function. The custom function is provided by the adapter software.

153

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 1. Example code for accessing device data.

The type attribute is passed to the getDeviceData function, which will return the device-specific data based on the value of the attribute – in this case position information and temperature. These results can be included for example in a response that is sent to a server. The getDeviceData can be made to include error handling in case the device is unable to resolve the required information. In the case of a client/server system, it is also possible that the client device can inform the server of its capabilities when initially retrieving the user interface (XUL code). This way the server may profile the client and omit information requests that cannot be executed on the device.

2.3 Software Update and Version Management Ideally, web application version management is simple because the web browser downloads the newest version every time it connects to a server. In practice, the use of caches or other browser optimizations may affect the retrieval of updated content. In the ideal case mentioned, the client devices always have the most recent version. This is helpful for the software developer and makes version management easier. This also simplifies system design, as there is no need to worry about clients having multiple different versions of the same application. Native applications installed into the operating system are usually not aware of whether they are the most recent version. A special application update could be done integrated into the application, or the update could be done by the operating system or by using special software management such as mobile device management. It should be noted that even if automatic software updates are available for the application, the user may choose not to use the function. So, unlike web applications, native applications may lag behind the newest version offered by the service, which can make version management complicated. The advantage of using native applications is the possibility to use previously downloaded versions offline. In both cases – web and native application – lack of a network connection may pose a problem. A web browser could use content from the cache but it would be unable to send the data back to the service and there is the risk of losing the data. A native application can use the older version and deliver the data later when the network connection has been re-established, but the data might be in an outdated format. Some of the modern technologies like HTML5 provide APIs for Web Storage to mitigate the risk of data losses when using web applications (W3C 2011; W3C 2009). Adapter software offers the possibility to work without a network connection, but in that case the issues associated with native applications, such as offline data storage and outdated data formats, must be taken into consideration. Adapter software can also take advantage of distribution methods and technologies used with web applications, and in that way the content delivery and version management can be made more flexible. The files used to define the content shown by the adapter software can be distributed by the same server solutions used by the web applications. This reduces the need to develop new content management systems from scratch.

3. EXAMPLE CASES In this chapter we introduce two example cases that illustrate the possibilities of adapter software. The first of the cases illustrates problems related to the binding of data to UI elements and proposes solutions offered by declarative languages. The second example describes how run-time updates can be used to provide the software with the most up-to-date version without any user interaction. As this is an ongoing study, there are no definitive results available yet, so these examples can be considered as preliminary test cases for the actual prototype system currently being developed.

154

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

3.1 Example 1: Data Binding in the Scope of User Interface Figure 2 illustrates a fairly common event on the web. The user wants to access content provided by a service. The service can be anything from a simple news feed to a complex social media portal. The service in turn requires additional information from the user before it can deliver the requested content. After the user has provided the necessary information the actual content is given to the user.

Figure 2. Transaction between user and service.

For the most part the transaction is quite simple and the same techniques and protocols can be used whether the case is implemented using web applications or adapter software. The real difference comes when additional information is required. On the Internet this could be done for example by showing an HTML form in the user's web browser and after the user has finished inputting the required information he would submit the form back to the server. For basic information this HTML form works well, but things get more complicated when information that the user may not directly know is required. One example could be the user's position – very few know their exact geographical coordinates. Accessing this kind of device-specific information is often problematic if not impossible when using a web browser. In this case it is possible to use Geolocation API (W3C 2010), but not all web browsers support it. It might also be possible to use some third party AJAX or JavaScript API, but these also may not be supported by all browsers. The problem in this case is that it is difficult to know if the user has a web browser that supports the required functionality. With adapter software this problem does not exist and it is known exactly what capabilities are available. When comparing the situation to traditional desktop applications, positioning can be solved in the same way as with adapter software. The problem with desktop applications arises if the information requested by the service needs to be changed – either by customizing it for individual users or groups or by updating the service. In that case the form – or more precisely the user interface – shown to the user has to be adapted either at run-time or by first updating the client software. The latter option is sufficient if the interface rarely needs to be updated or modified, but does not really provide any flexibility for interface manipulation. In this case a simple information form is used, but the case would be considerably more complex if the content were adapted instead or if it was allowed for the content to be provided by a third party developer or service provider.

Figure 3. XML fragment and the corresponding user interface.

Figures 3 and 4 illustrate how a traditional user interface can be customized by using declarative languages. Figure 3 (above) shows an XML fragment and corresponding UI created with Qt. The example code used in Figure 4 (below) is XUL, but any of the languages mentioned previously (subsection 2.1.) could be used just as well.

155

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 3 shows the main problem with customization: the data – in this case the short XML fragment on the left – can easily be modified by adding a few lines of code, but the same is not true for the user interface on the right. The XML elements (Figure 3, left) need to be pre-mapped for specific user interface elements (Figure 3, right) to be shown and currently there are no widely used standards for mapping the elements of raw XML data. In Figure 3 option elements could be renamed radio elements to illustrate better the UI components that the data is related to and to create binding for the radio buttons. The same kind of binding must be done individually for every data type, and that can require a lot of design and implementation work (Abrams et al. 1999; McLaughlin and Edelson 2006). This also means that even a small change or extension to the data format – such as adding another option element to Figure 3 or replacing the radio buttons with check boxes – may break the application's user interface. The more complex the data becomes, the more difficult it becomes to create the data mapping for the user interface. In this case it would also be possible to update the application every time modifications are made, but this can be inconvenient for the user.

Figure 4. User interface declared by XUL code.

Figure 4 shows how the same data from Figure 3 can be presented using XUL code and displayed using XUL compatible adapter software. In this case both the data itself and the user interface are declared in the code. The code is compiled and executed at run-time in the adapter software and the result is shown to the user. This way even the complete user interface can be changed whenever needed without any modifications to the user's client adapter software. The only change required to update the user interface is to modify the data (additional information request or content in Figure 2) returned by the service.

3.2 Example 2: Software Updates with Declarative Languages Figures 5 and 6 (below) are screenshots from an early development prototype. In both pictures the left side shows an Integrated Development Environment (IDE) made with PHP Hypertext Preprocessor (PHP) running on Apache HTTP Server on a Linux server. The IDE in the figure does not present the final look and feel of the tool, but it gives a good impression of what can be done with the IDE. In the final version, PHP might be changed to some other language, but the basic idea is to offer a development tool that can be easily used on a web browser. Currently the IDE is a simple way of creating and updating forms used on an enduser device, which in this case is an Android phone (right side on Figures 5 and 6) that has adapter software made with Java. In the IDE – or developer view – the user can drag items from the list of Controls using the mouse and drop them freely anywhere in the preview window shown next to the Controls list. In this case three Labels, two Textboxes, one Combobox and one Button (Submit) have been selected and the items have been named.

Figure 5. Developer view (left) and user interface (right).

156

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

After the desired items have been chosen from the list of Controls, the Save Application button can be clicked to generate the actual code to be transferred to the client device. In this case XUL is used as the declarative language to represent the UI components, but for example using XSL transformations (see subsection 2.1.) the output format could be varied for any language. This way the IDE could be used to create different code for different devices if so needed. As the declarative code is basically just text – no compilation to binary code is needed – it can be used to transfer to the client device using standard network protocols. The current prototype system uses a client - server model in which the client device checks for a new version of the user interface every time the user interface needs to be shown. In future versions, the functionality may be extended to provide push-functionality on the server side or to allow only the sending of updated elements of the user interface. Usually the data amount that needs to be transferred is on a par with simple web pages (in this case: 1 - 4 kilobytes), and because of their textual type they can be effectively compressed to save bandwidth. As mentioned above (subsection 2.1.), many of the declarative languages allow the embedding of executable code, which in this case means that for example the functionality for the Submit button could be inserted into the UI code and also updated when necessary without making any modifications to the adapter software that displays the actual UI.

Figure 6. Developer view (left) and modified user interface (right).

The difference between Figures 5 and 6 is that in the latter, a new Label and Textbox have been added to the preview window and the modifications have been saved to be shown on the device. The interface can be freely edited and when saved the changes will take effect on the device immediately. However, it should be noted that the changes cannot be downloaded on the client device if there is no network connection available. Naturally, in that case, the form also cannot be submitted back to the server so the user is unable to send back data using an old interface. This minimizes the need to worry about different software versions returning data in different formats.

4. DISCUSSION The first example presented in this paper illustrated problems related to data binding and showed a solution offered by the use of declarative languages. The second example showed a simple way to update the interface and related components on an end-user device. Neither of these approaches removes the need to update the server components when modifying or adding new elements to the user interface, but the approaches significantly simplify the version management and flexibility of the client software. By keeping the adapter software as simple as possible and offering only the rudimentary functionality, the maintenance costs of the adapter can be reduced, but this may also decrease the usability of the adapter software and minimize the advantages gained over using individual applications or web applications. The question of how much functionality should be added to the adapter software and how much should be delivered in the content is highly dependent on the situation, the resources available for the development and planned service model. The software framework used may also pose limitations to the functionality of the adapter software. If the content is planned to be offered by third party developers, it might be beneficial to offer a wider variety of functionality on the adapter; but if the content is delivered by the adapter developer a more specific set of functionality might be enough. An example of a third party provided content would be a web store where content can be freely purchased and then downloaded and executed on the adapter software

157

ISBN: 978-972-8939-40-3 © 2011 IADIS

– if a web browser is thought to be an adapter the closest example of this kind of functionality in the world of web applications would be Chrome Web Store (Google Inc. 2011c). The choice between adapter, native, and web applications should be decided based on the purpose of the application. As a general rule, the more specific the needs of the application in the scope of the device and accessory requirements, the closer the design should follow the native application approach. If the application requires very little or no interaction at all with the underlying operating system and client device, the web application approach is a viable option. Adapter software tries to provide a way to combine the best parts of native and web applications. However, the most appropriate approach is not always clear, and one aspect of our research is to study which features from native and web applications can be used practically with adapter software. The bandwidth requirements for transferring user interfaces written in declarative languages are similar to lightweight HTML page transfers. By using compression, the bandwidth can be further reduced. The next steps in our research are the improvement of the prototype system’s version management, client side caching, and the enabling of more flexible component management. One of our goals is to provide better updateability of individual software components and to optimize network usage. Another goal is to research the possibility of allowing end user customization of the user interfaces. The current prototype (Example 2 in subsection 3.2.) is a very early development version and for example, it does not yet include any kind of security aspects. In practice the same security and reliability considerations related to typical web development should be taken into account with user interface distribution. Declarative UI files are plain text by default and when handling sensitive information it is highly recommended to use encryption (e.g. Transport Layer Security, TLS). In addition, device capabilities revealed to the adapter software may pose security issues that must be taken care of. In the future we will continue the development and testing of our prototype system and the progress and results of our project will be reported in future conferences and research publications.

5. SUMMARY Flexibly manageable user interfaces are becoming an important trend in mobile and ubiquity applications. Traditional software maintenance procedures - updating the applications by installing new versions or patch files of the software - often demand unnecessary attention from the users, as they need to load and accept the necessary changes. In the near future there will be another option based on declarative languages, which could be used for rapid design and implementation of context-specific and adaptive user interfaces. In this paper we introduced the research topics of our ongoing project. We presented the possibilities offered by early prototype adapter software to display user interfaces created with declarative languages and we gave a short comparison between native applications, web applications, and adapter software. We also gave a short introduction and overview of the current state of declarative user interface languages. Based on the short survey of these formats, we concluded that they specify user interfaces by using text-based formats that allow fast development cycles. The emerging possibilities of the adapter software were illustrated by two examples; 1) binding data to elements of a user interface and 2) keeping the software up-to-date by run-time updates without user interaction. After describing the examples we gave a short discussion on the possible trade-offs between the available development options. Finally, we listed some further research topics, which cover issues on version management, bandwidth optimization, and security. We are also interested in end user customization of user interfaces. Since this is an ongoing study, there are as yet no final research results available. However, the definition of our examples can be considered as a preliminary verification planning phase – or in other words, the test cases for the actual prototype system currently being developed. The intention is to realize the validation phase of our results by means of the prototype system. The results of the trial will also be important for specifying detailed research topics that target better updateability of individual software components of a system and optimized network usage.

158

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

REFERENCES Abrams, M. et al., 1999, UIML: an appliance-independent XML user interface language, Computer Networks, vol.31, issues 11-16, pp 1695-1708. Bishop, J. and Horspool, N., 2006, Cross-Platform Development: Software that Lasts, Computer, vol.39, no.10, pp 26-35. Google Inc., 2011a, Google Docs - Online documents, spreadsheets, presentations, surveys, file storage and more, http://docs.google.com, Retrieved Feb. 4 2011. Google Inc., 2011b, Google Talk - Chat with friends and family on the internet using Google Chat, http://talk.google.com, Retrieved Feb. 4 2011. Google Inc., 2011c, Chrome Web Store – Apps, Extensions and Themes, https://chrome.google.com/webstore, Retrieved Feb. 4 2011. McLaughlin, B. and Edelson, J., 2006, Java & XML, O’Reilly Media, Inc., ISBN: 059610149X. Microsoft Corporation, 2011a, Windows Live SkyDrive - Online document storage and file sharing, http://explore.live.com/windows-live-skydrive, Retrieved Feb. 4 2011. Microsoft Corporation, 2011b, Extensible Application Markup Language Overview, http://msdn.microsoft.com/enus/library/ms752059.aspx, Retrieved Feb. 4 2011. Mitrovic, N. and Mena, E., 2002, Adaptive User Interface for Mobile Devices, Interactive Systems: Design, Specification, and Verification, Lecture Notes in Computer Science, vol.2545, pp 29-43. Mitrovic, N. et al., 2004, ADUS: indirect generation of user interfaces on wireless devices, In Proceedings of 15th International Workshop on Database and Expert Systems Applications, pp 662- 666. Mozilla Foundation, 2011, XML User Interface Language - MDC Doc Center, http://www.mozilla.org/projects/xul, Retrieved Feb. 4 2011. Nokia Corporation, 2010, Qt Meta-Object Language - Introduction to the QML language, http://doc.qt.nokia.com/4.7/qdeclarativeintroduction.html, Retrieved Feb. 4 2011. Nokia Corporation, 2011a, Qt Quick, http://qt.nokia.com/products/qt-quick, Retrieved Feb. 4 2011. Nokia Corporation, 2011b, Forum Nokia - Web Runtime, http://www.forum.nokia.com/Develop/Web, Retrieved Feb. 4 2011. Nokia Corporation, 2011c, Qt - Cross-platform application and UI framework, http://qt.nokia.com, Retrieved Feb. 4 2011. Organization for the Advancement of Structured Information Standards (OASIS), 2008, User Interface Markup Language (UIML) Version 4.0, Committee Draft, January 23, http://www.oasisopen.org/committees/download.php/28457/uiml-4.0-cd01.pdf, Retrieved Feb. 4 2011. Puerta, A. and Eisenstein, J., 2002, XIML: a common representation for interaction data, In Proceedings of the 7th International conference on Intelligent User Interfaces, San Francisco, California, USA, Jan 13-16. Rosenthal, N., 2010, Hybrid Qt/WebKit applications: Pushing the Limits of Web Development [Video], MeeGo Conference 2010, Dublin, Ireland, Nov. 15-17. http://conference2010.meego.com/session/hybrid-qtwebkitapplications-pushing-limits-web-development, Retrieved Feb. 4 2011. Siponen, M.T. and Oinas-Kukkonen, H., 2007, A review of information security issues and respective research contributions, ACM SIGMIS Database, vol.38 Issue 1, pp 60-80. TUT, 2011, Tampere University of Technology Pori Unit, http://www.pori.tut.fi, Retrieved Feb. 4 2011. Tekes, 2011, The Finnish Funding Agency for Technology and Innovation, http://www.tekes.fi/en/community/Home/351/Home/473, Retrieved Feb. 4 2011. Wartel, R., 2008, Web applications security, HEPiX Spring 2008, CERN, Geneva, Switzerland, May 5-9. http://cdsweb.cern.ch/record/1180831, Retrieved Feb. 4 2011. The WebKit Open Source Project, 2011, http://webkit.org, Retrieved Feb. 4 2011. World Wide Web Consortium (W3C), 1999, XSL Transformations (XSLT) Version 1.0, W3C Recommendation, Nov 16, http://www.w3.org/TR/xslt, Retrieved Feb. 4 2011. World Wide Web Consortium (W3C), 2009, Web Storage, W3C Working Draft, Dec 22, http://www.w3.org/TR/webstorage, Retrieved Feb. 4 2011. World Wide Web Consortium (W3C), 2010, Geolocation API Specification, W3C Candidate Recommendation, Sep 07, http://www.w3.org/TR/geolocation-API, Retrieved Feb. 4 2011. World Wide Web Consortium (W3C), 2011, HTML5 - A vocabulary and associated APIs for HTML and XHTML, W3C Working Draft, Jan 13, http://www.w3.org/TR/html5, Retrieved Feb. 4 2011. UIML.org, 2009, User Interface Markup Language, http://www.uiml.org, Retrieved Feb. 4 2011. XIML Forum, 2004, eXtensible Interface Markup Language, http://www.ximl.org, Retrieved Feb. 4 2011.

159

Short Papers

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

LITERATURE INTEGRATING AND ANALYZING OF INTERNET LITERACY: THE CASE OF 2000-2010 THESES IN TAIWAN Po-Yi Li, Shinn-Rong Lin and Eric Zhi-Feng Liu Graduate Institute of Learning and Instruction 300 Jung Ta Rd., Chung-li, National Central University, Taiwan, ROC

ABSTRACT Nowadays, information and Internet knowledge have become indispensable ability. It is more important that how to prepare ourselves correct concepts and literacy. This study utilizes “ Internet Literacy” as keywords to search and totally obtained 59 thesis. The researchers analyze these theses from aspects of subject domain, research methods, operational definition, and theses numbers per year. After analyzing, it is found that education domain produced most papers of Internet Literacy issues (62.7%), in addition to research participants, researchers focused on groups of student, including elementary, junior, and senior high school students, than others. And also diverse domain researchers are also beginning to join this domain. In terms of research methods, 54 theses used questionnaire survey method, four studies utilized experimental method, and the other one applied action research method. Ultimately, discovering stable circulation phenomenon to predict thesis numbers next year, and then the researchers will add more variables and dimensions for further analysis to figure out the reasons and meanings of the research findings. KEYWORDS Internet Literacy, Literature Analysis, Literature Review

1. INTRODUCTION Today coming with fast-changing information development, Internet use ability has become important to cultivate. American Library Association (ALA)(1989) defined information literacy that individual contains perspectives of understanding how to inquiry, evaluate, and effectively utilize acquired information while requiring. In the generation of information explosion, it is important task that cultivating students’ information literacy and make them be able to implement in reality because information literacy is also a fundamental of lifelong learning. McClure (1994) integrated four approaches of it: (1) Traditional Literacy (The National Literacy Act, 1991): Individual can read, write, speak English, compute, and solve problems. (2) Computer Literacy (McClure, 1994): It is extension of Traditional Literacy that complete tasks on a computer, for instance, word processing, creating, and manipulating data, or using other software. (3) Media Literacy (Afderheide and Firestone, 1993): A media literacy person should be capable of decoding, evaluating, analyzing, and producing both print and electronic media. (4) Network Literacy (McClure, 1994): The ability to identify, access, and use electronic information from the network, and it will be an important skill in the future. Above all, there are four aspects involved in information literacy, including traditional literacy, computer literacy, media literacy, and network literacy. In addition, according to Bawden (2001) categorized network literacy and Internet literacy as synonym so that two norms are combined to call “ Internet Literacy” in this study. However, according to increased Internet penetration rate, the importance of network literacy is also paid more attention to. Therefore, the researchers hope to understand current research situations among 2000 to 2010, eleven years, and discover future research approach in Taiwan throughout literature review of network literacy and Internet literacy.

163

ISBN: 978-972-8939-40-3 © 2011 IADIS

2. LITERATURE ANALYSIS The researchers use “Network Literacy” and “Internet Literacy” to search theses which utilized the former two norms as thesis title among 2000 to 2010 in National Digital Library of Theses and Dissertations in Taiwan. After integrating, there are total 59 theses related to Internet literacy category, as followed:

2.1 Subjects Analysis Subjects analysis mainly categorizes every participant of these 59 theses as table 1: Table 1. The Number of Internet Literacy theses (2000-2010) (N=59) Years 2000: 3 2001: 1 2002: 8

Subjects’ categories Elementary School Student (1)、Senior High School Student (1)、Vocational School Student (1) Elementary School Administrator (1) Elementary School Student (3)、Junior High School Student (1)、Senior High School Student (1)、 University Administrator (1)、Elementary School Faculty (1)、Government Worker (1) 2003: 7 Senior High and Vocational Schools Student (1)、Company Staff (2)、University Administrator (1)、 Region Faculty (Kaohsiung) (1)、Vocational School Faculty (1)、Elementary Curriculum Design (1) 2004: 4 Elementary School Student (1)、Junior High School Student (1)、Military School Student (1)、 Online Shopper (1) 2005: 8 Elementary School Student (3)、Junior High School Student (2)、Vocational School Student (1)、 Military School Student (1)、Elementary and Junior High School Principal (1)、Company Staff (1) 2006: 7 Elementary School Student (2)、Junior High School Student (3)、Elementary School Faculty (1)、 University Administrator (1) 2007: 4 Elementary School Student (1)、Junior High School Student (1)、Senior High School Student (1)、 University of Science and Technology Student (1) 2008: 8 Elementary School Student (2)、Elementary School Faculty (1)、Elementary School Parents (1)、 Nurses (1)、Government Worker (1)、Journalist (1)、Online Anonym (1) 2009: 6 Elementary School Student (2)、Senior High School Student (1)、Vocational School Student (1)、 University Student (1)、High School Administrator (1) 2010: 3 Elementary School Student (2)、Elementary Curriculum Design (1) * Number in parentheses represents the number of theses belonging that category.

According to Table 1, it could be found that Internet literacy subjects mainly compose by student groups (elementary, junior high school, senior high school, vocational school students, the number exceeding 50% of total theses). However, researches have been no longer paid attention to faculty, student in education domain since 2008. Researchers generalize that most of these researchers focused on students’ Internet literacy positively in order to excavate how they cultivate Internet use skill and morality. Moreover, it has started appearing different domain participants, like Journalist, Nurses, Government Worker after 2008. This means Internet literacy researches stepped across school, and meanwhile representing that anyone able to get on the Internet has a chance to become subjects. However, authors are curious about that there is not anyone research relative to junior, senior high school teachers or university faculty among 2000-2010, eleven years. Maybe secondary school teachers are unable to coordinate with research proceeding because of the pressure of students’ grades. On the other hand, university faculty may be too authoritative for researchers to investigate their capacity of Internet literacy. These two groups are waited for being developed.

2.2 Research Methods Analysis In research methods, there are fifty two theses utilized questionnaire survey method, two theses used both scale and questionnaire, four theses implemented experimental research method, and one thesis applied action research method. Most of theses, total fifty four theses, used survey method, including questionnaire and scale tool. Researchers collect data by questionnaire and scale, and then analyze with Percentage, Mean, Standard Deviation, Independent t-test, One-way ANOVA, Scheffe’s Post-hoc test, Pearson Correlation, and

164

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Multi-Regression. In addition, four experimental researches which aimed at pre-post evaluation of Internet literacy learning experiment to examine effect of learning strategies. The other one thesis, action research, was dominantly directed to implement curriculum design, and furthermore the researcher grew and retained more knowledge through the action research cycle of reflection and continued improvement. Generally, subjects of Internet literacy are always more than hundreds of people so that most of researchers conduct questionnaire or scale survey method to proceed this kind of data collection and analysis of larger participants. In addition, only one paper employs “action research” where observed and interviewed twenty-five students in a elementary school to understand effects of Internet literacy implementation and students’ reflections. Although action research method only collect data of less subjects, but it will realize individual’s differences deeply and closely. Furthermore, two of all apply “one-group pretest-posttest design research”, authors think the method would be able to find if arriving significant difference between pre- and post-test, but it can’t understand real learning situation. Therefore, it may add either some kind of open-ended items to make subjects express their opinions or progressing focus group interviewing to collect more perspectives. Quantitative methods could collect enormous data once, but qualitative one would be close to fact. Thus, if study could take the advantages of two methods to assist with each other, it would be plenty and reality.

2.3 Operational Definition Analysis Two kinds of definition, “three-element dimensions” and “Internet use ability”, divided in operational definition of Internet literacy theses. Thirty seven theses(62.7%) apply “three-element dimensions”, including Internet knowledge, skill, and attitude, as their operational definition; The other twenty two ones(37.3%) use “Internet use ability”, like Internet safety, Internet copyright, Internet law and manners, information searching ability et al. Although the titles of two operational definition are different, contents are still overlapped in several parts, for example, skill vs. information searching ability, attitude vs. law and manners. However, present Internet literacy researches utilize expert opinions as definitions, that is, experts and researchers design what kinds of literacy and ability are right or bad, and then to be an judging standard. But, it is more important that these designs have to correspond what learners’ needs and help them increase practical Internet literacy rather than to be an Internet literacy “test” where learners must know how to fill out all items to get better grades. Therefore, instructors had better infuse Internet related knowledge or real cases into daily class lectures to be rooted in students’ mind unconsciously rather than short “Internet literacy curriculum” during one to two weeks or even “post-test” to evaluate students’ Internet literacy.

2.4 Theses Number Analysis

Figure 1. Theses Calculation 2000- 2010

It could be found the circle of theses every three years, 2002-2004, 2005-2007, and 2008-2010, from Figure 1. Additionally, in terms of the number of theses of three “First Year”, 2002(8), 2005(8), and 2008(8), are all the same. The number of theses of three “Second Year”, 2003(7), 2006(7), and 2009(6), are almost the same. In the last, the number of theses of three “Third Year”, 2004(4), 2007(4), and 2010(3), are also familiar result. It means that situation of this circle still keeps stable, and the result is the same with what Lee, Chen, Lin, and Liu (2010) predicted. If generalizing by this circle, maybe there are still 7 to 8 researches to be generated.

165

ISBN: 978-972-8939-40-3 © 2011 IADIS

3. CONCLUSION This study focuses on the Internet literacy research approach during 2000-2010 in Taiwan. Now main research approach is that most of theses concentrate on subjects’ Internet knowledge, skill, and attitude first, and then analyze the impacts of environment, age, usage hours by statistics software. Among these fifty nine researches, two studies combined with Internet addiction to inquiry, and the other one research analyze elementary school teachers’ Internet literacy and online consumption. Because Internet addiction and consumption both require certain Internet literacy degree, It appeals that users are easier to contact with all kinds of Internet activity, such as online shopping, than in past. Along with the widespread of information development, the power is also enforced. Thus, Internet literacy research approach may be going to apply to addiction, online shopping, Web2.0 issues, like weblog, facebook, instant message, synchronous video instruction et al. Furthermore, it will have potential in domains of information, journalism and communications in Taiwan. However, the researchers are only able to predict theses numbers of next year rather than finding further phenomenal reasons and meanings because there are several dimensions which are not inquired yet. Therefore, authors then will add variables, including ”keywords”, “number of citation”, and also analyze backward (before 2000) and forward (after 2010) to extend this study in the future. Meanwhile finding out not only what the reasons, importance and meaning about this three circles (2002-2004, 2005-2007, 20082010), but Taiwanese researchers look upon Internet literacy or even Web 2.0 learning.

REFERENCES American Library Association, 1989. Final Report American Library Association Presidential Committee on Information Literacy. Retrieved August 17, 2009, from http://www.ala.org/ala/acrl/acrlpubs/whitepapers/presidential.htm . Bawden, D., 2001. Information and Digital Literacies: A Review of Concepts. Journal of Documentation, Vol. 57, No. 2, pp. 218-259. Li, Po-Yi, Chen, Yu-Ying, Lin, Shinn-Rong, and Liu, Eric Zhi-Feng, 2010. A Review and Analysis of the Literature about Internet Literacy: An Example of Thesis from 2000 to 2009. Journal of Scientific and Technological Studies, Vol. 44, No. 2, pp. 51-62. McClure C., 1994. Network Literacy: A Role for Libraries? Information Technology and Libraries, Vol. 13, No. 2, pp. 115-126.

166

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

BEHAVIORAL ANALYSIS OF SNS USERS WITH REGARD TO DIET Masashi Sugano and Chie Yamazaki School of Comprehensive Rehabilitation, Osaka Prefecture University 3-7-30, Habikino, Habikino-shi, Osaka 5838555, Japan

ABSTRACT In this paper, we verify the behavior of Social Networking Service (SNS) users who focus on diet and clarify points for utilizing such a SNS more effectively. We analyzed the utilization of Yahoo Diet Diary by 2500 users. As a result, we clarified the characteristics of users who were long-time users of the SNS and found that weight loss over 90 days from the start of using the service was significant. Moreover, we revealed the characteristics of users who experienced significant weight loss through active communication via the SNS. The study also showed that significant weight loss and posting messages frequently were characteristics of users who actively communicated through the SNS. KEYWORDS Social network service, health care, diet, behavior change

1. INTRODUCTION Communication via a Social Networking Service (SNS) which is a community-type website that promotes and supports relationships between people has spread in recent years. SNSs, such as Facebook and Twitter, offer a means and place for smooth communication between friends and acquaintances. Moreover, the mechanism for building new relationships can be made based on commonalities between users; for example, hobbies, inhabitable areas, alma mater, or a friend’s friend. These SNS applications can be used not only just for communication but also other things such as tools for health care or improving diet and lifestyle (Hawn, 2009). For diet or lifestyle improvement, aspects of one’s personal life, like meals and exercise, can be recorded and validated by oneself to reflect on at a later date (De Silva, 2007). Furthermore, by releasing this information through an SNS, information exchange with others and sharing the results of lifestyle improvements are facilitated (Swan, 2009). When the interaction of a person with this type of SNS promotes behavioral modification, it is expectable to have a marked effect on lifestyle improvement. In this research, we clarify the validity of such an SNS by investigating the situations and behavioral modifications of users focusing on diet through an SNS.

2. METHOD We chose the Yahoo Diet Diary as the SNS for investigation. Yahoo Diet Diary is an SNS equipped with a self-management tool for recording aspects such as weight and meals on a daily basis over a continuous period of 90 days (these 90 days are defined as one “stage”) and enables communication through user diaries. Moreover, it is possible to create a group community called a “club” or to register a user with similar age and habitus as a “rival”. First, we randomly selected 50 women who used the system for six months or more, and the number of stages in which the system was used continuously ranged from 1–10 (equivalent to 500 women). From user diary content, we investigated weight loss, meal recording points, age, and weight at the time of starting a diary. In order to clarify the relationship between the number of rivals and user characteristics, we investigated the number of continuous utilization stages about 2000 users from the top about the number of rivals. Next, we classified users into those with many rivals and those with few rivals

167

ISBN: 978-972-8939-40-3 © 2011 IADIS

and investigated the significant differences between weight loss, meal recording points, diary entry frequency, number of viewable photographs, and the number of the clubs to which each user belonged.

3. RESULTS Figure 1 shows the relationship between the number of stages in which the system was used continuously and weight loss per stage. Weight loss in stage 1 was the highest (average of 3.0 kg) and decreased as stages progressed. After stage 5, changes in weight loss were negligible. Next, we investigated the relationship between weight loss and the number of continuous stages while only paying attention to weight loss in stage 1 (Fig. 2), and a correlation was revealed (Spearman rank correlation coefficient: rs=0.73, p 15 to < 20 years old ≥ 20 years old Member of other social networks: Yes No

AMERICAN N=30 18.5 22 (71%) 16.2 years 10 (33.3%) 18 (60%) 2 (6.7%)

GERMAN N=154 18.6 103 (67%) 17 years 18 (11.7%) 121 (78.6%) 12 (7.8%)

10 (33.3%) 20 (64.5%)

100 (65%) 54 (35%)

3.2 Social Aspects of using Facebook We found that all study participants frequently used Facebook to communicate with their Facebook Friends. While no American and few German teenagers accessed Facebook once a week or less, differences between the groups were more pronounced at the other end of the scale, which was connecting to Facebook more than once per day (Table 2). Here, 71% of American teenagers versus 41.6% of German teenagers responded that they accessed Facebook more than once per day. The comparative study “Discover how the world lives online” (Albert et al, 2010) showed that daily use of a social network is higher among American than German 16- to 20-year old youths (72% vs. 52%, respectively). A study of German teens found that 40% of 1208 respondents accessed a social network more than once per day (MPFS, 2010). American and German teenagers alike spend 20 minutes on Facebook per visit and 60 minutes on Facebook per day. American teens use Facebook longer during a week than German teens do. The median time per week was 450 minutes (7.5 hours) for American teenagers compared to 360 minutes (6 hours) for German teenagers. Numerical differences between the groups indicate that American teens spend more time on Facebook than German teens; the difference between groups was not significant (p=0.403, Mann-Whitney test) and thus Hypothesis 2 is not supported. We assumed that American teenagers have more Friends on Facebook than German teenagers (Hypothesis 3.1) and this assumption was supported by results (Table 2); differences between groups in the number of Facebook Friends was significant (p=0.0000, Mann-Whitney test). 67% of American respondents had > 350 Facebook Friends compared to 23% of German respondents. The range in the number of Friends was considerable: from 100 to 1070 for American and from 50 to 795 for German teenagers. The number of Facebook Friends reported by both groups is staggering and it would be impossible to imagine these numbers of teenagers meeting regularly at one location to socialize. The statement by boyd (2008b), “a social network allows a teen’s social world to extend beyond a physical boundary and thus supports broader engagement with peers” is certainly supported by our results. In Hypothesis 3.2, we assumed that American teens, more than German teens, would have Facebook Friends with whom they shared an online friendship only. This assumption was not supported by the data. On the contrary, Americans claim to have an offline friendship with 99% of their Facebook Friends while Germans have an offline friendship with 90% of their online Friends (p=0.219, Mann-Whitney test). When respondents were asked if their Facebook Friends reflect their offline friends, fewer American teenagers (29%) than German teenagers (39%) noted that there was little or no similarity between the two types of friendships. Although teenagers can use Facebook to connect with strangers, the literature indicates that American teens are not likely to develop new friendships online (Lenhart, 2007). Responses of the two groups were similar when asked if Facebook was used to establish a friendship with someone met on a social basis. About 50% in each group (American, 52%; German, 48%) responded that they often or very often became Friends on Facebook with people they met in a social setting. Collecting Friends on a social network is socially valuable for teens as a marker of status and as a sign that they are “in the loop” (boyd, 2008b). In our study, American teens sent and received more Friend requests than German teens. Two or more Friend requests were received per week by 73% of American and by 59% of German teenagers. And, two or more Friend requests were sent per week by 40% of American and by 26% of German teenagers.

173

ISBN: 978-972-8939-40-3 © 2011 IADIS

Table 2. Responses indicative of social activities on Facebook

Frequency of logging onto Facebook: Every few weeks Once per week A couple times per week Once per day More than once per day Length of activity on Facebook per day: < 30 minutes 30 minutes to < 1 hour 1 hour to < 2 hours ≥ 2 hours Number of Friends on Facebook: median (range) ≤ 150 151 to 250 251 to 350 > 350

AMERICAN TEENS N=30

GERMAN TEENS N=154

0 0 3 (9.7%) 6 (19.4%) 22 (71%)

1 (0.6%) 3 (1.9%) 32 (20.8%) 59 (38.3%) 64 (41.6%)

2 (6.7%) 8 (26.7%) 11 (36.7%) 9 (30%)

31 (20%) 32 (20.8%) 43 (28%) 41 (26.6%)

405 (100-1070) 2 (6.7%) 2 (6.7%) 6 (20%) 20 (66.7%)

260 (50-795) 28 (18.2%) 42 (27.3%) 47 (30.5%) 36 (23.4%)

We assumed that American teenagers would feel more left out if they could not access Facebook than German teenagers would (Hypothesis 3.3). While results showed a numerical difference between the groups with 9.7% of Americans and 2.5% of Germans answering that they felt completely isolated or lonely if they could not access Facebook, differences between the groups were not statistically significant (p=0.404).

3.3 Privacy and Security Issues in Relation to Using Facebook Protection of private data has been a recurrent theme in popular German media (see Focus, 2010 and WirtschaftsWoche, 2010). With this in mind, our assumption was that German teenagers are more concerned about the security of personal information on Facebook than American teenagers (Hypothesis 4.1). Indeed, we found that 7% of the American respondents worry often or very often about privacy issues on their Facebook account compared with 28% of German respondents (p=0.028, Mann-Whitney test). In line with this result, fewer American (13%) than German teens (19%) update their privacy settings often or very often. Further on the issue of privacy, we assumed that American teenagers allow Friends, Friends of Friends, and Everyone to view their personal profile whereas German teens are more restrictive in who can view their personal profile (Hypothesis 4.2). What we found was that the overwhelming majority of both American and German respondents allow only Friends to view their personal profile (93% and 89%, respectively). In contrast, far fewer American teen respondents in the PEW study, 59%, show their profile only to Friends (Lenhart, 2007). American respondents

German respondents

97

94 % of Respondents

100 65 57 60 40 20

84

77

71

80

55

69 59

50

48

52

56

61

61

48 35

32 20 6

0

3

4

0

Figure 1. Private information displayed on Facebook profile by American and German teenagers

Overall, American teenagers display more private information on their Facebook profile than German teenagers do (Figure 1). Specifically, more American than German teens show their E-mail address, provide details of their high school, name their hometown, and show their relationship status. The most striking difference between the two groups is the presentation of a personal biography, which is displayed by the majority of American teens (61%) compared with only a small percentage of Germans (4%). Statistical testing of Hypothesis 4.2 was not possible (Cronbach’s alpha < 0.6).

174

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

3.4 Forms of Communication used on Facebook Hypothesis 5 proposes that American teens prefer public communication features on Facebook, such as wall posts and comments, while German teens prefer private communication via private messages, chat, and poke. Data showed that 19% of American teens compared to nearly half of German teens (48%) use private messaging often or very often (Figure 2). The chat function is used often or very often by fewer American than German teens and the third feature of private communication, the poke, is not very popular among either group. Twice as many American as German respondents use wall posts and commenting (both public forms of communication) often or very often. The numerical differences between the two groups support our hypothesis; statistical testing of the significance of the difference was not possible (Cronbach’s alpha < 0.6). American respondents

German respondents 81

% of Respondents

100

71

67 48 50

39

19

35

42

19 7

0 Private messaging

Chat

Poke

Wall post

Commenting

Figure 2. Forms of communication used often or very often on Facebook by American and German teenagers

4. DISCUSSION Our study results indicated more subtle rather than obvious differences between American and German teenage users of Facebook. While results revealed numerical differences between groups, tests of significant differences between the groups provided substantiation for acceptance of only two of our nine hypotheses. We surmise that our population uses Facebook to reinforce existing on- and offline relationships rather than to establish new relationships, a result similarly reported with American law students (Lampe et al, 2006). German teens were more likely to connect with strangers and to define their online Friends as being different than their offline friends. These results indicate a potential cultural difference in the way teenagers use social networks which has not been previously researched. Facebook allows teens to stay in touch with a large, diverse offline community which increases their social capital (Resnick, 2001). The American teenagers in our study had a significantly higher number of Friends than the German teens. The sheer number of American teens registered on Facebook coupled with the Friends of Friends feature probably account in part for this difference between the two groups. Subtle differences between the two groups were seen in respect to privacy issues. American teens present more personal information on their profile than their German counterparts, seem to be less worried about privacy issues, and indicated a preference for the public messaging feature of communication over private communication. Both groups indicated selectivity in who could view their personal information by limiting access to Friends only. Although the age of respondents might be considered as a reason for a nonchalant attitude toward showing private information, results of a study of US college students revealed similar results (Acquisti, 2006). Facebook gives users a various options to control the privacy of information available for view by others. If a member followed the recommended privacy settings, a great deal of personal information would be available for view by “Everyone” and by “Friends of Friends”. Some authors have suggested that corporations behind social networking sites should act more responsibly to protect the privacy of their members (Wallbridge, 2009). Our results seem to indicate that teens are making a conscious and deliberate effort to protect their privacy by customizing their privacy settings to control the information revealed. The field of research on teenagers and social networks is steadily growing. However, research on the use of social networks by non-American teenagers is uncommon. As the popularity of Facebook increases worldwide, it would be of interest to compare cultural differences among teenagers in their use of Facebook, their attitudes toward Facebook, and the possible differences between users and non-users -- if there are any non-connected Facebook teens to study! A limitation of our study is that the sample may not be representative of the populations of interest, namely, the millions of American and German teenagers who are members of Facebook. Further, the

175

ISBN: 978-972-8939-40-3 © 2011 IADIS

questionnaire was not pre-tested and testing for reliability and validity was not conducted. The survey collected information on self-reported activities in regard to Facebook but did not capture the “real” behavior of the participants when using Facebook. It is plausible that actual use differs from self-report of activities.

5. CONCLUSION Facebook is part of the “new normal” among many teenagers. Still, little seems to be known about Facebook and how teenagers use it despite this new normality and extensive research has yet to be done on how Facebook is used by teenagers of different cultures. Many teenagers cannot imagine a world without Facebook. Facebook makes distances shrink and allows large numbers of teens to communicate with a click of a mouse. No matter if chatting with a school friend close by or checking up on a relative living on another continent, using Facebook makes teens feel connected and brings about a sense of belonging.

ACKNOWLEDGEMENT We thank Patrick Hoberg for his assistance with the statistical analysis of the results. A word of thanks is extended to all Facebook Friends who participated in the survey.

REFERENCES Acquisti A. and Gross R., 2006. Imagined Communities: Awareness, Information Sharing and Privacy on the Facebook. Workshop on Privacy-Enhancing Technologies (PET) 2006. Albert, M., Hurrelmann, K., and Quenzel, G. (Eds.), 2010. TNS Infratest Sozialforschung:16. Shell Jugendstudie. 2010. boyd, d., 2008. Why youth [heart] social network sites: The role of networked publics in teenage social life. In Buckingham D. (Ed.), Youth, Identity, and Digital Media, The MIT Press, Cambridge, USA, pp 119-142. boyd, d., 2008b. Taken out of context: American teen sociality in networked publics, PhD Dissertation, UC Berkeley. boyd, d., 2007. Friendships. In Ito, M. et al, (Eds.), Hanging Out, Messing Around, and Geeking Out, The MIT Press, Cambridge, MA, USA, pp 79-115. Focus (Ed.), 19 July 2010. Er weiß alles über Sie! Wollen Sie das? Mark Zuckerbergs Facebook erobert die Welt. Lampe, C. et al, 2006. A Face(book) in the crowd: Social searching vs. social browsing, CSCW’06, pp 167-170. Leidner, D.E., and Kayworth, T., 2006. A review of culture in information systems research: Toward a theory of information technology culture conflict, MIS Quarterly, 30, 2 (June), pp 357-399. Lenhart, A. et al, 2010. Social media and young adults, PEW Internet and American Life Project. Washington, DC. Lenhart, A. , Madden, M. 2007. Social networking websites and teens: An overview, PEW Internet and American Life Project. Washington, DC: PEW/Internet. Medienpädagogischer Forschungsverbund Südwest (Eds.), 2010. Online-Communities, Jugend, Information, (Multi-) Media Studie (JIM-Studie), Stuttgart, pp 41-45. Palfrey, J. and Gasser, U., 2008. Born Digital. Basic Books, New York, NY. Resnick, P., 2001. Beyond bowling together: SocioTechnical capital, In: Carroll, J., (Ed.), HCI in the New Milennium, Addison-Wesley, Reading, MA, USA. The Economist (Ed.), 2010. Status update: Facebook has become third-largest nation, www.economist.com/node/16660401 on December 18, 2010. TNS (Ed.), 2010. Discover how the world lives online. Wallbridge, R. 2009. How safe is your Facebook profile? Privacy issues of online social networks. ANU Undergraduate Research Journal, 1, pp 85-92. WirtschaftsWoche (Ed.), 15 November 2010. Verkaufsmaschine Facebook: Das Millionengeschäft mit den Freunden. Volkswagen/MTV/Nielsen, 23 September 2010. Mepublic – A global study on social media youth, http://www.bvdw.org/ medien/volkswagen---mtv---nielsen-mepublic--a-global-study-on-social-media-youth/media=2371.

176

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

SNAP: THE SOCIAL NETWORK ADAPTIVE PORTAL Alexiei Dingli,Mark Scerri, Brendan Cutajar, Kristian Galea, Saviour Agius, Mark Anthony Cachia, Justin Saliba, Jeffrey Cassar, Erica Tanti, Sarah Cassar, Shirley Cini and Mariya Koleva Faculty of Information and Communication Technology (ICT) University of Malta Msida, MSD 2080, MALTA

ABSTRACT Since the boom of social networking lead to people using multiple account on many platforms in order to keep in touch with hundreds of contacts, managing one's contacts risks becoming a burden for many users. Following and finding information about friends and family has become an issue too. Guided by these observations and by careful research of existing adaptive web technologies, our team worked on the development of SNAP - an adaptive social network integrator which aimed to amalgamate four social networks (Facebook, Twitter, Flickr and Buzz) in one adaptive environment, which to unobtrusively sort the users' feed according to his/her preference. To achieve data transfer and authorisation, SNAP used APIs and the newest version of the OAuth protocol. Adaptivity was achieved through statistical filtering. Despite efforts, the initial field tests show that the system is not as yet ready to be launched for wider use. However, there is room for improvement in terms of Social Network Integration, and tester users expressed an interest in the idea of using an adaptive social integrator such as SNAP. KEYWORDS Social networking, adaptive systems, adaptive hypermedia systems, social network portal, social network integration

1. INTRODUCTION More and more people use social networking sites (SNS) everyday to interact with their family and friends, and for various reasons a number of these people use more than one social network. People maintain a large number of contacts (friends) on their chosen social networks and as most social networks today offer many functionalities, such as status updates, media sharing (text, photo, audio, video), third party applications and games, the users may experience difficulties finding the content in which they are interested. Another characteristics of SNS usage is that users are not a homogeneous mass but individuals with distinct preferences over which social network is the best for them. These preferences are not static either, and over the time they tend to change in favour of one network or another. Managing several accounts over different portals may become a burden for many users. For this purpose the SNAP adaptive web portal was developed, allowing the amalgamation of several popular social networks in one integrated environment. SNAP monitors and learns the browsing behaviour of the users and adapts to it, giving prominence to their preferred social networks. In the following section we are going to discuss previous research on adaptive systems and some issues with their evaluation. In section 3 we present the implementation of SNAP. Section 4 outlines the results from the testing of SNAP and section 5 concludes with some possible future developments.

2. LITERATURE REVIEW The function of an adaptive system is to offer personalized experience, analyzing the data from the user's interaction with the system (Brusilovsky and Milan, 2007),. It aims to improve the organization and presentation of websites (Perkowitz, M. and Etzioni,O., 1997 cited in Mican and Tomai, 2010, p. 86) by

177

ISBN: 978-972-8939-40-3 © 2011 IADIS

adaptively selecting, prioritizing and manipulating links and content (Brusilovsky and Milan, 2007). The reasons which gave rise to the development of adaptive systems are varied, for instance, to give personalized recommendations (Balabanović, 1997), to personalize learning experience in tutoring systems (Baena, et al., 2000) or even to and even to adapt information for terminally-ill patients (Bental, et al., 2000). Furthermore, there are many approaches to adaptivity. In this review we are only aiming to present overlay user modelling, which is particularly pertinent to our work on SNAP. The user model (a model of a user's behavior) is at the heart of an adaptive system. The problem receives prominence in Brusilovsky and Milan (2007) in a discussion about adaptive educational systems (AES), but the same principles can be used for other adaptive systems as well. Brusilovsky and Milan (2007) describe an overlay model as a structural model, which presents the domain knowledge as a set, and the user knowledge as its subset. The elements of the user knowledge are assigned values (boolean, numeric or a probability) to represent the level of knowledge. When the domain knowledge is modelled, what is modelled can broadly be called concepts. The latter can be facts, rules or constraints. Depending on whether these concepts are independent or related, the domain models are vector (set) models or network models respectively. Evaluating adaptive websites can be a difficult task, which was also pertinent to our research.. Bauer and Scharl (2000) investigate the possibilities for automatic content and structure evaluation of websites and raise important concerns about the validity of manual (subjective) website evaluation, but their techniques were not immediately applicable on the work on SNAP, one of the reasons being that Braun and Scharl's research is aimed at non-adaptive websites. Sadat and Ghorbani (2004) propose a hierarchy of features, specifically aimed at the evaluation of adaptive hypermedia systems. The authors classify the features in three main groups: runtime features, technology, and software engineering. Runtime features refer to the way the system works and behaves, technology features include all the algorithms and techniques employed in the system, and software engineering refers to features of the software development. These have been useful in the process of evaluating SNAP.

3. METHODOLOGY 3.1 Basic Information and Functionalities The SNAP project was developed adopting a spiral software development life cycle in an agile development framework. The front-end layer, which takes care of the user requests, was designed using layouts. The backend communicates with the social networks via specialized classes. There is one parent class and all other specialized social network classes extend it. The general functionalities of SNAP are defined in this parent class and the network classes can also present unique functionalities. In its final form, SNAP is designed to support different social networks. For the prototype, we attempted to implement Facebook, Flickr, Twitter and Buzz. At the time of writing not all planned functionalities are available to the users. The planned functionalities are as follows: The parent class retrieves user's posts, user's photos and user's inbox. It also has functionalities to search (e.g. for other users). The Facebook class should retrieve: the user's feed, wall, message inbox, message thread, albums, user's picture. It should also allow the user to post on the Wall. The Twitter class should retrieve user and home timeline, and the user's received direct messages. It has the functionality to send a direct message to one or more recipients, to search for a user, to update and delete a tweet, to retweet and to delete a tweet. The Flickr class has the functionalities to: retrieve photos, display photos in four formats, retrieve information about photos, and search for photos. Activities on Flicker are: commenting, tagging, adding to favourites, adding to gallery, adding to notes. The Buzz functionalities which were attempted are: retrieve posts, comment on posts, link posts, post text,post text and links, post text and photos.

3.2 Front End Specifications When first coming across the site, a new user registers a new account with the system in a very straight forward procedure. As the new user logs in for the first time, they are redirected to a page , where they can

178

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

manage social networks. Here the user can add the social network accounts they would like to integrate into SNAP (at least one). In the user's homepage, the user's feed is displayed in an ordered fashion. The social network that has the most priority will be displayed first, then the next in priority, and so on until all the spaces on the screen are filled. The user can also choose which particular social network to see using the filter button. At any time the user can delete a social network or add new ones to the system, making SNAP very useful for people who have multiple account on different social networking sites. In the homepage one can find a share section. Here the user can share a status with one or multiple social networks. Messages, photos, and profile edit pages are deprecated in the current version.

3.3 Database and Adaptation Implementation The database is the foundation of this solution. This project depends entirely on storing the user's social network credentials only once throughout their entire use of the website. Each of the user's account access IDs and tokens are stored within the database for quick access, thus giving the user the need to login only once - and the system handles the rest. The database uses stored procedures to perform functions on itself. The adaptation is provided by a class. The filter class is designed to interface the front end with the back end of the system, the front end being all the displayable sections of the solution while the back-end concerns itself with data gathering and representation. It is designed to satisfy the requests from the front end by processing the request made and then executing all the necessary back-end method calls and object creation. Once this is done, the results are gathered and using some simple adaptivity algorithms are filtered to relay the correct set of elements to the front end for display. This means that this class effectively handles the adaptive section of this solution. The filter is specifically intended to use statistical data collected from the use of the site, and use it to make assumptions of the user's preferences. Currently, the filters only ensure that it sorts the social networks into the order of preference when displayed on front end and attempts to allocate more space to the more preferred social network accounts. The following concerns emerged when it came to displaying features to the front end. There are some social networks which do not receive the same bulk updates that others do. If all the space available is given to the heaviest social network then all other social networks are likely to be given much less prominence than they should. We had to ensure that the user will see most of their preferred material while not omitting other material entirely. To solve this we implemented a filter that retrieves the user's social network accounts from the database and iterates through each, requesting data. How much of the returned data is added to the final result depends entirely on the user's preferences. Thus a filtering effect is achieved: allowing a lot of one particular data to pass while blocking some of another. To handle such data, return types as well as method names of each requested feature had to be uniform. This is to produce code efficiently, reduce maintenance effort and improve readability and overall quality of the code. The methods that are absolutely necessary are dictated by the parent class which all social networks must extend. Then the return objects are dictated via a set of "generic" classes that ensure that returned objects give the necessary data to the front end of the system.

3.4 Authorization Sharing content and data between SNAP and the social network sites requires authentication, as to ensure confidentiality and integrity of communications. This has been achieved by utilizing the specific APIs of the social networks, when such mechanisms were available, together with the OAuth protocol (Hammer-Lahav, n.d.), which handles the access control. In the prototype the following APIs were used: The Facebook class uses the Graph API - Facebook's social graph view, which contains objects, such as people, photos and events, and the connections between them (e.g., friend relationships, tags, etc) (Facebook, 2011). The Twitter class uses the Twitterizer API (Twitterizer, 2011). The Flickr class uses the FlickrAPI (Yahoo! Inc., 2011), which consists of methods and API endpoints(Yahoo! Inc., 2011), to send and retrieve data in all of the following formats: REST, XML-RPC, SOAP, JSON, and PHP. There was no Google Buzz API for C#, which was an issue, because SNAP was implemented in the .NET framework. An attempt was made to implement a Google Buzz API based on the Java client found in the Google Code page (Google, 2011), but had to be abandoned due to problems with signing requests.

179

ISBN: 978-972-8939-40-3 © 2011 IADIS

4. EVALUATION The tests which were planned for SNAP were Black box testing (testing the system with an external view), White box testing (testing whether the correct data constructs are being created during the parsing and receiving of data from social networks), and Field (Alpha) testing, where users were to be asked to fill in a questionnaire to determine their feedback from use of the website. Due to time constraints, the Alpha testing performed was brief and with fewer tester users than would have been desired. However, the developers' team tested thoroughly the functionalities of the website using the Black box and White box techniques. In the process, some of the planned features were deprecated in this version of the website. During the test period, the system viewed with the following browsers: Microsoft Internet Explorer, Mozilla Firefox, Opera and Google Chrome. The system performed without problems on Opera and with some issues in displaying Twitter images in Firefox. Google Chrome displayed the website and the implemented social network functionalities appeared to be working, but adaptation did not seem to be possible with this browser at the time of testing. At this moment in time, Microsoft Internet Explorer did not work with the system. Black box tests were planned on five groups of features of the system: initial log in and registration, integration of a social network account, deletion of a social network account, posting, and photo functionalities. Four out of the planned aspects were tested, with exception of the photo functionalities, since these had to be deprecated from the system. The log-in testing involved testing the registration, first-time log in and subsequent log ins. In the final version of the website, the output of the system matched the expected output, with exception of the registration procedure, when during the return to the login page, some formatting errors occurred. During the next tests (to add a social network and permissions), Twitter and Facebook accounts were successfully integrated. Flickr, however, could not be. It was known from earlier development stages that a Buzz account cannot yet be implemented without some compromising of the privacy and security of the system. Tests for deleting a social network account ended up unsuccessful, as the accounts appeared deleted, but in reality persisted. The test was performed on Facebook and Twitter accounts only, for the reasons described in the paragraph above. The posting tests meant to establish: whether the system checks for posts in all social networks added, if it checks for posts in a specific networks (chosen by the user), and if sharing a user status across all social networks is visible on all of them. When checking if all posts from a user's social networks are displayed in order of preference, the test was successful (Chrome had some issues with accounting for preference). With regards to specific social networks, the system displayed all posts from the user's social network account in order of most recent with comments, as it was meant to. Sharing a status across all social networks was successful with regards to Facebook and Twitter was successful. White box testing was designed to examine the functions of the system as they were coded. Five functions were examined: Facebook Get Photos, Filter get Posts, Filter get user posts, Facebook getInbox. All the tests seemed to be successful, with exception of the Filter get user posts. With the SNAP user as input it was supposed to give a sorted list according to preference of post objects containing post data. The test failed with an empty array. To complement the developers' testing, 24 users (11 male, 13 female) were asked to review their initial experience with the system in an informal questionnaire. They rated the success of social network integration and various site attributes, namely: overall experience, ease of use, social networks functionality, feed, initial account setup, and privacy and security. The integration Facebook was rated the best and Twitter, Flickr and Buzz – neutral to negative. The site attributes received neutral scores, with exception of the design and layout which received high scores, and the site speed, which received low scores. The user also left open-worded comments, in which they most frequently reported problems with displaying the Twitter timeline. Part of the testers indicated in their comments that they would be interested to use a social network integrator of this kind.

180

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

5. CONCLUSION The SNAP adaptive web portal was developed out of the need to have a system which relieves the user of the burden to visit different social network sites separately in order to maintain contact with their friends, family and colleagues. While amalgamating portals exist, SNAP is designed to also respond to the changing interests and information needs of the users. In other words, SNAP was designed as an adaptive portal which displays the user's social networks in an order of preference that the system learns unobtrusively on its own, with the help of statistical filtering. As SNAP was developed within a tight time frame, only basic testing could be done before launching the website. Only a brief initial test with participants was possible. During the final testing by the developers it was discovered that while the core functionalities are in order, there is still room for improvement in terms of social networks functionalities which had to be deprecated, as well as some browser discrepancies. Future work on the project is still a viable possibility, since we have retained a number of methods by which we can further improve our solution in terms of features, reliability and flexibility. Some possible new avenues include photo management and uploading, photo album management, and further adaptability by layout changing to emphasize specific functionality. Efforts need to be made to resolve the issues with the integration of Buzz in a way which does not compromise the privacy and security of our users.

REFERENCES Baena, A. et al., 2000. An Intelligent Tutor for a Web-Based Chess Course. AH '00 Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems. Springer-Verlag London, UK, pp. 17-26 Balabanović, M., 1997. An Adaptive Web Page Recommendation Service. Proceedings of the First International Conference on Autonomous Agents. Marina del Rey, California, United States, pp.378-385. Bauer, C and Scharl, A., 2000. Quantitive Evaluation of Web Site Content and Structure'. Internet Research: Electronic Networking Applications and Policy pp. 31-44. Bental, D. et. al., 2000. AH '00 Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems. Springer-Verlag London, UK,pp. 27-37 Brusilovsky, P. and Milan, E., 2007. User Models for Adaptive Hypermedia and Adaptive Educational Systems. In: Brusilovsky, P. et. al.,eds. The Adaptive Web. Springer-Verlag, Berlin Heidelberg, pp. 3-53. Facebook, 2011. Graph API Reference Documentation. Available: http://developers.facebook.com/docs/reference/api/. [Accessed 24 March 2011]. Google, 2011. Google Buzz API. Available: http://code.google.com/apis/buzz/docs/libraries.html. [Accessed 24 March 2011]. Hammer-Lahav, E. et al., n.d. OAuth. Available: http://oauth. net. Last accessed 24th March 2011. Mican, D. and Tomai, N., 2010. Association-Rules-Based Recommender System for Personalization in Adaptive WebBased Applications. Proceedings of the 10th international conference on Current trends in web engineering. Vienna, Austria. Springer-Verlag Berlin Heidelberg, pp. 85-90. Sadat, H. and Ghorbani, A. A., 2004. On the Evaluation of Adaptive Web Systems. Proceedings of the Workshop on Web-Based Support Systems. Beijing, China, pp. 127-136. Twitterizer, 2011. Twitterizer: We Want to Give your App Twitter. (Updated 22 March 2011 ) Available: http://www.twitterizer.net/. [Accessed 24 March 2011]. Yahoo! Inc., 2011. The App Garden. Available: http://www.flickr.com/services/api/misc.overview.html.[Accessed 24 March 2011].

181

ISBN: 978-972-8939-40-3 © 2011 IADIS

ONLINE WOM USAGE MODEL FOR TOURISM Masato Nakajima, Kosuke C. Yamada and Muneo Kitajima Center for Service Reserch, National Institute of Advanced Industrial Science and Technology (AIST) 2-3-6 Aomi, Kouto-Ku, Tokyo, JAPAN

ABSTRACT Nowadays, service providers in tourism cannot utilize the online word of mouth (WOM) information to improve their service for tourists because they do not understand how tourists use the WOM information. In this paper, we investigated how tourists utilize the online WOM as a consumer to consumer (C2C) interaction for choosing travel destinations, using a methodology based on Cognitive Chrono-Ethnography (CCE) (Kitajima, Nakajima and Toyota, 2010). In particular, we focused on senders’ and receivers’ styles of using an online WOM. Based on these observations, we built an online WOM usage model for tourism. The model helps service providers evaluate effects of online WOMs on tourists’ choices. KEYWORDS Online WOM, Tourism, C2C interaction, service provider, Cognitive Chrono-Ethnography (CCE)

1. INTRODUCTION Nowadays, web site information in tourism service, one of the largest service industries, plays a significant role in tourists’ choices of destinations. An online WOM is an especially valuable source for visitors selecting the hotel they will stay at and the local spots they will visit. An online WOM is also valuable for service providers because it includes feedback of tourists' experiences, expectations, etc. so providers can improve their services for tourists. However, tourism service providers cannot utilize the information, primarily because it is not easy for them to extract the information they need from the online WOM due to the lack of understanding how a WOM is written and read by tourists. In this study, we investigated how tourists use information on an online WOM to choose travel destinations. In particular, we focused on senders’ and receivers’ styles of using online WOMs. Finally, we built a model that businesses can use to evaluate how an online WOM affects tourists’ choices. This model describes relationships among senders, receivers, and service providers in terms of how each agent uses the online WOM.

2. METHOD We adopted Cognitive Chrono-Ethnography (CCE; for the methodology and theoretical background of CCE, see Kitajima, et al., 2010) as the methodology to conduct the study. Step 1. Selecting monitors Purpose: We conducted a web survey to select receiver monitors and sender monitors, 18 in total, who would participate in the in-depth interview sessions. Method: The Web survey consisted of two questionnaires: their attitudes on using online WOM and contents they were interested in as receiver and sender. The questionnaire for receivers included four attitude items: 1) frequency of using daily online WOM, 2) acceptance of information on an online WOM, 3) influence of information on an online WOM, and 4) contribution. Attitude items for senders included 1) frequency, 2) desire for communication, 3) intention to influence receivers’ activity, and 4) intention to contribute to receivers. The respondents were required to answer both types of questions using a scale of high, middle, low and none. The questionnaire included seven content of interest items: spa, meals, hotel,

182

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

town, shopping, sightseeing, and others. The respondents were asked to indicate items in which they were interested. Results: First, we extracted 636 monitor candidates from about 10 thousand respondents who had visited Kinosaki onsen (one of the most famous Japanese traditional spas, located in Hyogo Prefecture) by using an online WOM. Second, we applied Hayashi’s quantification method type three to categorize the responses. We then categorized the monitor candidates into three attitude groups, high, middle, and low, for receiver and sender. Similarly, we categorized the candidates into four content groups for receiver, all-over type, semi-allover type, spa type, and hotel type, for receiver, and three content groups for sender, all-over type, semi-allover type, and hotel type. The difference between all-over and semi-all-over types is the quantity of answers for each question. In this way, we defined 12 groups for receivers and 9 groups for senders. Since we were not interested in low-attitude senders, we chose 18 study monitors from 12 receiver groups and 6 sender groups. Step 2. In-depth interview We conducted in-depth interviews twice for each monitor to determine the styles of using online WOM for senders and receivers. The two interviews were separated by two or three weeks. The monitors were required to record their daily use of online WOMs in writing. This was called the diary memo and was used in the second interview. In the interview, we asked monitors’ current and past use of online WOMs.

3. RESULTS AND DISCUSSION Table 1 presents typical answers given by most monitors to the main questions provided in the in-depth interview. Table 1. Answers of receivers and senders in the in-depth interview RECEIVER What is the purpose of using an online WOM?  To choose sight-seeing spots and hotels.  To plan for real and future travel. What information do receivers want to obtain?

SENDER What is your motivation for writing online WOM messages?  To indicate satisfaction or complaints →mainly being satisfied.  To provide information that they want to tell tourists. What messages do senders write on the Website?

 Raw opinions, new information, brief and concise  Impressions (evaluation) or information (facts). information.  Positive message or Negative message.  Negative information for mental preparation and  Contents (rooms, faculties, meals, staff service, tour eliminating candidates. spots, etc.).  Contents (rooms, faculties, meals, staff service, tour spots, etc.). From whom? For whom?  Persons with similar characteristics (age, sex, style of travel).  Matured parson for travel and sending online message. What types of online WOM messages are credible?  Quantity and qualities of online WOM messages  Quantity of similar opinions  Evaluation score of a hotel

 Staff members of hotel where tourist stayed.  Other tourists. What points does the sender consider in the messages?  Ways of writing phrases (conciseness and the usage of language )  Length of a message  Difference from the other messages

3.1 Styles of Senders 1) Most tourists were motivated to send a message on an online WOM when they were satisfied with their travel. For example, they sent messages when hotels they stayed in were very clean and comfortable, meals were very good, and service staff members were very kind and polite. Therefore, almost all senders wrote

183

ISBN: 978-972-8939-40-3 © 2011 IADIS

positive messages on online WOMs. In contrast, senders seldom write negative messages unless they were very dissatisfied. 2) What do senders write in their message and to whom do they send it? Many senders indicated that they write online WOM messages to hotel staff members and to other tourists. To the staff members, senders express their appreciation and satisfaction for the kind and polite service received during their stay. To the tourists, senders write about the cleanness of the hotel, quantity of the served meals, and service. The facts are difficult for tourists to get from published media and the hotel web site.

3.2 Styles of Receivers 1) Receivers exhibited two uses of online WOMs: choice and planning. Receivers using WOMs to choose where they visit and stay tend to be heavy internet users and get many kinds of information from Web sites in their daily lives. In contrast, receivers using WOMs for planning their travel tend to be light internet users. They get information about travel mostly from published media such as books, magazines, and brochures. They use online WOMs to obtain the latest local information to complement what they found exploring published media. However, both types of receivers require the facts that impressed everyone rather than personal feelings. 2) What style of online WOM messages do receivers want and accept? Receivers trusted the quantity of messages written in online WOMs and the content of the messages related to them. In particular, similar opinions written by different people become more trustworthy, regardless of whether they are positive or negative. However, positive information seldom influences a tourist's final decision, but negative information will influence it. Many receivers regard online WOM that does not include negative messages as untrustworthy. Preference for negative information on WOM is also reported by a few studies (e.g., Herr, Kardes, and Kim,1991)

3.3 Online WOM usage Model for Tourism We arranged the relations for using online WOMs among sender, receiver and service provider to build a model to understand tourist expectations and needs; we called this the online WOM usage model for tourism (Fig. 1).

Figure 1. Schematic image of online WOM usage model for tourism

184

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

This model is explained as follows. 1) Structure: An online WOM is centered in the model, which consists of sites and contents. There are three types of sites where online WOM were written in. Three agents representing the sender, receiver, and business are placed around the centered elements of the online WOM. The motivation for using online WOMs and information required by each agent are described in boxes. 2) Sender: A tourist who would like to send an online WOM message to hotel staff members or other tourists when they were satisfied with services in the hotel in which they stayed. Almost all the messages are positive such as expressing appreciation and positive impressions about rooms, meals and service. 3) Sender: The tourists who write messages on web sites managed by tour companies and the hotel where they stayed, their own web log, and social networking services (SNS) such as Mixi. Most tourists write messages on web sites managed by tour companies because the sites provide exposure to many people and are trustworthy because they restrict submitters to those who actually reserved a room using the web sites. The tourists intend to express their appreciation to hotel staff members and to recommend the hotel and favorite spots to other tourists. Other tourists often write messages in their own weblog and SNS to convey a personal message. 4) Receiver: Receivers explore an online WOM for information on their travel. Some receivers use WOMs to decide the spots they will visit and the hotel where they will stay. Others use WOMs to plan their itinerary when they cannot obtain information from published media and TV. Receivers who use online WOMs to make decisions explore negative information not only to exclude unfavorable candidates but also to confirm online WOM sites having credible information. Moreover, they ignore negative information unrelated to them even though there is much such information. 5) Staff members: Staff members in tourism use online WOMs to confirm whether the tourists were satisfied with the service they provided and how the tourist evaluated them. Some hotel staff members respond to their message as soon as the tourist sends it. The immediate response tends to be evaluated better for future visitors.

4. CONCLUSION In this paper, we investigated how tourists utilize online WOMs as a C2C interaction for choosing travel destinations and built an online WOM usage model for tourism. The model would be useful for service providers in appropriately evaluating effects of online WOM on tourists’ choice. For example, the model demonstrated that receivers tend to explore negative information, but almost all senders prefer to write positive messages. At this point, it is interesting that Online WOM is composed of asymmetric communication between receiver and sender. In our future work, we should investigate how service providers use online WOMs to complete the model because we only focused on C2C interactions. Recently, interest in C2C interactions has increased in online WOM studies (e.g., Maclaran and Catterall, 2002; Nah, Siau, Tian and Lin, 2002; Xue and Phelps, 2004). Many companies suffer similar problems. It has been suggested that companies need to utilize the feedback from online C2C interaction (Maclaran and Catterall, 2002). Therefore, we should consider applying the methodology we adopted in this study to construct a model for tourism centered on online WOM for other service industries so they can improve their quality of services.

ACKNOWLEDGEMENT This research was entrusted by the Ministry of Economy, Trade and Industry (METI).

185

ISBN: 978-972-8939-40-3 © 2011 IADIS

REFERENCES Herr, P.M., Kardes, F.R. and Kim, J. (1991): "Effects of Word-of-Mouth and Product-Attribute Information on Persuasion: An Accessibility-Diagnosticity Perspective", Journal of Consumer Research 17(4): 454-462. Kitajima, M., Nakajima, M., & Toyota, M. (2010): “Cognitive Chrono-Ethnography: A Method for Studying Behavioral Selections in Daily Activities”, Proceedings of The Human Factors and Ergonomics Society 54th Annual Meeting 2010, 1732-1736. Maclaran, P. and Catterall, M. (2002): "Researching the Social Web: Marketing Information from Virtual Communities", Marketing Intelligence & Planning 20(6): 319-326. Nah, F., Siau K., Tian, Y. and Ling, M. (2002): "Knowledge Management Mechanisms in E-Commerce: A Study of Online Retailing and Auction Sites", The Journal of Computer Information Systems 42(5): 119-128. Xue, F. and Phelps, J.E. (2004): "Internet-Facilitated Consumer-to-Consumer Communication: The Moderating Role of Receiver Characteristics", International Journal of Internet Marketing and Advertising 1(2): 121-136.

186

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

ISOLATING CONTENT AND METADATA FROM WEBLOGS USING CLASSIFICATION AND RULE-BASED APPROACHES Eric J. Marshall and Eric B. Bell Pacific Northwest National Laboratory

ABSTRACT The emergence and increasing prevalence of social media, such as internet forums, weblogs (blogs), wikis, etc., has created a new opportunity to measure public opinion, attitude, and social structures. A major challenge in leveraging this information is isolating the content and metadata in weblogs, as there is no standard, universally supported, machine-readable format for presenting this information. We present two algorithms for isolating this information. The first uses web block classification, where each node in the Document Object Model (DOM) for a page is classified according to one of several pre-defined attributes from a common blog schema. The second uses a set of heuristics to select web blocks. These algorithms perform at a level suitable for initial use, validating this approach for isolating content and metadata from blogs. The resultant data serves as a starting point for analytical work on the content and substance of collections of weblog pages. KEYWORDS Weblog, Extraction, Classification

1. INTRODUCTION The explosion of popularity in social media, such as internet forums, weblogs (blogs), wikis, etc., in the past decade has created a new opportunity to measure public opinion, attitude, and social structures (Agichtein et al. 2008). Reaction to news or events in this medium is often nearly instantaneous, providing the opportunity to make quick measurements of the short-term impact of specific stimuli (Qualman 2010). A major challenge in leveraging this information is extracting it automatically from web pages, as no standard, universally-supported, machine-readable format for presenting it exists. In this paper, we examine one form of social media, blogs, and explore methods for automatically isolating common attributes from arbitrary blogs. This research describes techniques to distill raw HTML data into a common, structured schema that isolates textual content and metadata. In Section 2, we will discuss blogs in general and discuss the shortfalls of the current syndication formats. We discuss our two approaches for isolating content and metadata in sections 3 and 4, and present our experimental results in Section 5. We close with a discussion of future work in Section 6. We begin with a brief description of some of the previous work in this area.

1.1 Previous Work One previously used approach to isolate content and metadata from web pages is to manually write rules for each format of a site from which information is to be extracted (Hammer et al. 1997). This approach works well if there are a small number of sites of interest with known formats. However, it does not scale well to a large number of sites. In addition to being labor intensive, another problem with this approach is that the extraction rules will fail any time that the site author changes the format. Other approaches for automated information extraction from web pages has focused on extracting data records by finding repeated patterns in the HTML tags that make up a web page and creating

187

ISBN: 978-972-8939-40-3 © 2011 IADIS

generalized wrappers (i.e., wrapper induction) to locate those repeated patterns (Wang, and Lochovsky 2003; Algur, and Hiremath 2006). This approach requires the site to present multiple data records in a regular way on a single page. However, this model does not apply well to blogs, as much of the useful information from a blog is spread out across many pages.

2. BLOGS In order to limit the scope of our research, we focus on the following attributes found on a typical blog post: title, content, and date. Clearly this list is not exhaustive, as additional information that could be useful for extraction is available, such as comments to the posts, but we focus on only these attributes listed above because they are common to all or most blog sites. Given the data-rich pages from a blog site (which will be referred to as “post pages”), this research aims to extract the title, content, and date attributes for each post page provided. This task would be trivial if each site provided this information in a standard way. Although methods for doing exactly this have been developed and are quite mature, we discuss in section 2.1 why these methods are inadequate .

2.1 Syndication Formats Some work has been done to create standard syndication formats, such as RSS and Atom, which can be used to publish new information in a common, machine-readable format (Hammersley 2005). When implemented properly, these formats can be used to extract much of the information from social media. However, many blog authors restrict their feeds by truncating the content contained in the feed, forcing users to visit the post page directly to access the full content of the post. Furthermore, some of the less-popular blog authoring packages do not support these syndication formats by default. In light of the possible absence or incompleteness of RSS and Atom, they cannot be relied upon alone to provide the information that we seek.

2.2 Identifying Relevant Content One of the challenges of information extraction from web pages is eliminating the boilerplate content (e.g., headers, footers, navigational elements, and ads) from the relevant content unique to each page. Some recent research that has been done in this area, including (Kohlschutter, Fankhauser, and Nejdl 2010; Wang, and Lochovsky 2002). Kohlschutter, et al. of (2010) published an open source implementation of their algorithm, Boilerpipe (Kohlschutter, Fankhauser, and Nejdl 2010), which we found to perform well on selected index and post pages from our dataset. The Boilerpipe algorithm uses a combination of structural features, shallow text features, and densitometric features to classify each atomic block of a web page as either relevant or not relevant. We use this relevance classification when we compute the features for each web block, which we discuss in more detail in Section 3.

3. WEB BLOCK CLASSIFICATION We begin by dividing a web page into a series of blocks. We define a block to be any structural HTML element of a web page, where structural elements include body, div, span, h[1-6], table, tr, td, and p tags. Unlike (How, Kan, and Lai 2004), which visits blocks in a bottom-up manner, we process the blocks of a page using a top-down approach. This allows us to identify the most general element containing the metadata of interest. We found that a bottom-up approach is prone to a sub-optimal division of elements, as some relevant blocks may be left out and not classified with a neighboring element under certain conditions. By viewing a page as a collection of elements rather than an explicitly formatted container for information, our approach easily handles changes to the format of a site. Additionally, the predictive

188

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

nature of the classification and heuristic-based approaches provide the ability to handle unseen blogs, including those with notably different formats from those previously encountered.

3.1 Web Block Classification Features Once we have divided a page into a series of blocks, we then generate feature vectors for each block. The novelty of our approach is in the variety of features we compute for each block, including structural and stylistic features, similar to (How, Kan, and Lai 2004), the relevancy features, which represent the structural features, shallow text features, and densitometric features from (Kohlschutter, Fankhauser, and Nejdl 2010), as well as features based on the relative size of each block, values used in the algorithm from (Hiremath et al. 2005b) for identifying the most relevant section of a page.

4. HEURISTIC-BASED APPROACH The heuristic-based approach begins by identifying post pages from the index page of a site. This involves identifying the largest block on the page, dividing that block into post blocks, and comparing the content of the post block candidates to other local pages in the directory structure. Identification of the blog elements (Content, Title, and Date) begins by pulling title candidates from the post pages and comparing them to the content from the post blocks on the index page. Possible XPath candidates are produced in a weighted voting scheme across multiple post pages to select the optimal XPath that will extract a post's title across the website. Locating the content is similar to the title, but the content candidates, instead of the title candidates are pulled from the post pages. Date extraction works differently. Date extraction begins with searching all n post pages for regular expressions matching known date formats. The search begins at the DOM nodes containing the content and title and then branches. Weighted voting allows for a pattern of dates occurring at a specific element in the same format across post pages to become the XPath and format for the date element.

5. EXPERIMENTAL RESULTS We created a baseline system consisting of 60 posts across three different blog sites. Two of the sites used a fixed format, while the third site’s posts varied in format. We hand-created a template for extraction on each site based on the content of a single post page from each. We then compared that template with the gold standard annotation for each of the 60 post pages. The results are shown in Table 1. Table 1. Precision, recall, and F-measure for a fixed-template baseline system Baseline System

Precision 0.683

Recall 0.683

F-Measure 0.683

5.1 Classification Results We generated a dataset by creating a feature vector for each node in each of the 60 blog posts. Each post contained 371 structural nodes on average, which were given labels of “Title”, “Content”, “Date”, or “Other.” In all, we generated 22,285 feature vectors, 60 of which were labeled with each of the three targeted attributes, and 22,105 received the generic label. We used the Weka Machine Learning software package (Hall et al. 2009) for our classification system. We ran our dataset against a subset of the available and applicable algorithms in Weka. We attempted to select algorithms with substantial differences in implementation. We tested each algorithm using 10-fold cross validation. We found that all of the algorithms performed well on classifying the date, while they all generally struggled with the content field. Initially this may seem counter-intuitive, however we suspect this is due to the difficulty of selecting the appropriate depth for the content node, as many of our features would likely be quite similar. We discuss this in more detail in Section 6.

189

ISBN: 978-972-8939-40-3 © 2011 IADIS

Table 2. The average precision, recall, and F-measure of the five classification algorithms NB Updateable Bayes Net Ada Boost M1 BF Tree Decision Table

Precision 0.979 0.988 0.941 0.993 0.998

Recall 0.888 0.969 0.967 0.993 0.998

F-Measure 0.924 0.976 0.953 0.993 0.998

We present the overall precision, recall, and F-measure scores for each algorithm in Table 2. As we can see, each algorithm performed quite well using these metrics; however, we believe these scores may be slightly misleading, as the vast majority of examples labeled "other" inflate the true positive counts, leading to these high scores. This is most evident by the scores of Naïve Bayes Updateable, which was the only algorithm that had relatively low accuracy for "Other" examples. In order to remove the effect of inflated scores, we computed the scores presented in Table 3 by averaging the F-measures for the other three class labels, ignoring the scores for the examples labeled "other." This provides an alternative overview of how well each algorithm predicted the attributes of interest. While automated classification tends to perform well on unseen sites, the requirement for training data makes it impractical for fully automated systems. Table 3. Average F-measures of the five classification algorithms for the attributes "title," "content," and "date" NB Updateable Bayes Net Ada Boost M1 BF Tree Decision Table

F-Measure 0.687 0.741333 0.052 0.889333 0.961333

5.2 Heuristic-Based Selection of DOM Elements To test the heuristic based approach, we selected a set of 10 blogs from various domains. We ran the heuristic-based algorithm against a mirror of each site, and compared the results to the gold-standard. The results of this analysis are presented in Table 4. Pass/Fail is a binary description of whether the resulting XPath from the heuristic system was an exact match with the XPath in the gold standard. Of the ten blogs tested, the template was correctly matched for all four elements: Post, Content, Title, and Date on eight out of the ten blogs. For the remaining two blogs, the algorithm failed to identify correctly the XPath for post pages, and therefore was unable to select correct XPaths for the subsequent elements. This is a disadvantage of the heuristic-based approach; failure at any point in the process results in continuing failure at subsequent parts of the process. Table 4. Heuristic-based system results ID 1 2 3 4 5 6 7 8 9 10

190

Domain Business Technology Technology Politics Politics Politics News Science Science Sports

Posts Pass Pass Pass Fail Fail Pass Pass Pass Pass Pass

Content Pass Pass Pass Fail Fail Pass Pass Pass Pass Pass

Title Pass Pass Pass Fail Fail Pass Pass Pass Pass Pass

Date Pass Pass Pass Fail Fail Pass Pass Pass Pass Pass

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

6. DISCUSSIONS AND FUTURE WORK We have shown that, when paired with the right classification algorithm, our web block classification algorithm can achieve very good results. We believe the relatively low scores for isolating the “content” field are a result of what we call “DOM padding” where any in a series of nodes can sufficiently isolate the desired information. For example, if the target node is located at /html/div/div, but the 2nd “div” is the only child of the first, then there is effectively no difference between the two as it applies it extraction; however, only one of these will have the target label in the training set, leading to lower scores. Evaluation of this research thus far began with an initial set of three blogs, with sixty labeled examples. Additional analysis was conducted on a set of 10 unseen blogs from a variety of domains and in vastly different formats. In the future, we would like to test our algorithm on a larger sample of blogs to see how it holds up as the diversity of the data increases. In addition, we would like to expand the number of attributes we target.

REFERENCES Agichtein, E. et al, 2008. Finding High-Quality Content in Social Media. In Proceedings of the International Conference on Web Search and Web Data Mining (WSDM ’08), 183–194. Palo Alto, Calif.: ACM Press. Algur, S. P., and Hiremath, P. S. 2006. Extraction of Flat and Nested Data Records from Web Pages. In Proceedings of the Australasian Data Mining Conference. Sydney, Australia: Australian Computer Society, Inc. Hall, M., et al, 2009. The WEKA Data Mining Software: An Update. SIGKDD Explorations, 11(1): 10-18. Hammer, J. et al, 1997. Extracting Semistructured Information from the Web. In Proceedings of Workshop on Management of Semi-structured Data, 18-25. Tucson, Arizona: ACM Press. Hammersley, B. 2005. Developing Feeds with RSS and Atom. Sabastopol, Calif.: O'Reilly. Hiremath, P. S., Benchalli, S. S., Algur, S. P., and Udapudi, R. V. 2005b. Mining Data Regions from Web Pages. In Proceedings of International Conference on Management of Data. Goa, India: Computer Society of India. How, L. C. et al, 2004. Stylistic and Lexical Co-training for Web Block Classification. In Proceedings of International Workshop on Web Information and Data Management (WIDM '04). Washington, DC: ACM Press. How, L. C. 2004. PARCELS: PARser for Content Extraction and Logical Structure (Stylistic Detection). Honours Year Project Report. Department of Computer Science, School of Computing. National University of Singapore. Kohlschutter, C. et al, 2010. Boilerpipe. http://code.google.com/p/boilerpipe. Retrieved on May 1, 2010. Kohlschutter, C. et al, 2010. Boilerplate Detection using Shallow Text Features. In Proceedings of International Conference on Web Search and Data Mining (WSDM 10). New York, NY: ACM Press. Qualman, E. 2010. Socialnomics. New York, NY.: Wiley. Raggett. D. 2010. HTML Tidy Library Project. http://tidy.sourceforge.net. Retrieved on May 1, 2010. Wang, J., and Lochovsky, F. H. 2002. Data-rich Section Extraction from HTML Pages. In the Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE '02). Atlanta, Georgia: IEEE Press. Wang, J., and Lochovsky, F. H. 2003. Data Extraction and Label Assignment for Web Databases. In Proceedings of the Twelfth International World Wide Web Conference (WWW2003). Budapest, Hungary: ACM Press.

191

ISBN: 978-972-8939-40-3 © 2011 IADIS

SENTIMENT ANALYSIS IN SOCIAL WEB ENVIRONMENTS ORIENTED TO E-COMMERCE Luigi Lancieri and Eric Lepretre Lille 1 University Cité Scientifique, Villeneuve d'Ascq, France

ABSTRACT The purpose of this paper is to have a better understanding of users’ behavior in a social web environment. The data used are collected from web sites containing reviews about web merchants. In such sites, users provide a short description of the quality of the merchant as well as a degree of satisfaction. The idea underlying this study is to explore the textual structure (e.g the reviews length) and the perceived ambiguity of the opinion expressed in the reviews. One of the results of this study is that there is a correlation between the degree of satisfaction and the length of the user contribution. The other result is that the ambiguity of the opinion appears to be different when the opinion is positive compared to a negative one. We analyze the meaning of these finding in the growing context of social web interactions. KEYWORDS Measure, ambiguity, opinion, sentiment.

1. INTRODUCTION Social web environments are nowadays very common but are not always easy to understand. Platforms as Facebook, Twitter or sites dedicated to the reviews of commercial products have increasing influence in “real life” society. A study of Doubleclick shows that more than half (up to 75% for travel purchases) of the consumers look for information (reviews, ..) before making their online purchase (Qiang et al, 2009). From a general point of view we may wonder how the content of such reviews can inform us about the way people expresses their feelings within social web environments. The content of reviews refers here to other thing than the strict meaning of words and refers more to the text structure or the indirect expression conveyed by the text. For example, in face to face interactions, the voice tone or the speech rhythm can often give us more information than the words. In the same way, it is well known that physical posture or facial expression (smiling, etc) or in one word the indirect forms of communication are also extremely expressive. Even if social web does not transpose all aspects of real interactions, it's an interesting context in order to analyze indirect communications. This can be done thanks to the capacity of these environments to memorize traces of interactions. Computer scientists (among others) have studied the huge quantity of data accessible in such a milieu in order to compute behavioral indicators from raw data. In this paper, we first discuss of the meaning of the length of users input when such a contribution convey sentiments, opinions or reputations. This part of the work is achieved through the analysis of reviews content descriptive statistics compared to the degree of satisfaction provided by the author of the review. Then we investigate the relation between the ambiguity of the review and the polarity (negative or positive) of the opinion. This part is achieved by comparing the degree of satisfaction expressed by the user with those obtained through a computer based sentiments classifier. We assume that greater the difference, the higher the ambiguity of the review is high. We analyze the result of both parts of this work as well as the link between them. First of all, it is important to note that some web social environments have high constraints regarding the size of the users’ contribution. The best example is that of Twitter that is dedicated to very short texts. Others platforms (forums, blogs, ..), on which we focus, are more open. This paper is organized as follow. First we describe our methodology and the results we obtain. Then we make a review of similar works and finally we discuss the outcome.

192

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

2. METHODOLOGY AND RESULTS We begin by analyzing the statistics properties of the reviews including the relation between the lengths of the reviews according to the polarity (negative or positive) of the opinion expressed. Then we investigate the relation between the ambiguity of the reviews and their polarity.

2.1 Statistics of the Reviews First of all, we got a set of 5014 reviews including the degree of satisfaction (from 0 to 10) expressed by the reviewer. The reviews were collected from online web sites dedicated to consumer expression (e.g. rateitall.com). The average length of a review is of 66.7 words (std=66) and we can observe that 90 % of them have a length lower than the average plus one std. The total length of all reviews is of 335 246 words.

Figure 1. Quantity of reviews per class of length

Figure 2. Degree of satisfaction versus length of the review

In order to show the relation between the length of the review and the degree of satisfaction of the user, we first rank all the reviews by class of length (see abscissa of figure 2). Then, we compute the average degree of all the reviews by class of length. The figure 2 shows this relation. For example, the first bar of the plot tells us that all the reviews that have under 10 words have an average notation of 8.5. This figure suggests that the less the reviewer is satisfied, the more he expresses himself. Indeed, satisfied users (notation near 10) have written shorter reviews (left of the plot) whereas unsatisfied users (notation near zero) have written longer reviews (right of the figure). Such a remark may lead us to the conclusion that people express themselves mainly when they are unsatisfied. Actually this is not the case. It should be done a difference between the fact that someone expresses himself or not, and the length of his expression. This is confirmed by another view of our dataset. The figure 3 shows the distribution of the quantity of reviews per degree of satisfaction (from 1 to 10). This figure corroborate that most of the reviews are positive (1425 less than score 5, 3410 more than score 5).

Figure 3. Distribution of reviews depending on the degree of satisfaction

We see that most of users (around 2/3) are satisfied. Furthermore, most of the unsatisfied users (around 2/3) are strongly unsatisfied (peak on notation 1). This is logical and consistent with the fact that those users continue to trust and use online commerce. On the other hand, the situations that make users unsatisfied are most of the time complexes and need long explanations contrary to situations where everything is fine and from which there is nothing special to say.

193

ISBN: 978-972-8939-40-3 © 2011 IADIS

2.2 Measuring Reviews Ambiguity Another question that we may ask highlights the relation between the review length or its polarity with the ambiguity of the content. The ambiguity can be defined as the fact that a text expresses clearly or not an opinion (negative or positive). In order to evaluate the ambiguity, we compare the degree of satisfaction given by the user with that found by an artificial sentiment classifier (that is supposed to be objective). The idea is that ambiguity is proportional to the level of disagreement between the classifier and the human. Both have rated each review between 1 (disappointment) and 10 (satisfaction). For this study we used a sentiment classifier developed within our lab. It is out of this study to describe this classifier but we provide a set of references with detailed information about underlying techniques (Cui et al, 2006), (Gindl et al, 2008), (Qiang et al, 2009), (Xia et al, 2010). We may also say that our sentiment classifier has 89 % precision and 85 % of recall. These performances are consistent with this category of tools. Precision can be seen as a measure of exactness or of fidelity, whereas recall is a measure of completeness. First, we focus on the amplitude distribution of the disagreement which is simply the absolute value of the difference between both notations. In the figure 4 the abscissa represents this difference. For example the fourth bar means that classifier gives a notation of 5 and that the human gives 9 (or the opposite) in 9 % of the cases.

Figure 4. Percentage of reviews according to the difference between the classifier and the user notations.

Figure 5. Percentage of disagreement within each class of opinion notation

First of all, we can see that most of the time the classifier agrees with users. Indeed more than 66 % of the reviews have only one or two points of difference between both notations. Let’s see now how this difference is distributed over the classes of sentiments. The purpose is to see where the disagreement is the most visible. We consider here that for both the classifier and the human: if the notation is between 1 and 5, it expresses a sentiment of disappointment, otherwise a satisfaction. Then, for each class of sentiment level (1 to 10), we compute the ratio of reviews where there is a disagreement. For example in the figure 5, the bar 5 that has the value of 34 % means that the classifier disagrees with 34 % of the messages that were noted 5 by the user. As we mentioned before, this level of disagreement is considered as a level of ambiguity. As expected, the bars 5 and 6 have the highest level of ambiguity that corresponds to the middle range position of these notations. On the other hand, as shows the table 1, we can notice that whereas the level of disagreement is globally equal (88%) for each polarity (negative, positive), the distribution is very different inside of each polarity class (figure 5). We observe that the level grows regularly from the notation 1 to 6 and then goes down quickly. This suggests that the positive opinion appears rapidly without ambiguity contrarily to the negative opinions. We may consider these relations as a dynamic model of ambiguity progression (ie from one class 1, 2… to another). Table 1. Level of disagreement depending on the polarity of reviews Negative(1-5) Number of message 1604 Number of disagreement 1412 Percentage of disagreement 88%

Positive (6-10) 3410 3001 88%

Let us remember our previous suggestion, that negative messages are the larger one due to of the complexity of the discourse necessary for expressing problems or reasons of unsatisfactions. In such case, it is not surprising to see that ambiguity disappears less rapidly in negative messages.

194

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

3. RELATED WORKS The relation between the length of the expression and the satisfaction of customers has already been studied in relation to mouth-of-mouth. Anderson, for example, has stated that unsatisfied customers engage themselves in greater word-of-mouth than satisfied ones (Anderson, 1998). In the context of collaborative work several studies observe a relation between the amounts of contributors in a group and the length of individual contribution. Probably due to different methodologies there is no unique conclusion regarding the orientation of the relation. Some works report that an increase of a group size tends to reduce individual participation. Other works find the opposite but it appears that the structure of the group has a crucial influence on the level of individual contribution (Lancieri, 2008). Avouris et al remark for example that larger groups produce better results and generate greater activity, but this activity is less homogeneously distributed between different members of the group (Avouris et al, 2004). Valacich et al reported an even more precise finding, observing that the activity rate per member increases with the size of the group when it is composed of members with diverse skills, whereas it decreases when the group is homogeneous (Valacich et al, 1993). Apart the structure of the group, other researchers notice that the modality of the communication and especially the user interface also have an influence on the user contribution. C. Jensen, for example showed that the contribution of users to a group activity, such as an online game, was higher when the mode of communication was more evolved (voice, speech synthesis, text chat, no communications) (Jensen et al, 2000). The author points out that artificial speech generates more contributions than the equivalent text (chat), but still less than human conversation. It is also often observed that the length of the contribution depends on the modality of the interaction. Questions or answer, for example, does not have the same length. Although Twitter allows status updates to be up to 140 characters long, and Facebook up to 423, the collected questions had a mean length of only 75.1 characters (13.8 words). MR. Moris et al founds that question length influence response relevance. Indeed, questions that have fewer sentences receive more helpful responses than those with many sentences (r = -0.13)(Morris et al, 2010). A lot of work, especially in the language theory field, has for a long time analyzed the influences of text complexity and structure on its understanding. These works show that text length, linked with the lexical redundancy level, is directly related to the text comprehensibility. Details, discussion and references can be found in (Lancieri, 2008).

4. DISCUSSION As we have seen, the reason explaining the length of a textual contribution may depend on several factors. First of all, there are personal reasons. The education or the psychological profile of an individual may lead him to express the less possible or at contrary to give his opinion spontaneously. Social factors are also influential as observed in face-to-face interactions. The opinion of others contributors or of the group often motivate personal expression. Dislike to some web environments that exhibit communities and long-term interactions (hobbies, open source software, ..), e-commerce and reviews dedicated sites are more focussed on the expression of final opinion. In such environments, users rarely expect responses or long-term interactions. The specificity of this kind of context is important to have in mind while discussing the reasons or the implications of text length. If we exclude or limit the social influence, the direct relation between the length of the review and the disappointment degree of the users may come from two main raisons. The first one, as we said previously is linked with the fact that a problem is more complex to explain than a situation occurring as expected. The second reason may have a relation with the anger and the frustration generated by the feeling of disappointment leading to the need of expression. Regarding the ambiguity of the reviews, it is clear and quite obvious for middle range opinions (degree 5 and 6). What is interesting is that in the positive opinions, the ambiguity disappears more rapidly than in negative opinion. We assumed that this was linked with the difficulty to explain situations leading to disappointment. In one word this would means that negative opinions are not only difficult to explain but also difficult to understand. This may explain also why conflicts linked to these situations are not easy to solve.

195

ISBN: 978-972-8939-40-3 © 2011 IADIS

REFERENCES Anderson EW,1998, Customer satisfaction and word of mouth - Journal of Service Research, 1998 Avouris N et al, 2004, The effect of group size in synchronous collaborative problem solving activities, Proceedings AACE Conference ED-MEDIA 2004, Lugano, June 2004. Cui et al,2006, Comparative Experiments on Sentiment Classification for Online Product Reviews, proceedings of the 21st national conference on Artificial intelligence (2006) Gindl et al, 2008, Evaluation of different sentiment detection methods for polarity classification on web-based reviews. 18th European Conference on Artificial Intelligence, ECAI2008 Patras Greece ECAI Workshop on Computational Aspects of Affectual and Emotional Interaction p. 35–43 Jensen C et al,2000, The Effect of Communication Modality on Cooperation in Online Environments, In Proceedings of CHI 2000, The Hague, Netherlands March 2000. http://research.microsoft.com/scg/papers/dilemmachi2000.pdf, Lancieri L, 2008, Relation between the Complexity of Individuals' Expression and Groups Dynamic in Online Discussion Forums, The Open Cybernetics and Systemics Journal (TOCSJ), Volume 2. 2008; Morris MR et al, 2010, What Do People Ask Their Social Networks, and Why? , Proceedings of the 28th ACM International conference on Human factors in computing systems (CHI10) Qiang Y et al, 2009, Sentiment classification of online reviews to travel destinations by supervised machine learning approaches, Elsiever Journal Expert Systems with Applications, Volume 36, Issue 3, Part 2, April 2009, pp. 65276535. http://hci.ece.upatras.gr/pubs_files/c80_Avouris_Margaritis_Komis_2004_ED_MEDIA.pdf, Valacich JS et al, 1993, Computer-mediated idea generation: the effects of group size and group heterogeneity, Proceeding of the Twenty-Sixth Hawaii International Conference on System Sciences, Volume iv, 5-8 Jan. 1993 Page(s):152 - 160 vol.4 Xia H et al, 2010, Sentiment Text Classification of Customers Reviews on the Web Based on SVM, 2010, Sixth International Conference on Natural Computation (ICNC 2010)

196

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

TOWARD INTEGRATING SOCIAL NETWORKING SERVICE AND JAPANESE MANGA IN STRATEGIC CONSUMER GENERATED DESIGN Anak Agung Gede Dharma, Hiroyuki Kumamoto, Shogo Kochi, Natsuki Kudo, Wei Guowei and Kiyoshi Tomimatsu Graduate School of Design, Kyushu University 4-9-1 Shiobaru, Minamiku, Fukuoka, Japan

ABSTRACT The growth of Social Networking Service (SNS) has created a new potential in marketing. At present, a new tendency to post latest messages, information, or simply emotional expression to SNS is increasing. SNS is known for its effectiveness in delivering these kinds of data. In the perspective of product/service development, these data are known as casual data and can be managed with a method called abductive thinking. Although some online services that attempt to apply abductive thinking have been proposed in the past, their purposes and effectiveness are still questionable. In those services, users were only asked to give their opinion or participate in the system without receiving sufficient feedbacks. Another important consideration in designing consumer generated design services is to understand consumer aspiration and wishes. In recent Japanese culture, SNS is gaining popularity with Japanese internet users. Furthermore, Japanese internet users has been accustomed with a large variety of online contents. As a result, we designed an online service that integrates SNS as the medium and propose a unique communication method, i.e. Japanese manga. In our proposed online service, users can directly contribute to the product/service development process in an interesting and enjoyable way. We designed a system that allows users to posts manga to describe their original idea. While contributing to the product/service development, they can also benefit from expressing their interests and receiving feedbacks from other users that can be used to improve their capability. KEYWORDS Consumer generated design, social media, casual data, syntax analysis, Japanese cartoon/manga

1. INTRODUCTION Social Networking Service (SNS) is widely known as the latest phenomenon in the web communities. Nowadays, web users are able to communicate in a whole different way which is never happened before. In contrast with other online services, SNS enable its users to have two-way communication, utilizing various media, receive feedbacks, organizing social events, etc. In other words, SNS has the possibility to be used for various purposes that involves multiple users altogether with tremendous effect, e.g. consumer generated design. Consumer generated design is a widely used method in designing new products or services. In this method, consumers and producer are involved in the cycle of product design and development, where inputs and feedbacks are shared together. A good consumer generated design service requires elaborate system design. A typical difficulty is usually occurred during its implementation, which is caused by the lack of understanding consumer’s aspiration. If this flaw is not identified, the final product or service design will not match inherent consumer needs. However, with the right online service design, SNS can be applied successfully to retrieve users’ opinion and give feedbacks in return to the users. In the Japanese culture, Twitter is one of the most commonly used SNS (Sayaka et. al. 2010, p.111). Along with the increase of communication media, the utilization of Twitter is growing rapidly. There is a tendency to post latest messages, information, or simply emotional expression to Twitter. In regards to the product development, these data are also known as casual data and the method to deal with these data that is

197

ISBN: 978-972-8939-40-3 © 2011 IADIS

called abductive thinking has been proposed (Kolko 2010, p.17; Serota & Rockwell, p.43). Furthermore in the strategic product development perspective, we consider this phenomenon as a new opportunity.

2. RELATED WORKS 2.1 Cuusoo Seikatsu One of a web service which utilizes the concept of consumer generated design is Cuusoo Seikatsu (http://www.cuusoo.com). Cuusoo Seikatsu provides a service where ideas can be shared, published, and realized in the form of manufactured products. Manufacturer can create an account on cuusoo seikatsu and have a dedicated link that serves as community site. They can issue their own policy in the dedicated link and customize the contents. At present, LEGO and Muji (a Japanese retail company) are utilizing this web service to gather ideas in designing their products (Cuusoo Seikatsu, 2011). To contribute in Cuusoo Seikatsu, users have to select a link which allows them to view the manufacturer’s community site. In that page, user can contribute their ideas or rate existing ideas. The ideas can be contributed in the form of drawn sketches or pictures.

2.2 Shindanmaker On the other hand, Shindanmaker is a network service that provides a custom-made diagnosis tool for Twitter users (http://shindanmaker.com). At present, shindanmaker is gaining popularity in young Japanese Twitter users. To use this service, user has to fill the form with his/her registered twitter account. Shindanmaker engine will analyze the account and suggest a diagnosis result as the output. In addition, shindanmaker also enables its users to create or customize their own diagnosis system with ease. User can simply name the diagnosis title and fill the entries (up to five) with the desired results. Each entry can contain up to 255 variables and will be displayed randomly depending on the input (Twitter Shindanmaker, 2011).

3. SYSTEM DESIGN The mechanisms of our proposed idea, system architecture regarding database and networking, features, and the benefits of the system will be described in this section. We designed an online service that enables twoway communication between product/service developer and consumer. Product/service developers can use this service for publishing their idea while receiving inputs from their consumers. On the other hand, consumer can directly contributes to the cycle of product development by expressing and sharing their idea. In this paper, we propose using manga as the medium for expressing the idea. Instead of sketches or actual representation of the product, manga has several advantages such as: (1) Compared to the sketch which merely provides simple visualization, manga provides a storyline that describe the narrative of idea, possible applications, or limitations; (2) As we found that many Japanese young people draw manga as their hobby, this service can be a suitable media to express their hobby while receiving feedback from general public; (3) As the online content, manga can be enjoyed by various web users. Furthermore, consumer generated design is a design method where a product is designed as a result of consumer voice and aspiration. In this research, we propose a system where the producer publishes a question regarding product development and users can openly contribute by suggesting opinions or sketches. Our proposed system consists of idea collector part, manga part, and data analysis framework that consist of syntax analysis and keyword mapping (see Figure 1). The system is focused on two scenarios where users can openly contribute: (1) Users plays a role as a respondent (or Twitter Users), they were asked for their opinions regarding a design concept, these users will receive visual images as the feedback; (2) Users actively contribute their idea by posting manga sketches (as manga artists) which are designed according to a specified theme.

198

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Figure 1. An architecture diagram of our proposed system of integrating SNS and Japanese Manga in Strategic Consumer Generated Design

3.1 System Architecture Our system was developed on the html5 web environment. In addition, we were also using php and javascript as the web programming language, and mySQL as the database. Our system is designed to analyze user’s input with syntax analysis, store it into database, and publish it to twitter. In addition to the web programming environment, we also used 2 different Application Programming Interface (API), i.e. Twitter API and Syntax Analysis API from Yahoo! Japan (Twitter Developers, 2011; Yahoo! Japan Developer Network, 2011). In our system, we used two separate Syntax Analysis APIs from Yahoo! Japan, i.e. Japanese morphological analysis and Japanese dependency parsing analysis. In addition, we also used Twitter API at the login page. Twitter API requires web based OAuth process to authenticate the web application. User has to log in using his/her twitter ID to access our system. After log in, user will be asked to input a sentence that will be analysed. The analysis can be classified into 2 parts, i.e. morphological analysis and syntax analysis. Morphological analysis is the process of breaking the sentence down into morphemes and assigns the suitable lexical class to each morpheme. Yahoo API classifies lexical class into 13 types, i.e.: adjective, quasiadjective, interjection, adverb, adnominal adjective, conjunction, prefix, suffix, noun, verb, particle, auxiliary verb, and unique lexical classes. Syntax analysis is the process of breaking sentence down into its parts of speech with an explanation of the form, function, and syntactical relationship of each part. Japanese dependency pairing analysis can be included as syntax analysis. This analysis adds an additional parameter, i.e. dependency (syntactical relationship) that describes the list of dependent morphemes in a given morpheme. The final result of these analyses consists of the word, parse, and dependency. We repeat these analyses for each new sentence and if any duplicate word is detected it will increase the frequency. Word, parse, dependency, and frequency are classified as different fields in the database (see Figure 2-a). The example of the result of morphological and syntax analysis is shown in Figure 2-b. In this example, the sentence is composed by 6 morphemes. Furthermore, the morphemes that are located within the same phrase can be defined as syntactically dependent. For example, the morpheme ワイン and 飲ん, ワイン and を, and ワイ ン and だ, are located within a verb phrase and can be defined as syntactically dependent. We referred to Fry (2007) in visualizing the result of syntax analysis.

199

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 2. (a) Data analysis and storage in the database (b) An example of morphological and syntax analysis

3.2 System Features As discussed in the previous section, we are targeting two different types of users, i.e. respondents and contributors. In order to accommodate both types of user, we provide two different parts of the system, i.e. idea gathering part and manga part. Idea gathering part is designed for respondents and manga part is designed for contributors. The details of both parts will be discussed in the next sub-sections.

3.2.1 Idea Collector Part The first part of our proposed system is called Idea Collector Part. The scenario happens in these following sequences: (1) User opens the login page and insert their account name (see Figure 3-a); (2) A sketch which illustrates conceptual product are displayed at random in order to stimulate user ideas, user is required to write their opinion in the idea entry form (see Figure 3-b); (3) User is given an option to post the idea on Twitter, an URL to the illustration and login page is added automatically; (4) In the last page, a list of previously posted opinions were given to the user. The user has the option to rate the opinion (see Figure 3-c).

Figure 3. The Idea Collector Part: (a) Login page (b) Idea entry page (c) Rating Page & Manga Part

3.2.2 Manga Part In manga part, manga enthusiasts can publish their own original works while receiving feedbacks from other users. Manga contributors can publish their manga according to the following steps: (1) Contributor can select a list of available themes and upload their manga. In this step, the contributor is required to upload an

200

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

image file that can be displayed on an html page, such as jpeg or bmp (see Figure 4-a); (2) After finish uploading, they can view the grid view of the mangas and have them sorted based on latest contributions or rank (see Figure 4-b); (3) Users can view the manga and rate it in the viewing page (see Figure 4-c).

Figure 4. The Manga Part (a) Upload pages (b) Grid view of mangas (c) Viewing page

4. USER EVALUATIONS AND FEEDBACKS We have collected primary and secondary data for the evaluation of our system. The primary data were collected from a group of participants. The participants are students of art and information department who has sufficient knowledge about web and information design. For the secondary data, we had a presentation session with a team of professional product developer to review the feasibility and possible applications of our proposed system. After analyzing user feedbacks, we can summarize the positive and negative aspects of our system. The participants and developer described 3 points to be the positive aspect of our proposed system, i.e.: (1) The idea was very interesting and still has the possibility to be extended further, e.g. by utilizing different media or expression in our system. In addition to manga, the system could also utilize animation or game after some minor modifications; (2) From the user’s point of view, the idea to utilize manga as the communication method to express their idea was interesting. Users mention that they are able to submit their idea in a casual manner while expressing their interest; (3) From the product/service designer’s point of view, our proposed system was evaluated as the first step to develop online services that facilitate user generated design. On the other hand, we have identified the negative aspect of our application, which can be summarized in 2 points, i.e.: (1) Additional features to distinctly separate different type of users should be developed. For example, the system should provide another page that display the list of users feedbacks regarding a contributed manga. With this page, manga contributor can reflect upon their mangas to improve the quality of their future works; (2) The prototype should be developed further for other contents beside manga.

5. CONCLUSION AND FUTURE WORKS In this paper we have discussed a new method of utilizing casual data in designing consumer generated design. We have described how this system can be used as a collaborative tool in designing consumer generated design. Both product/service developers and consumers can receive benefits from our proposed system. However, this system is still limited to be used for analyzing Japanese sentences and directed to the Japanese users. In the future, we plan to develop this system for international users. Furthermore, we have described the versatility of our proposed system. We believe that the system can be applied in strategic marketing, product development, or as a promotional tool that utilizes the voice of customers.

201

ISBN: 978-972-8939-40-3 © 2011 IADIS

REFERENCES Cuusoo Seikatsu, 2011, ‘Nurturing Brand Communities, Constructing Service’, Cuusoo Seikatsu: For Manufacturers (in Japanese), viewed 18 June 2011, < http://www.cuusoo.com/about/coindex.html>. Fry, B, 2007, Visualizing Data: Exploring and Explaining Data with the Processing Environment. O’Reilly Media, California, USA. Kolko, J, 2010, ‘Abductive Thinking and Sensemaking: The Drivers of Design Synthesis’, in MIT Press Journals, vol. 26, no. 1, pp. 15-28. Twitter Developers, 2011, Twitter API Documentation, Twitter, viewed 18 June 2011, . Twitter Shindanmaker, 2011, About Twitter Shindanmaker (in Japanese), Twitter Shindanmaker, viewed 18 June 2011, . Sayaka A, Kato, N, Muraoka, Y & Yamana H, 2010, ‘Cross-media Impact on Twitter in Japan’, Proceedings of International Workshop on Search and Mining User-generated Contents, Toronto, Canada, pp. 111-118. Serota, L & Rockwell, D, 2010, ‘An Introduction to Casual Data, and How It’s Changing Everything’, in Interactions Vol. 17 Issue 2, pp. 43-47. Yahoo! Japan Developer Network, 2011, Japanese Morpheme Analysis (in Japanese), Yahoo! Japan, viewed 18 June 2011,

202

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

INNOVATION MANAGEMENT IN ENTERPRISES: COLLABORATIVE TREND ANALYSIS USING WEB 2.0 TECHNOLOGIES Iris Kaiser1 and Prof. Dr. Michael Durst2 1

2

University of Erlangen, Lange Gasse 20, 90402 Nürnberg, Germany ITONICS GmbH Software & Consulting, Otto-Seeling-Promenade 2-4, 90762 Fürth, Germany

ABSTRACT Through early trend recognition in the business environment and their specific processing within the innovation management, companies can achieve long-term market success. A particular challenge is the systematic identification, gathering, structuration and evaluation of trends. Web 2.0 technologies and especially Wikis, which allow several people to maintain and use content simultaneously, are eminently suitable for an efficient process of continuous collection and analysis of relevant market trends. In this paper, trend management processes are introduced and it is demonstrated how trends can be collected, structured and communicated within the enterprise using a customized wiki. The trend assessment is carried out inter alia on methods of crowd sourcing, resulting in an extensive evaluation basis. In addition, the presented approach includes a visualization of the trends and its assessment for decision support. A case study of global polymer solutions supplier REHAU AG demonstrates the use of the methodology in practice. KEYWORDS Innovation management, Trend scouting, Collaboration, Web 2.0, Wiki, Radar

1. INTRODUCTION In light of the increasing speed of technology and product innovation, companies cannot longer react to changes in terms of a quick response, but they have to act proactively in order to remain competitive. The ability to respond to already weak signals and proactively initiate changes and market adjustments is required (Bullinger and Schäfer 1997). The importance of an effective innovation management is not disputed among experts. Innovations permanently secure market position and ensure a competitive advantage (Goll 2007). Thus makes the management of innovation a lasting contribution to promoting and safeguarding the viability of a company. Global competition, short product life cycles, increased technical risk and enhanced customer demands lead to new challenges in management. To meet these challenges, for global enterprises it is required to internationalize the innovation management and thus the innovation processes. The aim is to ensure cross-border synergies by optimally using existing resources. This can be realized by combining the existing knowledge across the enterprise. (Gassmann and von Zedwitz 1996; Marquardt et al. 1996) Prerequisite for a successful innovation is a logical thought and rigorous planning, implementation and monitoring of all activities that are associated with innovation (Burr and Stephan 2005). This systematic development and implementation of innovation in an enterprise is the core area of innovation management. The innovation process involves the generation of ideas, the selection and acceptance of ideas and finally their implementation. Trend management is part of the early stage in innovation management and assigned to the generation of ideas in the innovation process (Herstatt and Verworn 2007). At the present time, it is an increasingly important component in managing innovation, since innovations in the past have often come on the market very quickly and unnoticed and companies had to suffer losses by inadequate monitoring of these trends. The goal of trend management is the generation of new ideas and innovations that can become a competitive advantage in the product portfolio. Thus, in trend research, the exploration and appointment of present change processes is of high relevance (Pfadenhauer 2004).

203

ISBN: 978-972-8939-40-3 © 2011 IADIS

Specific tasks of the trend management in enterprises are the identification, the structured gathering and finally the assessment of relevant trends. Based on this, in the following phases of the innovation process appropriate measures for implementation or further monitoring are developed. In addition to lack of efficient methods for trend detection, a systematic and holistic approach to innovation management is often missing in companies. As part of the innovation process, the trend management cannot be considered independently of this. The importance of a formal process is shown in various studies. According to the PDMA best practices study of 1997, companies which follow a structured innovation process are more successful (Griffin 1997). Methods of crowdsourcing can be applied in trend management, to identify, evaluate and document trends. A common, departmental and cross-site evaluation and assessment of trends allows a quick prioritization due to the “wisdom of crowds”. The Web 2.0 provides software solutions that are suitable for the establishment of collaborative trend management. Thus, trends can be collected and structured together in a modified Wiki. An evaluation component allows the common evaluation of trends. The comment function enables the development of business ideas along trends. A complex evaluation matrix opens the possibility for a detailed assessment of trends by experts. In the following sections, a tool for the efficient support of each step in the trend management process is presented and its practical use is explained based on an example.

2. TREND MANAGEMENT PROCESS The trend management process is divided into five steps that can be supported using collaborative methods. Key elements are the trend identification, assessment, analysis, reporting and finally the trend monitoring (Fink and Siebe 2006). In the following sections it is described in more detail how these steps are supported by the system.

Figure 1. Process model of trend management (Fink and Siebe 2006)

Trend identification is the basis for further steps in the trend management. In this context an attempt is made to proactively look for trends and track their development over time, to take action as appropriate. However, the embodiment of the identification phase is highly dependent on the industry and the objective pursued (Fink and Siebe 2006). For a long time, trend research in the company was solely the responsibility of management, but today whole innovation departments are dedicated to the research for future issues. In the dissemination of the open innovation approach even employees, customers or external experts are increasingly involved in the trend identification process. By crowdsourcing, internal and external resources are involved. Thus, the impetus for a new product or procedure may reach the company from outside, e.g. of customers, suppliers or competitors (Trommsdorff 2001). The resulting "intelligence of the masses" leads to a better quality of results, because groups derive from the diversity and independence of its members as well as due to decentralized decision-making structures better than experts (Kortzfleisch et al. 2008). This advantage can be used within the trend management in the collaborative collection and evaluation of trends. After the identification of trends their assessment based on several criteria is following. Collaborative trend management makes it possible, that all employees and, where appropriate, external experts evaluate a trend in an evaluation form, e.g. with the "thumbs up / thumbs down" method. Based on this, an algorithm calculates an average that determines the position of the trend in the trend view, which is an indication for more intensive or less intensive further processing. This very general assessment of the relevance of individual trends for a company creates a first ranking and allows a focusing on the most promising trends. Building on the collaborative trend assessment a detailed analysis of individual trends is executed through a panel of experts. The assessment is based on leading indicators, relevant past values and their own opinion (Dömer und Junker 2009). Criteria used for evaluation are cost-effectiveness for provider and customer, sustainability, potential for further development and application range of the trend. Each aspect is assessed on a scale from 1 (very low) to 5 (very high). An algorithm calculates the average values and places them in a Kiviat diagram. The results are then discussed in a panel of experts. From the results of the trend analysis,

204

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

next steps are derived. Relevant trends are further monitored and pushed ahead with their implementation, within the innovation management. For the trend reporting an expressive visualization is necessary. One way to structure trends and to make its relevance transparent is the introduction of a visualization component in the form of radar. Similar to a radar screen in the flight control, objects that are near the center appear to the human eye more relevant as objects that are further from the center. The presented radar can clearly depict four dimensions: Distance to center, placement in a segment, size and color coding of an object.

Figure 2. Dimensions of the trend radar

In the present example the distance category shows the alleged admission period of a trend. The range of the time scale must be defined according to the respective research area. The categories of the radar are shown in the segments. They correspond to the categories of strategic research areas in the innovation management. The size of the objects shows the potential of an individual trend, the color of objects reflects the fit of the trend for the specific company. The results of the collaborative trend management are clearly shown by the trend radar, which thus serve as a tool for management decision support. The final phase of the trend process, the monitoring, is used for the continuous and intensive monitoring of identified trends and their indicators as well as the establishment of warning areas by internal and external observers (Fink and Siebe 2006). In an indicator cockpit originally distributed information are displayed in condensed form as key figures. Based on the findings of the trend monitoring, better operational and strategic decisions can be made in the company. The trend monitoring therefore represents a crucial part of the "business intelligence" and enables the verification of the results of the trend management by numbers.

3. SYSTEM The structured analysis of trends in the market environment can be best supported by the use of appropriate IT systems. As trend analysis is a continuous collection and processing of information and a great advantage results from the possibility that several people can participate actively, the system operation must be user friendly and intuitive, so that no re-learning of the functionalities is necessary (Müller and Dibbern 2006). Furthermore, a simultaneous and location-independent working is useful (Kuppinger and Woywode 2000). The Web 2.0 provides suitable software solutions for the establishment of collaborative trend management. Wikis enable a flexible form of information documentation and contain almost no technical barriers while create or edit content (Komus and Wauch 2008; Szugat et al 2007). The simple participation can also promote the contribution of employees in the internal innovation management (Müller and Dibbern 2006). With the article history it is visible who was involved in trend detection at what time. This creates transparency and provides opportunities for incentive concepts. Another advantage is that changes in articles can be made directly in the operating system and therefore are available immediately. For documentation and backup the wiki support a versioning and logging, so that overridden content will be available continuously and can be restored. The essential feature of a wiki is the possibility of linkage between the individual articles. Because of the links, a user can deduce the wiki fast and it is easy to identify and recognize relationships (Mertins and Seidel 2009). An article in the Wiki describes a trend in the trend management. The wiki will be extended to include attributes such as search field and tagging. An evaluation function and a rich-text editor facilitate the participation in trend gathering and assessment. The reporting / monitoring component edits the quantitative evaluation criteria for the visualization on the radar. A rights and role concept supports the distinction of the roles "employee", "expert" and "moderator" in the system.

205

ISBN: 978-972-8939-40-3 © 2011 IADIS

The solution presented for trend analysis is based on the open-source software MediaWiki. The radar is implemented in FLEX. The data exchange between the trend wiki and the radar is mapped with XML.

Figure 3. Architecture of the collaborative trend management

4. CASE STUDY: REHAU AG The REHAU AG is an international industrial company with headquarter in Germany. It is a manufacturer of product and system solutions in the field of energy-efficient construction. The market for energy-efficient construction is currently growing fast and very dynamic, with a very diverse company and product structure. The liberalization of the energy market, persistently high commodity prices, growing political influence and increased environmental awareness among the population which leads to a growing demand for sustainable solutions, entails a tremendous pace of innovation. Companies that recognize and take up the right trends can thereby achieve significant competitive advantages. Using the trend radar, various analysis can be carried out which provide accurate information concerning the trend situation to the management which can be used for decisions on innovation and marketing strategy. Beside the general indicators for the trend portfolio, the following figures can be shown: number of gathered trends and trend density in the different segments, number of trends that have been evaluated by the users with high potential and highly competitive or complementary trends to the own product portfolio. Apart from the quantitative analysis you can always retrieve additional qualitative information about the individual trends. If the user recognizes in the radar that a trend is particularly interesting, he can look directly whose detailed description and evaluation. The trend wiki delivers findings with its reporting options that allow better operational and strategic decisions in terms of the innovation strategy and thus is part of the "Business Intelligence". In addition to content analysis, more information about the usage of the solution within the company can be analyzed. For instance it is available, how many employees work on and use the system or how up to date content is. The REHAU AG is using the radar to analyze the trends and visualize the trend situation in the market environment. The categories of the radar are shown in the segments. In the present case is distinguished whether it is a technology, a product, a service, a standard and legislation or a social trend. The category of social trends e.g. is dealing with issues of sustainability and environmental protection, which play an important role in the market of energy-efficient construction particularly on the client side. The distance to the center reflects the time horizon of the expected trend establishment in the market, which means a trend, is no longer a trend. Specifically, this means that the farther a point is positioned away from the center of the radar, the farther in the future, is the feasibility of this trend. As maximum time horizon 20 years are recognized as a longer view is not accurate in the future. In the system, the adjustment of the time horizon is made by using a slider. The size of an object symbolizes the potential of a trend. The rule is that a larger point means in relation to the other points a greater potential of the trend. The point value is calculated as the average of different individual values, which are considered separately. Each factor is evaluated on a scale from one (doesn’t apply at all) to five (applies totally). The factors considered are economic efficiency for the provider, economic efficiency for the customer, sustainability, development opportunities and breadth of application. As part of the economic efficiency for the provider is examined, as monetarily profitable it is for the provider to realize the trend. This might be the case for a product, when its production can be amortized within a short time or it complements well with the previously offered products and thus opens cross-selling opportunities. In contrast, economic efficiency for the customer means, how useful the purchase of a new product or

206

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

technology, the use of a service or the alignment of the behavior according to economic and environmental considerations can be. Third reviewed factor in the area of energy-efficient construction is the sustainability of a trend. Today, while new building and renovation of existing buildings, customers attach value to the installed systems that they also bring long term economic and environmental benefits. So, care is taken on the sustainable aspect in the selection of building components. Additionally the development opportunities of a trend play a role in the potential assessment. If a technology or product is already mature, there may be few or no opportunities to make further developments. For the potential, it is important that further opportunities are given, as it can be especially worthwhile to invest in a trend. The last factor is the breadth of application of a trend. Here it is shown how a trend can also be used in other areas than originally planned and thus additional sales potential can be opened. With the color of an object the fit of the trend for the company is displayed on the radar. It examines the behavior of the trend to the existing portfolio of products and services. Possible forms are a substitute, neutral and complement. Trends that are not relevant for the company or are already available are summarized here under the expression "neutral" because they do not affect the portfolio. After creating the posts and the classification and evaluation of the individual trends with the categories described above, the user can see the results of the collaborative trend analysis in the system. The user is getting in the system over a start page. Here he finds an alphabetical list of all trends and can directly access the related article. Furthermore, the start page includes a keyword cloud where all terms that are selected for the retrieval are combined. With a click on any word, the user gets an overview of all trends that are indexed by this term. A link takes the user to the radar.

Figure 4. View of the trend radar

In this view, trends can be filtered by various attributes and attribute combinations. Depending on the problem e.g. only the trends in a particular segment or those with high potential can be displayed. Clicking on a particular trend in the radar, the user reaches the related wiki article and gets the corresponding information stored. If he is authorized he can edit the article as well. The visualization of the trends in the radar is used within individual departments as well as for reporting to the top management. Thus the trend radar is an efficient tool for decision support for the management.

5. CONCLUSION Almost every company has some form of trend analysis. But, this is often unstructured and less supported by IT. This leads to significant research efforts in case of concrete information needs and to constant information loss because of staff changes. With the presented solution of a trend wiki, information can be easily detected international and collaborative. By crowdsourcing and the associated phenomenon of collective intelligence (Ebersbach et al, 2008), a large number of information can be collected and the subjectivity of the trend assessment can be reduced. This enables high information quality. With the help of

207

ISBN: 978-972-8939-40-3 © 2011 IADIS

the radar component, every user can get a very fast overview of the trend situation in the market environment. A high user acceptance level of the system is the basis for the successful use in the enterprise. The presented solution uses established tools (Wiki), extended by simple assessment components, an intuitive input surface and a clear overall presentation (trend radar). The hurdle for the use of the trend platform is low and thus easy even for inexperienced computer users. The application outlined here is based on the market for energy-efficient construction. The procedure and the system concept, however, are largely generic and thus transferable to other markets.

ACKNOWLEDGEMENT The trend wiki with visualization component was implemented in cooperation with the ITONICS GmbH. The subject and content were developed in workshops with the REHAU AG. We thank the employees of both companies, which have accompanied the research project and enriched with valuable suggestions.

REFERENCES Bullinger, H.-J. and Schäfer, M., 1997. Entwicklungstrends und Herausforderungen im Informationszeitalter. In Bullinger, H.-J.; Broßmann, M.. Business Television. Beginn einer neuen Informationskultur in den Unternehmen. Schäfer-Poeschel, Stuttgart. Burr, W. and Stephan, M., 2006. Dienstleistungsmanagement: Innovative Wertschöpfungskonzepte für Dienstleistungsunternehmen. Kohlhammer, Stuttgart. Dömer, F. and Junker, J., 2009. Trends im Informationsmanagement. In IM – Information Management und Consulting, Vol. 24, No. 4, S. 6-13. Ebersbach, A. et al, 2008. Social Web. UVK Verlagsgesellschaft mbH, Konstanz. Gassmann, O. and von Zedtwitz, M., 1996. Internationales Innovationsmanagement. Vahlen, München. Goll H.-W., 2007. Innovationsmanagement. http://www.goll.de/fp/archiv/Innovations-Management/index.php. Griffin, A., 1997. PDMA Research on New Product Development Practices: Updating Trends and Benchmarking Best Practices. In Journal of Product Innovation Management. Vol .14, pp. 429-458. Fink, A. and Siebe, A., 2006. Handbuch Zukunftsmanagement - Werkzeuge der strategischen Planung und Früherkennung. Campus Verlag, Frankfurt/Main. Herstatt, C. and Verworn, B., 2007. Management der frühen Innovationsphasen: Grundlagen – Methoden – Neue Ansätze. Gabler Verlag, Wiesbaden. Komus, A. and Wauch, F., 2008. Wikimanagement - Was Unternehmen von Social Software und Web 2.0 lernen können. Oldenbourg Verlag, München. Kortzfleisch, H. et al, 2008, Corporate Web 2.0 Applications. In: Hass, B. et al. (Hrsg.): Web 2.0. Neue Perspektiven für Marketing und Medien. Springer-Verlag, Berlin, S. 73-87. Kuppinger, M. and Woywode, M., 2000. Vom Intranet zum Knowledge Management – Die Veränderung der Informationskultur in Organisationen. Hanser, München Wien. Mertins, K. and Seidel, H., 2009. Wissensmanagement im Mittelstand. 1. Auflage, Springer-Verlag, Berlin Heidelberg. Marquardt, G. et al, 1996: Gestaltung internationaler Innovationsprozesse auf Basis von Kernkompetenzen. In: Gassmann, O.; von Zedtwitz, M. (Hrsg.): Internationales Innovationsmanagement. Vahlen, München, S. 175-185. Müller, C. and Dibbern, P., 2006. Selbstorganisiertes Wissensmanagement in Unternehmen auf Basis der WikiTechnologie – ein Anwendungsfall. In: Hildebrand, K.; Hofmann, J.(Hrsg.): Social Software. HMD – Praxis der Wirtschaftsinformatik, Heft 252. dpunkt.verlag. Heidelberg, S. 45-54. Pfadenhauer, M. 2010. Wie forschen Trendforscher? Zur Wissensproduktion in einer umstrittenen Branche. In Forum: Qualitative Social Research, www.qualitative-research.net/fqs-texte/2-04/2-04pfadenhauer-d.htm. Szugat, M. et al, 2007. Social Software. schnell + kompakt. entwickler.press, Frankfurt/Main. Trommsdorff, V., 2001. Innovationsmanagement. In: Diller, H. (Hrsg.): Vahlens großes Marketinglexikon. Vahlen, München, S. 661-664.

208

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

METADATA FOR A REUSABLE BUSINESS VOCABULARY ELEMENT N. Ghatasheh, D. Storelli and A. Corallo e-Business Management Section of S. S. ISUFI University of Salento Lecce, Italy

ABSTRACT The aim of this paper is to present the structure and the metadata needed to construct a business vocabulary element. The business vocabulary element could be considered as the building block of a Domain Specific Language (DSL) dictionary. For creating the desired reusable vocabulary element a set of metadata is used to ensure collective synthesis, flexible categorization, and the integrity with other systems as business rules based ones. One of the most important parts is finding those who can define a vocabulary element by a histogram density processing. Creating a vocabulary element with basic metadata and some hints about who can define it better than the others, is a first step for the creation of a vocabulary element that is well defined, clear, reusable, and maintainable. KEYWORDS Business Vocabulary Element, Metadata, SBVR Specification, Collective Synthesis, Domain Specific Language.

1. INTRODUCTION The world today requires high speed communication means where the knowledge plays a great rule in it. Clear communication needs a well-understanding of the communication itself. In the world of business we face communities of practice, groups and teams that work together, internetworked enterprises, knowledge sharing ,many suppliers distributed geographically, and so on. A common business vocabulary may solve the problems of communications and makes it faster and efficient. The community itself can build these vocabularies and agree upon them. If the conceptualizations are done in a collaborative way by time there will be a great asset that supports many. The relations between the concepts will be mapped and agreed upon also. The development of a common business vocabulary requires a team work; mainly the community that performs related task. In order to leverage the level of common understanding and minimize the mistakes resulted from it. The community may include a level of diversity in terms of communities of practice that require cross-team communication. A simple example that we may demonstrate is the communication between the management and the development team within an organization; the later has to perform the tasks specified by the management properly and may be urgently. The level of understanding between the management and the development team is required to be at high level to reduce the time delays and re-work. If we suppose that they agreed on common vocabulary they will communicate faster and easier without the need for extra explanations and the task will be clear. Chen (1994) has pointed to the idea that individuals may use different ways to describe the same concept. He shows that when an online collaboration is needed this problem becomes apparent. According to his study he suggested an algorithmic method to solve the problem. Fitzpatrick et al (2006), Bao et al, and others tried to propose ways for constructing common vocabulary or concepts. Their attempts tried to standardize the communication between the individuals almost at the same level. For example developers; they might be able to find a common language. Others tried to go to business level and find ways to build a business vocabulary for a specific domain as what Busanelli et al (2006) did for the textile industry. They tried to find an approach for an automatic processing on e-business vocabulary. They proposed a document based framework within a collaborative environment. Their

209

ISBN: 978-972-8939-40-3 © 2011 IADIS

proposition aims to assist the business to business applications communication. According to this background and similar researches we can see that the studies studies focused on individuals at the same level or applications. They point to the need for a common vocabulary. Different disciplines such as linguistics, philosophy, modeling, ontologies and mathematics are used to present a standard for the Semantics of Business Business Vocabulary and Business Rules (SBVR) specification. Relying on this standard we may be able to represent the business process in a controlled natural language. This language consists of classes or names that are called (noun concepts) and relations among them called (fact types). So the fact types may be used to construct a business rule that will show modalities as necessity, possibility and others. A simple example is “each customer has at least one order”; customer here is a noun concept and when it is defined it will be a Business Vocabulary. Not that only is considered a business vocabulary but also the fact types OMG (2008) and Linehan (2008).

2. DEFINING A BUSINESS VOCABULARY ELEMENT The vocabulary element as we present here is, within some limits, a multi view element that represents a concept or a simple word. For example an element called “customer” will have English dictionary definition and characteristics, also it will be defined according to its use within the domain. One of the uses of such an element is its integration with a business rules based system. As a result of our inspections and tests the following will be a definition of the vocabulary element. It will be constructed according to a combination of the Semantics of Business Vocabulary and Business Rules (SBVR) definition, linguistic part of speech, general word meta data, and a set of clustering variables.

Vocabulary Element Ver. Trace: creation date, finalization date ...

Meta Data P.O.S: Name, Verb, etc

Clustering Variables: Group, Team, Org., Freq., etc

Gender, Plural/Singular,

SBVR Attributes Term, Name, or Fact Type

Definition, Source, Concept Type, Example, etc

Figure 1. Vocabulary Element Structure

As the vocabulary elements are regular words that could be in any language, they will have the attributes of a word. Here we will consider that the language is English. So the properties under this section will be the part of speech, name-verb-adjective-etc, gender, pronunciation, etc. Taking only this part of the structure results in a comprehensive definition of the vocabulary as an English word. This will be the start point for going further in the process of defining the vocabulary as a domain specific one. Relying on the SBVR specification of the Object Management Group (OMG) released in 2008 a set of attributes will define the vocabulary. This part of the structure will make the vocabulary element compatible with any SBVR based platform for the future use or the reusability issue. Add to the platforms compatibility perspective the four disciplines that the SBVR is based on which are; Natural Language, Linguistics, Fact Oriented Modeling, and Business Consultancy. If all the objects created according to the SBVR specification it is clear that a standardization is present, so other vocabularies created by others could be used as starting point for the definition process.

210

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Add to the previous sections another set of version tracing history data that includes the creation date of the object, the current version, finalization date, who worked on it, etc. Also an important part is dedicated to the clustering process. Clustering here means trying as much as possible to find the most suitable entities of people, domain users, that are capable of defining the vocabulary. We can imagine that as understanding level of the vocabulary to be defined. For example a specific team within an organization could be the best group of people to define a regularly used concept among them. The attributes that will be added to this section of the structure includes, the relevance of the term for the teams or groups within an organization, the frequency density according to the usage, and a manually entered relevance values entered by an expert.

3. PROCESSING OF VOCABULARY ELEMENTS Before starting the clustering process a set of important concepts, seed concepts, are to be found. These seed concepts are the most important vocabulary to be defined. According to the frequency within different means of communications within the teams or the organization. Here a set of tools are to be used to tag the vocabulary, define its part of speech and calculate the frequency and giving indicators for importance. For clustering purposes we defined a process that will relate, as possible, each vocabulary element with the closest group of experts that can define it properly. Also the clustering process will show the level of usage for the vocabulary element within a small set of users, along an organization, or among several   organizations if applicable. Frequency Density Histogram  ,   =  where X is the

Frequency Density

vocabulary element within the G group; the frequency density is found by dividing the frequency of the vocabulary element X over the total number of words analyzed within a single group. We mean by group a division within an organization, team, or in general any physical or virtual number of actors communicating with each other to perform a common task. An example of a group could be the software development team within a generic organization.

Histogram of X's F.D. 6 5 4 3 2 1 0 Group 1

Group 2

Group 3

Group n

Team A

3,3

2,5

3,5

2

Team B

2,4

1,4

4,8

1,5

Team C

1

1

3

0,4

Figure 2. An Example of a Histogram for n Groups

Finally an expert is required to manually check the vocabulary to be defined, modify the relevance values if they seem to be incorrect, fill the missing or un defined clustering values, and finalize the basic vocabulary element before sending it to the collaborative and collective definition process.

4. CONCLUSION In this paper we proposed a structure for vocabulary elements. The structure is based on SBVR specifications and different other metadata. This is to create a base for vocabulary elements that are standardized, reusable, well defined, ready to be defined within a collaborative definition platform, and mainly to have a

211

ISBN: 978-972-8939-40-3 © 2011 IADIS

comprehensive and well defined vocabularies. The process for defining a vocabulary is to be as following: finding the most important vocabulary, filling some of the metadata that are basic and essential, as part of speech and gender, then trying to relate the vocabulary to a cluster (category) in semi-automatic way, finally exporting the pre-processed vocabulary element. The pre-processed vocabulary after that will be ready for the next process which is a collective work by those who are capable the most of defining it. This could be done using a collaborative platform. The structure of the pre-processed vocabulary elements makes it easy for the collaborative platform integration, versioning, and definition processes.

REFERENCES Bao, J., Caragea, D., and Honavar, V. 2006.Towards Collaborative Environments for Ontology Construction and Sharing. CTS '06 Proceedings of the International Symposium on Collaborative Technologies and Systems. IEEE Computer Society Washington, DC, USA ©2006 Bao, J., Hu, Z., Carage, D., Reecy, J., and Honavar, V. 2006. A Tool for Collaborative Construction of Large Biological Ontologies. DEXA '06 Proceedings of the 17th International Conference on Database and Expert Systems Applications. IEEE Computer Society Washington, DC, USA ©2006 Busanelli, M., Gessa, N., Sabbata, P., and Vitali, F. 2006. Extracting a Semantic View From an Ebusiness Vocabulary. The 8th IEEE International Conference on E-Commerce Technology and The 3rd IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services (CEC/EEE'06). pp.57. Chen, H., 1994 . Collaborative Systems: Solving the Vocabulary Problem. Conference Paper. Computational Intelligence for Financial Engineering. April, 2005. New York City, USA. Chen, H.; Lynch, K.J. 1992. Automatic construction of networks of concepts characterizing document databases. Systems, Man and Cybernetics, IEEE Transactions on , vol.22, no.5, pp.885-902, Sep/Oct 1992 Dolby, J., Fokoue, A., Kalyanpur, A., Schonberg, E. and Srinivas K., 2009. Extracting Enterprise Vocabularies Using Linked Open Data. 8th International Semantic Web Conference, ISWC 2009, Chantilly, VA, USA. Fitzpatrick, G., Marshall, P., and Phillips, A. (2006,). CVS Integration with Notification and Chat: Lightweight Software Team Collaboration. Conference Paper. Computer Supported Cooperative Work CSCW'06, November 4–8, 2006, Banff, Alberta, Canada.Copyright 2006 ACM. Fox, S., Chionglo, F. and Fadel, G. 1993. Common-sense model of the enterprise. Proceedings of the Industrial Engineering Research Conference. 1993 Kozakov, L., Park, Y., Fin, T., Drissi, Y., Doganata, Y., Cofino, T. 2004. Glossary extraction and utilization in the information search and delivery system for IBM Technical Support .IBM SYSTEMS JOURNAL, VOL 43, NO 3, 2004 Lambrix, P. and Edberg, A. 2003. Evaluation of ontology merging tools in bioinformatics. Pac Symp Biocomput, pages 589–600 Linehan, M., 2008. SBVR Use Cases. N. Bassiliades, G. Governatori, and A. Paschke (Eds.): RuleML 2008, LNCS 5321, pp. 182–196, 2008. © Springer-Verlag Berlin Heidelberg 2008 Linehan, M., 2007. Ontologies and Rules in Business Models. Eleventh International IEEE EDOC Conference Workshop (EDOCW'07). OMG. 2008. Semantic of Business Vocabularies and Rules. Object Management Group OMG. Released Versions of SBVR. Retrieved from http://www.omg.org/spec/SBVR/ Zilli, A., Damiani, E., Ceravolo, P., Corallo, A. and Elia, G., 2008. Semantic Knowledge Management: An OntologyBased Framework. Hershey, PA : Information Science Reference, ©2009 Zack, H., 2003. Rethinking the knowledge-based organization. Sloan Man-agement Review, vol. 44, pp. 67-71.

212

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

EXPLORING THE RELATIONSHIP BETWEEN IMPRESSION MANAGEMENT AND INTERPERSONAL ATTRACTION IN SOCIAL NETWORKING SITE Hueiju Yu1, Pei-Shan Wei2, Hsi-Peng Lu2 and Jen-Chuen Tzou1 1

School of Continuing Education, Chinese Culture University, Taipei City, Taiwan, Republic of China No.231, Sec. 2, Jianguo S. Rd., Da-an District, Taipei City 106, Taiwan (R.O.C.) 2 Department of Information Management, National Taiwan University of Science and Technology, Taipei City, Taiwan, Republic of China No.43, Sec. 4, Keelung Road, Da-an District, Taipei City 106, Taiwan (R.O.C.)

ABSTRACT As numerous social networking sites (SNSs) have successfully spread widely by the internet users over recent years, they have drawn close attention from the enterprises as well as the researchers. In order to be desirable in social networking, people tend to create favorable online impressions with biographic data, pictures, interests and thoughts, online landscapes, and avatars. This study presents an experiment investigating how SNS users do impression management through their online profiles. The results reveal that the avatars who have elaborate outfits, living in well-decorated apartments, and being described as extraverted will receive higher interpersonal attraction evaluations over those with introverted and plain ones. Several insights are also gained from open-end questionnaires on how SNS users perceive eimages through avatars. Implications for future research and applications are also discussed. KEYWORDS Social networking sites, impression management, interpersonal attraction, avatar.

1. INTRODUCTION Goffman (1959) revealed that individual interacting with others in public environment will strategically manage other people’s impressions of himself or herself. This phenomenon has been researched and defined as “impression management”. Impression management is a valuable strategy in social life to enhance the degree of individual confidence and self-satisfaction and, at the same time, to improve interpersonal relationship. Today, the growing ubiquity of internet technology, the development of computer-mediated communications (CMC) has led to rapid changes in building and maintaining interpersonal relationship. Social networking sites (SNSs) draw a lot of attention of Internet users in recent years. Individuals who use a SNS can not only maintain pre-existing social connections (Ellison et al., 2007), but also find others with similar interests for romantic or social purposes (McKenna, Green et al. 2002). People use SNS desire to have strong interpersonal attraction for assisting the building of ideal interpersonal relationship. Scholars had claimed that ‘‘online communicators may exploit the capabilities of text-based, non-visual interaction to form levels of affinity that would be unexpected in parallel offline interactions’’ (Walther et al., 2001). Yet so far there are still relatively few studies characterizing how to make a desirable e-image on others to captivate them in SNSs?

2. RESEARCH METHODOLOGY In SNSs, each user can be quite selective in their impression management online through profile or avatar constructed. This study argues that, just like in real societies, people with good look and are described as

213

ISBN: 978-972-8939-40-3 © 2011 IADIS

extraverted would be more likely to foster higher interaction willingness among others and further gain more friendship. The proposed research model for this study is presented in Figure 1.

 

e-Impression design Physical attractiveness Extrovert/ introvert

Interpersonal attraction Figure 1. Research model

We presents an experiment in i-Partment which is a famous SNS founded in Taiwan with approximately over 1.8 million users in Taiwan and 20 millions in China. In i-Partment, individuals can create a unique avatar to decorate their own apartment and to construct profiles representing themselves during a variety of online interactions. In order to identify commonly acknowledged physical attractiveness, we designed twenty avatars in i-Partment classified into four types by gender (male and female) and appearance (gorgeous and plain). A pre-test of 30 graduate students was conducted to select the most gorgeous or plain avatar for each type. Besides, they were also asked to select the characters and interests from the lists provided by i-Partment of which they thought best represent the concept of “extravert” and “introvert”. As a result of this pre-test, eight impression management roles were created as the stimulus for further web-based survey (Figure 2) (Figure 3).

Figure 2. Female avatars: (A) extravert with gorgeous decoration, (B) extravert with plain decoration, (C) introvert with gorgeous decoration, (D) introvert with plain decoration

Figure 3. Male avatars: (E) extravert with gorgeous decoration, (F) extravert with plain decoration, (G) introvert with gorgeous decoration, (H) introvert with plain decoration

214

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

The questionnaire was developed based on the theories of interpersonal attraction (McCroskey & MoCain, 1974; Berscheid & Walster, 1974) which was made up of social attraction and physical attraction. Besides, open-ended questions were used to understand gorgeous or plain avatars in both sexes preference reasons. Web-based survey was conducted: each participant was asked to view all of eight stimulus, which was manipulated by the independent variables before the participant evaluates the interpersonal attraction of the avatar. Moreover, in order to understand whether any matter from different viewpoints, results in different interpretations about SNSs users’ impression management strategy and preference. We use opposite aspects of questionnaires to measure users’ different roles in SNSs: audience or performer. Audience is defined as people who receive stimulus, which from performers’ introduction in different ways, will evoke audience’s diverse degree of want to create relationship with performers. Performer is referred to as how individuals present themselves through avatar and profile constructed in SNSs to obtain more interpersonal attraction. During a 3-week survey period, totally 335 survey responses received, 332 were complete and usable, among which 164 were audience viewpoints; 168 were performer viewpoints.

3. RESULT Audience viewpoints results: This research used audience aspect questionnaire to analyze when view different e-impression, how degree of interpersonal attraction will be evoked. All the results gathered were be processed with the descriptive statistics, multivariate tests, and one-way ANOVA (Analysis of Variance), whereat, we set α =0.05. The analyses yielded the following results. First, avatars who have elaborate clothes, live in well-decorated apartment and are described as extraverted receive higher interpersonal attraction evaluations (A, E) than those with low outward attractiveness and being introverted (D, H). The result also revealed that avatars who with high outward attractiveness and being described as introverted receive higher interpersonal attraction evaluations (C, G) than those avatars with low outward attractiveness and being extraverted (B, F). Performer viewpoints results: In order to understanding more comprehensive survey about how people build their e-image for gaining higher interpersonal attraction, this study polled opposite aspects of audience and designed another questionnaires to measure users’ different roles in SNSs. The results show that people who attempt to obtain more interpersonal attraction are more likely to present themselves through making avatars dress up, have an ornamented apartment and described as extraverted (A, E) than those have plain apparel, poor embellishment and being described as introverted (D, H). Besides, result demonstrate that individuals who attempt to obtain more interpersonal attraction are more likely to present themselves through making avatars have elaborate clothes, furnished apartment and described as introverted (C, G) than those have low outward attractiveness and being described as extraverted (B, F) . The open-ended questionnaire results show that most of the participants prefer to establish friendship with people who make avatars gorgeous, because they look fashion, have good taste and treat their own life seriously. Some participants indicate that “gorgeous and stylish avatar are attractive and imply the owner pay more attention on self-image and have their own ideas.” Others participants think “gorgeous avatars seem have colorful life and must be outgoing personality should have more topics to chat with!” However small numbers of participants consider those people who make their avatars gorgeous are materialist and feel distant to be friend with them. One participant feels “gorgeous avatar is too fancy and spends too much time and money in this virtual platform.” Thus, few participants want establish friendship with plain avatars because they look friendlier, less material desires and seem not addict to the Internet. One participant indicates “plain avatar is simple, nature and do not spend too much money in the virtual world. I prefer a simple lifestyle.”

4. CONCLUSION Prior studies pointed out that self-presentation is an important factor to motive others willingness to be a friend with you. The goal of this study is to better understand the impact of impression management in a virtual world. The results suggest that e-impression management has a significant impact in the SNS domain, which does play a prominent role in enhancing interpersonal attraction. The both audience and performer

215

ISBN: 978-972-8939-40-3 © 2011 IADIS

viewpoints results revealed that individual in SNS wearing elaborate clothes, living in the decorate apartment and being describe as extraverted receive higher interpersonal attraction evaluations than those with low outward attractiveness and being introverted. Consistent with previous research, physical attractiveness and extravert personality have important influences on willingness to interact. Specifically, avatars who dressed up and lived in decorate apartment were counted as affluent person but addict to internet. Thus, careful attention must be paid that designing too complex avatars may evoke negative attitudes from other SNSs users. This study contributes to both theory and practice. From the theoretical standpoint, our study provided a better understanding of the relationship between e-impression management and interpersonal attraction in SNSs. It also showed that physical attractiveness was a key factor in determining a SNSs user intention to establish friendship. Moreover, the theory of impression management usually found a place in the organizational literature. This research successfully applied impression management theory to virtual environment. Besides, our study also showed traditional interpersonal relationship theories can be successfully applied to online communities. Our study has generated the following insights for managers and service providers of SNSs. The results of this study demonstrate that outward attractiveness and personality of avatars influence SNS users interpersonal attraction. Service providers can thus consider how to provide multiform appearance elements for users to change their outward feature. External appearance and personality are manipulated to increase interpersonal attraction and develop ideal interpersonal networks. When expected interpersonal relations are established, their needs for using SNS will be fulfilled, and thus users will tend to continue to support this site.

ACKNOWLEDGEMENT This study was supported by grants from the National Science Council of the Republic of China under Contract Number NSC 99-2410-H-034-038.

REFERENCES Book Goffman, E., 1959. The presentation of self in everyday life. Doubleday, New York. Journal Berscheid, E., & Walster, E., 1974. Physical attractiveness. Advances in Experimental Social Psychology, 7, pp 157–215. McKenna, KatelynY. A., Green, A. S., & Glenson,Marci E. J., 2002. Relationship formation on the Internet: What’s the big attraction? Journal of Social Issues, 58(1), pp 9–31. McCroskey, I. C., & MoCain, TA., 1974. The measurement of interpersonal attraction. Speech Monographs, 41, pp 261– 266. Walther, J. B., Slovacek, C., & Tidwell, L., 2001. Is a picture worth a thousand words? Photographic images in long-term and short-term computer-mediated communication. Communication Research, 28, pp 105–134. Conference paper or contributed volume Ellison, N. B., Lampe, C., & Steinfield, C., 2007. A familiar Facebook: Profile elements as signals in an online social network. Proceedings of human factors in computing systems, San Jose, CA, USA.

216

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

EVALUATION OF THE SERVICE COMPOSITION TECHNIQUES: A TOOL AND CRITERION Abrehet Mohammed Omer and Alexander Schill Chair for Computer Networks TU Dresden Dresden, Germany

ABSTRACT There are several service composition techniques proposed by various researchers. However, there is no standard evaluation criterion that takes into account different aspects of service composition problems. This paper presents: (1) composition techniques evaluation and validation criteria through gathering and extending existing methods. When selecting evaluation criteria attempts were made to understand web service composition techniques and problems. (2) a tool that facilitate evaluation of web service composition mechanisms. The tool generates a collection of synthetic web services as required. It has a graphical interface to control the number of services to be generated and see other intermediate results during composition.

1. INTRODUCTION Service-Oriented Computing (SOC) is becoming an enabling technology for future computing models over Internet. The dynamic nature of SOC allows the development of component service-based applications. This motivated researchers to propose techniques for supporting service composition and integration. Researchers proposed a number of composition techniques. Subtasks of composition techniques could include activities like discovering abstract services, creating composition plan and binding concrete services and making it available for execution. The different proposed techniques either consider these subtasks of service composition in lump sum or one at a time. For the evaluation of the different approaches some kind of standard is apparently necessary. However, there is no standard evaluation criterion proposed for the evaluation of composition approaches. This has made evaluation and comparison of exiting service composition techniques difficult. Mostly composition approaches validation and performance evaluation is done using hypothetical scenarios that involve limited number of services. Such evaluation method does not adequately characterize various problems of service composition. To overcome this limitation some researchers used test environments that provide synthetic web services. This method has also its own limitations. It allows only the evaluation of composition techniques based on limited criteria mostly scalability. This paper attempts to set composition techniques evaluation and validation criteria by gathering and extending the existing methods. It also provides a tool that facilitates the evaluation of automatic web service composition mechanisms based on the set criteria.

2. COMPOSITION TECHNIQUES EVALUATION CRITERIA Composing services requires consideration of multiple composition problems; such as, the capability to compose high number of services and to form composite service with a combination of multiple controlflows, i.e. concurrent, sequential, alternative and loop. Thus, all proposed automatic and semi-automatic composition approaches should be evaluated subjected to these composition problems. There are several service composition techniques proposed by various researchers. Yet, none of them considers multiple issues of service composition problem while evaluating their approaches. This made

217

ISBN: 978-972-8939-40-3 © 2011 IADIS

comparison of different approaches difficult. One of the two aims of this paper is providing basic criteria to serve such diversity. The selection of evaluation criteria requires understanding of web service composition techniques and associated problems. We set the following questions to assist the evaluation criteria selection process (1) which properties of the composition problems are worth testing? (2) What are the major properties of composable web services? (3) What kinds of tests (validations) are relevant? On the road to answer these questions we come up with the following evaluation criterion. 1. Scalability: his criterion allows us to evaluate whether the composition approach is suitable for larger number of service compositions or not. It measures the relation between composition approach computation time and the number of services involved. 2. Multiple control flows: this criterion allows checking whether the proposed composition approach outputs a composite service workflow that has various control flows or not. 3. Inter-domain services combination: this criterion allows evaluating weather the composition techniques work for a set of composable services from different domain or not. 4. Open /closeness: this criterion allows checking whether the proposed composition is open for the participation of arbitrary service or it is only applicable for pre-defined services. 5. Correctness: this criterion is subjected to user requirement. If the composite service satisfies all goals of a user requirement then the composition approach creates correct composite service. One or more of the above evaluation criteria are used by other researchers for evaluating composition approaches. But, we propose the simultaneous use of all the criteria while evaluating composition approaches and we included most of them in the developed test tool.

3. THE TOOL For the evaluation of composition techniques based on the aforementioned criteria, availability of service repositories or real web service environment involving adequate number of web services is a requisite. However, it is difficult to get service repositories or a real web service environment that satisfy this requisite. This research developed a Synthetic Composable Web Service GENerator (SCWSGen) tool. The SCWSGen is capable of generating as large number of synthetic web services as required. The generated synthetic web services can be used to evaluate and validate composition approaches. SCWSGen first creates parameter pool and then by randomly picking I/O parameters from the parameter pool it creates WSDL files automatically. Setting some important constraints on the process of the random picking I/O parameters is found to be necessary. This is because letting the I/O picking process absolutely random results properties far from real web services characteristics; such as, redundant services and densely interconnected services. Thus constraints are set by taking into account empirical results found from related works about individual and composite web service properties. In addition, aspects related to composition approaches evaluation criteria are taken into account. In this tool two of the above evaluation criteria are explicitly included, i.e. scalability and multiple control flow, and the other two criteria (inter-domain service combination and correctness) are incorporated partially. Testing scalability requires high number of composable services. The SCWSGen generates number of synthetic web services as demanded. This makes it eligible to test scalability. The number of web services to be generated is given as user modifiable constraint. The main variables used in this tool are: the maximum number of services to be generated (n), the number of parameters in parameter pool (m) (average total number of parameters proportionally grows with n). The SCWSGen puts additional constraints during service generation to create scenario that comprise the multiple control flows. There are four possible control flows: alternative control flow, loop control flow, sequential control flow and concurrent control flow. Alternative control-flow occurs in a composition plan when there are services with same (equivalent) input and output parameters. Loop control flow could occur within a service it self which is (self loop) or it might involve more than one service. Self loop is related to the possibility of web services having at least one parameter that is both input and output parameter. A loop among multiple services occurs when output of a service is input of another service and this chain continues till the last service output which will again become input of the starting service. Specially realizing loop control flow requires cautiously restricted assignment of I/O parameters. Otherwise, it might end up in

218

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

creating complicated loops that doesn’t occur in real case scenarios and that can not be described in composition plans or workflow. To avoid such complicated and unnecessary cycles we tried to investigate loops in existing composite web service scenarios. We have found that the possibility of having nested loop is less and also in reality there are no loops with more than 5 services. Taking this into account the following constraints are set while picking random I/O parameters: (1) one input parameter should not appear in more than two services. (2) a parameter should appear only twice as an input parameter. (3) Input and output parameters should be picked from the same pool (since the aim is to generate composable web services). This constraint enables the tool to evaluate composition approaches with the second evaluation criteria (i.e. including multiple control flow). Beside the two explicitly incorporated criteria a third criterion, i.e. inter-domain service combination, is partially included. This is possible because we randomly generate web services without consideration domain knowledge. To fully include this criterion implementation of ontology and semantic reasoning in the composition process should be done. This is not included in the current version of SCWSGen. SCWSGen is meant to test a composition technique that works in open environment. The other criterion, which is correctness, is complex to fully incorporating it in a tool. But indirectly some features can be cross checked. For example: by checking goals of user request against goals of component services in a composition, by checking expected outputs of user request against outputs of composite service. JAVA JDK is used for implementation and AXIS is utilized for automatic WSDL generation. AXIS is a code generator plug- in for eclipse that supports Java2WSDL and WSDL2Java tools. Figure 1 shows the detailed implementation architecture of the synthetic web service generator.

Figure 1. Synthetic web service generator architecture

4. SCWSGen SCENARIO GENERATION The SCWSGen is used to generate n number of service descriptions that form composite service scenarios. Each scenario will thus contain n number of services that have various numbers of services dependency arrangements at each run. SCWSGen has a graphical user interface (GUI) that takes user inputs. In order to perform the testing of composition techniques a user has to initially provide to the SCWSGen the required number of services (n). The screenshot of the tool kit with its full components is in figure 2. As it can be seen in figure 2 this tool has GUI components. The basic ones are four buttons: web service generator, dependency generator, show dependency and CA (which stands for composition approach). These buttons provide a means to receive the action of users' input. The first two buttons allow a user to generate web services and identify dependencies among these web services, respectively. When the web service generator button is clicked all the necessary operation to generate synthetic web services will be accomplished and

219

ISBN: 978-972-8939-40-3 © 2011 IADIS

synthetic web service will be deployed in web server. The last button (CA) directs the user to the specific composition technique to be tested. A researcher can extend this tool by implementing a specific composition technique on the top of this tool. The extension will be connected to the ‘CA’ button. The main requirements to install and use this tool are: Java, Apache Tomcat, Apache axis. For a user who only wants to have independent services the tool also stores the generated synthetic service descriptions in a local folder ‘webservices’ inside the installation folder.

Figure 2. GUI SCWSGen

5. RELATED WORKS Conducting experimental evaluation of service composition techniques requires an appropriate web service environment (tool) and evaluation criterion. However, it is difficult to get a real web service environment with diverse and adequate number of composable web services that allows performance evaluation studies based on the evaluation criteria discussed earlier. Indeed, there are few existing prototypes that generate synthetic web services and provide test-bed for a service composition. For example, [2] propose a test-bed that generates synthetic web services in the form of graph. The focus of their test-bed is cross-domain chaining and service composition request generation. The test-bed considers composition mechanisms that use the backward or forward chaining technique and it does service discovery and process model creation steps of a composition process simultaneously. This test-bed is not applicable for composition approaches that take a service discovery and a process model generation subtasks as two different steps. [3] presents a test-bed called WSben. WSben is created to facilitate service discovery and composition mechanisms. This test-bed is suitable mainly for approaches that combine service discovery and process model generation steps during a composition. Both [2] and [3] do not address the issue of multiple control-flows in composite service, specifically loop control flows. And they only consider scalability as an evaluation criterion. Moreover, their graph models do not consider the existence of cyclic dependency. [1] presents a tool that generates service automatically via random graph generator. The random graph generator randomly picks two parameters from different parameter cluster, it makes an edge, inserts one node in between and call it a service. Then the graph model is converted to implementation file to generate deployable service. The different parameter clusters represent different domain and simulate inter-domain service composition. Their tool provides deployable services in the form of very densely connected graph without cycle. Thus, this tool is more convenient to evaluate approaches that use graph based algorithms. The existence of a test-bed that provides abstract service descriptions, which are not interconnected, is an important factor while evaluating composition techniques that do the different subtasks of service composition separately in steps. None of the existing test-beds meet this requirement. Moreover, separating a testing tool from composition implementation, and making it generic, provides flexibility to a researcher for the evaluation of a composition approach with more criteria than scalability. This actually has important role in increasing the reusability of the test-bed (or the tool). The implementation of the tool presented in this paper has a Synthetic Composable Web Service GENerator (SCWSGen) and composition plan generator.

220

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

The SCWSGen is responsible for generating composable web services for any service composition approach as demanded. The composition plan generator, which is actually the implementation of an approach proposed by [5], is done to demonstrate how the SCWSGen can be used. It is mandatory to have a benchmark evaluation criterion along with a tool that facilitates evaluation composition techniques. In particular, the tool should be capable of providing as much number of composable services as demanded potentially possible diverse real world scenarios. The benchmark evaluation criterion should also consider the necessary properties of problem related to real composition scenarios.

6. DISCUSSION The comparison between the tool presented in this paper (SCWSGen) and existing tools is given in table 1. The comparison is done based on five criteria. The criterion in column 1 is adopted from [1]. The other four criteria are the evaluation criteria proposed in this paper. The table compares applicability of the different tools for evaluation based on the proposed evaluation criteria. Table 1. Comparison of SCWSGen with other existing tools Approach

SCWSGen WSBen[3] Large-scale testbed[2] Automatic-service generator tool [1]

Service graph model

Evaluation criteria Scalability Multiple control flow

No Yes Yes

Yes yes yes

yes

yes

yes No No

Discrete service generation Yes No No

Inter-domain composition partial

no

No

yes

There are two main contribution of this paper. First, it presents a composition techniques evaluation and validation criteria by gathering and extending existing methods. Second, it provides a tool for generation of constraints based randomly generated synthetic composable web services that can be used to evaluate and compare web service composition mechanisms. The synthetic composable web services generated by this system are constrained to simulate or characterize real case scenarios. The constraints are based on survey of papers that discuss about characteristics of individual and composite web services. This tool can be customized and used by other researchers by including the composition approach algorithms inside the code. It helps to evaluate the efficiency of algorithms in a given composition technique in creating composition plan given candidate services for a composite request

REFERENCES [1] Cho, E. et al. 2009. Automatic Web Services Generation. , Proceedings of 42nd Hawaii International Conference on System Sciences, pp.1-8. [2] Constantinescu, I. et al. 2004. Large scale, type-compatible service composition. Proceedings of IEEE International Conference on web services, pp. 506 - 513. [3] Oh, S.-C. et al. 2006. Wsben: A web services discovery and composition benchmark. Proceedings of the IEEE International Conference on Web Services, Washington, DC, USA, pp. 239-248. [4] Fluegge, M. et al. 2006 Challenges and techniques on the road to dynamically compose web services. Proceedings of the 6th international conference on Web engineering, New York, USA, pp. 40-47. [5] Omer A.M. and Schill A., 2009Web service composition using input/output dependency matrix, Proceedings of the 3rd workshop on Agent-oriented software engineering challenges for ubiquitous and pervasive computing, pp. 21- 26.

221

ISBN: 978-972-8939-40-3 © 2011 IADIS

MODELING XML CONTENT EXPLAINED Harrie Passier and Bastiaan Heeren School of Computer Science, Open Universiteit Nederland Valkenburgerweg 177, 6419AT Heerlen, The Netherlands

ABSTRACT Over the years, a lot of course material has been developed to explain to students the fundamentals of XML and schema languages such as DTD and XML-Schema. Typically, the syntax of these languages is discussed and examples are given. How to find a schema for some XML content is often not covered by the material. As a result, students have problems to start with modeling a complex schema, many of their inferred XML schemas are too liberal, and some are even incorrect. In this paper we present a systematic approach for modeling XML content models based on rewriting regular expressions. KEYWORDS XML, DTD, schema, modeling, regular expression, rewriting

1. INTRODUCTION Many of today's computer science courses introduce and explain their topics without mentioning their underlying formal methods (Van Merriënboer, 2007). As a result, it remains unclear how to construct a program, a data structure, et cetera. Instead, teaching methods and engineering approaches are used that mainly rely on inspiration and intuition, and this does not always work out well. As a consequence, students are often not sufficiently aware of what to do and why: they need and ask for more guidance in terms of ''how to do'' a particular task. In this paper, we discuss the use of formal methods in an introductory course on XML at bachelor level. We focus on XML schema languages, such as the Document Type Definition (DTD) language. More specifically, we present an approach based on regular expressions (REs) (Hopcroft, 1979) for deriving content models for XML elements in a systematic way. Although the similarities between schema languages and REs are well understood, books and teaching material do not use this to their advantage. Typically, a number of (small) examples is given, but without an explanation of how the resulting content model was found. We are not aware of books on XML that introduce REs to provide a deeper insight into content models, or a systematic way to model XML content. We present a systematic approach based on rewriting REs that helps students in constructing content models as part of a schema. By establishing a link between schema languages and REs, it becomes much easier to reason about content models, and to manipulate these models. Rewrite rules on REs pave the way for a stepwise derivation of a content model. We distinguish between precise content models (describing exactly the set of allowed sequences of XML elements, but nothing more), and correct content models (describing at least the sequences of XML elements we want to have). For both cases we specify a strategy that makes precise how and where to apply the rewrite rules. The paper is structured as follows. Section 2 gives an introduction to DTDs and REs, and presents rules to rewrite these expressions. Section 3 then explains how to make a content model deterministic, a requirement of the DTD language. Sections 4 and 5 define strategies for precise and correct content models, respectively. The last section gives related work and draws conclusions. More background information and a description of a small-scale experiment can be found in a technical report (Passier, 2011).

222

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

R|S R|R εR Rε

= = = =

S|R R R R

(1a) (1b) (1c) (1d)

R? = ε | R (2a) R* = ε | RR* (2b) R+ = RR* (2c)

RS | RT RT | ST R *R (RS)*R

= = = =

R(S | T) (R | S)T RR* R(SR)*

(3a) (3b) (3c) (3d)

Rn ⇒ R* (if n≥0) Rn ⇒ R+ (if n≥1) R*S* ⇒ (R | S)*

(4a) (4b) (4c)

Figure 1. Rewrite rules on REs

2. DTDS AND REGULAR EXPRESSIONS A content model of an XML element is in fact a regular expression (RE) (Bruggemann-Klein, 1998). Figure 1 presents a list of rewrite rules that operate on REs. The first set of rules (1a-1d) expresses some basic properties of the choice and sequence operators. The second set of rules (2a-2c) shows that all occurrences of R? and R+ can be removed from an expression and that R* can be expanded one step. The third set of rules (3a-3d) is for making expressions deterministic. Note that the rules in Figure 1 can be applied in both directions (i.e., also from right to left) because both sides are equal. When modeling XML content, one typically uses the cardinality operators to reduce the size of the model. For example, a | aa | aaa can be written as a+, which is far more concise. The price we pay for this reduction in size is a loss of precision: the latter expression now also accepts aaaa. The fourth set shows rules for the introduction of cardinality operators. These rules are directed, and extend the language that is generated.

3. REMOVING NON-DETERMINISM XML is defined to be compatible with SGML, and as a consequence, content models of DTDs have to be deterministic. A content model is deterministic if an XML processor can check a document against a DTD without looking forward in the document (i.e., inspecting only the current element). Generally, there are two situations in which non-determinism occurs (Watt, 2000): 1. A content model contains R | S and the sets of element names that can start a sequence in L(R) and L(S) are not disjoint, where L denotes the language of an expression. For example, ab | ac is not deterministic because the set of starters (known as the first set) is {a} for both alternatives. 2. A content model contains R?, R*, or R+, and the set of element names that can start a sequence in L(R) is not disjoint with the set of names that can follow in this particular context (the follow set). An example of such a non-deterministic expression is (ab)*ac. We now present a strategy for the stepwise removal of non-determinism: rewrite problematic subexpressions until we have reached a deterministic expression. We discuss the two situations. Situation 1. Given is a subexpression R | S with at least one element that is starter of R and S. Let this element be a. The non-determinism can be removed in three steps. a) Rewrite R and S until element a is the first of a sequence. This involves expanding cardinality operators (2a-2c), removing ε in sequences (1c), and distributing sequence over choice (3b). Rules 1b, 3c, and 3d (and variants for the other operators) can provide a shortcut. b) If needed, rearrange alternatives (1a) so that the sequences starting with a are adjacent. c) Apply the factorization rule 3a. In some cases, an ε has to be introduced first (1d). Example 1. The top-level alternatives in the following example make the expression non-deterministic: (a | b)a | a = = = =

aa | ba | a aa | a | ba aa | aε | ba a(a| ε) | ba

(3b) (1a) (1d) (3a)

Situation 2. This case is more complicated because some REs cannot be transformed into a deterministic form (Bruggemann-Klein, 1993, 1998). Examples of such expressions are (ab)*(ac)* and (ab)*a?. Note that deterministic REs should not be confused with deterministic finite-state automata (DFA) (Hopcroft, 1979), another well-known formalism. Every RE can be transformed into an equivalent DFA, and the other way around. An RE constructed from a DFA, however, is not automatically deterministic.

223

ISBN: 978-972-8939-40-3 © 2011 IADIS

Suppose we have some subexpression R?, R*, or R+, which has element a as a starter. Furthermore, a is also in the follow set of the subexpression at hand. We proceed by case analysis on the operator used. a) For content model R?, we remove the operator by applying rule 2a, resulting in ε | R. Then, this alternative should be combined with its context, for instance by using distribution (rule 3b, from right to left). Eventually, we arrive at a situation 1 problem. This case also covers non-deterministic expressions of the form ε | R (instead of R?) that have a non-disjoint follow set. b) At its best, removing non-determinism involving R* can be done with rules 3c and 3d. In some cases, expression R or its context needs some rewriting before these rules are applicable. If this does not work (e.g. because it is impossible), then two possibilities remain to deal with the situation: - Make the expression less precise, and extend the language generated by the RE. - Introduce an extra level in the XML tree, and circumvent the ambiguity altogether. c) In case of R+, we use rule 2c. This reduces the problem to the case for R*. Example 2. The following derivation illustrates the case for an optional part: (ab)?a = = =

(ε | ab) a a | aba a(ε | ba)

(2a) (3b and 1c) (3a and 1d)

Besides the requirement that content models have to be deterministic, an ε should not be part of a composite content model according to the DTD language specification. This means that in the end, ε's have to be removed, which is fairly simple (e.g., with rules 2a, 1c, and 1d). This gives us a normal form for content models. Definition 1. A content model M is in XML normal form (XNF) if and only if M is deterministic, and no ε is present in M (except when M itself is ε). Reaching XNF is not a goal of its own, but rather a final step after the strategies presented next (Section 4 and Section 5).

4. PRECISE CONTENT MODELS We now turn our attention to deriving a content model from some instance document. A content model should not be made too liberal without careful thought: after all, schema languages are used in the first place to reject documents and to spot inconsistencies. We start by considering precise content models only: a precise content model contains exactly those sequences of child elements that we want to have. Definition 2. Let M be a content model, and X a set of sequences of child elements. Then M is a precise model for X if and only if L(M) = X. Obtaining a precise model for some XML content is rather easy. First, we write down all sequences of child elements for some particular element that appear in the instance document(s). These sequences are the alternatives of the starting model. For example, suppose we have: The starting model of element rec is ab | ε | abab | ab. We call such a first approximation of a precise

content model the starting form (SF). The next step is to rewrite the content model in SF into XNF without losing precision. A strategy for this step is discussed next. The strategy described in Section 3 can turn any content model into XNF. If we start with a model in SF and look for a precise model, a much simpler strategy is sufficient. Cardinality operators are not present in the starting model, nor are they introduced during rewriting. The strategy for precise content models is: Input: An XML content model in SF. Output: A precise XML content model in XNF. Step 1. Remove redundant choices of the form R | R by applying rule 1b. If the duplicate choices are not adjacent, then change the order (rule 1a). Step 2. Remove situation 1 type of non-determinism (for subexpressions of the form R | S) by repeating the strategy of Section 3.1 until the model is deterministic. Step 3. To reach XNF, we remove all occurrences of ε. For this, we apply rule 2a, from right to left.

224

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

The strategy for precise content models returns canonical models (up to the order of choices). The order in which the rules are applied does not influence the result. Removing duplicates early (step 1) helps to shorten the derivations. Also note that the size of the final model is never larger than the original model. Example 3. We continue with the starting form for element rec, and rewrite it into a precise model in XNF. (1a) ab | ε | abab | ab = ab | ab | ε | abab = ab | ε | abab (1b) = ε | ab | abab (1a) = ε | abε | abab (1d) = ε | ab(ε | ab) (3a) = ε | ab(ab)? (2a) = (ab(ab)?)? (2a)

5. CORRECT CONTENT MODELS A correct content model contains at least the sequences of child elements we want to have, and possibly more. Definition 3. Let M be a content model, and X a set of sequences of child elements. Then M is a correct model for X if and only if L(M) ⊇ X. For instance, (ab)* is a correct content model for {ε, ab, abab}, but not a precise model since ababab ∈ L((ab)*). Correct models are generally more concise than precise models: the trade-off is that they can be more liberal than needed. Smaller models show the structure more clearly. For example, (a | b)* is equal to (a*b*)*, but the first one is (arguably) simpler. Minimizing the size of an expression should not be the only goal though. A model that allows everything (e.g. (a1 | a2 | … | an)*, where a1 … an are all existing elements) is concise, but defeats the purpose of writing content models. The challenge is to find the right balance between conciseness and precision. For this, expert knowledge about the domain being modeled is needed. For example, chapter+ is reasonable for a book record, whereas isbn* is questionable. Such decisions cannot be made automatically by a strategy. We now present a strategy for correct (but not necessarily precise) content models. This strategy introduces cardinality operators during rewriting. As a rule of thumb, cardinality operators should be introduced early on, and before factorization, because the initial model in SF best exposes the replicated parts. The introduction of cardinality operators can lead to non-deterministic models, for which we use the strategy described in Section 3 to remove this non-determinism. Input: An XML content model in SF. Output: A correct XML content model in XNF. Step 1. Remove redundant choices of the form R | R (rule 1b). Change the order of alternatives if needed (rule 1a). Step 2. Search for opportunities to introduce cardinality operators, and make sure that this is appropriate in the underlying domain. Find all choices that can be combined, and place these next to each other (rule 1a). If the ε alternative is not present, rewrite all choices to R+ (rule 4b); otherwise, use rule 4a. Afterwards, duplicate alternatives can be removed (rule 1b). Sometimes, parts have to be rewritten before the cardinality operators can be introduced. Step 3. If no more cardinality operators have to be introduced, bring the expression into XNF by applying factorization and removing ε's. The details of this procedure are discussed in Section 3. Example 4. Consider the model ab | abab | abc | ε, which is in SF. We identify three out of four alternatives as instances of (ab)*, i.e., zero or more occurrences of ab. Rewriting the term then proceeds as follows: (1a) ab | abab | abc | ε = ε | ab | abab | abc | ≤ (ab)* | (ab)* |(ab)* | abc (4a) = (ab)* | abc (1b) = ε | ab(ab)* | abc (2b) = ε | ab((ab)* | c) (3a) = (ab((ab)* | c))? (2a)

225

ISBN: 978-972-8939-40-3 © 2011 IADIS

The resulting model is in XNF. The step in which we give up precision and introduce (ab)* is made explicit in the derivation, and this is where domain knowledge is required.

6. RELATED WORK AND CONCLUSIONS Systematic approaches to problem solving play an important role in education. These approaches are often based on three components: knowledge about a domain, means to reason with that knowledge, and a strategy to guide that reasoning (Bundy 1983, Van Merriënboer 2007). Our approach is based on making the rewrite rules, and the procedure for using these rules, explicit. In computer science education, the incorporation of formal methods is strongly suggested by scientific societies such as ACM/IEEE, and many influential scientists (Meyer, 2009). Students employing formal methods during analysis and specification produce more correct, concise, and less complex models (Sobel, 2002). In many curricula, however, formal methods are treated solely as a separate subject to study (Lamport 2003). Wing et al. (Wing, 2000) advise to weave the use of formal methods into existing courses, making it an additional problem solving technique. We think that our approach is a good example of this advise. There is an extensive literature about the algorithmic inference of XML content models, and about dealing with non-determinism (Bex, 2000). These algorithms often involve the construction of finite-state automata, which makes them more difficult to carry out by hand. We are not aware of other approaches that aim at manually deriving models, at the level of an undergraduate course. We have shown that rewrite rules and strategies for regular expressions help students in understanding XML content models, and guide them in the stepwise construction of such a model. The approach makes a sharp distinction between precise and correct models.

ACKNOWLEDGEMENT The authors wish to thank Marko van Eekelen, Johan Jeuring, and Lex Bijlsma for their helpful comments on an earlier draft. We are grateful to the students that participated in the experiment.

REFERENCES Baader, F. and Nipkow, T., 1999. Term rewriting and all that. Cambridge university press. Cambridge, UK. Bex, G.J. et al., 2009. Simplifying XML schema: Effortless handling of nondeterministic regular expressions. In ACM SIGMOD ‘09, Providence, Rhode Island, United States, pages 731-744. Bruggemann-Klein, A., 1993. Regular expressions into finite automata. Theoretical Computer Science, 120:87-98. Bruggemann-Klein, A. and Wood, D., 1998. One-unambiguous regular languages. Information and Computation, 140:229-253. Bundy, A., 1983. The Computer Modeling of Mathematical Reasoning. Academic Press, London, UK. Hopcroft, J.E. and Ullman, J.D., 1979. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley. Lamport, L., 2003. The future of computing: logic or biology. Text of a talk given at Christian Albrechts University. Meyer, B., 2009. Touch of class: Learning to program well with objects and contracts. Springer. Passier, H. and Heeren, B., 2011. Modeling XML Content Explained, Technical Report UU-CS-2011-019, Department of Information and Computing Sciences, Utrecht University. Sobel, A.E.K. and Clarkson, M.R., 2002. Formal methods application: An empirical tale of software development. IEEE Transactions on Software Engineering, 28:308-320. Merriënboer, J.G. van and Kirschner, P., 2007. Ten steps to complex learning. Routledge, New York, USA. Watt, D.A. and Brown, D.F., 2000. Programming language processors in Java. Prentice Hall. J.M. Wing and J.M. Wing, 2000. Weaving formal methods into the undergraduate computer science curriculum. In Proceedings of AMAST, Iowa City, USA, pages 2-9.

226

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

COOPERATIVE INFOTAINMENT SERVICES PLATFORM FOR AMBIENT ASSISTED LIVING Markus Hager, Mais Hasan, Karsten Renhak, Maik Debes and Jochen Seitz Department of Communications Networks, Faculty of Electrical Engineering and Information Technology Ilmenau University of Technology, Germany

ABSTRACT According to the senescent society, the current demographic trend has a big challenge for development of new and better services for elderly people. Today, elderly people experience better health and sanity than elderly people some years ago. Therefore, there is a need for healthcare, infotainment, e-learning and other services that can satisfy the desires of this group. But the comparison of different learning platforms and currently developed or deployed architectures in this field identifies the absence of the combination between ambient assisted living and e-learning services. Our proposed solution called CrISP-AAL helps to overcome this weakness. It uses the OpenAAL platform and consolidates services for ambient assisted living and e-learning. These services are optimized by exploiting context information obtained from the users. To gather context information, we utilize a network of sensors, which are disseminated through the environment or worn by users. The following paper describes two practical scenarios for using our CrISP-AAL system. It demonstrates that this system can support a humanly responsible and self-contained life for elderly people in a familiar environment and ensures their integration into society. KEYWORDS Ambient assisted living; e-learning; infotainment; health care

1. INTRODUCTION The term “Ambient Assisted Living” (AAL) describes methods, concepts, (electronic) systems, products and services which are tailored to unobtrusively support the daily life of (elderly) people. Most AAL systems basically benefit from additional information about the user and his or her environment, which are usually called context information. The combination and analysis of all present context information can thus be used to improve health or social services and even the user’s daily life. Smart home technologies are often utilized to provide AAL services. They are extensively using sensors and actors to realize home automation services. Based on sensor information and predefined rules, specific actions are performed, like e.g. closing windows automatically when the heating is turned on. For a successful deployment and commercialization of an AAL system, the additional values of the whole system and each single service have to be taken into account. Otherwise, users will not accept such new assistance systems, if there is no direct benefit visible. This user centric approach is used in some AAL projects like Soprano (Sixsmith et al. 2009) and Weitblick (Lutherdt et al. 2009). The next section will first give an overview of ongoing AAL projects. Thereafter, section 3 derives the main requirements an AAL system has to fulfill, by comparing the investigated projects. Section 4 presents our approach of an AAL system and highlights two application scenarios. Finally, section 5 summarizes the paper and gives an outlook to the next working steps.

2. RELATED WORK –AAL PROJECTS FOR INFOTAINMENT Research of AAL-related topics has been intensified in the recent years. Thus, to get a complete overview on all the related projects is really a cumbersome task which cannot be presented in a conference paper. Therefore, the scope of the projects must be limited – according to the requirements of the users of AAL

227

ISBN: 978-972-8939-40-3 © 2011 IADIS

systems. Inquiries and interviews of these users have produced similar results. Within the Soprano project (Sixsmith et al. 2009) typical themes of user needs could be defined and should be addressed by the services of an AAL system: social isolation, safety and security, keeping healthy and active, forgetfulness, community participation, care provision and mobility inside and outside the home. It is our impression that many of these themes require a cooperative AAL infotainment platform. Therefore, we concentrate on infotainment AAL projects. Infotainment is “information-based media content or programming that also includes entertainment content in an effort to enhance popularity with audiences and consumers” (Demers 2005). Since the 1990’s, the issue of infotainment and lifelong learning has become increasingly important, especially for the elderly people who represent a growing percentage of today’s society. These people want to learn more about everything new in the world particularly with relation to their work. On the other hand, seniors who do not work anymore need to satisfy their social and personal requirements. The old people also want to decide when, where, what, and why they learn, they prefer to learn freely without the need to follow any curriculum. Which learning opportunities are currently existing for old people? (Kimpeler et al. 2007) have discovered in their study that only a few e-learning products and services exist for the old learners. There are some Internet portals or platforms which support age-related interests like seniority, rights and habitation. However, specific learning or infotainment software for this target group is still the exception and most of these programs are primarily designed to give older people basic media skills. The few institutional elearning programs all have project or experimental character. Current e-learning modules for seniors take their social and personal needs into account, like health knowledge, travel tips, literature, social connections and everything that relates to their hobbies. Here are some practice examples from European countries: • Seniorweb (http://www.seniorweb.ch) Seniorweb is the trilingual (French, German, Italian) interactive Internet platform for the generation 50plus in Switzerland. This Internet platform has been existing for over 10 years in the market and has many service and edutainment offers for its users. It supports many services like blogs, chat, club calendar, forums etc. The user can inform about health, life, education, work, society, ambience and other categories which the platform supports. • Senioren lernen online (http://www.senioren-lernen-online.de) This is a project consisting of seniors who volunteer their time to help other seniors to take advantage of lifelong learning using the Internet and various special synchronous and asynchronous platforms. The Internet portal offers courses, workshops, regular’s table and also single coaching. The content depends on Web 2.0 and the demand of participants. • SeniorLearning (http://www.seniorlearning.eu) This senior-learning platform offers courses to learn how to use popular communications tools such as email, chat and forums. It also presents some courses for book lovers to learn how to access online newspapers, e-books, digital libraries, etc. as well as how to purchase books online (e-shopping). The user should be registered by the administrator to participate in these courses. • E-learning for Seniors (http://www.el-se.org) This is an Internet portal that offers computer courses for seniors who want to learn more about using the Internet and the computer at home, e.g. using an E-Mail client, doing online banking, how to organize a journey via the Internet or how to use the online phone programs. The participation in these courses is not free. The seniors must pay to get a user name and password; they also get an e-learning CD with course content to be able to repeat the course offline too. These examples show that the need for infotainment applications for senior citizens is obvious. Nevertheless, these approaches do not take context information into account that helps to judge the interests and the abilities of the user. Therefore, we propose a new approach called CrISP-AAL (Cooperative Infotainment Services Platform for Ambient Assisted Living).

3. OUR SOLUTION: CRISP-AAL Before detailing our solution, several basic considerations have to be decided. Firstly, an analysis of different AAL projects shows that there is much ongoing research, but there is no generally accepted standard for

228

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

AAL services. Table 1 gives a brief overview of a wide selection of AAL related research projects and commercial products. The table compares the presence of typical assistance system functions (columns) with the listed projects (rows). A “+” or a “-“ sign indicates that the feature is available by this project or not. A “u” denotes that the authors of this paper have no explicit information about this specific fact. Each service could be offered by the assistance system itself or by a third party. In general, the assistance system should act as a brokering service for carefully selected services from known companies. This role could be defined as service brokering. Medical and social services could further be offered by a professional nursing service supported through personal indoor and outdoor activity detection or a GPS assisted emergency call device. The column miscellaneous and value added services represents any service which is adding a subjective or objective value comparing to common services. Such a service could be for example a medically trained call center which has access to (medical) information about the calling user or customer. Based on this information and the trained call center staff, further professional help can be arranged if necessary.

Home automation

Presence scanning

Social network

Medical and social services

Video conference

Misc. and value added services

Context sensitive service offers

Memorization service

Entertainment and activities

E-learning

AAL project Sophia Paul Soprano SLiM homebutler Alter leben easyCare MIDIS Weitblick CrISP-AAL

Service brokering

Table 1. Survey of AAL projects.

u u u u u u + + + +

u + + u + + u +

u + u u + + u u +

u u u u + u u u + +

+ u + u + + + u + +

+ u u + u + u u +

+ u u u + + u u + +

u u + + + u u u + +

u u + u u + u u + +

u u u u u u u u +

+

While comparing the content of Table 1, it is obvious that e-learning and entertainment and activities are only served by the system described and proposed in this paper. According to the authors present information e-learning related aspects or services are not covered by any other known AAL projects.

3.1 Sensors in CrISP-AAL There are several solutions and systems presenting a way how to monitor the indoor environment or how to get the physiological parameters of a person. For example (Zhengzhong et al. 2009) and (Yamagiwa et al. 2010) present a solution based on a wireless sensor network to collect various attributes of the indoor environment like temperature, humidity ratio, intensity of illumination or the existence of a person. (van de Ven et al. 2009) defines a system for health sign sensing based on a waist-worn device supporting a complete electrocardiography and monitoring for the blood oxygen saturation, body temperature and respiratory rate. In summary, these works demonstrate that there is the possibility to get sufficient information about the environment and the physiological constitution of a person. The further development will make it possible to integrate the necessary sensors in a simple wristlet in such a way that the user will not be impaired or even notice them. The modular structure of the openAAL platform presented in the next section easily facilitates integrating these different sensors into the whole system.

229

ISBN: 978-972-8939-40-3 © 2011 IADIS

3.2 The openAAL Platform A promising technical solution to set up and maintain AAL related scenarios represents the openAAL middleware (Wolf et al. 2009). This open source project is a result of the SOPRANO Integrated Project and it is based on the OSGi framework. It can be separated into three functional components, see Figure 1.

Figure 1. The openAAL architecture (Wolf et al. 2010).

Data and information management is done by the Context Manager. An ontology-based semantic description is used to provide an abstract representation of all present information and the available hardware components. Therefore, an interpretation of low-level information from sensors (as described in the section before) to abstract context information is required. After this semantic uplifting, abstract information, like person X left the building, can be specified. The Procedural Manager is basically implementing business logic through a BPEL-based workflow description language. Here, precise events with certain context conditions execute a course of actions defined by the workflow. Procedural templates can further be used to define generic services of certain knowledge domains. The Composer is finally executing the abstract workflow. Hence, a mapping of internal semantic services to external OSGi OSGi service bundles is done. (Wolf et al. 2009) explains all components of the middleware in detail. To use or to extend this middleware a context-aware description and an OSGi interface of sensors and actuators have to be available. The DIANE service description language (Klein 2005) is used to define the corresponding ontologies. To develop a service or workflow, only high-level abstract information have to be used. Technical knowledge about implementation is not necessary necessary at this level. Care service providers, users or relatives are able to control predefined properties of the AAL system without any programming skills.

4. SCENARIOS FOR CRISP-AAL The CrISP-AAL approach can flexibly integrate different applications to assist assist elderly persons in their daily life. The main criteria these applications have to fulfill are as follows. Firstly, these applications should take care of physical exercises for the users. Although it might be cumbersome, especially elderly people have to keep practicing their physical abilities. Otherwise, they are in danger of losing their mobility and independence. However, these exercises need to be closely monitored to not overstrain the users. Secondly, not only physical but also mental exercises are important for the wellbeing of elderly people. They keep the users interested and train their receptiveness. For both kinds of exercises, the elderly people must always be in total control of the application. They must not feel urged to exercise, the system rather proposes new exercises and tries to motivate the user.

230

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Furthermore, these proposals must consider the current physical or mental state of the user to achieve best results and the users should always be informed about their performance. Additionally, the application must take into account multi-users exercises, too. Thus, the problem of isolation that elderly people often experience could be handled. The users can meet like-minded people without having to travel far. Finally, relatives and medical care should have the option to access the training results to deduce information about the physical and mental state of the users – of course respecting the privacy of this information. According to the medical point of view, new exercises can be selected and integrated into the training program. To motivate our framework, we give two exemplary applications that should be of interest to the target audience. We selected one application that deals with training the physics and the other application is concentrating on the mental abilities of the users. Thus, the first application is taking dancing lessons. Assuming that dancing is a kind of sports which is attractive for all ages, we propose a remote dancing class for ballroom dancing. The second application is a remote language course for either refreshing the knowledge of a certain language or for starting a new one. Using a multimedia stream one can listen to the pronunciation of words, can get information about the grammar or learn special phrases. Obviously, this is also an application suited for single users as well as for groups of users.

4.1 CrISP-AAL Application “Dancing Lessons” An application allowing the user to learn dancing could be one interesting option for such a system. But there are two main problems which have to be solved. On the one hand it must be possible to show the user the correct way how to dance. This could be done with the help of a virtual reality system or similar solutions but then expensive hardware and a good technical comprehension would be necessary reducing the acceptance of this application. Therefore, it is advisable to use the already present TV screen or a computer screen to show the user the dancing lesson. But on the other hand, the more interesting question is how to realize the necessary feedback for the learning application. The main problem for all motion capturing solutions is that they need expensive hardware. Some systems detect motion based on the view of several camera pictures or with only two cameras but then the users have to wear tags on the knees or the arms to detect the position of these body parts. However, the evolution of these hardware system, for example the Kinect motion control system, used by the Xbox game console, or the Asus Wavi Xtion system, shows that they will be cheaper and easy to use in the near future. If these problems are solved, the application could be trained by a professional dancer to record the comparable data for the feedback system. To make it as simple as possible to start learning a dance, different levels should be recorded, for example “simple basics”, “beginner level”, “intermediate level” and so on. To guide the user through the process of learning to dance, the progress of the user is stored in a data base to analyze his last results and to provide him with suggestions for the current training session. Depending on the available data about the environment such as appointments or other persons in the house, the application could suggest to train a special dance together at an appropriate time and level or to suggest an interruption during the practice if the pulse of the person is too high. The assembly of such different facts is the main advantage of our CrISP system.

4.2 CrISP-AAL Application “Foreign Language Course” To learn a new or to improve skills in foreign languages is still a popular adult education area and the usage of multimedia tools in this area is widely accepted. Most languages learning courses address at least three essential subjects areas: vocabulary, grammar and conversation. The challenges to realize this application are usually not related to hardware issues. A technical language lessons system could be provided by an ordinary laptop computer with a screen, a keyboard, a speaker, a microphone and optionally a video camera included. Thus, the quality of the application is depending on software related tasks. To adopt this application to the openAAL middleware several low-level information have to be analyzed and combined to abstract high-level information. The low-level information could for example consist of the microphone input, speaker outp ut, keyboard input, video camera input or the background noise. On the other hand, the e-learning application relies on abstract high-level information to interact with the users. Such information could be for

231

ISBN: 978-972-8939-40-3 © 2011 IADIS

example the articulation level, learning progress, or the user motivation. Those high-level information are computed from (multiple) low-level information or internally from the language lessons application. The final learning system should be capable to offer suitable language lessons according to the users’ motivation and their learning history. During the language lesson, the application should even react and interact with the user to modify the level of difficulty or to offer a short break. Participation with other users will also be covered by the learning application. Here a (guided) conversation about certain topics via video chat, like a Skype video call, can improve the learning process.

5. CONCLUSION In this paper, the CrISP-AAL platform has been introduced which is based on the openAAL framework and combines infotainment services and applications for seniors with context-sensitive processing. According to our investigations, this distinguishes CrISP-AAL from all other platforms and portals currently existing in the AAL sector. Our next working steps will be to refine our implementation and to test their attractiveness with a group of volunteering seniors. Accordingly, we will enrich our portfolio of applications accordingly.

REFERENCES Demers, David (2005): Dictionary of Mass Communication & Media Research. A Guide for Students, Scholars, and Professionals. Spokane, WA: Marquette Books. Hachimura, Kozaburo; Kato, Hiromu; Tamura, Hideyuki (2004): A Prototype Dance Training Support System with Motion Capture and Mixed Reality Technologies. In : 13th International Workshop on Robot and Human Interactive Communication (RO-MAN 2004). Kurashiki, Okayama, Japan, 20-22 September. IEEE, pp. 217–222. Kimpeler, Simone; Georgieff, Peter; Revermann, Christoph (2007): Zielgruppenorientiertes eLearning für Kinder und ältere Menschen. Sachstandsbericht zum Monitoring „eLearning“. Büro für Technikfolgen-Abschätzung beim Deutschen Bundestag. Berlin (Arbeitsbericht, 115). Klein, Michael (2005): Handbuch zur DIANE Service Description. Version 2.0.1. Edited by Fakultät für Informatik. Universität Karlsruhe (TH) (Technischer Bericht, TR 2004-17, ISSN 1432-7864). Lutherdt, Stefan; Stiller, Carsten; Lienert, Katrin; Spittel, Sabine; Roß, Fred; Ament, Christoph; Witte, Hartmut (2009): Design of an Assistance System for Elderly Based on Analyses of Needs and Acceptance. In Constantine Stephanidis (Ed.): Universal Access in Human Computer Interaction. 13th International Conference on Human-Computer Interaction (HCI International 2009). San Diego, CA, USA, 19-24 July, pp. 96–105. Sixsmith, Andrew; Müller, Sonja; Lull, Felicitas; Klein, Michael; Bierhoff, Ilse; Delaney, Sarah; Savage, Robert (2009): SOPRANO - An Ambient Assisted Living System for Supporting Older People at Home. In Mounir Mokhtari, Ismail Khalil, Jérémy Bauchet, Daqing Zhang, Chris Nugent (Eds.): Ambient Assistive Health and Wellness Management in the Heart of the City, Berlin / Heidelberg: Springer (Lecture Notes in Computer Science, 5597), pp. 233–236. van de Ven, Pepijn; Bourke, Alan; Tavares, Carlos; Feld, Robert; Nelson, John; Rocha, Artur; O‘Laighin, Gearóid (2009): Integration of a Suite of Sensors in a Wireless Health Sensor Platform. In : 8th Annual IEEE Conference on Sensors. Christchurch, New Zealand, 25-28 October. Wolf, Peter; Schmidt, Andreas; Klein, Michael (2009): Applying Semantic Technologies for Context-Aware AAL Services: What we can learn from SOPRANO. In Stefan Fischer, Erik Maehle, Rüdiger Reischuk (Eds.): Informatik 2009 - Im Focus das Leben. Beiträge der 39. Jahrestagung der Gesellschaft für Informatik e.V. (GI). Workshop on Applications of Semantic Technologies (AST 2009). Lübeck, Germany, 2 October. Gesellschaft für Informatik e.V. (GI) (Lecture Notes in Informatics, 154). Yamagiwa, Motoi; Murakami, Makoto; Uehara, Minoru (2010): A Proposal of Indoor Life Environment Monitoring for Ecological Lifestyle. In Tomoya Enokido, Fatos Xhafa, Leonard Barolli, Makoto Takizawa, Minoru Uehara, Arjan Durresi (Eds.): 13th International Conference on Network-Based Information Systems (NBiS 2010). Takayama, Gifu, Japan, 14–16 September. Los Alamitos, CA, USA: IEEE Computer Society, pp. 457–462. Zhengzhong, Wu; Zilin, Liu; Jun, Liu; Xiaowei, Huang (2009): Wireless Sensor Networks for Living Environment Monitoring. In Dat Tran, Shang-Ming Zhou (Eds.): WRI World Congress on Software Engineering (WCSE '09). Xiamen, China, 19-21 May. World Research Institutes, pp. 22–25.

232

Reflection Papers

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

THE STYLES OF ONLINE WOM-SENDERS AND ONLINE WOM-RECEIVERS AMONG HOT SPRINGS TOURISTS Kosuke C. Yamada, Masato Nakajima and Muneo Kitajima Center for Service Research, National Institute of Advanced Industrial Science and Technology (AIST) 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064 JAPAN

ABSTRACT Japan has numerous numbers of hot-spring resorts (onsen). Visiting hot-spring resorts is one of popular leisure in Japan. The spread of broadband in Japan has increased the consumer interactions about hot springs in online communities. It is important to evaluate the impact of the consumer interactions in online communities on the decision making process of the hot-spring resorts’ visitors. In this study, we investigated the ecology of word of mouth (WOM) communication for hot springs in web-based communities. We conducted a qualitative interview-based study by adopting the methodology, Cognitive Chrono-Ethnography (CCE: Kitajima et al., 2010). 18 adults (14 women; 4 men; mean age, 33.4 years) participated in in-depth interviews, who have experience using WOM in online community about hot spring trip. We found that the WOM-receivers showed different use characteristics of WOM depending on the degree of dependence on WOM. The WOM-senders were different in terms of motivation of writing. KEYWORDS Word of mouth; customer to customer relationship; Cognitive Chrono-Ethnography (CCE); tourism; hot springs

1. INTRODUCTION Traditionally, a trip intended to take in hot spring “onsen trip” is popular in Japan. There are more than 2,800 hot spring resorts in Japan, and a total of 1.3 billion people visited there in 2008 (Ministry of the Environment, 2010). When a person plans an onsen trip, he or she visits the websites managed by hot spring resorts and accommodations, and obtains various kinds of information though online C2C-WOM, i.e., consumer-to-consumer interaction in the form of word of mouth (hereafter, we simply put it just “WOM”), when making decisions as to where to stay, visit, buy, etc. Since WOM influences the behavior of tourists, both practitioners and researchers in tourism are interested in the ecology of WOM (Litvin et al., 2008). WOM exits in the industries other than tourism. Since its influence on consumers’ decisions is enormous, WOM has become a hot research topic in various industry fields as well (Allsop, et al., 2007). In addition, service providers also have a great interest in WOM because it contains useful information for them to estimate and improve the services they are currently providing. However, there is no systematic way of using WOM because its ecology has not yet well understood. WOM is a collection of texts written by WOM-senders and read by WOM-receivers. The purpose of this study was to derive a typology of WOM-senders and that of WOM-receivers. The subject of WOM we focused on was “onsen trip.” We conducted an interview-based study by adopting a methodology for qualitative study, called “Cognitive Chrono-Ethnography (CCE)” developed by Kitajima et al. (2010), which has been successfully applied to understand various kinds of customers.

2. METHOD CCE consists of two stages: (1) selecting “elite monitors” and (2) conducting in-depth interview with the elite monitors. In a CCE study, researchers need to construct an initial hypothesis about the typology of the customers in question. In this study, it corresponds to the initial hypothesis about WOM-senders and the one about WOM-receivers. In a CCE study, monitors are recruited who represent respective types defined by the

235

ISBN: 978-972-8939-40-3 © 2011 IADIS

typology. They are called “elite monitors.” A series of in-depth interviews with the elite monitors (typically, twice or three times) are conducted to understand their behavior in detail. The results of the in-depth interview are used to construct models that represent WOM-sender and WOM-receiver, which are essentially the refinements of the initial hypotheses (See Kitajima et al. (2010) for the methodology of CCE). Table 1. The characteristics of elite monitors as the sender. ID Age Gender Attitude

Content

Frequency

Needs to convey

S1 40s Female High

General

High

Strong

S2 20s Female High

Spa

High

Strong

S3 20s Female High

Accommodation

Low

Weak

S4 30s Female Middle

General

Low

Weak

S5 40s Female Middle

Spa

Low

Weak

Low

Weak

S6 50s

Male Middle Accommodation

Contents of information Accommodation

Intention to convey

Diet Facilities Spa Influencing receivers' activity Contributing to receiver Just for infromation Contributing to receiver Just for infromation Just for infromation





Staff Location



















Sightseeing Surroundin g area

Local



 



 

Note) Check marks mean applicable.

Table 2. The characteristics of elite monitors as the receiver. ID Age Gender Attitude

Content

R1 30s Female High General R2 30s Female High Partial R3 20s Female High Spa R4 50s Male High Accommodation R5 30s Female Middle General R6 30s Female Middle Partial R7 30s Female Middle Spa R8 30s Female Middle Accommodation R9 40s Female Low General Low Partial R10 20s Male Low Spa R11 30s Male R12 20s Female Low Accommodation Note) Check marks mean applicable.

Frequency High High High Middle High High High High Low Low Middle Low

Acceptance of information

Influence of information

Strong Strong Strong Moderate Strong Strong Strong Strong Weak Moderate Moderate Weak

Strong Strong Moderate Moderate Moderate Moderate Moderate Moderate Weak Moderate Weak Weak

Contents of information Accommodation Diet Facilities Spa 

      

     

 

Staff Location

Sightseeing Surroundin g area

Local

    





 



 





2.1 Elite Monitors By conducting a Web survey, eighteen adults (14 females; 4 males; mean age, 33.4 years old) were selected as the elite monitors. All of them had experience in using WOM about onsen trip. Twelve of them were representative WOM-receivers, and six of them were representative WOM-senders (See Table 1 and 2 for detail). Based on the preliminary research on WOM, we derived a number of parameters that should be useful for defining typologies of WOM-senders and WOM-receivers, and we designed questionnaires accordingly. The followings show the Web questionnaire items for screening WOM-senders and WOMreceivers, respectively: The Web questionnaire items for WOM-senders: 1) frequency of writing WOM, 2) subjective estimate as to how much the written information is needed by the receivers, 3) subjective estimate of intention of writing WOM, and 4) subjective estimate as to how much he/she wants to contribute to the receivers. The Web questionnaire items for WOM-receivers: 1) frequency of using WOM, 2) subjective estimate of acceptance of information on WOM, 3) subjective estimate of influence of information of WOM on their decision making, and

236

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

4) subjective estimate of the degree of helpfulness of WOM. The responses from the Web were analyzed by Hayashi’s quantification method type III and the factors that corresponded to the large eigenvalues were fed into a cluster analysis. Eventually we categorized them into 12 groups of WOM-receivers and 6 groups of WOM-senders. Each of the elite monitors belonged to one of the eighteen groups exclusively.

2.2 In-depth Interview We carried out in-depth interviews twice for each monitor. After the first in-depth interview, the monitors were asked to record their daily use of WOM as diary memo. The diary memo was used in the second interview, carried out two or three weeks after the first in-depth interview. The purpose of the diary memo was to help the monitors remember their current and past use experience of WOM. The in-depth interviews were semi-structured with several critical items to be clarified by their answers in the in-depth interview sessions.

3. RESULT AND DISCUSSION 3.1 Results from the in-depth Interviews for the WOM-senders As shown in Table 1, the typology of WOM-senders was based on the patterns of weightings of contents of information. S1 (high attitude and general content) sent general information including diet, facilities, spa, and staff. S1 posted general information and wrote WOM that the other senders had never written. S2 (high attitude and content of spa) and S3 (high attitude and content of accommodation) wrote the same things as the other senders. However they had a different viewpoint. In fact, they sent the limited information about facilities or diet. Although S4 (middle attitude and general content) wrote general information, S4 was not so aggressive as S1 as to post WOM. S5 (middle attitude and content of spa) and S6 (middle attitude and content of accommodation) sent non-specific information based on their travel experiences. The initial typology was based on the contents. However, the results of the in-depth interviews suggested that it would be better to consider their motivations; in other words, a revised typology would be based more on subjective feature of WOM than objective feature of WOM, i.e., contents. However, it would require more extensive research for actually formulating the typology.

3.2 Results from the in-depth Interviews for the WOM-receivers High attitude receivers: The monitors with high attitude (R1, R2, R3, and R4) were particular about the contents of information that they want to get. For instance, R1 (high attitude and general contents) didn’t always receive the contents of information in general. Individual experiences for hot spring trip might be important to receive WOM. If a good service was provided by the staffs at the accommodation they stayed, they might focus on staffs. The receivers with high attitude were more likely to directly connect their experiences to how they received WOM. Middle attitude receivers: The monitors with middle attitude (R5, R6, R7, and R8) received WOM in a variety of ways. They had high demand in finding out what the tourists visited the location they considered really thought. Therefore, they were sensitive at the way of writing of WOM. Low attitude receivers: The monitors with low attitude (R9, R10, R11, and R12) used WOM selectively. This is different from the high attitude WOM-receivers who also used WOM selectively based on their preferences. The low attitude WOM-receivers intentionally narrowed the range of the WOM they would pay attention to even at the time they had started to collect information. Their use of WOM was limited but they had never try to change their attitude even if they knew more information was available which might be useful for their purposes.

237

ISBN: 978-972-8939-40-3 © 2011 IADIS

As described, the typology based on the degree of attitude would be useful to understand WOM-receivers. The three types, i.e., the high attitude, the middle attitude, and the low attitude monitors showed qualitatively different usage patterns of WOM.

3.3 WOM-texts from the Viewpoint of WOM-senders and WOM-receivers The interface between WOM-senders and WOM-receivers is the texts WOM-senders wrote and WOMreceivers read. We call them WOM-texts. Even if the appearance of WOM-texts exist as a text object physically but their meanings from the WOM-senders’ point of view may be different from those from the WOM-receivers’ point of view. Our WOM-senders tended to write positive information about the place they visited and the services they received, which they considered useful for potential visitors. On the other hand, some of our WOM-receivers selectively used negative information when they examined the candidate places.

4. CONCLUSION A numerous amount of WOM are continuously generated on the Web. They are easily accessible by researchers, service providers, WOM-receivers, and WOM-senders. This study showed clearly that WOM, i.e., a collection of texts, should convey more information than mere textual information. WOM exists as the objects to be shared by WOM-senders and WOM-receivers but what WOM actually means are different depending on which side a person is in, i.e., he/she is a WOM-sender or a WOM-receiver. This paper showed that there is difference, and provided hypothetical typologies for WOM-senders and WOM-receivers, which were partially confirmed but refinement is definitely necessary to come to a true typologies. Once the typologies are established, we will be able to envision the entire horizon that extends from a collection of textual WOM. This is also useful for service providers in tourism.

ACKNOWLEDGEMENT This research was entrusted by the Ministry of Economy, Trade and Industry (METI), Japan.

REFERENCES Allsop, D. T., Bassett, B. R., and Hoskins, J. A., 2007. Word-of-mouth research: principles and applications. Journal of Advertising Research, pp. 398-411. Kitajima, M., 2010. Cognitive Chrono-Ethnography: A methodology for understanding diverse tourists needs. Extended Abstract of ATLAS Annual Conference 2010, pp. 70-74. Kitajima, M., Nakajima, M., and Toyota, M., 2010. Cognitive Chrono-Ethnography: A Method for Studying Behavioral Selections in Daily Activities. Proceedings of The Human Factors and Ergonomics Society 54th Annual Meeting 2010, San Francisco, U.S.A., pp. 1732-1736. Litvin, S. W., Goldsmith, R. E., and Pan, B., 2008. Electronic word-of-mouth in hospitality and tourism management. Tourism Management, Vol.29, No.3, pp.458-468. Ministry of the Environment Government of Japan, 2010. Interannual change of status of utilization for hot spring. http://www.env.go.jp/nature/onsen/index.html

238

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

THE ENTANGLEMENT OF HUMAN AND TECHNOLOGICAL FACETS IN THE INVESTIGATION OF WEB-BASED COMMUNITIES Laura Carletti and Tommaso Leo Università Politecnica delle Marche Via Brecce Bianche, 60131 Ancona, Italy

ABSTRACT This paper draws upon the review of recent research, aimed at exploring the relation between socio-technological facets within the communities and the novel social configurations based on the web. It is part of an ongoing research project designed to investigate the intersection between people and social media, enhancing knowledge creation and circulation, as well as ‘life-wide’ learning. In the paper, a new perspective on the entanglement of human (society) and material (technology) is presented to provide an alternative view to analyse the online groupings. Although the intertwining of society and technology is the inherent trait of online communities, it seems that the two have been largely observed as distinct aspects. This approach can be probably considered unsuitable, when the web-based social configurations are the field of inquiry. The aim of this paper is to highlight a different angle of investigation in the disciplines facing the complex intersection of human and technological elements. The reflections are at an initial stage, as it is the research on the novel web-based formations as object of study. KEYWORDS Online communities; society and technology; sociomateriality; digital formations.

1. INTRODUCTION This paper is part of an ongoing research project designed to explore the web-based social formations, adopting social media for knowledge creation and circulation, as well as for lifelong and ‘life-wide’ learning. The reflections result from the review of recent literature, aimed at exploring the relation between social and technological facets in online communities/aggregations. A promising path of investigation is arising in the field of organisational studies, seeking to overcome the conceptualisation of ‘society’ and ‘technology’ as separate entities. Though the roots of this approach can be traced to Actor Network Theory (ANT), the emerging area of research seems to provide interesting perspectives for the analysis of the communities and of the new social configurations based on the web. Social media are the tools which appear to facilitate selforganisation in web participation and assemblage, as well as to contribute to increase “new geographies of association” (Latham & Sassen, 2005). The present work briefly illustrates latest studies on the entanglement of social and technological aspects, with the aim of providing an alternative angle to further investigate novel digital formations. The examined literature can contribute to draw a research path in several disciplines, facing the complex relation between human and technological elements (notably in the education and learning studies). The paper starts from a reflection on Wenger et al.’s (2009) and Gee’s (in Barton & Tusting, 2005) recent works on learning communities and proceed to the new approach labelled under the term of “sociomateriality” (Orlikowski & Scott, 2008). The intertwining of society and technology is the inherent trait of online groupings; hence the “sociomateriality” lens possibly offers a profitable contribution for future research development.

239

ISBN: 978-972-8939-40-3 © 2011 IADIS

2. THE COMPLEX INTERSECTION BETWEEN HUMAN AND TECHNOLOGICAL FACETS 2.1 Reflection One: Human and Technological as Discrete Aspects Online communities have assumed many forms in the last decade. Notably the model of the community of practice, proposed by Lave and Wenger (1991), has been a relevant object of research inquiries, as well as influential in several domains. In the authors’ view, the community of practice has three characteristics: a shared “domain” of interest; a regularly interacting “community” engaged in joint activities and discussions; a “practice” in the sense of a shared set of resources: experiences, stories, tools, ways of addressing recurring problems. While Lave and Wenger’s paradigm was theorised and refined (Wenger et al., 2002) before the epiphany of the Web 2.0, recently Gee has introduced the concept of “semiotic social space”, shifting the focus from the “community” to the “space” (in Barton & Tusting, 2005; Gee, 2009). Gee argues that the community of practice notion – even if clearly defined by Wenger et al. (2002) – has been used to cover a wide array of social forms in an attempt to label groupings (Gee, 2009). The idea of the “community of practice” is members-centred and implies a sense of belonging to a definite cluster of people. Otherwise the “semiotic social space” is focused on an identified territory, real or virtual. Although Wenger et al.’s model still represents a valid framework to develop learning communities, the advent of the Web 2.0 has been determining novel social formations. Semiotic social spaces provide an alternative representation of those configurations. The space is shaped by its content, “something for the space to be about”, and whatever gives the space some content is defined by Gee as “generator” (of signs). The set of signs generated can be considered from two different angles: internal and external. The internal perspective refers to the content design and organisation of the space; the external perspective refers to the interactional organisation of the space. In Gee’s work related to game-based learning, he identifies the “affinity space” as an area where people interact around common interests, endeavours, goals, or practices and where they can also interfere in the implementation of the content design of the space. The influence of external factors (interactions) on the internal level (content design and organisation) represents a leap forward to distributed participatory design in learning environments. In Wenger et al.’s view (2002), the technology represents a mean, a platform for the community exchanges. It is the backstage for the actors of the community. In “affinity space” Gee highlights the relation between space/technology and human interactions. More recently, Wenger et al.’s work recognizes the increasing intertwining of technology and community (Wenger et al., 2009). The authors defines that relation as a “vortex of inventiveness”, powered/fed by the inputs of technology influencing the shape of communities and by the needs of communities refining technologies. The growing relevance of technology leads Wenger et al. to identify the emergence of the technology stewardship or rather tech stewards, who take the responsibility for the design and the management of a community’s technology framework. Although technology stewarding is commonly part of the traditional community leadership function, Wenger et al. underline that tech stewards and community leaders are distinct roles. The intersection between technology and community is not the core of Wenger et al.’s recent analysis; the focus is still on communities of practice, where the social learning component is central and the technology is underling. The definition of “technology stewarding” highlights this standpoint: technology is at the service of the community of practice. Both Wenger et al.’s and Gee’s visions identify the mutual influence between technology design/development and social dynamics in learning environments, but they conceptualise the two as separate entities.

2.2 Reflection Two: Human and Technological as Intertwined Aspects A different perspective is offered by the emerging research on the entanglement between technology and society, carried out in the field of organisational studies. Orlikowski and Scott posit the inherent inseparability of the technical and the social under the umbrella term of “sociomateriality” (Orlikowski & Scott, 2008). Starting from the observation of the paradox of the pervasive presence of the technology in organisations and its loud absence in organisational studies literature, Orlikowski and Scott argue some of the reasons behind the gap are that:

240

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

1. The growing complexity of organisations demands investigation from multiple angles (economic, political, sociological, etc.) and the technological is just considered as one among them; 2. The increasingly complex technological systems can be hardly explored by researchers with an organisational studies background and no technological skills; 3. Many organisational researchers are more focused on cultural, human, etc. aspects than on material ones; 4. Technology is assumed as part of the organisation facilities, as the ‘utilities’ are (e.g. electricity, telephony, etc.). Sociomateriality overcomes the conceptualisation of social and material as discrete entities and introduces a holistic approach focusing on the interpenetration of humans and technologies rather than on their distinction. The correlation of this view and Actor Network Theory (ANT) is evident. Since the 1980s, ANT has paved the way to recognize agency of both human and non-human/non-individual entities, with no hierarchical relation. Indeed, Orlikowski and Scott intend “sociomateriality” as an umbrella term including similar research streams developed in the past 15-20 years (including ANT). Nevertheless sociomateriality emerges in the field of organisational science and explores mainly the relation between ‘real’ organisations and technology, it can offer a novel approach to analyze the wide range of web groupings (communities of practice, learning networks, knowledge communities, groups of interest, etc.), due to the innate intertwining of technological and social facets in those online environments. The concept of “digital formations” represents a step forward in this direction (Latham & Sassen, 2005). The internet-based communication and information structures are the core of the inquiry of Latham and Sassen. Those entities were not present in a given social context before and they are in the early stages of formation, hence for the authors they represent a novel object of research. Those digital configurations are characterized (as many social systems) by three interweaving elements: organisation, interaction and space. The difference is that they are largely constituted in the web and their scaling dynamics can heavily fluctuate. The ‘spatial’ dimension – also stressed by Gee – seems to represent an emergent aspect in the web analysis. The electronic space is described as “materialization and visualization of the digital that depends on mix of screens, logics of sequencing, and graphic presentations of text and images” (Latham & Sassen, 2005). The process of the material entangling with the social facets is defined “sociodigitization” by Latham and Sassen. The authors argue that the conversion of analogue data into digital form is augmented by the social component: “sociodigitization differs from digitization because what is rendered in digital form is not only information and artefacts but also logics of social organisation, interaction, and space”. There is no innovation in interrelating society and technology as the authors recognize; the unprecedented shifts are determined by the democratisation of knowledge production and sharing, by the manipulative capacities engendered by digital technologies, by the increasing mobility of knowledge (reusability and transferability of digital information among various contexts).

2.3 Further Reflections The path of research arising in the field of organisational studies seems to offer a fruitful contribution to the investigation of the online communities and of the new social configurations based on the web. Notably in the field of education and learning, the increasing pervasiveness of technology in the past 20 years and the rapid development of interactive tools in the past 10 years have been leading, on one side, to heavy investments in technology by the education institutions and, on the other side, to the overall rethinking of the contexts for teaching and learning. Nonetheless the intersection between formal education and technology still appears as a ‘love-hate relationship’, indeed “information technologies pose direct challenges to how schooling operationalises learning. These challenges illustrate deep incompatibilities between school and the new technologies” (Collins & Halverson, 2010). The rationale behind this phenomenon possibly accrues from the projection and application of the traditional educational approach on the use of new technologies. In the management and development of e-education, a shift is needed from the Cartesian view of knowledge as a kind of substance transferable from the teacher to the students to a social learning perspective (Seely Brown & Adler, 2008). The ‘promise’ of the education opportunities expansion, displayed by the web (especially by the Web 2.0), could potentially lead to a failure, if the education system delays to recognize the role and the assets of technologies. Conole (2010) argues that the gap between the potential and the actual use of technologies to support learning may be caused by a lack of understanding of the properties of the new

241

ISBN: 978-972-8939-40-3 © 2011 IADIS

technologies. Whilst this gap is acknowledged in the education sector, self-organised online learning communities and groups show to be at ease in the electronic space. Formal and informal partnerships, enhancing self-directed learning and knowledge sharing, are exponentially and spontaneously increasing in the web; co-evolution dynamics of the communities/groups and the technological components can also be observed. Web-based communities and groups exhibit diverse ‘behaviours’ in their interaction with the technologies supporting their existence. The difference of communities/groups firstly constituted on the web (solely virtual) and the ‘real’ communities/groups, which choose to move online (part of their activities/interactions/etc.), seems to reflect the distinction between digital natives and digital immigrants (Prensky, 2001). Thus a classification of digital native communities/groups and digital immigrants communities/groups could represent a contribution to the recent investigation of those digital formations. According to Latham and Sassen (2005), digital formations represent a novel field of inquiry and need to be further explored. The findings of that thread of research, specifically the study of digital native communities/groups, could then contribute: to understand what makes those communities/groups work; to guide the development of immigrated communities/groups; and to facilitate the often critical intersection of technology and organisations. Among them, education institutions often exhibit an awkward approach, possibly amplified by the broad provision of online informal learning opportunities. The hypothesis of the inseparability between social and technological facets (sociomateriality, sociodigitization, ANT) proposes an innovative perspective to analyse web-based communities and groups, although it has been developed in the organisational studies. Accordingly to the literature review, it paves the way to a similar research development in the field of education and learning.

3. CONCLUSION This reflection paper is an attempt to present an emerging perspective and to enhance further investigations of the novel digital formations, occurring in the electronic space. Web-based communities/groups are multiplying largely thanks to the diffusion of social media. Although the intertwining of society and technology is the inherent trait of online aggregations, they have been largely observed as social groups supported by the technology for their existence and development. Accordingly, Orlikowski and Scott (2008) argue that, in the organisational studies, ‘society’ and ‘technology’ have been mainly investigated as discrete entities. This approach can probably be considered not suitable, when the web-based social configurations are the object of the inquiry. A better understanding of the online “new geographies of association” could be possibly assured through a holistic and interdisciplinary vision, assuming the entanglement of the social and the technological. That assumption demands cross-disciplinary research groups, implying deep knowledge on both areas, to allow delving into the intersection of social declensions and technological properties. Orlikowski and Scott’s work on sociomateriality offers inspiring directions to carry out similar studies in other disciplines. Notably the education and learning research could potentially benefit from the application of that perspective, starting from a review of the educational literature. Another question emerges from the recent studies analysis: is the notion of ‘community’ appropriate to define the assemblage in the social networks (such as LinkedIn, Twitter, Facebook, and MySpace)? Referring to Wikipedia, “Traditionally a ‘community’ has been defined as a group of interacting people living in a common location. The word is often used to refer to a group that is organised around common values and is attributed with social cohesion within a shared geographical location, generally in social units larger than a household […]. Since the advent of the Internet, the concept of community no longer has geographical limitations, as people can now virtually gather in an online community and share common interests regardless of physical location”. Hence the concept of community does not appear to fit with the large-scale assemblage of people in the social networks, who are not necessarily united by common interests or by the sense of belonging and membership. Digital formations” is the label provided by Latham and Sassen (2005) to identify the novel social forms, occurring in the web (suggesting a sort of detachment process from the ‘traditional’ concept of community). Furthermore, in the examined literature, the ‘spatial’ dimension of the web (rather than the member/social dimension) seems to emerge: “affinity space” (in Barton & Tusting, 2005; Gee, 2009), “digital habitats” (Wenger et al., 2009), “electronic space” (Latham & Sassen, 2005) are the recent definitions. The paper argues that an interdisciplinary review of the literature on web-based communities/groups could lead to a better understanding of their evolution and future perspectives.

242

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

ACKNOWLEDGEMENT This work was supported by the RCUK’s Horizon Digital Economy Research Hub grant, EP/G065802/1.

REFERENCES Barton, D., & Tusting, K. (2005). Beyond communities of practice: language, power and social context. Cambridge: Cambridge University Press. Beetham, H., & Sharpe, R. (2007). Rethinking Pedagogy for a Digital Age - Designing and Delivering E-Learning. Oxon: Routledge. Carletti, L., & Leo, T. (2010). Virtual Communities of Practice design in the framework of EU projects: a case-study. Proceedings of IADIS-CELDA 2010 International Conference, (pp. 311-314). 15-17 October, Timisoara (Romania). Collins, A., & Halversont, R. (2010). The second educational revolution: rethinking education in the age of technology. Journal of Computer Assisted Learning , 26, 18-27. Conole, G. (2010). Facilitating new forms of discourse for learning and teaching: harnessing the power of Web 2.0 practices. Open Learning: The Journal of Open, Distance and eLearning , 25 (2), 141-151. Cook, J. (2002). The role of dialogue in a computer-based learning and observing learning: an evolutionary approach to theory. Journal of Interactive Media in Education . Gee, J. P. (2009). Affinity spaces: from Age of Mithology to today's schools. Retrieved January 24, 2011, from http://www.jamespaulgee.com/node/5 Hayes, E., & Gee, J. P. (2009). Popular Culture as a Public Pedagogy. Retrieved January 24, 2011, from http://www.jamespaulgee.com/node/24 Latham, R., & Sassen, S. (2005). Digital Formations - IT and New Architectures in the Global Realm. Princeton: Princeton University Press. Latour, B. (2005). Reassembling the Social. Oxford: Oxford University Press. Lave, J., & Wenger, E. (1991). Situated learning: legitimate peripheral participation. Cambridge: Cambridge University Press. Leonardi, P. M. (2011). When Flexible Routines Meet Flexible Technologies: Affordance, Constraint, and the Imbrication of Human and Material Agencies. MIS Quarterly , 35 (1), 147-167. Leone, S., Guazzaroni, G., Carletti, L., & Leo, T. (2010). The increasing need of validation of non-formal and informal learning. The case of the Community of Practice WEBM.ORG. Proceedings of IADIS-CELDA 2010 International Conference, (pp. 111-119). 15-17 October, Timisoara (Romania). Mott, J. (2010). Envisioning the Post–LMS Era: The Open Learning Network. EDUCAUSE Quarterly , 33 (1). Mutch, A. (2010). Technology, Organization and Structure: A Morphogenetic Approach. Organization Science , 21 (2), 507-520. Orlikowski, W. J., & Scott, S. V. (2008). Sociomateriality: Challenging the Separation of Technology, Work and Organization. The Academy of management Annals , 433-474. Prensky, M. (2001). Digital Natives, Digital Immigrants. On the Horizon , 9 (5), 1-6. Seely Brown, J., & Adler, R. P. (2008). Minds: on Fire: Open Education, the Long Tail, and Learning 2.0. EDUCAUSE Review , 43 (1), 16-32. Tu, C.-H. (2004). Online collaborative learning communities: twenty-one designs to building an online collaborative learning community. Westport, Connecticut: Libraries Unlimited. Volkoff, O., Strong, D. M., & Elmes, M. B. (2007). Technological Embeddedness and Organizational Change. Organization Science , 18 (5), 832–848. Wenger, E., McDermott, R., & Snyder, W. M. (2002). Cultivating Communities of Practice. Boston: Harvard Business School Press. Wenger, E., White, N., & Smith, J. D. (2009). Digital Habitats stewarding technologies for communities. Portland: CPsquare. Williams, R., Karousou, R., & Mackness, J. (2011). Emergent Learning and Learning Ecologies in Web 2.0. The International Review Of Research In Open and Distance Learning , 12 (3), 39-59. Zammuto, R. F., Griffith, T. L., Majchrzak, A., Dougherty, D. J., & Faraj, S. (2007). Information Technology and the Changing Fabric of Organization. Organization Science , 18 (5), 749-762.

243

ISBN: 978-972-8939-40-3 © 2011 IADIS

ZOOTECHNICS E-SCIENCE - A MANAGEMENT TOOL RELATED TO LIVESTOCK RESEARCH Adriano Rogério Bruno Tech, Aldo Ivan Céspedes Arce, Max Vicente, Gustavo de Sousa Silva, Ana Carolina de Sousa Silva and Ernane José Xavier Costa ZAB-FZEA, University of São Paulo. Pirassununga, São Paulo. Brazil. CEP: 13635-900.

ABSTRACT The aim of this study was to develop a system for controlling a Zootechnics e-Science for sharing data among researchers with an emphasis on poultry houses. The experiment was conducted in the premises of the Faculty of Animal Science and Food Engineering, with the entire structure prepared for the reception of the birds. In this structure it was tested a data collection environment and sensors that store and distribute the data via an e-Science. Thus, we can conclude that it is possible to monitor the environment, the animals and provide data through the web, than researchers can access and interact with other researchers registered to discuss issues related to experiments in the studies. KEYWORDS Wireless Network, Sensors, Management Environment, Collaboration Research.

1. INTRODUCTION The concept of e-Science can be defined as global collaboration in key areas of science through sharing, generating research, discussion of results and ideas in a more specific and effective (LICAN; ZHAOHUI and YUNHE, 2003). This system consists of three basic tools, which are the repository of data (Data Warehouse), a metadata editor and a search engine, besides the possibility of inserting a fourth tool, Data Mining, which allows the extraction of hidden patterns. These elements allow better management of the production chain in the management process, industry, trade, agribusiness management or research management at the Universities or Research Institute. Although, e-Science has the basic premise of global collaboration between sciences, allowing the generation, analysis, sharing and discussion of insights and results obtained in experiments conducted in the universities (TECH, 2008). The benefits of e-Science for the community are relevant, especially when developed with advanced computer networks technologies, allowing greater ease and availability of information. In some cases the benefit is the ability to access and control remote resources such as instruments, computing resources, telemetry, visualization or data analysis, in other cases, the ability for collaboration between specific (remote experts or researchers) (SBC, 2006). One of the main objectives of e-Science is the availability of experimental data over the internet for all involved in research, teaching and outreach from the academic or private institutions and public research through collaboration multidisciplinary (HEY et al., 2005). Therefore, the objective was to develop a Zootechnics e-Science for managing the experiments in Animal Science research by monitoring an aviary experiment.

244

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

2. MATERIAL AND METHODS The experimental test was carried out at the Faculty of Animal Science and Food Engineering - University of São Paulo, Pirassununga / SP. It aimed to test the data acquisition and control system through sensors and actuators distributed in the environment, in order to provide data for management system. The manager software was developed using Java platform by the NetBeans IDE - version 6.9.1, which offers a set of libraries that facilitate the development of new applications through the Orientation Objects (OOA and OOP) and for data storage was used the MySQL database - version 6.5. The monitoring infrastructure consists of four wireless IP cameras, which were installed for monitoring the environment such as temperature, humidity and ammonia sensors, the data acquired, is sent to the system manager in order to control the actuators such as nebulizers, ventilators and automated curtains attached to the system. Figure 1 shows the system called "e-LAFAC” (Zootechnics e-Science) for monitoring the activities developed during the research or experiment. Monitoring Module

Local Management Module

Report Module

WEB

Sistema e-LAFAC Figure 1. Management System Experiments (Aviary) on-line – Use Case e_LAFAC. Source: Tech et al. (2010).

Researchers, students and persons authorized by the administrator can access the system via LAN or Web. Figure 2 shows the configuration process and handling of environmental monitoring conducted by the researchers.

Figure 2. Management System Case.

245

ISBN: 978-972-8939-40-3 © 2011 IADIS

Figure 3 shows the above process through an interface UML: Login

Entry

User

Wireless

Database

Report

Analyze / Data Work

Monitoring

Storage

Management Reports

Access

Figure 3. Use Case Monitoring environment. Source: Tech (2008).

This module allows the managers collect data and other telemetric parameters from the animals or monitored environment, in order to generate a database for analysis and standards verification, such as: temperature profile, humidity and gases in the monitored environment. Once the data is stored in a database, the system allows the extraction of information through the software manager, which allows an analysis and abstraction of a pattern of interest to the researcher, because the controlled environment allows a higher animal productivity, once they will be in a place where all the major influencing factors in the production are controlled (MACHADO, 2000). This database enables distribute, integrate and develop solutions with high performance, based on the analysis and decisions that can be drawn from this base (BERSON; SMITH, 1997). Figure 4 illustrates a controlled environment with sensors and actuators installed.

Figure 4. Illustration of the monitored environment.

246

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

3. RESULTS AND DISCUSSION After testing, it was possible to observe the facilities of the management system, where researchers have access to the controlled environment through cameras installed on site as well as data collected by the sensors. Figure 5 shows the system manager designed to manage activities throughout the experiments.

Figure 5. Snapshot of the online management system. Source: Tech (2008).

The screen in Figure 5 allows register new events, as well as monitoring and management of the registered system, thus facilitating the work team of the researchers and students in the control experiment. All data is stored in the database for future availability to all staff and researchers guest. The experiment also allows recording of the experiment, data analysis, where the researcher can in real time analyze animal’s behavior or the environment and control the system, since the program can act on the experiment through actuators (nebulizers, ventilators and automated curtains). However, the system allowed the exchange of information among researchers. The stored data were shared in management reports, where each researcher or member made observations and reported its difficulties on the documents bank. This allowed an exchange of experiences, without the need of being located in the same place, and most important, all had the files of filming and the reports by the researchers, technicians and students involved. Figure 6 shows the screen to access the system availability of files.

Figure 6. Page for access to PDF files and AVI available for consultation.

These data illustrate the functionality and system performance, showing its applicability in environment monitoring, it is open or closed, with large or small animals.

247

ISBN: 978-972-8939-40-3 © 2011 IADIS

4. CONCLUSION The results lead us to a conclusion that, it is possible to monitor the environment and animals via Web in real time, using computational techniques and advanced concepts in data management, as computer simulation combined with a system for exchanging information through a Zootechnics e-Science where researchers can access and interact with others registered researchers to discuss issues related to the experiments in the studies.

ACKNOWLEDGEMENT The Research Foundation of the State of São Paulo (FAPESP).

REFERENCES Berson, A; Smith, S. J., 1997. Data Warehouse, Data Mining, & OLAP. McGraw-Hill. Machado, F. N. R., 2000. Projeto de DATA WAREHOUSE: Uma Visão Multidimensional. São Paulo: Érica. Lican, Huang; Zhaohui, Wu; Yunhe, Pan, 2003. Virtual and dynamic hierarchical architecture for e-Science grid. The International Journal of High Performance Computing Applications, v. 17, n. 3, pp. 329–347. SBC. Sociedade Brasileira de Computação, 2006. Ano VII. n. 24. Dezembro. TECH, A. R. B., 2008. Development of a computer tool for monitoring and data collecting, based in concepts of eScience and Data Warehouse for the application in Cattle Breeding. 2008. 129 F. Thesis (Doctorate) – Faculty of Zootechnics and Food Engineering at the University of São Paulo, Pirassununga, São Paulo, Brazil. TECH, A. R. B.; Arce, A. I. C.; Silva, A. C. S.; Pereira, L. A. M.; COSTA, E. J. X., 2010. Um modelo de gestão baseado em conceitos de e-science e data warehouse para aplicação no agronegócio da pecuária. Archivos de Zootecnia, v.59, n. 226, pp.161 – 168. HEY, Toney et al., 2055.Cyberinfrastructure for e-Science. Science, v. 308, pp. 817-821.

248

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

A BUSINESS INTELLIGENCE VIRTUAL COMPETENCY COMMUNITY OF PRACTICE PROPOSAL Diana Târnăveanu and Mihaela I. Muntean West University of Timişoara Faculty of Economics and Business Administration Department of Informatics and Economic Statistics 16, H.J. Pestalozzi Str., 300115, Timişoara, ROMANIA

ABSTRACT In an economical post-crisis context, the general tendency is to migrate to stronger alliances that offer support. One of the solutions is virtual communities of practice (VCoPs). They offer an important knowledge management tool as they are based on common goals and shared interests on a large period of time, capable to develop the social capital, create new knowledge, exploit the existing tacit knowledge, stimulating innovation and disseminating the results. Only optimizing performance it can survive and remain an important competitor in a changing market, constantly taking advantage of the raising opportunities, risking and being flexible at new multiple demands. Business Intelligence (BI) represents the capability to look inside a business and the environment in which it operates, in order to fundament the most productive and profitable decisions. It can provide great opportunities if used properly. We proposed a model of business intelligence virtual competency community of practice for our University, capable of sustaining practitioners share resources – experiences, problems and solutions, tools and methodologies. Such a community facilitates the improvement of each participant’s knowledge and contributes to the development of knowledge within business intelligence domain and conducting original research with the help of participants. We believe creating a virtual space of communication, wisely built and managed, will be of great help to teachers, students and individuals interested in BI tools. KEYWORDS Virtual communities, communities of practice, business intelligence, competency, collaboration

1. INTRODUCTION Each company has to struggle with a lot of challenges: profitability, high rate of technological innovation, economic globalization, the demand for quality services and products, desire to consolidate the position on the market, and many others. The need of a support is stringent; a team based on competence specialists could be the answer. But because of so many restrictions cased by time and place, a virtual meeting place for people who have the same problems and interests could be a solution. Multidimensional analysis offers real advantages for the decisional factor, representing an intuitive approach of forecasting in the present context of market economy. At the global level, the market of Business Intelligence has an upward trend. International Data Corporation (IDC), at the second part of last year, estimate that between 2009 and 2013 this tendency will continue, the average growth being 7.2% per year, slightly diminished because of the crisis, compared to 2008 and 2007. For the next period, Gartner previsions are alike with the IDC’s one. For 2009-2013 period, the growth remains but is reduced by 8.1%. The market of BI platforms is one with the biggest and quickest development, in spite of the crisis. Business Intelligence has been the no.1 technology spend for the last three years running, according to research firm Gartner [10]. This paper propose a model of BI virtual competency community of practice for the West University of Timişoara, capable of sustaining practitioners share resources – experiences, problems and solutions, tools and methodologies. This will facilitate improvement of each participant’s knowledge and will contribute to the development of knowledge within business intelligence domain and conducting original research with the help of participants.

249

ISBN: 978-972-8939-40-3 © 2011 IADIS

2. BI VIRTUAL COMPETENCY COMMUNITY OF PRACTICE PROPOSAL 2.1 Virtual Competency Communities of Practice CoP is a concept first introduced in 1991 by Jean Lave and Etienne Wenger from Institute for Research on Learning [7]. They described the community of practice as being a set of relationships between persons, activities and world, and relationships with other organizations based on common interests. In 1998, Wenger extended this concept, applying it to other contexts, including organizational context. The model of learning inside a CoP is an apprenticeship model, but it is not restricted to this type of learning [6]. The process by which a newcomer learns from the rest of the group was central to their notion of CoP, so they termed this process LPP – Legitimate Peripheral Participation [7]. One of the barriers in implementing a CoP is related to sharing, especially between institutions, and regards legal issues as data protection, intellectual property, copyright and confidentiality [1]. From the causes that can make a CoP fail we enumerate the lack of a common, shared identity, the lack of consensual knowledge, the uncertainty factor (based on volunteers, sometimes a problem can remain unsolved or untouched), geographical distance (differences in time zones), loose opportunities for collaboration and sharing informal knowledge. A CoP is visualized as a goal-seeking system whose survival is problematic [3]. A virtual community of practice is a network of individuals who share a domain of interest about which they communicate online [4]. VCoPs provide a lot of benefits and create opportunities that are not found in traditional organizations. They help develop a holistic understanding, increased levels of flexibility and responsiveness. Many barriers caused by the “virtual” term can appear, one of the most important being trust. Because of weak motivation or lack of self-esteem (people having information considered not substantial enough), individuals may prefer to stand by, un-engaging into discussions, preferring to work autonomous or not at all; read only participants being a real threat for the community [1]. Sustaining a VCoP is difficult, but can be done through monitoring, regulating, maintaining boundaries and responding to change. Identifying and managing competence may help in innovation, decision support, product quality improvement, and constitute an important input to the creation of the company’s organizational knowledge [11]. We found different definitions for competency, all of them suggesting it is an individual personal ability expressed on a set of skills that can help an organization get competitive advantages. We are interested in competency because we want to: identify strengths and weaknesses in the organization, reduce vulnerabilities representing by people leaving the organization and taking key competencies away, match the most adequate employers to execute activities in a project, stimulate interactions and exchange of knowledge in the organization.

2.2 Our Proposal – BI Virtual Competency Community of Practice We propose a model of business intelligence virtual competency community of practice for our University, capable of sustaining practitioners share resources – experiences, problems and solutions, tools and methodologies. The members will benefit from taking new ideas/processes back to their units, sharing their BI initiatives/ideas with a broader community, disseminating business intelligence projects and assisting in building a repository of BI knowledge, standards, methodologies that could be used by all units in future initiatives. Such a community facilitates the improvement of the knowledge of each participant and contributes to the development of knowledge within business intelligence domain and conducting original research with the help of participants. Only optimizing performance a company can survive and remain an important competitor in a changing market, constantly taking advantage of the raisin opportunity, risking and being flexible at new multiple demands [10]. Subordinated to performance management, BI approaches help firms to optimize business performance [8]. When the company manages to optimize the business processes that affect the key performance indicators (KPI) metrics, the company can become competitive. Linking Balanced Scorecard measures to strategy is vital; strategic measures, those that define the strategy designed for competitive excellence, will ground the collaborative decision processes among senior and mid-level

250

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

managers. Any Balanced Scorecard Management Program can be developed based on a business intelligence approach [9].

2.2.1 Our Objectives We consider it is important to form a community to develop business intelligence tools and solutions as a core business activity, our goals being to expose users to business intelligence by pursuing BI initiatives in units that are ready and waiting, discover the success and benefits of these projects to increase business intelligence awareness, stimulate curiosity and demand, increasing collaboration and strengthen the relationship between University and teachers, students, alumni, employees or employers, researchers, specialists or just individuals interested in business intelligence. Our focus is to create a tight network of people who are actively involved in BI, and with their help to create a business intelligence virtual competency community of practice. Our community will provide a forum for BI users to actively share their expertise, improve both individuals and University’s body of knowledge and acting as an advocate for business intelligence user needs. This model was build after the Michigan’s University proposal [2].

2.2.2 Organizational Structure In order to achieve our goals, the targeted participants have been divided into multiple roles, depending on their time commitment and their competency level. When identifying an individual’s competencies, we are taking into consideration the following: declared competences (the competences that the person thinks he/she has), project competencies (based on the projects he/she was working on, we assume that if a person worked in a project and executed an activity which require some competence to be executed, then, this person has this competence), community competences (collected from the communities in which the researcher participates or contributes) [11]. Establishing a central resource of business intelligence experts is most important. The first layer is the one of “BI consultants”. We consider a group of 10-15 specialist will be enough. They should be business experts capable of committing to create awareness, coordinate the activities of the second layer members, train members, work with BI website design and tool reviews, working around 4-8 hours per month. The second layer group of members will be “BI advisors”, a group of 25-30 people, specialists and active members. They will meet two times per month in order to organize a project update meeting, analysis sharing and guest speakers. They are expected to actively share what they do in areas of BI, but they are not expected to driving awareness and coordinate activities. The members from these two layers should be the owners of issues that either have been solved already with business intelligence or need to be solved using BI. The third layer group will allow a simple member to participate in activities through BI email groups, blogs and BI Web site. They will keep themselves up-to-date with business intelligence activities inside the community, share ideas, submit project proposals, but would not participate in regular meetings. They are used as a pool to solicit volunteers for specific future initiatives or events. We think the number of this layer could be around 150. The community will have to be dynamic and organic. There is always an opportunity for a member to step upward on the pyramid and ascend to a higher level. In order to motivate people we considered: providing priority training and one-to-one consulting for community members, providing support/service in helping them document and transfer their skills and knowledge to others and emphasizing the professional networking opportunities.

2.2.3 Activities As initial activities [2], the first layer should create the following sub-groups in order to refine and develop the strategies and methods for creating a successful community: user awareness and training sub-group, methodology, standards and definition sub-group, data warehouse completeness and share experiences and plans. User awareness and training group will propose questions needed to be answered with the existing information such as what is BI and why should I care; providing technical guidance to business intelligence Web and communication teams. Data warehouse completeness sub-group would be responsible for making sure the right data is available and users are familiar with it: identifying what data is available for decision making, advocate when there is missing data and assessing solutions and work to develop clear expectations and processed for collecting data needed for decision support that is not currently available or accessible.

251

ISBN: 978-972-8939-40-3 © 2011 IADIS

3. CONCLUSIONS Starting as a Knowledge Management tool, communities of practice are based on a common interest that may be different than the interest of a single organization. Communities of practice conduct and perpetuate themselves, the reward being intrinsic, not financial. The common goal is not that narrow, the practice is dynamic, the hierarchy is formal, and the members are altruistic volunteers, not a group of paid employees. Modern organizations worldwide are slowly discovering that controlling knowledge is a major component for strategic growth and creating a competitive organization. Their goal being competitiveness, they have tools like business intelligence to help them, through key performance indicators, like Balanced Scorecards. BI has a tremendous impact on business once installed. It produces the right information at the right time, which is a key element for the success of any business enterprise [9]. Virtual teams represent an organizational form that is changing the workplace and providing organizations with higher levels of flexibility and responsiveness [11]. A CoP is a vehicle for more effective virtual team working [6]. Information technology is providing the infrastructure necessary for the development of new organizational forms. We propose a model of business intelligence virtual competency community of practice that will be implemented in our University, our common goal being raising awareness on new business tools that enhance competitiveness. One of our main concerns is how can we motivate people in supporting this initiative on the long run.

ACKNOWLEDGEMENT This work was supported by ANCS-CNMP, project number PNII – 92-100/2008-2011

REFERENCES [1] Aisha Abuelmaati, Yacine Rezgui – Virtual Organizations in Practice: A European Perspective, Proceedings of the Fourteenth Americas Conference on Information Systems, Toronto, ON, Canada, August 14th-17th 2008, http://aisel.aisnet.org/amcis2008/142 [2] AIMS: A Business Intelligence Competency Community Proposal for the University of Michigan, June 2006, http://www.bi.umich.edu/community/download/bicc_proposal_aims.pdf [3] Elizabeth J. Davidson, Albert S.M. Tay – Studying Teamwork in Global IT Support, Proceedings of the 36th Hawaii International Conference on System Sciences, 2003, pp.1-10, http://www.hicss.hawaii.edu/HICSS36/HICSSpapers/CLGVC01.pdf [4] Pat Gannon-Leary, Elsa Fontainha – Communities of Practice and virtual learning communities: benefits, barriers and success factors, eLearning Papers, no.5 September 2007, ISSN 1887-1542, electronic copy available at http://ssrn.com/abstract=1018066 [5] Chris Kimble, Paul Hildreth – Communities of Practice: Going One Step Too Far?, electronic copy available at http://ssrn.com/abstract=634642 [6] Chris Kimble, et al – Effective Virtual Teams through Communities of Practice, Management Science – Theory, Method&Practice, Research paper No. 2000/9, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=634645 [7]Jean Lave, Etienne Wenger – Situated Learning: Legitimate Peripheral Participation (Learning in Doing: Social, Cognitive and Computational Perspectives), Cambridge University Press, Cambridge, UK, 1991 [8] Zaman Mukhles – Business Intelligence: Its Ins and Outs, Technology Evaluation Centers, April 29th, 2009, http://www.technologyevaluation.com/research/articles/business-intelligence-its-ins-and-outs-19503/ [9] Mihaela I. Muntean, Liviu Gabriel Cabău – Business Performance Approach in a Business Performance Context, Proceedings of The 2nd Symposium on Business Informatics in Central and Eastern Europe, 2011 [10] Mihaela I. Muntean, et al – BI Approach for Business Performance, The 5th WSEAS Conference on Economy and Management Transformation, 2010, http://ssrn.com/abstract=1732190 [11] Sergio Rodrigues, et al – Competence Mining for Team Formation and Virtual Community Recommendation, http://www.sergiorodrigues.net/aulas/downloads/projetofinal/paper_SergioRodrigues_IJWBC.pdf [12] *** - Virtual Teams and Virtual Communities, https://www1.comp.nus.edu.sg/research/researchareas/is-vt.htm

252

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

DIGITIZATION OF THE GREEK NATIONAL THEATRE ARCHIVE Nick Hatzigeorgiu and Nikos Sidiropoulos Institute for Language and Speech Processing / R. C. “Athena”

ABSTRACT We describe the digitization process for the archive of the National Theatre of Greece. Many of the problems we encountered in this project are common to any digitization of cultural archives. We discuss all the steps in the design and development of such a digital archive, from the planning stage to the completion of a database driven web application for general public use. We argue that having alternative forms of the metadata, such as XML, minimizes the startup delays and also increases the preservability of the data. KEYWORDS Digital Library, Web Application, Heritage Preservation.

1. INTRODUCTION Digital archives are increasingly important for the preservation of cultural heritage. There are many advantages of a web-based digital archive over traditional libraries and museums: the collected material does not age, it can be viewed from every part of the world, and it can be studied without time limits and without fear of damaging it. Also, the material can be organized in a multitude of categories and virtual timelines and this permits the emergence of relationships that otherwise would not be evident. Sometimes, digital archives permit us to interact with the material more directly than physical archives and explore the archived material in multiple ways, establishing virtual relations and connecting the archived objects with pictures, videos, sounds and text. Also, there is the possibility that communities with common interests can grow out of such a digital archive (Witten, et al. 2004) and the interaction within their members could contribute more to the goals of the particular institution than a traditional museum would. The design and implementation of a digital library is an engineering endeavor which has to solve a variety of practical problems, both in digitizing the material and in organizing the resulting digital collection. Creation of a digital archive often has to confront with large amounts of physical material, usually not in good condition. Another constrain is the cataloguing of this material. There are differences between a physical archive and a digital archive, so there has to be extensive planning before digitizing an existing archive. Depending on the particular archive and the material stored in it, a decision has to be made about the ontology that will cover the material of the archive. Also, it is important to plan for the future preservation of the digital archive, especially the necessary organization process management and the technological continuity for the future growth and use of the collected data (Chen, 2005). In the following, we describe the creation of the digital archive for the National Theatre of Greece. The National Theatre of Greece is the premier theatre in Greece. It was established in 1930 and it has operated since then. Its main aim has been to disseminate Classical Greek Drama, to introduce the Greek people to the classics of Greek and international repertory as well as to all contemporary theatrical trends, to study and investigate new theatrical forms and experimental modes of expression, and to promote theatrical training through the creation of a Drama School.

253

ISBN: 978-972-8939-40-3 © 2011 IADIS

2. DIGITIZATION PROCESS The physical archive contains materials ever since the creation of the theatre. The archive includes theatre programs, photographs, audio, video, music scores, publications and newspaper clippings. The materials are of varying quality and condition. The organization of these items was inside various folders. It was accessible mainly to theatre archivists and some people working in theatre studies but it was not open to the general public. The goal of the project was to construct a digital archive that would be accessible to general public. The end result of the project would be a structured collection of electronic files and a relational database. The first important step was to decide the organization of the whole digitization process. After some discussions with the archivists we decided that we will have some intermediate products. In particular, it was deemed that it would be better for all the people involved in the project that the metadata would be first collected in Word files. This had a lot of advantages: a) it permitted the theatre people to start working without delay, b) it required no special software, c) it required no training, d) it made changes a straightforward process, and e) it provided us of an alternative storage medium which was easy to print and store. Another important decision was the ontology that was going to be used, which would also decide the design of the database schema. As in other cases of digital archive creation (Schering, et al. 2007) a record unit has to be selected in order to achieve a stable and distinguishable organization of data. It was decided that there would be a tree-like structure which would have as a main abject the theatrical “play”. All the other information and objects (for example, people or organizations) should be connected directly or indirectly to a play. Using the play as the main object, we constructed an XML tree-like schema that contained all the metadata that the theatre people wanted to preserve. Following this schema, we developed forms in Word documents that contained the same structure as a numbered multilevel list. On the other hand, normalization of the structure of this XML tree gave us the schema of the relational database that would store the metadata in a later stage of the project. Digitization of the material included the scanning of photographs, programs, music scores and publications, and the digitization of audio and video tape recordings. Due to the fact that the main goal of the digital archive is the creation of a library for general use and not the creation of a repository for research of theatrical studies, we kept the general digitization requirements relatively low. Programs, publications and newspaper clippings were scanned with a resolution of 96 dpi and a color depth of 24 bits. The same digitization process was followed for photographic materials. Photographs were also scanned with a 96 dpi resolution and a 24 bit color depth and saved in a JPEG format. The varying age of the materials in the archive lead to a varying quality of the digitized files. For example, some of the original photographs were small size black and white photographs from the beginning of the 20th century, while other photographs were large modern color photographs. Audio and video material was digitized with the main purpose of having a streaming media file in a web server. Music material was digitized with higher requirements. Due the nature of music scores documents, a higher scanning resolution was used. In many cases, music scores were written by hand, with lots of errors and corrections by their composers, so a 150 dpi resolution was selected, using a 24 bit color depth and the JPEG format for file size compression.

3. DATABASE DESIGN AND IMPLEMENTATION The relational database for this project was designed to hold all the material of the digital archive and all the metadata of the material. The structure of the database was largely decided by the structure of the XML tree for the metadata, which in term was organized as close as possible to the existing physical material. This helped a lot the staff of the National Theatre to understand the entities that would later be imported to the database. Most of the information of this tree was normalized so as to form independent objects that were connected to each other using standard one-to-many relationships. In some cases, we also constructed indexed views so that we will have higher speeds in complicated queries to the database. Following the organization of the physical archive, the database schema contains the main entities of figure 1 and also an additional number of tables that represent the relationships between these entities. Each record in the database is connected directly or indirectly to the “plays” table. At first, this might seem like a

254

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

restriction but actually in this archive all the material was based on live theater performances and it would be meaningless to have any information that is not somehow connected to a play. Some of the information for each play is the title of the play, the company who produced the play, and the genre and type classifications for each play. Another important table was the “works” table which contains the theatrical works. Each play consists of one or more theatrical works. Some of the information for each work is the work title, the original title, the original language, the year and the genre classifications. Each play was connected to a number of digitized collections that includes: programs, publications, photographs, music scores, audio and video. Each of these collections has its own table that contains all the relevant metadata and the filename of the actual digital file. These files were stored in a structured directory that was accessible by the web application.

Figure 1. Main entities of the database

During the first stage of the project, the theatre archivists collected most of the original metadata on-site in Word files. We developed scripts and macros that imported these files in the SQL database. At the same time, the software specifications and requirements were gradually crystallized and we developed a complete stand-alone software application that could view, add and edit database records. In the latter stages of the project, the theatre archivists perform most of the work using this specialized software application. This process required some normalization in the later stages of the project, and to assist the archivists in this work we developed a semi-automatic merge for objects that might be the same but have slight differences (for example, misspelled or alternative names). The database management software application was able to either connect to the central database or work locally using XML files that followed the structure of the Word files used in the initial cataloguing of the metadata. The ability of the application to export and import XML data was useful because it permitted it to work without a live connection to the central database. Also, XML provides the ability of having systematic tagging of the components of a play and ensures preservation of the material through easy migration to newer software requirements without the need for costly upgrades (Richards, 2002). The collection of the XML files of the material is also an alternative method of preservation of the metadata.

4. WEB APPLICATION The last stage of a digitization project is the development of a web application that can access the material stored in the database. We developed an ergonomic design which was easy to use but also rich in possibilities. A formal requirement of the web application was the conformance with the W3C standards for XHTML and CSS. The application was developed in ASP.Net and it is a three-tier database-driven web application. The first tier is the presentation tier which displays the information to web users. The second tier is the business logic unit which performs the data processing. The third tier is the data tier which includes the database and the media files. There are three main modes of navigation in the archive: a) using a timeline for the plays, b) using the name of a particular play, and c) searching for a title term in the database. This way we tried to offer a multilayered browsing experience to the user (Weber, et al. 2006) and the user can change between different layers

255

ISBN: 978-972-8939-40-3 © 2011 IADIS

during his navigation on his search path. After choosing a particular play we can view all the details related to the play, and the user can also explore the various collections connected to the play and the people and other objects. All the metadata collected in this project are available to the general web user.

5. CONCLUSION We described the digitization of the archive of National Theatre of Greece and the construction of the accompanying software. This project required the cooperation of a number of different specialists, including archivists, theatre studies specialists, administrators, technicians and computer science people. We found that frequent communication between different specialists is very important, at least for the first few months of the project, so that a common language will be acquired and shared between the people who are involved. Our experience with this and other software projects leads us to emphasize one very important point: the planning stage of such a project is often the most important part of the project. It is essential to arrive to a precise and comprehensive set of software requirements, before the beginning of the construction of the software. It is also essential to leave plenty of room for modifications. In practice, the need for modifications will always arise, but the goal is to minimize the time spend modifying the software by spending time gathering the requirements. For the first few months of this project, the theatre archivists used forms in Word to collect the metadata for the digitized material. This was very beneficial to the project, since it enabled the people who knew the material to work with a software tool they were familiar with. Word files also enabled us to keep an archive of the material in a more primitive and easy to use form, thus ensuring its preservability, which has been found to be an issue in other cases of digital archive development (Hong, et al. 2005). We also found that it was beneficial to the project to have the ability to store the metadata in three different types of files: Word documents, Unicode XML files and a relational database. As a result of this project, the archive of the Greek National Theatre is available on the internet today. The web application of this project is available at the URL address: http://www.nt-archive.gr. The first edition of the digital archive is in Greek language only, but hopefully in a future project we will be able to provide translations to other languages as well. This project was well received from both the specialists in the field of theatre studies and the general public and has received favorable mentions in the Greek Press.

REFERENCES Chen S. S., 2005. Digital Preservation and Workflow Process. Lecture Notes in Computer Science. Vol. 3334/2005, pp. 395-413. Hong J. S et al, 2005. Toward an integrated digital museum system – the Chi Nan experiences. International Journal on Digital Libraries.Vol. 5, No 3, pp. 231-251. Richards J. D., 2002. Digital Preservation and Access. European Journal of Archaeology. Vol. 5, No. 3, pp. 343-366. Schering A. C. et al, 2007. Towards a digital archive for handwritten paper slips with ethnological contents. Proceedings of the 10th international conference on Asian digital libraries. ICADL'07, Hanoi, Vietnam, pp. 61-64. Weber A. et al, 2006. Multi-Layered Browsing and Visualisation for Digital Libraries. Lecture Notes in Computer Science. Vol. 4172/2006, pp. 520-523. Witten I. H et al, 2004. Digital libraries for creative communities. Digital Creativity. Vol. 15, No. 2, pp. 110-125.

256

Posters

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

PORTAL DEVELOPMENT APPROACHES. PROPOSAL FOR COLLABORATIVE COMMUNITIES Mihaela I. Muntean West University of Timisoara, Romania

ABSTRACT Starting with some best practices used in portal development approaches, a proposal regarding a quick deployment of these systems has been introduced. The proposed agile development framework establishes the life-cycle phases of product development taking into account the desired functionalities. Analyzing the needs of a real user community, a portal deployment project was initiated. KEYWORDS Portal, prototype technique, model driven architecture, agile development

1. PORTAL DEVELOPMENT APPROACHES For supporting virtual activities and business processes it is neccesary to adopt the latest collaborative technologies and information systems that are standing for e-business phenomena, wide open to potential clients and business partners. Collaborative technologies underlie a large variaty of tools, sytems and IT platforms supporting different projects of common interest, collaborative communities and enterprises [Muntean, 2010]. Portals proved their efficency as infrastructures of these environments, either as a unique portal proposal or one based on a distributed model. Taking into consideration the basic functional architecture defined in [Jansen, Bach, Osterle, 2000] for a collaborative knowledge portal, an agile approach for developing community portals will be proposed. The literature is poor in references concerning portal development approaches, best practises developed by leading organizations and portal designer have in view „to better serve customers, to deliver business intelligence across the organization, to deploy effective knowledge management systems, to ensure the adoption by end users“ [Sullivan, 2003]. Certainly, functionalities like 'content management' 'collaboration', 'personalization' and 'supporting business processes' must be implemented within an community/enterprise portal initiative. Portal development projects are based on general systems development methodologies [Watson, 1997; Turban, Aronson, 2001; Arlow, Neustadt, 2002; Lungu, Sabău, Velicanu, 2003; Zaharia, Roşca, 2002; Davidescu, 2003; Brândaş, 2007], but must take into account the specificity of these IT platforms [Brosche, 2002; Guruge, 2003; Sullivan, 2003; Collins, 2003]. Some approaches suggest that portal development can underlie on component reuse, standard sub-portals representing elements that could decisively contribute to optimize the development life-cycles of these systems [Crolene, 2002]. Standard groupware sub-portals1, business intelligence sub-portals2, or other specific service-oriented sub-portals can be integrated into the unitary schema of a service-oriented community portal. Generally, a groupware/collaborative sub-portal contains elements such as news, weather and map information, as well as discussion groups, team oriented to-do and task lists, and other collaborative tools. Reporting, OLAP Analysis, Data Mining, Dashboards and Scorecards are the BI core components of a business intelligenge sub-portal (or portal), all these being grounded on multi-dimensional data/warehousing models. Rapid portal development can be also achieved by using the prototype technique for the solution deployment [Pienar, 2003]. More and more approaches are model driven [Klepper, Warmer, Bast, 2003], the desired portal functionalities being implemented starting with the PIM model, followed by the PSM platform 1 2

OsgCorp proposal, 2005 – www.osgcorb.com SpagoBI – BI Free Platform, 2006 – http://spagobi.objectweb.org

259

ISBN: 978-972-8939-40-3 © 2011 IADIS

and technologies specific model. According to Muntean (2009, 2010) the following similitude between a MDA approach and a prototype based portal development project can be established. Table 1. MDA & Prototype-based development MDA framework

Prototype based development

Analysis

Analysis

Design

Prototype design

Writing program code (coding) Testing In st allat i on

Portal prototype implementation Prototype testing Portal installation

Outputs of the MDA phases P I M p o r t a l m o d e l , developed in executable UML, which describes in an unitary, integrative approach:  Portal functionalities: o Personalization o Supporting processes o Collaboration o Content/document management  Communities of portal users. P S M p o r t a l m o d e l , which describes the portal architecture taking into account the necessary implementation technologies:  Services architecture of the platform (CORBA, JAVA, .NET, XMI/XML, etc).  IT platform components that support all portal specific activities described in the PIM model (portlet-based integration schema, add-on extensions).  Portal prototype. S o u r c e c o d e for the portal. Components code library. Corrected source code In st allat i on sch ema/mod el

Nowadays, agile development is an umbrella term for a variety of best practices in software/system development. These methods have proven to be more effective in dealing with changing requirements during the development phases, which always seem to occur. The agile methods emphasize teamwork, customer involvement and the frequent creation of small, working pieces of the total system“3. The Agile Alliance define these methodologies as „a group of software development methodologies based on iterative development, where requirements and solutions evolve through collaboration between self-organizing crossfunctional teams“4, each iteration being like a miniature project of its own. Taking into account all these preliminary considerations, we propose an agile development framework for developing portal solutions.

2. AGILE DEVELOPMENT FRAMEWORK PROPOSAL The proposed agile development framework5 recommends the use of prototype technique enriched with MDA specific attributes and is based on the following phases (Figure 1): C O N C E P T I O N : at this level the PIM model elaboration is targeted, according to the requirements of the knowledge based collaborative community; prior, a feasibility study is made in order to justify the efficiency and efficacy of the project, also elaborating a business plan to demonstrate if the project brings a measurable benefit or not; the modeling of the requirements will lead to outlining the functionalities of the portal, of the user communities, all these being represented at a level of PIM model; D E S I G N : targets the elaboration of the PSM model specific for the portal prototype, i.e. the finalization of the architecture of this model, taking into account all details regarding the IT infrastructure, which must sustain the unitary, integrating vision of the PIM model. The building of the PSM model will take into account the future implementation solution of the prototype, by relating the model to a certain IT platform and to certain maintaining technologies; I . T . I (Implementing – Testing - Installation): has the goal to implement the portal prototype according to the PSM model, followed by the testing of the prototype. Often, as a result of testing its functionality, the 3

http://www.answers.com/topic/agile-software-development Extreme Programming (XP), Scrum, Agile (Rational) Unified Process Framework (Agile RUP), Crystal Methods, Feature Driven Development (FDD), Lean Development, Dynamic Systems Development methodology http://www.gatherspace.com/static/agile_software_development.html 5 Fundamentals regarding agile development - in [Pereiras, Tobias, Grzegorzek, Staab, 2007] 4

260

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

prototype invalidation leads to the revision of the PSM model and aims at correcting some aspects related to technology and the considered IT platform. Practically, the final version of the portal prototype is obtained by an iterative process which regards the adjustment of the PSM, its implementation and the testing of the prototype solutions for verifying the imposed requirements. The validation of the prototype leads to the portal installation and its transfer to the users that possess the knowledge of the collaborative community [Muntean, 2009, 2010]. The considered functionalities are sustained both by the PIM and PSM model (Figure 2) with concret implementations within the Three-Tier architecture of the portal.

Figure 1. Agile development of portals

Figure 2. PSM model. Component diagram

3. A FEW ASPECTS REGARDING THE PROJECT Our target represents a small or medium-size user community, collaboration between the users being implemented through a variety of collaborative tools (Figure 2). They facilitate „on-demand collaboration anytime/anywhere”, being also real knowledge management assistants. With respect to project management approaches [Bibu, Brândaş, 2000; Demeulemester, Herroelen, 2002; Bodea, 2005; McCollum, Bănescu, 2005] and the introduced agile development framework, a real knowledge portal project was initiated [Muntean, 2011]. Referring to the collected prescriptive characteristics introduced by the Agile Alliance, the proposed agile framework has conducted to the following results (Table 2). A correct approach, based on functionalities, of any portal development process, has a decisive role in the finallity of the desired project. Table 2. Prescriptive Characteristics XP

Scrum

Crystal

FDD

Agile development framework proposal

Team size

2 - 10

1-7

variable

variable

Variable

Iteration Length

2 weeks

4 weeks

< 4 month

< 2 weeks

< 4 month

Distributed Support

no

adaptable

Yes

adaptable

Adaptable

System Criticality

adaptable

adaptable

all types

adaptable

Adaptable

4. CONCLUSIONS At the end of the last year, Forrester Research Inc. surveyed business environment inquiring about companies’ opening towards adopting new IT collaborative platforms: a trend was obvious, more than 50% heading to collaborative technologies. The adoption of portals within collaborative communities/

261

ISBN: 978-972-8939-40-3 © 2011 IADIS

environments seems to be imminent. After analyzing best practices used in portal deployment processes, an agile development framework was proposed. Further, a real portal development project was initiated, the proposed approach being subject of a performance analysis. Agility is necessary when it comes to user satisfaction, the proposed agile development of portals managed to deliver a better final solution, faster and cheaper.

ACKNOWLEDGEMENT This work was supported by ANCS-CNMP, project number PNII – 92-100/2008-2011.

REFERENCES 1. Bibu, N., Brândaş, C. (2000), Managementul prin proiecte, Editura Mirton, Timişoara 2. Bodea, C.N. (2005), Managementul informatizat al proiectelor, suport curs master, ASE Bucureşti 3. Brosche, C. (2002), Designing the Corporate Portal – A Master’s Thesis Carried Out At Volvo Information Technology, Gotteborg 4. Demeulemeester, E.L., Herroelen, W.S. (2002), Project Scheduling: A Research Handbook, Kluwer Academic Publishers, Dordrecht 5. Guruge, A. (2003), Corporate Portals. Empowered with XML and Web Services, Digital Press/Elsevier Sciences 6. Jansen, C. M., Bach, V., Osterle, V. (2000), Knowledge Portals: Using Internet to Enable Business Transformation, Proceedings of INET 2000 7. Kleppe, A., Warmer, J., Bast, W. (2003), MDA Explained: The Model Driven Architecture: Practice and Promise, Addison Wesley 8. Lungu, I., Sabău, Gh., Velicanu, M., Muntean, M. (2003), Sisteme informatice. Analiză, proiectare şi implementare, Editura Economică vol. 13/no. 2, 2009 9. Muntean, M. (2009), Collaborative environments. Considerations concerning some collaborative systems, Revista de Informatica economica, vol. 13/no. 2 10. Muntean,M. (2010), About Portal-Based Collaborative Environments, Proceedings of the IADIS International Conference “Collaborative Technologies 2010”, Freiburg 11. Pereiras, S. F., Tobias, W., Grzegorzek, M., Staab, St. (2007), Semantisches Web Portal, Koblenz Universitat 12. Piennar, H. (2003), Design and Development of an Academic Portal, Libri 2003, vol. 53 13. Sullivan, D. (2003), Proven Portals. Best Practices for Planning, Designing and Developing Enterprise Portals, Addison-Wesley Press 14. Turban, E., Aronson, J.E. (2003), Decision Support Systems and Intelligent Systems, Prentice Hall, New Jersey 15. Zaharia, D., Roşca, I. (2003), Analiza obiectuală a sistemelor informatice, Editura DualTech, Bucureşti

262

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

CONTENT MANAGEMENT IN RUBY ON RAILS Antonio Tapiador and Joaquín Salvachúa Universidad Politécnica de Madrid Avda. Complutense 30, Madrid, Spain.

ABSTRACT Web development is currently driven by model-view-controller (MVC) frameworks. How has content management adapted to this scenario? This paper reviews content management features in Ruby on Rails framework and its most popular plug-ins. These features are distributed among the different layers of the MVC architecture. KEYWORDS Content management, model-view-controller, web development, ruby on rails

1. INTRODUCTION How has content management adapted to the arrival of web development frameworks? Using frameworks for web development has become a common practice. Model-view-controller (MVC) patterns facilitate development. They hide complexity, give structure and consistence and promote best practices. Their code is better tested. Finally, a framework becomes popular if it has something useful to offer (Johnson 2005). On the other hand, content management is the process behind matching what your organization has to what your audience wants (Boiko 2001). It comprises collection, management and publishing content to any outlet. Web content management is the result of delivering content to the web. Web content management became popular with the growth of web pages (McKeever 2003). But, did it catch up with the emergence of web development frameworks? We have hardly found related work in literature filling the gap between MVC and content management. There is recent work that introduces the implementation of a web content management system using J2EE MVC technologies (Liduo et al. 2010). It presents a successful case using a 3-tier (MVC) architecture and collects the requisites for web content management. However, it does not explain how content management features are supported by a MVC framework. In this article, we explore this issue from our experience building web content management systems with Ruby on Rails, a popular web development framework developed to increase productivity. It implements MVC architecture, and relies on “convention over configuration” and “don’t repeat yourself” (Bachle and Kirchberg 2007).

2. METHOD We have wide experience developing Ruby on Rails applications, including the GlobalPlaza (http://globalplaza.org/), a Web content management system developed in the context of the EU 7th FP Global project. We have reviewed the implementation of content management features (McKeever 2003, Liduo 2010) in Ruby on Rails and its plug-ins. Table 1 shows each feature and where it is implemented: Rails or a external plug-in. We have measured their popularity relative to Rails (which is the most popular project). The most popular full-featured content management project built with Ruby on Rails (RadiantCMS) is also included. Popularity is measured using The Ruby Toolbox (http://ruby-toolbox.com/), a web site that collects projects from Github. Github (http://github.com/) is the web site where the Rails community lives in. The score of

263

ISBN: 978-972-8939-40-3 © 2011 IADIS

each project in The Ruby Toolbox is proportional to the number of developers that are watching it and its forks in Github.

3. RESULTS Table 1. Content management features (McKeever 2003, Liduo 2010) and their support by Rub on Rails or an external plug-in Content collection Standard tools for content creation Multi-user support & authorship Separation of content from presentation Content syndication Content preview Content versioning Relevant content types Form support for catalogue type data Localization Shared database for content storage Thin client Real time access to CM functions Workflow Flexible, multi-threaded Workflow monitoring and control features Workgroups

Rails, Formstastic, WillPaginate Devise Rails Rails Rails* VestalVersions Paperclip Rails Rails Rails Rails Rails AASM Rails CanCan

Content delivery Static content

Rails

Dynamic content Rails Automatic link checking Rails Data error checking Rails Separate environments Rails Content version rollback Vestal versions Multi-channel support Rails, Prawn Automatic site changes Rails* Content personalization Devise Control and administration Role definition and user Devise, CanCan security Taxonomy ActsAsTaggableOn Audit trail Rails* Reporting functions Rails*, Devise Rails* means that feature needs implementation

Figure 1. Popularity of Ruby on Rails content management plug-ins

4. DISCUSSION Almost all the content management features are available in the Ruby on Rails development framework. Most of them are integrated in the core framework, but some of them are available as plug-ins. Web authentication means have evolved a lot in the last years. The most popular authentication plug-in (Devise) provides methods like User name and password: nowadays it is implemented by almost every web site providing authentication. Credentials can be provided as parameters of a POST request, as result of filling a web form, or through HTTP authentication headers such as basic or digest authentication (Franks 1999). Access token: A token is generated and stored in the server associated to the user. This token is passed as a parameter in the request URL, or stored. in the user client, as a cookie. This method is useful for

264

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

syndication feeds, or remembering user authentication in the browser of a trusted computer. OpenID (Recordon and Reed 2006). The user-centric framework emerged as a solution to the “multiple user name and password” problem. OpenID’s aims is managing only the authentication of you identity provider. The rest of web sites rely authentication on it. OAuth: Initially a protocol for secure API authorization, OAuth (Hammer-Lahav 2010) has become a popular authentication mean used by Facebook and Twitter, among others. Other features like confirm email address, recover passwords, track user information like IP address, timeout session or lock user account. Devise’s authentication methods are configurable in the model, letting the content management developer decide which methods are appropriate for each case. It provides with custom controller and views for authentication mechanisms. Besides, it provides with helping methods so the developer can check if the user is authenticated and which is his identity. Authorization is transverse to the MVC architecture. Model: the state of the data in the persistence layer (e.g. roles assignations and resources relationships) determines whether authorization is granted or denied. Controller: authorization mandate which actions can be performed in the business logic layer. View: the interface changes depending on authorization issues. For example, some links are displayed if the user has rights to perform the actions behind them. CanCan, provides methods for controller and views. The authorization policies are decoupled from the MVC architecture. They are described in a separate Ability class, which it is instantiated for the user in every request. Ruby on Rails follows resource oriented architecture (ROA). The framework provides resources management support at the three levels of the MVC architecture. Resources are tight related with the life cycle of content, collecting, managing and publishing (Boiko 2001). At the model level, AASM state declarations for workflows, VestalVersions revisions and FriendlyID slug generation. At the controller, inherited resources provide the basic functionality for the life cycle of resources. Several plug-ins enhance the views. Formstatic powering forms generation, WillPaginate for index paginations and Prawn for PDF views. Finally, there are transversal plug-ins to the MVC architecture. These provide functionality at several levels of the MVC architecture. Examples are Paperclip for file management. RailsAdmin interface for resources management and ActsAsTaggableOn for taxonomies and folksonomies.

5. CONCLUSION Content management features keep up with Ruby on Rails development framework. Most of them are integrated in the framework itself, while others are distributed as plug-ins. However, some of them are more popular than others, even than the most popular full-featured content management solution. Content management features are transversal to the MVC architecture, they use some or even all the layers.

REFERENCES Bachle M, Kirchberg P. Ruby on Rails. IEEE Software, Vol. 24, Iss. 6, pp. 105 - 108 Boiko B, 2001, Understanding Content Management, Bulletin of the American Society for Information Science and Technology, 28: 8-13 Franks J. et al. 1999. HTTP Authentication: Basic and Digest Access Authentication. RFC 2617. IETF. Hammer-Lahav E. 2010, The OAuth 1.0 Protocol. RFC 5849, IETF Johnson R, 2005. J2EE Development Frameworks. IEEE Computer, Vol 38, Iss 1, pp 107-110 Liduo H, Yan C and Ming Y. 2010. Design and implementation of Web Content Management System by J2EE-based three-tier architecture: Applying in maritime and shipping business. The 2nd IEEE International Conference on Information Management and Engineering (ICIME). pp 513 - 517 McKeever S, 2003, Understanding Web content management systems: evolution, lifecycle and market. Industrial Management & Data Systems, Vol. 103, Iss. 9, pp. 686 - 692 Recordon D. Reed D, 2006, OpenID 2.0: a platform for user-centric identity management. Proceedings of the second ACM workshop on Digital Identity Management. ACM.

265

Workshop Papers

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

CCBS – A METHOD TO MAINTAIN MEMORABILITY, ACCURACY OF PASSWORD SUBMISSION AND THE EFFECTIVE PASSWORD SPACE IN CLICK-BASED VISUAL PASSWORDS Haider Al-Khateeb and Carsten Maple Institute for Research in Applicable Computing, University of Bedfordshire, Luton, LU1 3JU, United Kingdom

ABSTRACT Text passwords are vulnerable to many security attacks due to a number of reasons such as the insecure practices of end users who select weak passwords to maintain their long term memory. As such, visual password (VP) solutions were developed to maintain the security and usability of user authentication in collaborative systems. This paper focuses on the challenges facing click-based visual password systems and proposes a novel method in response to them. For instance, Hotspots reveal a serious vulnerability. They occur because users are attracted to specific parts of an image and neglect other areas. Undertaking image analysis to identify these high probability areas can assist dictionary attacks. Another concern is that click-based systems do not guide users towards the correct click-point they are aiming to select. For instance, users might recall the correct spot or area but still fail to include their click within the tolerance distance around the original click-point which results in more incorrect password submissions. Nevertheless, the Passpoints study by Wiedenbeck et al., 2005 inspected the retention of their VP in comparison with text passwords over the long term. Despite being cued-recall the successful rate of their VP submission was not superior to text passwords as it decreased from 85% (the instant retention on the day of registration) to 55% after 2 weeks. This result was identical to that of the text password in the same experiment. The successful submission rates after 6 weeks were also 55% for both VP and text passwords. This paper addresses these issues, and then presents a novel method (CCBS) as a usable solution supported by an empirical proof. A user study is conducted and the results are evaluated against a comparative study. KEYWORDS Authentication, visual passwords, click-based systems, hotspots, password space

1. INTRODUCTION Due to the limitation of current technology, text passwords are relatively secure against guessing, dictionary and brute-force attacks when their length is eight characters or more, consist of a complex mix of characters (digits, letters and symbols) and are absolutely random, but that is hard to achieve (Furnell, 2003) (Belgers, 1993). ASCII keyboards have 94 printable characters, hence in a traditional text based password system, given the advised length of eight characters there is a password space of 948 ≈ 6 x 1015 words. However, in practice, attackers exploit possible patterns to reduce the number of possible string combination and perform efficient dictionary attacks against the system. For example, if we assume that a group of users tend to use an English word as a password, the effective password space in this case will be equal to the number of words in the English dictionary, this is impossible to count accurately, but the number approaches three quarters of a million words only as estimated by Oxford Dictionary (AskOxford, 2009). As such, a text password regardless of its length has an effective password space much smaller than the theoretical space. Similarly, click-based passwords are vulnerable to dictionary attacks as discussed and analysed in the following section. A click-based system is a VP authentication schemes in which the VP is a sequence of click-points on one image or more (Wiedenbeck et al., 2005b) (al-Khateeb et al., 2009). Users find the retention of a click-based VP easier if included within specific hotspots of an image. Some click-points are also easier to select based on their location. For instance, recalling then selecting a click-point visually

269

ISBN: 978-972-8939-40-3 © 2011 IADIS

represented by the edge of a square is easier than selecting a click-point located inside the square in an empty space. As such, click-based systems require a method to support all their click-points with a memorable cue. The remainder of this paper organised as follows: Section 2 provides background discussion of related researches. Section 3 proposes CCBS. The experiment’s methodology is demonstrated in Section 4. Section 5 presents the results of the experiment and Section 6 discusses and concludes the results.

2. BACKGROUND AND RELATED WORK The effective password space in click-based VPs: Click-based VP schemes have a large theoretical password space which can be increased further by adding more grid squares through expanding the clickable area or adding more images to the user portfolio. However, the effective password space in click-based schemes is significantly smaller than the theoretical space. That is because, if people are not guided or interrupted they are attracted to a limited number of predictable areas (hotspots) when looking at an image (Wolfe, 2000, Thorpe and Oorschot, 2007) (Erik, 2009) (al-Khateeb et al., 2010). Hence, hotspots can be used to perform an effective dictionary attack. (Thorpe and Oorschot, 2007) used data from a relatively small set of users to explore popular clusters and was able to correctly guess 36% of the passwords within 231 guesses. (Dirik et al., 2007) proposed a model to pre-identify hotspots in a given image. However, Section 2.1 of the same paper shows how carefully selected images are still vulnerable. Another proposal (Chiasson et al., 2008) highlights a random area in the image being used. A user may not click outside this area, but they can press a shuffle button to randomly reposition the highlighted area. This might help to achieve better distribution of clicks, but cannot increase the usability of low probability areas. Accuracy of password submission: (al-Khateeb et al., 2010) shows that 70% of the incorrect clicks submitted by users were rejected due to exceeding tolerance by up to 4 pixels. (Wiedenbeck et al., 2005a) conducted a study to examine tolerance effect concluded that smaller tolerances (10x10 in their case) are harder to encode in users’ memory, hence resulting in more incorrect password submissions. Nevertheless, retaining the VPs after one week shows that the number of incorrect submissions with the smaller tolerance (10x10) was significantly higher than the larger tolerance of 14x14. This problem persists because password cues in click-based systems guide users towards areas but not specific click-points. Increasing the tolerance can eliminate this problem but it reduces the overall effective password space of the system. In CCP (Chiasson et al., 2007), every click results in a unique path of images until the VP is submitted. This helps the user to reselect a click-point before password submission if the consequent image is not part of their portfolio. While this can partially solve the problem, it can be time consuming and exposes the system to shoulder-surfing attacks as addressed by its authors. VP retention in click-based systems: Cued-recall authentication such as click-based systems provides cues to trigger users’ memory while entering their password. Each cue should aid the LTM to retain a particular task successfully. However, a laboratory study by (Wiedenbeck et al., 2005b) showed that the number of participants who failed to submit valid click-based passwords during the experiment was almost identical to that of users who were asked to retain text passwords. Success rates for both type of passwords decreased from 85% (instant retention on the day of registration) to 55% after 1 week from registration/first retention (R1) and the same percentage of 55% was achieved after 4 weeks from the second password retention (R2). This implies that the visual cues failed to significantly maintain users’ memory to recall passwords.

3. CUED CLICK-BASED SYSTEM (CCBS) We propose Cued Click-Based System (CCBS) as a method to overcome some of the main limitations of click-based systems discussed in the previous section. In CCBS, two types of cues are implemented to trigger the user’s memory: graphical and textual, to retain and submit the correct click-points. Each image is transparently divided into click-cells representing the available symbols to form a VP. The visual cues to recall these click-cells (similar to other click-based systems) consist of all or part of the figures and features of an image existing in the area of that particular cell. In addition, each click-cell is accompanied by a unique textual cue as illustrated Figure 1. The textual cue appears when the relevant cell is hovered by the mouse.

270

IADIS International Conferences Web Based Communities 2011, Collaborative Technologies 2011 and Internet Applications and Research 2011

Figure 1. Visual and textual cues.

These cues are formed of short but informative sentences. It is essential to avoid confusing users by locating similar textual cues next to each other. For instance, if one cue is talking about London being the capital of the UK, click-cells next to it must contain different information that doesn’t include keywords like London, Capital or the UK. CCBS has been developed with the following assumptions: First, in response to the ‘effective password space’ problem: A uniform distribution of click-points (no hotspots) will be achieved via system-generated passwords while maintaining memorability. In this user study, we provide empirical evidence that system generated passwords in CCBS doesn’t cause memorability problems in comparison with the comparative user studies (i.e. CCP and Passpoints). Second, in response to the ‘accuracy of password submission’ problem: Users can accurately select the intended click-points hence the number of incorrect password submissions will significantly reduce. Third, in response to the ‘VP retention’ problem: VP retention in CCBS will be significantly higher than the comparative schemes: Passpoints and Alphanumeric (text password) on the long term.

4. EXPERIMENT METHODOLOGY Experimental design. The experiment continued for 6 weeks and consisted of 3 sessions. Session 1 was undertaken during week 1 in which participants were introduced to the system and asked to create a new user account using a VP that is randomly assigned to them. This was followed by a learning task were the VP is requested multiple times. Participants were then asked to complete a questionnaire. Finally, they retained their VP for the first time. Consequently, sessions 2 and 3 took place during weeks 2 and 6 to retain the VPs again. Session 3 was followed by another questionnaire about their experience with the system. Materials. The system was implemented based on the HybridPass prototype (al-Khateeb et al., 2009) but the text password interfaces were excluded to match that of the comparative study: Passpoints (Wiedenbeck et al., 2005b). The clickable area display six different pictures and in addition to the visual representation of click-points, textual cues are used. The same six images are used to create portfolios and random VPs to all users. The size of the clickable area was 230x100 pixels and the tolerance around the original click was set to 4 pixels, which represents each click-point with a 9x9 grid square. Hence, instead of returning the coordinates the system calculates an identifier of the grid square containing the click-point. As such, the VP space is 1.4 x 1016 . A single computer was used in this experiment with a screen resolution of 1280 x 800. The experiment included a questionnaire in which the perception of end users towards the system is measured. Procedure. The experiment was completed individually. Participants were first introduced to the experiment with a 5 minutes presentation. In session one the registration form included two input fields to capture the user ID, full name and an input method to capture the VP. However, VPs were not entered based on the participant’s preference but rather randomly assigned to them. A unique VP formed of 5 click-points is shown to each participant during registration to adopt and use. They were asked to memorise these clickpoints and their order to select them again in the future. The registration form was validated using JavaScript, thus the ‘Submit’ button can be clicked if the ID and full name fields are filled and exactly 5 click-points are selected. The following step is for password confirmation, participants are asked to re-enter their VP one more time. If the two passwords did not match, users are asked to repeat their registration. The learning task consists of multiple password submissions until the participant succeeds to submit the correct password 10 times. The correct password is shown after each incorrect submission. Then to distract the participants from the system, they are asked to complete a questionnaire followed by a login trial after around 30 minutes, R1. In session 2, users are asked to retain their passwords for the second time, R2. If the password is wrong they can try again. After five attempts users can see their correct password to refresh their memory. Finally in session 3, password retention (R3) is followed by a questionnaire.

271

ISBN: 978-972-8939-40-3 © 2011 IADIS

Participants. The comparative study had 20 participants taking part in the graphic password scheme. This study included the same number. Participants were computer science and business students who use computers on a regular basis. Most of them were Masters or PhD students. The mean age is 26.65 years (SD = 2.79) and the range was between 23 and 34 years. There were 11 females and 9 males in the sample.

5. ANALYSIS OF RESULTS The results are evaluated against the comparative study: Passpoints (Wiedenbeck et al., 2005b).

5.1 Registration Phase Table 1 compares the time (of all attempts) and the number of attempts required to register a new user account with the Passpoints scheme and the alphanumeric password from the comparative study. Password confirmation time is reported for CCBS only because the comparative study did not include confirmation. Student’s t-test was used to analyse and compare the results. The total number of attempts to register a new account was less in CCBS. The difference was significant compared to the text password: t(38) = 10.22, p