Talking About Mobile Communication Systems

0 downloads 0 Views 319KB Size Report
Talking About Mobile Communication Systems: Verbal Comments in the Web ... (process steps) and how (methods) acceptance issues can ... interview effects.
Talking About Mobile Communication Systems: Verbal Comments in the Web as a Source for Acceptance Research in Large-scale Technologies

Bianka Trevisan RWTH Aachen University [email protected]

Eva-Maria Jakobs RWTH Aachen University [email protected]

ment and extension of complex technical systems using the example of mobile communication systems. It is funded by the Excellence Initiative of the German Federal and State Governments. The project's objective is to complement the development and operation of complex technologies such as mobile communication systems through the systematic integration of acceptability evaluations. In this project engineers and social scientists collaborate together. The task of the social scientists is to develop holistic models of acceptance and appropriate methods for acceptance evaluation. The engineers focus on the mathematical modelling of acceptance as a relevant input for the technology design. In the following, the project is described briefly. A key challenge for the modelling of acceptability is to find access to the view of user groups on technology as a whole (mobile communication system) as well as on individual technical components (mobile communication system vs. mast). Our projects’ contribution focuses on the reconstruction of the evaluation relevant properties on technology. It combines linguistic theories and text technological methods. The theoretical approach is based on the linguistic theory of evaluation and uses topic-related web discourses as database. Text mining will be adapted for the data collection and acquisition.

Abstract In the research project HUMIC (RWTH Aachen University) we strike out in a new direction in acceptance research. In order to identify previously undetected acceptance factors in user-generated content, traditionally used methods are complemented by innovative methods of computational linguistics. Verbal comments from social media tools like weblogs are analyzed by text mining methods with the aim to get access to user judgements. This methodology offers the possibility to learn how user perceive and conceptualize large-scale technologies using the example of the mobile communication systems. Keywords: acceptance research, text mining, weblog, evaluation, user-generated content

Introduction Nowadays, most new services (for example, selfservices) are based on technologies such as Internet and mobile communication systems. The use of the Internet and of mobile phones is for most people a regular part of everyday life in Western Countries. Despite the high usage rates of mobile phones and other types of small screen mobile devices, mobile communication systems are evaluated ambivalently. Most of the users like to use their mobile phone, but reject the technical system behind it (masts, radiationand so forth) [1]. In engineering sciences, there is a growing interest in the users' view on technology and the willingness to take these into account when designing technology. This approach is currently missing in the area of complex technologies (for example, nuclear power, nanotechnology, gene technology). At RWTH Aachen University, the interdisciplinary research group "HUMIC” detects solutions to fill this gap. In this project, the evaluation of acceptance is seen as an integral element of the develop-

978-1-4244-8143-9/10/$26.00 ©2010 IEEE

HUMIC: Project’s objective The aim of the project is the mapping and integration of acceptance in the life cycle of large-scale technologies (for example, mobile communication systems). The life cycle expands over long periods of time and requires huge investment. It includes processes like conception, planning, construction and configuration of the complex technical systems. In the project, we want to prove where (process steps) and how (methods) acceptance issues can

93

be and have to be considered in the system life cycle (see Figure 1). The overarching approach will be developed for mobile communication systems; in principle the approach will be transferable to other complex technical systems in the future.

In this project, methods and approaches of various disciplines are combined. The humanities develop in cooperation with the engineering sciences an integrated evaluation model for complex technical systems as well as a multi-methodological approach for the measurement of acceptance. Therefore, we combine traditional methods of acceptance research with methods used in other fields like computational linguistics and market research. Up to now, the use of text mining methods in acceptance research is relatively uncommon as research overviews show [2, 3]. In the interdisciplinary project HUMIC, we are using different methods (see Figure 3). Text mining results are used as input for other methods.

Figure 1. Entanglement of disciplines and the life cycle of a complex technical system The basic assumption of the project’s approach is that acceptance can be modelled. With respect to complex technologies, a substantial challenge of modelling is the handling of many different factors and their correlation: the technological system as a unit as well as an ensemble of components (see Figure 2), a huge user diversity and related perspectives on technology, a broad range of relevant societal and cultural parameters etc. The evaluation of complex technical systems is variable and changing over time; it depends on perceived or ascribed characteristics and their evaluation by individuals and social systems.

Figure 3. Multi-methodological approach and data flow The basic idea behind this design is to compensate weaknesses of single methods by means of data triangulation and cross validation. Depending on the method (for example, questionnaire vs. re-analysis) and the type of method (qualitative vs. quantitative methods) different problems arise. One of the major shortcomings of traditional methods like questionnaires and focus groups are interview effects. For example, in non-anonymous survey situations respondents might not be willing to express their straight opinion and tend to answer socially desirable (Social-Desirability-Response-Set) [4]. Ascertaining user opinion and evaluation from unobserved apparently “natural” settings may solve many of these problems. In contrast text mining methods allow investigating users’ thoughts about technologies by analysing topic-related “natural” discourses in the WWW. We expect that the analysis of extracted data will elicit new and unknown issues, which have not been obtained by common methods of acceptance research.

Reconstruction of user perspectives on technology Modelling technology acceptance requires a deeper understanding of humans’ perception and evaluation of technological products. In order to be able to integrate peoples’ view into the modelling process we need to find accesses to their opinion about mobile communication

Figure 2. Examples for relevant factors for user evaluation of complex technical systems (for example, mobile communication systems)

94

systems. We argue that we will have the best access to value assessments by analyzing the way people evaluate objects in verbal acts. From a linguistic perspective Sandig conceptualized the process of evaluation as an act where a subject (for example, a young man) evaluates an object (for example, a mobile phone) with a certain purpose (for example, to buy it) in a certain moment (for example, in the year 2010) [5]. The person is evaluating the object as a hole or in parts (for example, object features like the camera function). Depending on the person and the aim of the evaluation (for example, to buy something) some properties of the object are claimed as important and others not (for example, the design) [5]. The important properties are measured by using certain measurements (for example, 3 Megapixel to 8 Megapixel) and explicating the assessment (for example, by adjectives like little or much). The knowledge about evaluation issues, their comparison basis and measurements is part of the knowledge of social communities [6]. Measurements are time, culture and group specific and can vary individually, depending on personal values, experiences, and attitudes [5]. A basic assumption of our approach is that verbal judgments are related to typical and specific word fields, recurring patterns and formulations, such as complex vocabulary units (for example, hazardous radiation), idioms (for example, cutting edge), and comparisons (for example, better than, less). They differ depending on language (for example, German, written vs. spoken language), text type (for example, weblogs), object (for example, large-scale technologies) and other factors [7]. The identification and analysis of recurring verbal terms and patterns allows the reconstruction of how users perceive technology in relation to evaluation issues and evaluation criteria.

to articulate their thoughts in public offers researchers an attractive and promising database for acceptance research [10]. It allows eliciting authentic ‘pure’ data that reflect attitudes of users. The advantage of social media tools like weblogs is that everybody who wants to express himself can do this anonymously (with a nickname) in any extent (blog post or comment) and any way (positive, negative, neutral). Thus we argue that weblogs are a suitable database for the collection of user ratings.

Working hypothesis Our work is based on the following hypotheses: Verbal statements provide access to mental models, relevance setting and evaluations. 2. Internet discourses allow access to beliefs, attitudes and mental models of individuals and social groups. Therefore, our methodological design is focused on the analysis of Internet discourses. In terms of acceptance the analysis concentrates on the following questions: Which components of large-scale technologies (for example, mobile communication systems) are topics of interest in respect of ongoing Internet discussions? Which properties of mobile communication systems or components are discussed? What ratings scales they are measured? 1.

Methodological approach: text mining In our multi-methodological approach we integrate a method for the analysis of large text corpora called text mining (see figure 3). In general, the term text mining refers to machine supported analysis of unstructured information (texts) with several “[…] techniques from information retrieval, information extraction as well as natural language processing (NLP) […]” [11]. The goal of the analysis is the automated identification of key concepts, trends and repeatedly occurring topics and topic relations in large text collections. The method allows to get access to personal opinions and evaluations on largescale technologies by analyzing written verbal comments in social media tools, in special weblogs. The method consist of two parts: 1. Data collection: The choice of the database is guided by four criteria: language (a weblog written in German), theme (clear thematic proximity to the topic), extent (representative sample) and closeness (nonchanging database). 2. Data analysis: The data analysis is accomplished in three steps: information extraction, information processing and intelligent analysis (see Figure 4).

Social media tools as gateway to user evaluations Social media tools offer a very good research base for opinion mining and sentiment analysis. Blogs, forums, and similar tools serve as “social market places” where people meet each other to talk about topics they are interested in [8]. People post content with the intention of documenting their life, providing commentary and opinions, expressing deeply felt emotions, articulating ideas through writing, forming and maintaining community forums [9]. Some weblogs are focused on a particular subject like mobile communication systems. They offer authors the opportunity to express topic-related individual opinions, evaluations and experiences; readers can react on this with comments. The culture-related willingness of people

95

point for following parameter-based extractions according to defined words (for example, mobile phone, transmitter mast, antenna). Moreover, counting how often users apply in weblogs on topics can show in which topics they are interested.

Information processing Figure 4. Steps of data analysis

In the step of information processing the list of nouns has to be adjusted and structured. The list entries must be normalized, that is, the flectional word forms are reduced on the common grammatical form (normalization). The extracted words can be normalized either by lemmatization (dictionary-based method) or stemming (rule-based method) [12]. In the lemmatization, forms are traced back to a lexicographic basic form (lemma). The declined noun is reduced to the nominative singular [13]. Inflections and umlauts are removed. Stemming describes the reduction of a word to its stem. Inflectional and derivational morphemes are deleted. In our study, nouns have been lemmatized. The decision to lemmatize has two reasons. First, stemming is only recommended when a language is processed, which has not many inflections like in English, where the stem retains in plural and word formation [14]. The German language has to many inflections and compounds for stemming the nouns correctly. Secondly, the chosen software does not support stemming (PASW Modeler 13). The normalized list entries have to be categorized. For this purpose, semantically related words are organised in word fields (for example, radiation). Word fields are described or categorized by semantic categories like SYSTEM, PART OF, OBJECT, MOBILE PHONE. The categories are organized in a hierarchical manner. Sometimes the assignment of a word to a certain word field is difficult because of phenomena like polysemy (for example, bank, march) and homography (for example, mobile, live). When the list entries were categorized, the assignments to categories were checked with the representatives of the various disciplines in the project. The representatives categorized from their own discipline-specific perspective but their categorization is not transferable for blog content. The specific perspectives do not represent those of bloggers who post in weblogs and discuss topics in forums. Thus, in the categorization the perspective of bloggers should be taken. However, this allows a subsequent comparison of a technical terminology and a nontechnical terminology categorization, and shows the differing conceptualizations of mobile communication systems depending on the respective perspective.

Information extraction The concept of information extraction is defined as the search for word occurrences and recurring relations between words in texts. Extracts are nouns including proper nouns. The extracted items are words (list entries). They refer to evaluation objects as well as to object components and properties. Normally, the information extraction can be done either manually or automatically. In the case of the manual extraction of information, all nouns occurring in the text are highlighted, extracted and listed. Contrary in the automatic information extraction, this can be done automatically with the help of software tools (for example, PASW Modeler 13). Furthermore, texts can be analyzed exploratory, that is, either that all nouns from texts will be extracted and listed, or parameter-based, that is, only those words will be extracted which have already been defined before. The words, which are extracted, are recorded in libraries. They contain further information on expressions, the grammatical structure of a language, and words that should not be extracted (for example, and, or). The libraries also include records of typical expressions such as verbal units or abbreviations. Many software tools contain electronic libraries, which can be extended manually in the extraction process. In the study, the data has been extracted manually and exploratory. The decision for exploratory data analysis is based on the intention to let the data ‘speak’ in order to explore what topics are discussed and emphasized by the weblog users. The extracted nouns have to be organized in a domain-specific library. Each word and its occurrence frequency were recorded in a list (list entries). The list includes nouns and proper names (for example, of cities, countries, persons). A major challenge of the method is to extract and describe single words as well as complex idioms adequately. The extraction of nouns as single words and nouns that are embedded in a phrase (idiom) are problematic. If only a certain word (noun) of an idiom is extracted, not only the meaning of the word will change, but also the item ranking. The use of an exploratory sample provides the advantage of acquiring a domain-specific vocabulary (for example, related to mobile communication systems), which is required especially at the beginning of a text mining project. The collected vocabulary can be used as starting

Intelligent analysis The intelligent analysis consists of two steps: rankings and the identification of user assessments.

96

Ranking: When the information processing is completed, the results–list entries, word fields and categories– are analysed by counting occurrences of list entries (intelligent analysis). Frequency analysis is used to build up rankings. Rankings allow to make statements about which topics are often or less discuss in samples. The total number of occurrences demonstrates which item was discussed most frequently by bloggers. High listed entries dominate the bloggers talk (for example, mobile communication systems). They have the strongest emphasis in the corpus. The procedure of relative weighting allows comparing list entries per category (categoryrelated ranking), to compare list entries of a certain word field (word field ranking) and of upper categories (upper category ranking). The comparison of upper category ranking shows which domains (for example, science, society) and topics of the domain (for example, research, health, law, social aspects of the use of mobile communication systems) are weighted most strongly. Assessment: The identification of user assessments and opinions consist of three steps: 1. Identification   of   co-­occurrences   and   their   an-­ alysis   by   frequency   and   similarity:   The   simi-­‐ larity   between   parts   of   co-­‐occurrences   and   the   strength  of  their  connection  will  be  calculated  by  a   similarity  coefficient  (SC):  

Pre-test In the following the setting and results of a pre-test are reported. It aims to prove and validate the methodology. In the pre-test two samples were used. The first test sample is the weblog www.elektrosmogblog.de. The author deals in his blog posts with the effects of electromagnetic radiation on living organisms. He states that he launches the weblog out of personal interest in the topic and because he wants to compare notes with other stakeholders and interested parties. He describes his attitude to technology and technology impacts as sceptical, but not hostile. The sample contains 63 blog posts and 28 user comments. All texts are dated between May and June 2008 (accessed 25/08/2009). The texts were saved and tagged (year of publication, sample). The weblog content is organised in eight defined categories. Each blog post is assigned to one of these categories. Some blog posts are heavily discussed, others aren’t. The distribution of comments to blog posts corresponds to the real case, as it is frequently encountered in weblogs. It shows that there are issues that interest readers more than others. The results obtained in the first test were controlled through a second sample. The data were collected randomly from the Heise Mobile weblog (www.heise.de/mobil/, accessed on 11/02/2010). Heise is a news site including information about technology, where several weblogs and forums are launched. The authors are professional editors; readers often comment on their contributions. The sample consists of 11 blog posts and 1418 comments from the year 2008. The samples were subsequently analysed. Text words were extracted from the samples, processed and ranked. In the pre-test, user evaluations have not been extracted.

SC = (Cij)²/ (Ci x Cj).

2.

3.

The number of documents with co-occurrences (Cij) is compared with the occurrence score of list entries per document (Ci, Cj). The strength is described by values between 0 (= words never appear together) and 1 (= words always appear together). Cooccurrences are defined as regular when they occur often together in a pre-defined text window. A text window can be limited syntactically (a sentence, a section or a chapter) or numerically (defined number of words). In the study the text window is defined by text section. Identification   and   analysis   of   lexical   context   elements:   For   each   occurrence   the   local   lexical   context   is   identified   and   analysed   (left   and   right,   for   example,   hazardous   radiation,   radiation   rel-­‐ evant).  The  aim  is  to  elicit  evaluative  linguistic  ex-­‐ pressions   (for   example,   adjectives   like   good,   haz-­‐ ardous).   Evaluative   linguistic   expressions   can   be   used  for  the  reconstruction  of  evaluation  relevant   object  properties  and  rating  scales.     Interpretation:  The  results  will  be  correlated  and   compared   with   the   aim   of   being   able   to   make   statements  about  their  meaning  and  content.  

Results and discussion The pre-test results show that the method permits clear statements on topic preferences. Statements can be made about which components and properties of mobile communication systems are discussed. The comparison of the results gained from usergenerated content with topic statements of an acceptance study by CATI survey (Computer Aided Telephone Interviews) [15] shows that different aspects of mobile communication systems are discussed. In the long-term survey (2003-2006) by Belz it has been found, that the topic transmitters masts has a particularly ominous effect on the population. Transmitters masts were most frequently mentioned as a concern and source of impairment in all survey periods (except 2004). Respondents (n=10200) reported that they are afraid of the consequences of electromagnetic radiation, like headaches and sleep problems. In the following the results of the pre-test are reported: first, results of the ranking of the elektrosmogblog.de

97

(sample one), and secondly, findings of the ranking of the Heise Mobile weblog (sample two). The category ranking of the sample elektrosmogblog.de shows that the most frequently discussed topic is the category MOBILE COMMUNICATION SYSTEMS (n=2629) and the second ranked topic the category HEALTH SECTOR (n=1422). The category MOBILE COMMUNICATION SYSTEMS includes topics dealing with the signs and effects of mobile communication systems as well as with components of cell systems and related devices. A most discussed feature is a phenomenon of electromagnetism. The word field electromagnetism (n=1013) has the most list entries. In the word field high ranked list entries are radiation (n=291), field (n=229) and environmental stress (n=137). The results obtained with the first sample were compared with the second sample. In the second sample the topic mobile communication systems is discussed from another perspective. The users of the weblog are more interested in how mobile communication systems and their components influence the private life of people. In the analysis, two main issues were identified that were not mentioned in earlier acceptance literature. First, there were several articles that dealt with the expansion of the airspace to mobile phone areas. In their comments users discuss particularly the use of mobile devices such as mobile phones and laptops on airplanes and networking in the air. The word field ranking shows that the top seven ranked word fields are related topics. The most mentioned word field is caller (n=15). The second ranked word field is behaviour (n=13)—a main aspect is the behaviour of airlines respectively related to the use of mobile phones in airplanes. The most often mentioned list entries were airspace over Europe (n=221), notebook (n=76), and mobile phone area (n=75) (see Table 1).

scanner and data reader. In the ranking, the list entry transmitter (n=26) is ranked before household (n=17), personal data (n=15), radio installation (n=13) and electromagnetic radiation (n=9). In the case of the second sample we have completed our methods by the analysis of co-occurrences. The analysis shows that the issues mentioned above are highly related topics. Table 2 gives a summary of the top scored co-occurrences (n) of the survey. Table 2. Range of co-occurrences (Heise sample 2008). Text word 1 airspace over Europe airspace over Europe transmitter radio installation household household radio labels electromagnetic induction panic high ionizing internet telephony internet telephony internet telephony umts cell phones wireless data trailing scent in the mobile phone network trailing scent in the mobile phone network shopping cart with rfids electromagnetic induction nuclear power plants

Table 1. Top ten ranked list entries (Heise sample 2008). List entry airspace over Europe notebooks mobile phone area transmitter panic household mobile label personal data police radio installation

n 221 76 75 26 24 17 16 15 15 13

Text word 2 mobile phone area notebooks household electromagnetic radiation radio installation electromagnetic radiation personal data

n SC 74 1 74 0,97 11 0,31 9 9

0,75 0,42

8 5

0,44 0,83

4 4 2 2

1 0,5 1 1

2 2

1 1

2

1

2

1

motion profile of owners

2

1

wireless data

2

1

record

2

1

atom atom

2 2

0,5 0,5

sharks nuclear power plants gamma rays cell phone blatherer behaviour of lufthansa caller harmfulness of radiation motion profile of cell phone owner

The relation of airspace over Europe and mobile phone area has a similarity score of 1 with a total of 74 co-occurrences. The relation of transmitter and household was often extracted in the corpus with eleven cooccurrences but with a low similarity score (SC=0,31). More significant were the relations between wireless data and motion profile of cell phone owner (SC=1) as well as

The second frequently discussed topic deals with the surveillance of private mobile phone data. Top ranked are the word fields data (n=6) and reader, (n=6) respectively

98

wireless data and trailing scent in the mobile phone network (SC=1) (for further co-occurrences see table 2). The data does not only show which topics are often discussed in the weblogs, but also provides clues to the evaluation of relevant topics. Thus, the term electromagnetic induction was used in several cases with atom (SC=0.5). The relation atom and nuclear power plant as well as nuclear power plant and panic also showed a similarity score of 0,5. The linked topics suggests that user compare their fear of mobile communication systems with fears related to nuclear power plants; in this context electromagnetic radiation gets a negative connotation. This also reflects the strong apposition of high ionizing and gamma rays (SC=1), but the strength of the relation must be investigated further, as only a co-occurrence of 2 was found. The analysis can draw conclusions about the impact of mobile communication systems on the private life. Bloggers’ concerns are related to their personal security, for example in air traffic, as well as on the impact of mobile communication systems on their privacy. People are afraid of being subject of surveillance with the help of mobile phone data (from cell phone or laptop) and to lose their privacy by using mobile devices. Due to the analysis of larger samples it can be examined whether the discussed topics in the Internet differ from the themes mentioned in interview situations. If they differ, we have investigated with text mining a valuable complement to established methods of acceptance research.

of the co-occurrence partner1. The domain-specific semantic meaning of an occurrence is thus defined through relation. The described re-design offers several advantages: • A broad categorization of all extracted data is not necessary. Only text words that appear often together across multiple corpora are domain relevant. They have to be individually assigned to categories. • If a word appears in a never-before and never-after captured relation, it is negligible in the case of a long-term study. A larger amount of samples will allow to identify systematically emerging items. We adopt the perspective of blogger and blogger communities through the bottom-up categorization.

Re-design

[2] Quiring, O. Methodische Aspekte der Akzeptanzforschung bei interaktiven Medientechnologien. Münchner Beiträge zur Kommunikationswissenschaft. 6: 1-29, 2006.

Outlook Future analysis and evaluation of large corpora will be increasingly engaged in the question on which rating scales evaluations of subjects are based on, and which are recurring relationships between topics. Our methodical approach will be re-designed for further surveys according to the proposals.

References [1] Renn, O. Technikakzeptanz: Lehren und Rückschlüsse der Akzeptanzforschung für die Bewältigung des technischen Wandels. Technikfolgenabschätzung - Theorie und Praxis (TaTuP). 14: 29-38, 2005.

The pre-test showed two problems in the methodological approach: 1. Noun   vs.   idiom   extraction:   In   the   information   extraction,   nouns   as   singles   word   and   nouns   as   part  of  idioms  have  to  be  distinguished  and  sepa-­‐ rately  extracted.   2. Selective   categorization:   The   prepared   catego-­‐ ries   are   not   clear-­‐cut;   single   words   can   be   as-­‐ signed   to   several   categories.   For   the   ranking,   the   items  must  be  clearly  specified.   The problems will be solved by two modifications in the methodology. First, a library with language typical and text type specific idioms has to be build up. Idiomatic expressions need to be abstracted based on the literature (for example, German idiom dictionaries). Furthermore, recurring user formulations—blogger specific vocabulary and idioms—will be extracted and integrated in the library. Hence, idioms can be directly identified in the analysis and extracted as related units. Secondly, a domain-specific library will be build up. Only identified occurrences will be categorized in the analysis. The allocation of parts of co-occurrences (for example, words) to categories depends on characteristics

[3] Tran, T.A. and T. Daim. A taxonomic review of methods and tools applied in technology assessment. Technological Forecasting & Social Change. 75: 1396-1405, 2008. [4] Schnell, R., R.B. Hill, and E. Esser, Methoden der empirischen Sozialforschung, R. Oldenbourg Verlag, München, Wien, 1999. [5] Sandig, B. Formen des Bewertens. Anabasis. Festschrift für Krystyna Pisarkowa, Lexis, Kraków, 279-287, 2003. [6] Jakobs, E.-M. Bewertungsperspektiven auf Websites. Befunde aus den Bereichen: Arbeit, Lernen und Freizeit. 6: 71-86, 2005. [7] Sandig, B. Formeln des Bewertens. EUROPHRAS 90. Akten der internationalen Tagung zur germanistischen Phraseologieforschung, Almqvist & Wiksell, Uppsala, 227-252, 1991.

1

We would like to thank Dr. Marc Kupietz and Rainer Perkuhn from IDS Mannheim who supported us in finding solutions.

99

[8] Pang, B., L. Lee, Opinion Mining and Sentiment Analysis, Now Publishers, Boston, 2008.

frequenten elektromagnetischen Felder des Mobilfunks - jährliche Umfragen, Bundesamt für Strahlenschutz, Bonn, 2007.

[9] Nardi, B. A., D.J. Schiano, M. Gumbrecht, and L. Swartz. Why we Blog. Communications of the ACM. 12: 41-46, 2004.

About the Authors Bianka Trevisan, MA, is research assistant in the research area textlinguistics and technical communication. Her research focuses mainly on the analysis and evaluation of text corpora (data, text and web mining), the evaluation of verbal content in social media tools as well as the use of semantic principles for information access and information processing. In her PhD work she deals from a linguistic perspective with methods for the collection of user evaluations. Eva-Maria Jakobs obtained her PhD degree in linguistics from the University of Greifswald. Since 1999 she has been full professor in textlinguistics and technical communication at RWTH Aachen University, Germany. In 2005 she became a member of the German Academy of Engineering (acatech). Eva-Maria Jakobs leads the programme in technical communication and is director of the Institute for Industrial Communication and Business Media. Her main research fields are technical and business communication, textlinguistics, writing at work, age and technology, usability testing, and electronic media.

[10] Smith, A., Joshi, A., Liu, Z., Bannon, L., Gulliksen, J. and C. Li. Institutionalizing HCI in Asia. INTERACT. 4663: 85-99, 2007. [11] Hotho, A., A. Nürnberger, and G. Paaß. A Brief Survey of Text Mining. Zeitschrift für Computerlinguistik und Sprachtechnologie. 12: 19-62, 2005. [12] Arampatzis, A.T., T. Tsoris, D.H.A. Koster, and Th.P. van der Weide, Phrase-based Information Retrieval, Computing Science Institute, Nijmegen, 1998. [13] Engelberg, S., and L. Lemnitzer, Lexikographie und Wörterbuchbenutzung, Stauffenburg, Tübingen, 2001. [14] Nohr, H., Grundlagen der automatischen Indexierung: Ein Lehrbuch, Logos Verlag, Berlin, 2003. [15] Belz, J., Ermittlungen der Befürchtungen und Ängste der breiten Öffentlichkeit hinsichtlich möglicher Gefahren der hoch-

100