an explorative, operational method supporting ...

AN EXPLORATIVE, OPERATIONAL METHOD SUPPORTING USABILITY EVALUATION OF TECHNOLOGY CHANGES IN WORK CONTEXTS Lena Pareto & Ulrika Lundh Snis Laboratory of Interaction Technology University West SE- 461 86 Trollhättan, Sweden [email protected], [email protected]

ABSTRACT In this paper we elaborate on the process of designing usability evaluations of technology that is to be integrated into specific real work settings, and where technology is the main target for change. Our research question relates to both what to evaluate and how to evaluate, and the ambition is to support evaluation designers on a practical level. We propose an explorative, operational method for usability evaluation design, which makes the design process explicit and design decisions more tractable. The evaluation design method is illustrated by an empirical example of how to design a comparative usability evaluation study, in a health care setting. The method supports the designer on a conceptual level by describing the different levels in the evaluation design. On an operational level it supports the designers by defining transitions between levels. And, on the analytical level it provides a template for a usability matrix, which allows for an explorative approach of constructing the conceptual content as well as a structured way to define and refine usability measures. However, the method needs to be further tested in other settings. KEY WORDS Usability, Evaluation design, Explorative framework

1

Introduction

Organizations spend billions of dollars in introducing new technology in their businesses on very weak decision information, which may create lots of unanticipated problems. On a general level, the importance of usability evaluations is apparent in order to make more wellfounded decisions and to better anticipate consequences of technological change. This has been the central issue for many practitioners and researchers within the humancomputer interaction (HCI) and computer-supported cooperative work (CSCW) fields for long, in order to build knowledge about evaluation approaches in general, and in a work-contextual setting in particular [1,2,3,4,5,6]. However, frameworks and models for usability evaluation seldom address the practical aspect of conducting a particular usability evaluation. For instance, in [7] they present several examples where seemingly innovative and

reliable systems have failed when introduced into specific and critical organizational contexts. According to McGrath [8] researchers are trying to maximize generalizability, precision, and realism. For designing an evaluation study in a given practical situation, generalizability is not of interest. In [5] there is a particular call for more work on evaluation frameworks to address the integrated problem of knowing what and how to conduct evaluation studies. They classify the research literature into three different classes of evaluation frameworks; (i) Method-oriented frameworks describe the types of experiments and methodologies available for a general understanding, but provide little guidance for choosing among different types of methods. (ii) Conceptual frameworks describe group factors such as group characteristics, group processes and group outcomes, relevant to CSCW applications in particular. These provide conceptual understanding, but give little guidance in evaluation design. (iii) Concept-oriented frameworks describe how specific methods can be used to measure concepts like communication effectiveness, awareness, or trust, which give support for knowing how to evaluate but not of what usability issues to focus on and how to understand them in a particular work context. Our research question relates to both what to evaluate and how to evaluate. The goal of this research is to develop an explorative, operational method for usability evaluation design, with the main purpose of supporting the evaluation design process, where technology is the main target for change (rather than work practices, work process or organisational change). Our ambition is to support evaluation designers on a practical level in the process of designing their evaluation studies in an explorative, though structured, analytical manner in such a way that high-level evaluation objectives are tractable all the way down to data-collection elements. The evaluation design method is illustrated by an empirical example of how to design a comparative usability evaluation study, in a health care setting. The purpose of the study is to compare existing technology with proposed new technology to identify potential gain

as well as potential problems with the proposed new technology.

2

An Explorative Method for Usability Evaluation Design

We have focused on work-contextual situations, since specific evaluations studies are often performed in such context. We take our starting-point from the incitement of change in an organization together with organizational goals, since evaluations are often initiated because of the need for change. We focus on the process of designing usability evaluation studies, and the purpose of the method is to support the designers in their activities. The method consists of three parts: (i) a model of evaluation levels for structuring the activities, (ii) a usability matrix for exploring the usability aspects, and (iii) a process for designing the evaluation study. 2.1 A conceptual model of evaluation levels Our approach is top-down structured refinement of the evaluation objective down to the specific measures in terms of data collection methods and data to be collected, but where the transition activities from one level to another is of an explorative nature. We are much inspired by the evaluation framework presented in [6] regarding the different levels in an evaluation design, and therefore we have augmented their picture of the five levels in our figure 1 below. The levels are from top to bottom: system goals, evaluation objectives, conceptual metrics, conceptual measures and implementation-specific measures. We have also included related terms that we use in our description to the right in figure 1.

purpose of the change indicates in what direction the organization wants to move, and this is important for the evaluation as a whole. The target of evaluation refers to what the organization choose as an agent for change to accomplish the desired goal. In this work we focus on technological changes, i.e., introducing new applications, devices or platforms, rather than organizational or work process change. However, there is nothing fundamental about this choice and we believe the method can be adopted to focus on other aspects in the context. Evaluation objective and Usability Judgements: The evaluation objective is the concrete formulation of the purpose of the evaluation study to be performed, and can often be deduced from the change incitement and the target of evaluation. However, it is important to clarify the objective with the evaluation so that all involved parties are aware of it. Evaluation objectives often involve ungraspable concepts such as the potential of something or the impact on something and need to be transformed into statements that are easier to judge. We refer to these as Usability Judgements. Usability issues metrics: The usability issues are more concrete metrics of usability that ought to be decomposed from the usability judgement which reflects the evaluation objective. There is seldom an obvious decomposition of the usability judgements, so this activity requires a great deal of consideration. The goal is that the decomposition covers all important aspects of the usability judgements and that the issues are mutually exclusive. Usability measures: The usability issues are transformed into measures, which are entities that can be measured, first in principal, and then instantiated with a concrete data collection method and concrete sets of data to be collected. Both the usability issues and the usability measures are documented in the usability matrix. 2.2 A usability matrix The matrix of usability has two purposes: to support the explorative process of deciding which usability aspects to focus on in the evaluation, and to document the structural refinement process of successively transforming usability issues into conceptual and concrete measures. In figure 2 one can see that two axis span the matrix: the workcontextual components and the usability aspects. Usability aspects Workcontextual components

Figure 1 Model of evaluation levels (modified from [6]) Change incitements and organizational goals: The first thing to do for an evaluation designer is to identify the purpose of change and the target of evaluation. The

Entities we may observe

Usability judgments we want to make

Figure 2. The usability matrix

The work-contextual components are the observable entities, and they denote components in the working environment that affect the usability in some way. The usability aspects are the de-composition of usability issues that constitutes the overall usability judgements we want to make in the evaluation. The process of constructing the matrix, is useful to explore the relevant work-contextual components for the particular situation as well as determining which set of usability aspects that is relevant and reasonable to use in the evaluation. The elements in the matrix are the units-of-analysis and function as sources of inspiration as they consider particular usability aspects relative to particular contextual components. For instance, if we decide that readability is an important usability issue, we may relate this to the physical environment and determine that the room’s light circumstance is an important factor that may affect the readability. A usability matrix is presented in our illustrating example later on. 2.2.1 Work-Contextual Components The work-contextual components, the observable factors in the context, are divided in the following categories: Environment, Work situation/process, Task technology supports, and Users and other interest groups. In each category, we list areas that may be of interest, and which are considered for analysis. To what extent these areas are of major concern, depends on the particular case. We find it useful to divide the areas of interest into assumptions of the environment which are static (constant over some time) or dynamic (situation-dependent) in their nature. Static properties can be seen as presumptions to the usability analysis, whereas some aspects of usability may depend on dynamic properties. For instance, the physical room in which an activity is performed may be seen as a static property, whereas the light in the room is dynamic and may vary between use occasions, and the light may affect the readability on a screen. Environment: In the environment category, aspects concerning integration in the environment are considered, and the areas of interest are physical, technical and social aspects of the environment. The physical surrounding may be static or varying if the activity is performed in many different places. Technical aspects of the environment include other technical equipment, artifacts and systems that the technology should be integrated into, or that affect the task in focus or the users. The social environment includes work-place practices and cultures. Work situation and process: Areas of interest concerning the overall work situation include overall work load (type of work, number and co-occurrence of activities and tasks), overall mental and cognitive load (for instance stress, shortage of time, critical situations, responsibilities), organizational issues, labor-division and many more, depending on the particular case. A complete work situation is complex by nature, and the task or

activity in focus is often just a small part. Therefore, we choose to separate the two, as distinct units of analysis. Technology-supported task: Areas to consider for adopting technology to the task, is for instance, its relative importance compared to other co-existing tasks, the frequency of use, the effect and frequency of disruption, the effect and frequency of interruption, and structural or cognitive aspects of the task itself. Users and other interest groups: Integrating technology includes adopting it to users and other interest groups. Areas that are of interest are users expectations, their experience and knowledge, physical and mental circumstances such as for instance pain, stress, lefthandedness, impaired sight. Pain and stress, for instance, may be constant over a period, i.e., static, or particular for an occasion, i.e., dynamic). The success of technology integration highly depends on the users' attitude towards technology, and how the integration is implemented, and these are therefore areas of interest for the analysis. 2.2.2 Usability Aspects Which usability aspects to consider in a particular study differ for different evaluations. The concept is complex and there is not one categorization which is accepted and used by everyone in the field [9]. It is a task for the designer to determine a relevant set of usability aspects, and there are different approaches to this task: 1. Choose a standard set of usability categories, for instance one of these found in [9, 4, 11, 12]. This is usually recommended. 2. Construct your own set by combining standard categories. A categorization ought to cover all relevant aspects with mutually exclusive categories. 3. In evaluations which are comparative studies, the set of usability categories can be constructed from comparison to be made, i.e., deduce relevant aspects which reflect the matters that differ in the comparison. An example of such construction can be found in the illustrating example below. Which particular categorization that is chosen may be a matter of taste, but it is important to convince oneself that it is as complete as possible, and stimulates one imagination during the exploration of usability issues. 2.3 A process for designing the evaluation study The process we propose is a structured way of designing an evaluation study with our suggested method. It allows for a combination of an explorative and analytical approach, which we think is essential for any design activity. The process proceeds as follows: 1) Identify target of evaluation (aspect of technology which is to be evaluated). 2) Formulate evaluation objective and transform it into usability judgements. 3) Construct the usability matrix:

a)

Identify relevant factors in the work-contextual components (suitable methods are in-context observations and interviews) b) Select a set of relevant usability aspects to consider c) Fill the units-of-analysis with usability issues of interest - here the matrix is useful as a source of inspiration and control that all combinations of the work-contextual components and usability categories are considered as potential generators of relevant usability issues. d) Determine which usability issues that are of main concern for the evaluation (metrics – conceptual aspects we want to measure) e) Consider plausible dependencies between the work-contextual components factors and the usability issues (this is important input when considering which data should be collected and analyzed in the evaluation) f) Refine the usability issues into measures. g) Choose data to be collected and methods to collect them. Here, the matrix can be used for visualizing the dependencies of measurements and factors in the contextual components, which can be used to guide the analysis of the data. 4) Refine the matrix until it is consistent and complete.

3

Application of the proposed Method

The example we are going to illustrate refers to an evaluation study of introducing mobile technology in a health care setting. Related studies are found in for instance [13, 14, 15, 16, 17, 18, 19, 20]. The study was conducted at the central surgery department of a hospital in West Sweden. The empirical results of the evaluation study are reported in [21]. The purpose of the study is, in particular, to evaluate the usability aspects which can be derived from mobile technology alone, and thereby judge the potential of replacing stationary technology with a mobile counterpart. Therefore, other factors such as the software used, the users involved, and the environment are kept as constant and similar as possible, so that the technical device should be the only varying factor in the comparison. The organizational goal is to use the results of the study for two purposes: 1) as input to the organizational decision procedure whether mobile devices should replace existing stationary technology, and 2) as a starting point of requirement gathering, regarding adaptation of the software interfaces for the mobile device. The task in focus for the study is the patient recording documentation, which is an important part of a hospital’s operational activities, due to patient security, communication between personnel, planning and organization, and it is regulated by law. Yet, the care taking of patients is always the primary task, so documentation has to be performed whenever allowed, between care-taking activities. Today, most of the patient recording is computerized at the department, however at stationary bases. There are stationary computers in all

surgery rooms, and at other strategic places. However, surgery patients are not stationary; they need to be moved around. The aim is to achieve a complete workplace physically close to the patient at all time, where all activities can be performed. This calls for a mobile device for documentation activities, and thus, this comparative study evaluating a mobile, Tablet PC, as a documentation device is being performed. According to our proposed explorative method, the following were defined during the evaluation design: Change incitements and organizational goals: The change incitements are to improve the quality of the caretaking work as well as better support the nurses in their documentation work. The organizational goal is to achieve a more patient-oriented workplace, which include being able to perform documentation work while monitoring the patient, and therefore the mobile technology is of interest. Evaluation objective: To evaluate the potential of a new, mobile technology as an alternative to stationary technology to achieve a more patient-oriented workplace, i.e. to do a comparative study. Usability judgements: We have interpreted (transformed) this evaluation objective into judging what is gained by the added mobility, and what is perhaps lost in other usability aspects which are effected by the change of technology. These two judgements can then be used in the overall decision made by the organization, to decide whether the gains are valued high enough and the losses are tolerable or can be reduced to a tolerable level. 3.1 Constructing the Usability Matrix To reveal interesting factors among the work-contextual components, we conducted in-context observations and interviews, to get a better grasp of the physical environment, the work process and the equipment used. Here are some examples of factors that we identified as relevant for our study: Environment: Static factors are that the surgery rooms have limited space, lots of equipment and sterile areas, whereas dynamic factors include light circumstances and surgery equipment in the room. Work situation: Static factors include frequent switching between tasks (secondary documentation task and primary care-taking tasks), hard workload, and the fact that stressful situations are common. Dynamic factors include complications during surgery and disturbances due to problems with technical equipment. Technology-supported task: Among the static factors we found that inserting values in medical record were a common activity, as well as selecting choices in predetermined lists. Dynamic factors included length of surgery, distribution of documentation activities (when documentation work was performed), physical position during documentation work (sitting, standing holding

device, standing device laying on surface), as well as frequency of interruptions (how often the documentation work is interrupted to attend to other tasks). Users: Static factors included professional experience and experience with computerized documentation work, their physical and mental condition normally (left- or right handed, tiredness, pain) as well as attitude towards computerized documentation. Dynamic factors included symptoms of fatigue or pain due to computer use and physical or mental unusual circumstances. In order to select a relevant set of usability aspects for the matrix, we started out by identifying which parameters that differed between the two technologies that were target for the evaluation. Further, we identified which parameters that varied, and which ones of those that were possible and interesting to measure. From there, we identified four usability categories of interest for this particular comparative study. Usability categories of interest, derived from the aspect that differed between the two technologies: comparative study: 1. One device is mobile, whereas the other is stationary, so mobility is naturally of major concern. 2. A stationary device is used in a given place, whereas a mobile device is used in many different places, which apparently affect the ergonomics of the use. 3. The two devices have different input devices and different displays, so the use efficiency is affected even thought the same application is used. 4. The mobile device is a new tool with other interaction techniques, which affect the users. Moreover, there are new possibilities with mobility, so user satisfaction is an aspect we are interested in. See figure 3 below for the entire usability matrix

Workcontextual Mobility component

Usability aspects User Ergonomics Use Efficiency satisfaction

Environment

Variations in placement in room

Work situation/ process

Frequency of Variation of Frequency of perceived body positions task transfers interruptions

Satisfaction of patient monitoring capabilities

Task technology supports

Body Variation and position task, ways of Distribution holding device task o/ time

Satisfaction of technology

Users and other interest groups

Accessibility of information for others

Places in room where mobile device is placed when not used

Required walking distance for task transfers

Input efficiency, Readability, Error f Productivity

Flexibility of work position change over Fatigue level, time pain due to use

Level of collaboration, Level of quality care

Figure 3. The usability matrix derived in our example

Usability issues – some metrics of interest: Mobility: The usability issues we wanted to measure were to what extent the local mobility of the mobile device was utilized, and to what extend the mobility affected the capability of monitoring the patient while performing the documentation work. These issues were refined into measuring variations in placement in room and variation of body positions during documentation work. Use efficiency: Regarding efficiency we wanted to measure, for instance, readability, perceived productivity of inserting values, error-rate and sense of control. Ergonomics: Usability issues include flexibility of work position, level of fatigue, pain due to computer use, variation and ways of holding device as well as variations of body position during use. User satisfaction: issues we considered important include patient closeness and monitoring capabilities, satisfaction with handling and interacting with the device, as well as level of collaborations with colleagues. Choosing methods to collect data The evaluation study was organized into three different phases: 1) pre-study observations, interviews and questionnaire, 2) main study conducted as a reaction study, and 3) post-study interviews, see figure 4. Work-contextual components

static aspects

Usability aspects

situation aspects

Mobility

Ergonomics

Efficiency

User satisfaction

Environment

Work situation/process Task technology supports Users and other interest groups

pre-study questionnaire

pre-study observation

pre-study interview

reaction study (questionnaire)

poststudy

Figure 4. Data collections methods chosen in example 1) The pre-study: In order to sufficiently interpret work practice, i.e. the user interaction and work situation, we did introductory non-participatory observations in order to obtain basic domain knowledge about the work situation, potential problems with the task and to get an overall understanding of the users’ situation. Following the observation, interviews with primary users, organizational and technical personnel were conducted to collect further information regarding the user group, technical and workrelated issues as well as organizational goals. Furthermore, a pre-study questionnaire was designed containing aspects that were not expected to vary from time to time. The pre-study questionnaire was filled-in by the participants once before the main study.

2) The main study: The main data-collection method used was a reaction study, which is a particular kind of inquiry method. It should be performed repeatedly, during a period of time, and answered directly after a performed task; as a reaction on the performance. The questions were formulated so that a reactive answer was possible, and must not require deep considerations. The purpose of such a study is to collect data which is close to spontaneous reactions from the users (see for instance [22]). By so doing, we avoided a bias for enhanced user performance due to the users’ awareness of being studied, and thus we can say that our test subjects did not pay more attention to the tool than they normally do. Using questionnaires that the users filled-in, implied that we measure perceived usability rather than actual usability; since we measured what they say they did, instead of measuring what they actually did.

3.

4. 5.

6.

7. 8.

9.

3) The post-study: To collect deeper information concerning user satisfaction, as well as getting a better understanding of the reasons behind the findings in the reaction study; also a post-study with qualitative interviews with the participants was conducted. In these interviews we had possibilities to get deeper reflections and discussions together with the users.

4

Concluding Remarks and Further Work

Our study has confirmed that the various aspects to determine in an evaluation design are highly complex and interdependent. However, our method supports the evaluation designer in the following aspects: 1. On a conceptual level by describing the different level in the evaluation design (similar to [6]). 2. On the operational level by defining ways to perform the transition activities, .i.e., how to get from one level to the next in the evaluation design, and how to construct the usability matrix. 3. On the analytical level by providing a template for a usability matrix, which allows an explorative approach of constructing the conceptual content, and a structured way to define and refine usability measures. Another benefit of our method is its concrete way of making the design process explicit on what is actually measured in the evaluation and how this is related to the overall evaluation objective. Many choices are possible, but by making them explicit, good traceability of the evaluation design process could be achieved. Future work includes testing the proposed method in other settings.

10.

11. 12. 13.

14.

15.

16.

17.

18.

19. 20.

21.

References 1.

2.

Whiteside, J., Bennett, J. & Hotzblatt, K. (1988). Usability engineering: Our experience and evolution. In M. Helander (Ed.), Handbook of Human Computer Interaction. First Edition, 791--817, Amsterdam. Elsevier. Beyer, H. and Holzblatt, K. (1998). Contextual Design: Defining Customer-Centered Systems. San Fransisco: Morgan Kauffman.

22.

Davies, F.D. (1989), Perceived usefulness, perceived ease of use, and user acceptance of information technology, MIS Quarterly 13 (3), pp 319-339. Nielsen, J. (1993), Usability Engineering, Academic Press, London. Neale, D.C., Carroll, J.M., and Rosson, M.B. (2004), Evaluating Computer-Supported Cooperative Work: Models and Frameworks, In CSCW’04, Nov, Chicago, Illinois, USA. Scholtz, J. & Potts Steves, M. (2004), A Framework for Real-World Software System Evaluations, In CSCW’04, November, Chicago, Illinois, USA. Heath, C. & Luff, P (2000), Technology in Action, Cambridge University Press, London, 2000. McGrath, J.E. (1994), Methodology matters: Doing research in the behavioural and social sciences, Readings in Human-Computer Ineraction: Toward the year 2000, pp 152-169, Baecker, Grudin, Buxton, and Greenberg (eds), San Fransisco, CA, USA, Morgan Kaufmann. Martijn van Welie, Gerrit C. van der Veer, Anton Eliëns (1999), Breaking down Usability, In Interact’99, August, Edingburg, Scotland, GB. Preece, J., Rogers, Y, and, Sharp, H (2002), Interaction Design – beyond human computer interaction, Wileys, 2002. Shneiderman, B. (1998), Designing the User Interface, Addison-Wesley Publishing Company, USA. Dix, A., Abowd, G., Beale, R. and Finlay, J. (1998), Human-Computer Interaction, Prentice Hall Europe Ancona, M, Dodero, G, Gianuzzi, V, Minuto, F, and Guida, M. (2000), Mobile computing in a hospital: the WARD-INHAND project, SAC’00 March 19-21, Como, Italy. Brown, B., Green, N., and Harper, R. (2002), Wireless World, Social and Interactional Aspects of the Mobile Age, Springer-Verlag, London, 2002. Brown, D.S & Motte S. (1998), Device Design Methodology for Trauma Applications, CHI’98, Los Angeles, CA, USA. Gosbee, J. W. (1998), Applying CHI in Health Care: Domain Issues, Resources, and Requirements, Tutorial in CHI ´98, Computer Human Interaction Conference, April, 1998. Kakihara, M. & Sørensen, C. (2002). Mobility: An Extended Perspective, Proceedings of the 35th Hawaii International Conference on System Sciences, Big Island, Hawaii, IEEE Press, CD-ROM. Levin, S., Clough, P., and Sanderson, M. (2003), Assessing The Effectiveness of Pen-Based Input Queries, ACM SIGIR'03, July 28-August 1, 2003, Toronto, Canada. Muñoz, Miguel A et al. (2003). Context-Aware Mobile Communication in Hospitals. [Elektronisk] I IEEE. Wu, J H, Wang, S-C, and Lin L-M. (2005), What Drives Mobile Health Care? An Empirical Evaluation of Technology Acceptance, In Proceedings of HICSS 38, January, Hawaii, USA. Pareto, L. & Lundh Snis, U. (2004), Work-Integrated Mobile Technology: Towards a patient-oriented workplace in Health Care Settings, In Proceedings of IRIS 27, August, Falkenberg, Sweden. Hassenzahl, M & Sandweg, N. (2004), From Mental Effort to Perceived Usability: Transforming Experiences into Summary Assessments, In CHI 2004, April, Vienna, Austria.