Using an Empirical Study to Evaluate the Feasibility of a new ... - LBD

Using an Empirical Study to Evaluate the Feasibility of a new Usability Inspection Technique for Paper Based Prototypes of Web Applications Luis Rivero and Tayana Conte Grupo de Usabilidade e Engenharia de Software - USES Instituto de Computação, Universidade Federal do Amazonas (UFAM) Manaus, AM - Brazil {luisrivero,tayana}@icomp.ufam.edu.br Abstract- Usability is one of the most important factors that determine the quality of Web applications. Many Usability Inspection Methods (UIMs) are gaining popularity as an effective alternative for addressing Web usability issues. However, most of these UIMs still evaluate artifacts in the last stages of the development process and therefore, the cost of correcting the encountered usability problems is high. This paper presents the Web Design Usability Evaluation (Web DUE) technique that aims to allow the evaluation of low-fidelity prototypes (or mockups) during the design of the application. The Web DUE’s main original feature is that it guides the inspection process through Web page zones which are “pieces” of Web pages. For each Web page zone, we crafted a set of usability verification items that can aid inspectors in finding usability problems. We have also carried out an empirical study to verify the feasibility of the technique. The analysis of the quantitative and qualitative data provides information on the Web DUE’s degree of effectiveness and efficiency; and offers insight on what can be modified to enhance its results.

I. INTRODUCTION In recent years Web applications have become increasingly important [13]. According to Triacca et al. [20], Web applications need to be usable because they are of growing complexity, address several targets, deal with complex content and have different communications goals. Usability also affects the acceptability of Web applications [10]. According to Mendes et al. [11], if a Web application possesses poor usability, it will be quickly replaced by a more usable one as soon as its existence becomes known to the target audience. Usability is moving up the list of strategic factors to be dealt with, especially in software development [7]. In the last decade, the software development industry has invested in the development of a variety of Usability Inspection Methods (UIMs) to address Web usability issues [10]. UIMs are procedures in which expert evaluators inspect or examine usability-related aspects of a user interface [17]. In [5], Fernandez et al. identified which UIMs have been proposed in the last decade to evaluate Web applications. Despite the increasing number of emerging UIMs there is still room for improvement. In [15], we identified that emerging UIMs should be able to: (a) find problems in early stages of the development process in order to reduce the costs of correcting the encountered problems; and (b) consider specific usability aspects of Web applications in order to aid in the identification of usability problems. This paper presents the Web Design Usability Evaluation (Web DUE) technique. The Web DUE technique aims to reduce the cost of correcting usability problems by evaluating

Web artifacts in early stages of the development process. Therefore, we chose to focus on low fidelity prototypes (or mockups) which are images of how the software would look like, and that can be evaluated before any source code is written. The Web DUE technique guides inspectors through the evaluation process of mockups by dividing Web pages into Web page zones [6]. We crafted the technique by selecting usability criteria from UIMs within the studies in [15], and relating these criteria to Web page zones. We carried out an empirical study to verify the feasibility of the Web DUE technique. This study measured the technique’s effectiveness and efficiency indicators to analyze if the Web DUE is feasible compared to its predecessor, the Web Design Perspective (WDP) based Usability Evaluation technique [2]. By measuring these indicators we expect to corroborate that the Web DUE allows the identification of more usability problems of paper based prototypes of Web applications in reasonable time. Moreover, we applied questionnaires to the subjects of the empirical study to obtain qualitative data. We used the obtained data to understand some of the difficulties subjects had experienced when using the technique and to identify improvement opportunities. This paper is organized as follows. Section II presents the background knowledge of Usability Inspection Methods for Web applications. In Section III we present the Web DUE technique proposal. Section IV shows how we carried out the feasibility study of the Web DUE technique. Section V presents the analysis and discussion of the results obtained from the feasibility study, while Section VI presents its validity threats. Finally Section VII presents our conclusions and future work. II.

RELATED WORK ON UIMS FOR WEB APPLICATIONS

This section presents the basic knowledge to understand usability inspection methods and presents a brief summary of the actual state of UIMs for Web applications. A. Usability Inspection Methods Usability Evaluation Methods (UEMs) are procedures composed of a set of well-defined activities that aim to collect data about the software’s degree of usability [5]. Usability Inspection Methods (UIMs) are a subset of UEMs that make use of experienced inspectors to review the usability aspects of the software artifacts. In UIMs, the inspectors base their evaluation in guidelines that check the system’s level of achievement of usability attributes. The obtained results can be used to predict whether there will be a usability problem. The main advantage of UIMs is that they

can lower the cost of finding usability problems since they do not need any special equipment or laboratory [17]. According to Matera et al. [10], the main Usability Inspection Methods are the Heuristic Evaluation (HE) [12] and the Cognitive Walkthrough (CW) [14]. The HE assists inspectors in finding usability problems through the usage of a set of rules which seek to describe common properties of usable interfaces. During the HE’s inspection process, inspectors examine Graphical User Interfaces (GUIs) looking for problems. If a usability problem is found, they report it and associate it with the violated heuristics. The CW allows inspectors to analyze if a user can make sense of interaction steps as they proceed in a pre-defined task. In order to identify usability problems the inspectors ask questions to answer if: (a) the action is sufficiently evident; (b) the action will be connected with what the user is trying to do; and (c) if the user will understand the system’s response. There are other UIMs that can be used to improve the system’s usability and therefore its quality. Guidelines inspections, for instance, are inspection methods that are similar to the Heuristic Evaluation. In guidelines, experienced inspectors check if the application meets the guidelines’ set of usability rules [17]. Another technique, proposed by Zhang et al. [22], is the Perspective-based Usability Inspection in which every section of the inspection focuses in a sub-set of questions according to the usability perspective. B. Usability Inspection Methods for Web Applications Since the last decade research has been focused in the development of new UIMs specifically crafted for the evaluation of Web applications [5]. In [15], we performed a systematic mapping extension based on the results in [5] to draw conclusions about the state of art of UIMs for Web applications. In this subsection we will briefly summarize the results in [15] regarding: (a) knowledge basis of the UIMs for Web applications; (b) evaluated artifacts of the UIMs for Web applications; (c) type of evaluated Web applications; and (d) the degree of the inspector’s intervention in the evaluation process of the UIM for Web applications. Regarding the knowledge base of UIMs for Web applications, we found out that some of the emerging UIMs are adapting generic techniques, like the Heuristic Evaluation and the Cognitive Walkthrough, in order to evaluate Web applications. Moreover, other UIMs for Web applications make use of perspectives to help focus on each usability attribute when carrying out the inspection. Furthermore, some of the techniques are creating their own specific set of rules for the Web domain. The results in [15] also showed that UIMs for Web applications are carrying out inspections using: models, HTML code and prototypes/applications. UIMs for Web applications are mostly focusing in evaluating functional prototypes and finished applications. Consequently, as the inspections are being carried out in the last stages of the development process, the cost of correcting usability problems is high. HTML code analysis is also used in the last stages of the development process, but it uses automated tools with embedded usability rules so that it can be done without inspectors. Model analysis verifies if the application’s models

meet interaction rules within the Web domain. The advantage of model analysis is that it can be used in earlier stages of the development process. Regarding the type of the evaluated Web applications, the results in [15] showed that there have been proposals for the evaluation of specific type of Web applications such as: (a) medical, (b) e-commerce, and (c) Web 2.0. UIMs for specific types of Web applications provide more feedback than those evaluating generic Web applications. There is a shortage of UIMs for generic Web applications that are able to provide feedback in both the identification and solution of usability problems. The degree of automation of UIMs for Web applications is related to the artifact they evaluate. HTML code analysis, which considers inconsistencies in color and patterns, is being fully automated and can be carried out without the intervention of an inspector. However, HTML code analysis does not consider all the usability criteria that can be assessed by UIMs performed by inspectors. The inspectors’ expertise can aid in finding problems involving judgment and human interaction. III.

WEB DESIGN USABILITY E VALUATION

The Web Design Usability Evaluation (Web DUE) technique is an inspection method based on checklists. The Web DUE was proposed aiming to meet the actual needs of the software development industry. Consequently, it mainly focuses on identifying usability problems in early stages of the development process. Moreover, the Web DUE also considers usability problems that are specific of Web applications (See Table II for some examples). In the next subsections we will describe the Web DUE proposal and its inspection process through an example. A. The Web DUE Proposal One of the most important features of the Web DUE technique is its ability to find usability problems in early stages of the development process. Therefore, the Web DUE can be used to evaluate the usability of mockups. This feature is important as finding usability problems in earlier stages of the development process can lower the cost of correcting them [15]. The main innovation of The Web DUE technique is its ability to guide inspectors through the evaluation process by using specific “pieces” that are used to compose Web pages of Web applications. These “pieces” are called Web page zones [6] and they contain specific component within Web pages. Table I shows the list of the Web page zones used by the Web DUE technique, the suitability of including such zones in a Web page, and a brief description of their contents. Readers must note that we added the Help zone to the list in [6] to evaluate if the system provides help to its users, moreover a more thorough description of each Web page zone can be found at [16]. TABLE I CONTENTS AND SUITABILITY OF WEB PAGE ZONES BASED ON [6]. Zone Navigation

Suitability Mandatory

Contents Navigation Links

Zone System’s State

Suitability Mandatory

Information

Depends on the Functionality Depends on the Functionality Depends on the Functionality Depends on the Functionality

Services User Information Direct Access

Data Entry Institution

Depends on the Functionality Optional

Custom Help

Optional Mandatory

TABLE II DATA ENTRY ZONE: DESCRIPTION AND PART OF THE USABILITY VERIFICATION ITEMS AND RELATED E XAMPLES/E XPLANATIONS.

Contents Information about the application's state and how we got there. Information from the application's data base. Access to functionalities related to the information zone. Information about the logged user. Links to common Web functionalities (Login, Logout, Home). Provides a form to input data to execute certain operations. Information about the institution that is responsible for the Web application. Domain-independent content Information about how to use the application and how it works.

The main advantage of the use of Web page zone is that it can aid inspectors in evaluating only the elements that are present in the evaluated mockup. In other words, inspectors will only have to verify the usability attributes for those Web page zones that compose the evaluated mockup. Another expected benefit of the use of Web page zones is that it can provide inspectors with a guideline to identify the elements within the application. This can help inspectors in focusing in determined aspects and parts of the Web mockup while executing the inspection. We based the Web DUE technique in the Web Design Perspective (WDP) Based Usability Evaluation technique [2]. The WDP is an inspection technique that evolves the Heuristic Evaluation [12] in order to evaluate Web applications. The WDP relates the usability of Web applications to three main perspectives: concept, navigation and presentation. Using these perspectives and the heuristics in [12], it creates pairs HxP by combining heuristics and perspectives when there is a relationship between them. For each pair HxP, the WDP provides a set of hints in order to aid inspectors in finding usability problems when evaluating Web applications. The Web DUE technique provides inspectors with rich descriptions on how to identify usability problems. We extracted the hints from the WDP [2] inspection technique, and related them to each of the Web page zones. After identifying which hints each Web page zone should provide, we compiled all hints and turned them into usability verification items. These verification items check the usable properties of each Web page zone and are grouped in checklists. There is a checklist for each Web page zone. Inspectors use these checklists to verify if the evaluated lowfidelity prototype meets usability principals. Furthermore, in order to aid inspectors in finding problems using the verification items, we paired each of them with examples and clarifications. Inspectors feeling lost on what they are evaluating can use the provided examples to get more insight on what to evaluate and how. Table II shows part of the verification items and their examples/clarifications for the Data Entry Web page zone.

Data Entry Zone Description: This zone is responsible for providing the user with a form to input data in order to execute certain operations. Then, a submit-like button links the input data with the associated functionality. Usability Verification Items Examples / Explanations It is easy to see (find, locate) the The user can easily see where to zone to input data in the system. input data in the system. It is easy to understand that this The input fields (Combo Box, Line zone is for data input. Edit, Dial, Horizontal Slider, others.) must be easily recognized. The user must know that they are used to input data. The interface indicates which data For example, the system indicates must be mandatory filled. mandatory input data with a “*” or a “mandatory” next to the field. The interface indicates the correct The data fields must format for a determined data contain/provide hints on how to fill entrance. them. For example, a “Date” entry field could have the next hint: “dd/mm/yy”.

B. Web DUE Inspection Process The steps of the simplified inspection process of the Web DUE technique are shown in Fig. 1. These steps are: (1) divide prototype in zones, (2) verify checklists, and (3) identify problems. In order to illustrate the Web DUE’s inspection process, we have used it to evaluate the usability of a low-fidelity prototype (mockup). This mockup was crafted based on a page of the Journal and Event Management System (JEMS1). This page is used in the JEMS system to edit user data. As this page is long and uses two times the size of a real screen, we have divided it into two parts for it to be better visualized. In the next paragraphs we describe how we applied the inspection process steps to perform a simple inspection of the prototyped Web page from the JEMS Web application. Readers must note that this example shows only part of the inspection of the JEMS portal, since we are only evaluating one of its pages. The first step for the identification of usability problems is to divide the paper based prototype into Web page zones. In Fig. 1 we show the identified zones within the mockup: System’s State zone, Data Entry zone and Navigation zone. The system’s state zone indicates the actual state of the user while using the application. Moreover, the Data Entry zone shows a form that the user can use to edit the data of his/her account. Finally, the Navigation zone shows the navigation links within the application. Despite the fact, that these links do not seem to guide the user through the application, but offer access to common functionalities; we have decided to consider them as a navigation zone since there is no other way of navigating through the prototyped Web page. After identifying the Web page zones, inspectors must proceed with the evaluation of the usability verification items. In other words, they must check if the mockup meets all the items described within each of the checklists per Web page zone. Readers must note that inspectors will only verify 1

https://submissoes.sbc.org.br/

the checklists of the zones that are present in the evaluated mockup. Table III shows an example of the usability verification items that were not met by the prototyped Web page. Each of the verification items within Table II is associated with its Web page zone. In order to identify usability problems, inspectors must point in the application which components within each Web page zone did not meet the usability verification items. If we look at Fig. 1 and Table III simultaneously, we can relate the nonconformity of the usability verification items in Table III with the augmented elements A, B, C, D and E in Fig. 1. We will address each of the encountered usability problems as follows.

logical way. Asking for a country’s state (even if this input data must be filled only for members from the US) before filling the user’s country does not follow a logical order (see Fig. 1 element B). Furthermore, item 02 also suggests that the application leads the users to commit errors. If the application forces the user to input data in a disorganized way, users could also fill them incorrectly. Furthermore, the verification item 03 suggests that the input fields are not standardized. This usability problem can be observed in the length of all input data which constantly varies (see Fig. 1 element C). During the evaluation of the Navigation zone we encountered three nonconformities: 04, 05 and 06. The verification item 04 indicates an error. Users are not able to see the navigation zone until they roll to the end of the Web page (see Fig. 1 both elements D and E). Furthermore item 05 indicates that the symbols used within the navigation zone are difficult to understand. A user would find it difficult that the “globe” symbol would leave to the JEMS portal (see Fig. 1 element D). Moreover, the verification item 06 indicates a problem in the use of icons. The “house” symbol is generally used for the home page, but in this case it is directing the user to any of the conferences’ submission pages (see Fig. 1 element E). TABLE III USABILITY VERIFICATION I TEMS THAT WERE NOT MET BY THE PROTOTYPED WEB PAGE OF THE JEMS SYSTEM. N° 01

Web Page Zone System’s State

02

Data Entry

03

Data Entry

04

Navigation

05

Navigation

06

Navigation

Usability Verification Items The System’s State is naturally and logically presented to the user. The data to be input (whatever the means of input) by the user are requested in a natural and logical order. The input fields are standardized regarding layouts, formats and controls. It is easy to see (find, locate) the navigation zone. It is easy to understand the words and symbols used in the system. The terms (words or symbols) make sense considering the domain of the application and follow conventions of the real world.

We have shown the simplified inspection process of the Web DUE technique by evaluating a low-fidelity prototype of a Web page. As mentioned before, readers must note that in order to evaluate the entire Web application, all Web pages within the Web application must be evaluated. In the following Sections we describe the feasibility study of the Web DUE technique and its results. IV. Fig. 1. Example of the Inspection Process of the Web DUE technique. Regarding the System’s State zone, the usability verification item 01 indicated that, despite showing the actual state of the system, the prototype does not show it logically (see Fig. 1 element A). In other words, the prototype does not show how the user reached that state. Regarding the Data Entry zone, we identified two nonconformities: 02 and 03. The usability verification item 02 indicates that the mockup does not request data in a

FEASIBILITY STUDY OF THE WEB DUE TECHNIQUE

According to Shull et al. [18], the first study that must be carried out to evaluate a new technology is a feasibility study. We have designed and executed a feasibility study to answer the following research question: “Is the Web DUE technique feasible regarding the number of detected defects and is time well spent?” To address this question, we have compared the Web DUE technique with the WDP technique [2]. Despite the WDP not being a technique for the evaluation of paper based prototypes, it is the Web DUE’s predecessor, since the Web DUE is based on the WDP. Therefore, we believe it is reasonable to compare if the Web DUE presents better results than the WDP when evaluating mockups.

In the following subsections we provide all information needed by readers to assess the quality of the empirical study and the applicability of its results. A. Goal In Table IV, we use the GQM paradigm [1] to present the goal of this feasibility study. We have characterized the viability of the Web DUE technique by analyzing its effectiveness and efficiency indicators. For these indicators, we have used the same definition used in [2] and [4]: Effectiveness is the ratio between the number of detected problems and the total of existing problems. Efficiency is the ratio between the number of detected problems and the time spent in finding them. TABLE IV GOAL OF THE FEASIBILITY STUDY PRESENTED USING THE GQM PARADIGM. Analyze For the purpose of With respect to From the point of view In the context of

The Web Design Usability Evaluation technique. Characterize. Its viability in terms of the effectiveness and efficiency indicators compared to the WDP technique. Of Software Engineering Researchers. A usability evaluation of the mockups of a Web based system by usability inspectors.

B. Hypotheses Using the indicators defined above, we planned and conducted the study to test the following hypotheses (null and alternative, respectively): H01: There is no difference in terms of effectiveness in using the Web DUE technique to find usability problems compared to the WDP technique. HA1: The Web DUE technique presents a difference in the effectiveness indicator, when compared to the WDP technique. H02: There is no difference in terms of efficiency when using the Web DUE technique and the WDP technique.

reduce the bias of having more experienced inspectors using one or another technique, we equally distributed the subjects into two teams. One team used the Web DUE technique and the other one the WDP technique. Table V shows both teams and the expertise of their members. Readers must take note that we asked subjects their overall expertise in Web applications. However, we did not categorize subjects according to their experience with Web applications, as all subjects possessed high experience. There were other participants in this empirical study: (a) the moderator of the inspection, (b) the technique lecturers, and (c) the discrimination team. The moderator was responsible for planning and collecting the data during the empirical study. Moreover, the technique lecturers provided the training on each technique. Finally, the discrimination team was responsible for deciding which of the encountered defects were real usability problems. D. Materials Each team evaluated the usability of a set of paper based prototypes. However, each team used only one of the techniques, either the Web DUE technique or the WDP technique. For each technique they received a checklist to report the encountered usability problems. The mockups were prepared based on a Coupon Website. Coupon Websites offer social coupons which are quickly emerging as a marketing tool for businesses, and an attractive shopping tool for consumers [9]. These coupons often provide the user with one-time savings opportunities. We decided to perform the inspection over this type of Website since its popularity is raising [9]. To aid inspectors in the navigation among the prototyped Web pages, we provided a navigation map indicating which page should be shown after pressing buttons. Fig. 2 shows some of the prepared mockups (1 and 2) and the navigation map (3) of one of the prototyped Web pages. Readers must note that these mockups were printed so that the inspectors could have an idea of the real size of the Web pages. Furthermore, inspectors could use these mockups to write down notes and point usability problems.

HA2: The Web DUE technique presents different results regarding the efficiency indicator, when compared to the WDP technique. C. Subjects We carried out the feasibility study in April 2012 with students of the Computer Science course from Federal University of Amazonas. There were a total of eight undergraduate and postgraduate students who agreed to participate. All subjects signed a consent form and filled out a characterization form. The characterization form addressed the participants’ expertise concerning: (a) expertise in usability knowledge, (b) expertise in usability evaluation, and (c) expertise in application design. All subjects answered objective questions regarding their degree of knowledge and professional experience. The characterization data were analyzed and each subject was classified as having Low, Medium or High experience according to the provided information within his/her characterization form. In order to

Fig. 2. Paper based low-fidelity prototypes of a Web application and navigation map for one of its the Web pages. E. Procedure All subjects had lectures on: (a) usability, (b) examples of typical usability problems, and (c) how to apply an inspection method like the Heuristic Evaluation. Before the inspection, each team was trained in the technique it was assigned. The

context of the inspection comprised four tasks that inspectors should perform over the paper based prototypes with the assigned technique. Each subject had three days to complete the inspection and prepare a defect report. These reports contained discrepancies, which are issues reported by the inspector that could be real defects or false-positives. After finishing the inspection, each subject sent back his/her report and a follow-up questionnaire with comments regarding the Web DUE or the WDP technique. F. Data Collection All inspections reports were delivered on time and none of them were discarded. One of the authors who were responsible for conducting the feasibility study was acting as the inspection’s moderator. The moderator checked all discrepancies within the defect report for incorrect or missing information and also gathered the discrepancies. During the collection activity, the moderator highlighted duplicated discrepancies. Finally he generated a new discrepancies report which contained all discrepancies found without showing the duplicated ones. Readers must note that both the collection activity and the discrepancies report were also verified by another researcher to avoid bias.

We have performed a statistical analysis using the SPSS v 19.0.0 tool [19]. In this analysis, we used α = 0.10 due to the small sample used within this study [3]. In order to compare the effectiveness and efficiency of both samples, we used the Mann-Whitney non parametrical statistic method and the boxplots graph to facilitate visualization. The boxplot comparing the effectiveness indicator is shown in Fig. 3. When analyzing the graph, we can see that the mean from the Web DUE group is slightly higher than the mean from the WDP group. However, the comparison using the Mann-Whitney statistic method showed that there was no significant difference between the groups (p = 0.885). These results suggest that the Web DUE technique and the WDP technique provided similar effectiveness when used to inspect paper based Web pages of a Coupon Website. These results support the null hypothesis H01 and reject HA1. TABLE V QUANTITATIVE RESULTS PER INSPECTOR AND TECHNIQUE.

After the collection activity there was a discrimination meeting. The goal of a discrimination meeting was to classify the discrepancies within the discrepancies report as real defects or false-positives. This meeting was attended by the moderator and two other researchers who were not involved with the study. These researchers possessed good usability knowledge and prior experience in usability evaluation. For each discrepancy reported, the other researchers verified if it was a usability error by evaluating the paper based prototypes. Readers must take note that the moderator, who was involved in the study, did not classify any of the discrepancies in order to reduce biased opinions. Considering all discrepancies, there were a total of 79 real defects. This number was use in the calculation of the effectiveness indicator. In the next section we present the results of the feasibility study and its analysis. V.

ANALYSIS OF THE RESULTS OF THE FEASIBILITY STUDY

In this feasibility study we have analyzed the quantitative and qualitative data. We obtained the quantitative data from the discrepancies’ list resulting from the discrimination meeting. We have obtained the qualitative data from the answers to the questionnaires. In the following subsections we present both the quantitative and qualitative analyses. A. Quantitative Analysis After the discrimination activity, we counted the number of discrepancies, false-positives and defects by inspector. We also considered the time used by each inspector to carry out the inspection process. Table V presents both the results per inspector and the overall results per technique. As mentioned in the previous section, the total number of known usability defects is 79.

Fig. 3. Boxplot comparing the effectiveness indicator. The boxplot comparing the efficiency indicator is shown in Fig. 4. When analyzing the graph, we can see that the mean from the Web DUE group is higher than the mean from

the WDP group. Therefore, the group that used the Web DUE technique was more efficient than the group that used the WDP technique. In order words, the Web DUE technique group was able to find more defects in lesser time. Nevertheless, the comparison using the Mann-Whitney statistic method showed that there was no significant difference between the groups (p = 0.149). These results suggest that: (a) the Web DUE technique and the WDP technique provided different efficiency when used to inspect paper based Web pages of a Coupon Website; and (b) this difference is not significant from a statistical point of view. These results support the null hypothesis H02 and reject the alternative HA2.

- Inspector 1. “It was easy to understand all Web page zones.” - Inspector 2. Inspector 4 added that maybe it would be easier to carry out the inspection, if the Web DUE allowed an easy way to verify repeated Web page zones within a Web page. In other words, if the inspector saw the same Web page zone more than one time within the Web page, he would check all verification items for each repeated zone. This could have caused an increase in the time spent during the inspection. “… In case of a data entry zone, a Web page could contain more than one data entry zone with different purposes. The technique could allow the verification of all these repeated zones in a particular way.” - Inspector 4.

Fig. 4. Boxplot comparing the efficiency indicator. B. Qualitative Analysis The data analysis began with the examination of the answers within the follow-up questionnaires of the subjects who used the Web DUE technique. Readers must note that the subjects using the WDP technique also answered a follow-up questionnaire. However, its questions addressed information about the suitability of using the WDP in the inspection of paper based prototypes. We will not address the follow-up questionnaire of the WDP due to lack of space. Fig. 5 shows the main questions of the follow-up questionnaires for the subjects using the Web DUE technique. The questions aimed to obtain information about the subjects’ overall opinion of the main components of the Web DUE technique: (a) Web page zones, (b) verification items, and (c) examples/explanations for the verification items. We also collected data regarding the adequacy and perceived ease of use of the technique. In this subsection, we discuss our results from the qualitative research on the subjects’ opinion towards the Web DUE technique and its components as follows. Regarding the subjects’ opinion towards the use of Web Page zones, all inspectors agreed that it was easy to understand and identify all Web page zones within paper based Web applications. This is illustrated in the following quotes. “Yes, they were really clarifying; I had no trouble understanding them.”

Fig. 5. Follow-up Questionnaire of the Web DUE technique. Regarding the Web DUE’s verification items, Inspectors 1, 3 and 4 agreed that they were helpful when combined with the Web page zones (See quote from Inspector 4). However, Inspector 2 pointed that some verification items could be used to evaluate the same Web page zone (See quote from Inspector 2). This could mean that some Web pages can be used to evaluate the same components, which confuses the inspectors; or that some verification items are too broad to allow the identification of specific features of the evaluated Web page zone. Furthermore, Inspector 3 stated that some items did not help him decide whether there was a usability problem or not (See quote from Inspector 3). This means that some verification items could not provide a clear description of which nonconformities must be found, or that the verification items are too subjective for the inspector to decide.

“…, the usability verification items contributed to find more discrepancies when analyzing a determined part of the page.” - Inspector 4. “Data entry zones, for example, possess verification items that could be adequate for the evaluation of the help zone. This ambiguity made it difficult to find terms that defined the problems encountered during the inspection.” - Inspector 2. “… However, some verification items judge features, which are not totally wrong or right.” - Inspector 3. Regarding the subjects’ opinion towards the examples and explanations provided along with the verification items, all inspectors agreed they were easy to understand. They stated that the examples and explanations also made it easier to understand the usability verification items. Regarding the technique’s adequacy we found out that one inspector (Inspector 3) found the technique inadequate. Nevertheless, this inspector was not referring to the technique, but to the mapping process of the Web pages. All inspectors agreed that the technique could aid in finding usability problems in Web page prototypes. However, the fact that inspectors had to manually simulate the interaction between the user and the system turned the inspection process very tiring. The following quotes illustrate the inspector’s opinion towards the difficulty in simulating interaction. “Applying the technique in mockups is really confusing when navigating through the pages. Mainly in bigger systems, with a higher number of pages. Furthermore, there is no interaction with the system.” “It was too difficult to simulate using Web pages. Linking Web pages is complex even with the use of the navigation map.” - Inspector 3. Regarding the technique’s perceived ease of use we found out that Inspector 4 found the technique very difficult. He argued that the Web DUE technique was too detailed and very repetitive (See quote from Inspector 4). The other inspectors did not stress that the technique was difficult, but that it was difficult to simulate the interaction as pointed before. “The technique did not seem easy at all. It was too detailed and repetitive during most of the evaluation. On the other hand, it includes a very complete view of the contents of the pages.” - Inspector 4. Inspectors also stated advantages and suggestions of the evaluation of paper based prototypes. According to Inspector

4, mockups allow inspectors to point directly the usability problems (See quote from Inspector 4). Regarding suggestions for the evaluation of paper based prototypes, Inspector 3 indicated that colored mockups could make them more realistic (See quote from Inspector 3). Furthermore, Inspector 2 said that the organization of the paper based mockups is important for simulating the interaction between the system and the user (See quote from Inspector 2). “I found the technique appropriate because in the mockups we can scribble and then, we can relate what we drew in the checklists options.” - Inspector 4. “For a more artistic/visual view of the site, it could be interesting to provide colored mockups.” - Inspector 3. “The use of a graph explaining the interaction among pages, or the organization of the pages according to the activities that are being executed can help in understanding the interaction.” - Inspector 2. The overall results show that the technique can be applied in the evaluation of paper based low-fidelity prototypes of Web applications. However, as the qualitative data indicate, there is still room for improvement. In the following Section, we present the threats to validity of the feasibility study of the Web DUE technique. VI.

THREATS TO VALIDITY

Every experimental study possesses threats that can affect the validity of their results. The following subsection presents the threats to validity considered in this empirical study. These validity threats have been grouped in four main categories: internal, external, conclusion and construct. A. Internal Validity According to Wohlin et al. [21], the internal validity analyses if, in fact, the treatment causes the results. In this study we considered three main threats to the internal validity: (a) training effects, (b) subjects’ usability knowledge, (c) expertise classification, and (d) time measurement. Regarding the training effect, there could be a risk if the quality of the training of the WDP technique had been inferior to the training of the Web DUE technique. However, we controlled this risk by using the same examples of encountered usability problems in both trainings. Furthermore, in order to mitigate the threat of the subject's knowledge, we divided them into balanced groups according to their experience. This measure avoided that the subjects’ experience affected the overall results of the techniques. Another problem could have been the classification of the subjects. In Table V we can see that the subjects were classified according to three criteria. In order to reduce this

threat, the degree of each criterion was assessed through an objective questionnaire. Finally, regarding time measurement, we asked the subjects to be very precise in their measurement. However, we cannot guarantee that these measures were carefully obtained. B. External Validity The external validity is concerned with the generalization of the results [21]. We will discuss four issues regarding the external validity: (a) students are probably not good substitutes for professional inspectors, (b) academic environments do not represent day to day experience in the industry, (c) if a Coupon Website was a representative application of all Web applications, and (d) if the WDP was a suitable technique for comparison. As for the first issue, the use of students as inspectors, we can argue that since we were looking for inspectors with the same degree of usability knowledge and we balanced both teams, students could be used as subjects since none of them had previous experience with any of the techniques. Even though we used an academic environment to carry out the feasibility study, we based the mockups and the interaction between the system and the user, in a real Web application which can help resemble a real industry environment. Regarding the use of a coupon Website, we cannot guarantee that it is not a threat since there are many Web application categories [8]. Finally, using the WDP technique for comparison purposes is a validity threat. However, we can argue that since the Web DUE technique is based on the WDP technique, it is reasonable to compare if the Web DUE presents better results than the WDP when evaluating mockups. C. Conclusion Validity According to Wohlin et al. [21], the conclusion validity is concerned with the relationship between the treatment and the results. In this study, the biggest problem is the statistical power. Since the number of participants is low, the data extracted from this study can only be considered indicators and not conclusive. D. Construct Validity The construct validity is concerned with the relationship between the theory and the observation [21]. The criteria used to measure the feasibility of the technique can be considered a threat if not properly chosen. As effectiveness and efficiency are two common criteria used for investigating the productivity of new techniques, this threat cannot be considered a risk to the validity of the results [4]. Furthermore, we have measured these criteria using the same approach used at [2]. VII.

CONCLUSIONS AND FUTURE WORK

This paper has described the Web Design Usability Evaluation (Web DUE) technique of paper based low-fidelity prototypes of Web applications. The main innovation of the Web DUE technique is its ability to guide inspectors through the inspection process through Web page zones. Furthermore the technique can aid in the identification of usability

problems in early stages of the development process, and therefore reduce the cost of correcting them. We have carried out a feasibility study to answer the following research question: “Is the Web DUE technique feasible regarding the number of detected defects and is time well spent?” In order to answer this question we have measured the results of the Web DUE technique in terms of effectiveness and efficiency, and we compared them to the results of its predecessor the WDP technique. Both the quantitative and the qualitative results of this study provided us with important feedback to improve the Web DUE technique. The quantitative analysis using the Mann-Whitney statistic method showed that the effectiveness indicator is similar in both techniques. However, the Web DUE technique managed a better average result with 25,32% compared to the 22,15% effectiveness rate of the WDP. Furthermore, the analysis using the Mann-Whitney statistic method indicated that there was no significant statistical difference between the efficiency of the Web DUE technique and the WDP technique. Nevertheless, the Web DUE technique managed to find an average of 10.89 defects per hour, while the WDP technique found an average of 6.58 defects per hour. Readers must consider that these results are not conclusive but indicators of the feasibility of the Web DUE technique due to the small sample used. The qualitative analysis showed that, overall, the inspectors that used the Web DUE technique found the technique easy to use and adequate for the evaluation of paper based Web application prototypes. They stated that the technique’s ability to guide the inspection through Web page zones was helpful. Furthermore, the inspectors said that the usability verification items, combined with the examples/explanations provided by the Web DUE, contributed to finding more usability problems. There is still room for improvement in the evaluation of paper based Web application prototypes. Inspectors stated that the main problem with this type of evaluation is that the process of simulating the human computer interaction is really tiring. Therefore, as future work we pretend to create a tool that automatically presents the verification items of the Web DUE technique, and simulates the interaction among the paper based prototyped Web pages. Furthermore, some inspectors stated that colored mockups could aid in providing a more artistic view of the paper based Web pages. We will use this suggestion and include colored mockups in the evaluation process. We found out that according to the inspectors, the main advantages in using mockups, is that they allow the inspector to directly point out the encountered usability problems and make notes. Consequently, we will maintain this feature in the proposed tool, allowing inspectors to place comments and pointers in the evaluated mockups. Regarding the evolution of the Web DUE technique future work involves: (a) verifying which usability verification items can be combined within the Web page zones to reduce effort during the inspection; (b) analyzing and suggesting an alternative inspection process in order to reduce repetitive instructions; and (c) analyzing which

usability verification items or examples/explanations are not clear or ambiguous in order to reduce the inspector’s effort and confusion. We pretend to perform new empirical studies to validate the obtained results and to better understand the technique. As the WDP technique was not crafted for the evaluation of paper based prototypes, we pretend to carry out a new feasibility study. In this study, we will compare the Web DUE technique with a technique specifically crafted for the evaluation of paper based prototypes. Furthermore, we pretend to carry out an observational study aiming to elicit the process used by the usability inspectors when applying the technique; and to answer the question “Do the steps of the process make sense?” We hope that our findings will be useful in the promotion and improvement of the current practice and research of Web usability. We also hope that the proposed technique aids in the evaluation of Web applications in early stages of the development process, improving their quality at a lower cost. ACKNOWLEDGMENT We would like to thank all of the students who participated in the execution of the empirical study. We would also like to thank Priscila Fernandes and Natasha Costa for their aid in the discrimination meeting. REFERENCES [1] Basili, V., Rombach, H.: The tame project: towards improvement-oriented software environments. IEEE Transactions on Software Engineering, Volume 14, Issue 6, 1988. [2] Conte, T., Massollar, J., Mendes, E., Travassos, G.: Web usability inspection technique based on design perspectives. IET Software, Volume 3, Issue 2, 2009. [3] Dyba, T., Kampenes, V., Sjoberg, D.: A systematic review of statistical power in software engineering experiments. Information and Software Technology, Volume 48, Issue 8, 2006. [4] Fernandez, A., Abrahao, S., Insfran, E.: Towards to the validation of a usability evaluation method for modeldriven Web development. Proc. IV International Symposium on Empirical Software Engineering and Measurement, USA, 2010, pp 54-57. [5] Fernandez, A., Insfran, E., Abrahao, S.: Usability evaluation methods for the Web: A systematic mapping study. Information and Software Technology, Volume 53, Issue 8, 2011. [6] Fons, J., Pelechano, V., Pastor, O., Valderas, P., Torres, V.: Applying the OOWS model-driven approach for developing Web applications: The internet movie database case study. In: Rossi, G., Schwabe, D., Olsina, L., Pastor, O.: Web Engineering: Modeling and Implementing Web Applications, Springer, 2008. [7] Juristo, N., Moreno, A., Sanchez-Segura, M.: Analyzing the impact of usability on software design. Journal of Systems and Software, Volume 80, Issue 9, 2007.

[8] Kappel, G., Proll, B., Reich, S., Retschitzegger, W.: An Introduction to Web Engineering. In: Kappel, G., Proll, B., Reich, S., Retschitzegger, W.: Web Engineering: The Discipline of Systematic Development of Web Applications, John Wiley & Sons , 2006. [9] Kumar, V.: Social Coupons as a Marketing Strategy: A Multifaceted Perspective. Journal of the Academy of Marketing Science, Volume 40, Issue 1, 2012. [10] Matera, M., Rizzo, F., Carughi, G. T.: Web Usability: Principles and Evaluation Methods. In: Mendes, E., Mosley, N.: Web Engineering, Springer, 2006. [11] Mendes, E., Mosley, N., Counsell, S.: The Need for Web Engineering: An Introduction. Web Engineering. In: Mendes, E., Mosley, N.: Web Engineering, Springer, 2006. [12] Nielsen J.: Finding usability problems through heuristic evaluation. Proc. CHI’92, UK, 1992, pp. 373-380. [13] Oztekin, A., Nikov, A., Zaim, S.: UWIS: An assessment methodology for usability of Web-based information systems. Journal of Systems and Software, Volume 8, Issue 12, 2009. [14] Polson, P., Lewis, C., Rieman, J., Wharton, C.: Cognitive walkthroughs: a method for theory-based evaluation of user interfaces. International Journal of Man-Machine Studies, Volume 36, Issue 5, 1992. [15] Rivero, L., Conte, T.: Characterizing Usability Inspection Methods through the Analysis of a Systematic Mapping Study Extension. Proc. IX Experimental Software Engineering Latin Workshop, Argentina, 2012. [16] Rivero, L., Conte, T.: Web DUE technique: Usability Verification Items per Web Page Zone. Technical Report RT-USES-2012-001, 2012. Available at: http://www.dcc.ufam.edu.br/uses/index.php/ publicacoes/cat_view/69-relatorios-tecnicos. [17] Rocha, H., Baranauska, M.: Design and Evaluation of Human Computer Interfaces (Book), Nied, 2003. (in Portuguese) [18] Shull, F., Carver, J., Travassos, G. H.: An empirical methodology for introducing software processes. ACM SIGSOFT Software Engineering Notes, Volume 26, Issue 5, 2001. [19] SPSS (2010). SPSS version 19.0.0 for Windows. Chicago, SPSS Inc. [20] Triacca, L., Inversini, A., Bolchini, D.: Evaluating Web usability with MiLE+. Proc. 7th IEEE International Symposium on Web Site Evolution, Hungary, 2005, pp. 22-29. [21] Wohlin, C. Runeson, P., Host, M., Ohlsson, M., Regnell, B., Wessl, A.: Experimentation in software engineering: an introduction (Book), Kluwer Academic Publishers, 2000. [22] Zhang, Z., Basili, V., Shneiderman, B.: Perspectivebased Usability Inspection: An Empirical Validation of Efficacy. Empirical Software Engineering, Volume 4, Issue 1, 1999.