Near-field distance perception in real and virtual environments using ...

Near-Field Distance Perception in Real and Virtual Environments Using Both Verbal and Action Responses PHILLIP E. NAPIERALSKI, BLISS M. ALTENHOFF, JEFFREY W. BERTRAND, LINDSAY O. LONG, SABARISH V. BABU, CHRISTOPHER C. PAGANO, JUSTIN KERN, and TIMOTHY A. DAVIS, Clemson University

Few experiments have been performed to investigate near-field egocentric distance estimation in an Immersive Virtual Environment (IVE) as compared to the Real World (RW). This article investigates near-field distance estimation in IVEs and RW conditions using physical reach and verbal report measures, by using an apparatus similar to that used by Bingham and Pagano [1998]. Analysis of our experiment shows distance compression in both the IVE and RW conditions in participants’ perceptual judgments to targets. This is consistent with previous research in both action space in an IVE and reach space with Augmented Reality (AR). Analysis of verbal responses from participants revealed that participants underestimated significantly less in the virtual world as compared to the RW. We also found that verbal reports and reaches provided different results in both IVEs and RW environments. Categories and Subject Descriptors: I.3.7 [Computer Graphics]: Three-Dimensional Graphics and Realism—Virtual reality; I.4.8 [Image Processing and Computer Vision]: Scene Analysis—Depth cues; H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems—Artificial, augmented, and virtual realities; H.1.2 [Information Systems]: User/Machine Systems—Human factors General Terms: Human Factors Additional Key Words and Phrases: Depth perception, distance estimation, virtual reality, immersive virtual environments, human factors and usability ACM Reference Format: Napieralski, P. E., Altenhoff, B. M., Bertrand, J. W., Long, L. O., Babu, S. V., Pagano, C. C., Kern, J., and Davi, T. A. 2011. Nearfield distance perception in real and virtual environments using both verbal and action responses. ACM Trans. Appl. Percept. 8, 3, Article 18 (August 2011), 19 pages. DOI = 10.1145/2010325.2010328 http://doi.acm.org/10.1145/2010325.2010328

1.

INTRODUCTION

Nearly fifty years ago, Ivan Sutherland created the first stereoscopic Head-Mounted Display (HMD) and presented a vision for the future of Virtual Reality (VR) systems [Sutherland 1965]. Today, some of Sutherland’s goals are finally realized in many state-of-the-art VR systems for entertainment, This research was supported in part by NSF Research Experience for Undergraduates (REU) Site Grant CNS-0850695. Authors’ addresses: P. E. Napieralski (corresponding author), B. M. Altenhoff, J. W. Bertrand, L. O. Long, S. V. Babu, C. C. Pagano, J. Kern, and T. A. Davis, School of Computing and Department of Psychology, Clemson University, Clemson, SC; email: [email protected]. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. c 2011 ACM 1544-3558/2011/08-ART18 $10.00 DOI 10.1145/2010325.2010328 http://doi.acm.org/10.1145/2010325.2010328 ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18

18:2

•

P. E. Napieralski et al.

education, and the study of human behavior [Brooks 1999]. Complex VR systems in which users interact with virtual entities in personal space include applications such as VR therapy [Hodges et al. 2001], interpersonal communication trainers [Johnson et al. 2006], and urban combat simulators [Hill et al. 2003]. Near-field VR has many exciting applications in simulation and telepresence. Training for laparoscopic surgery, for instance, can be performed with Augmented Reality (AR) and VR environments [Peters et al. 2008]. Studies have shown that such virtual training improves operating room performance more effectively than traditional training methods [Seymour 2008]. Telepresence, the area of VR that uses technology to give a user the impression of being somewhere other than his or her current location, remains an important area of research and applications [Hine et al. 1994; Goza et al. 2004]. In one such application, operators using a Head-Mounted Display (HMD) and a joystick can control an undersea robot [Hine et al. 1994]. Using these robotic controls operators remotely pickup and investigate near-field undersea objects from the robot’s perspective. In order to assess performance of users in systems where they operate in simulated near-space conditions, a thorough understanding of perceptual influences such as near-field distance estimation is crucial. In this article we attempt to bridge the gap in understanding the differences in near-field distance estimation in real and virtual environments using physical reach and verbal response measures. 1.1 Background Distance estimation can be categorized into three distinct regions: personal space, or near field, is the distance from 0m to slightly beyond arms’ reach, action space extends to 30m, and vista space is greater than 30m [Cutting and Vishton 1995]. Recent research on egocentric distance estimation in Immersive Virtual Environments (IVEs) has focused on action space measured using blind-walking, imagined timed-walking, bean bag throwing, and triangulated walking techniques [Grechkin et al. 2010; Ziemer et al. 2009; Klein et al. 2009; Richardson and Waller 2007; Interrante et al. 2006; Messing and Durgin 2005; Sahm et al. 2005; Thompson et al. 2004; Loomis and Knapp 2003]. Current research in action space has shown that people can accurately estimate distances up to 20m in the Real World (RW) but grossly underestimate targets in the virtual world [Witmer and Kline 1998; Loomis and Knapp 2003]. Grechkin et al. [2010] compared distance estimation in various presentation conditions including a RW view with a see-through HMD, virtual world with HMD, AR with HMD photo viewed with large-screen immersive display (LSID) and IVE viewed with a LSID. These conditions were tested using both imagined timed-walking and blind-walking in two different experiments. The LSID conditions were excluded from the blind-walking experiment due to space constraints. Distances were underestimated in all VR, LSID, and AR conditions. Interestingly, both photograph and IVE LSID conditions achieved similar underestimation results. This implies that the quality of graphics has no effect with imagined time-walking in these nonstereoscopic setups. Also interesting is that the HMD with see-through display condition had no effect in the RW with imagined timed-walking, but it showed some underestimation with blind walking. This underestimation could partially, but not fully, be caused by the weight and forces from the HMD during the actual walking [Willemsen et al. 2009]. Further, presentation order was also shown a significant factor in action space distance estimation [Ziemer et al. 2009]. Ziemer et al. [2009] had participants perform imagined timed-walking in a RW setting, followed by an IVE setting and vice versa. Conditions were performed 3–4 minutes apart. Participants underestimated the distances to targets in the RW condition after performing distance estimation in the IVE condition, showing a carryover effect from IVE to RW. Furthermore, they noted that this underestimation of distance judgments in the RW, after exposure to an IVE, was still less than the underestimation typically observed in participants that only received the IVE condition. ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

Near-Field Distance Perception in Real and Virtual Environments

•

18:3

Kunz et al. [2009] showed there was no significant difference in blind-walking distance judgments between a low- and high-quality VR environment but there was a significant difference in verbal reports of distances. The low-quality scenario contained simple textures and simple geometry while the high-quality scenario contained photorealistic textures and realistic geometry. When participants were asked to give a verbal report of distances for each trial, the distances reported in the low-quality setting were much shorter than those in the high-quality virtual environment. The authors cite three potential reasons for this, including a model known as two visual systems which hypothesizes that different neurological streams are responsible for verbal reports and actions [Milner and Goodale 1995, 2008]. Research done in personal space for AR has shown a similar underestimation result using a slider apparatus for measuring distance judgments [Singh et al. 2010; Ellis and Menges 1998]. The slider apparatus is able to test both visually closed-loop, where the subject reaches with sight of an LED pointer, and open-loop, where the subject reaches without sight of the pointer. In all cases, the findings show an underestimation, with more pronounced underestimation in the open-loop task [Singh et al. 2010]. Other research in personal space has tested monocular and binocular vision under reduced-cue conditions, with a target often appearing as a small point of light or luminous disk [Foley 1997; Bingham and Pagano 1998]. A general finding is that with monocular viewing, perceived distances tend to underestimate actual distances and verbal responses provide less accurate and more variable responses than manual pointing or reaching [Foley 1977; Pagano and Bingham 1998]. Depending on the number and type of cues eliminated, overestimations in near space have also been found [Foley 1985]. With feedback, however, binocular reaches become accurate while monocular reaches remain inaccurate [Bingham and Pagano 1998]. It seems likely that reaches to near space can be accurate in all viewing conditions, RW, IVE, AR, etc., so long as ample perceptual information is available and feedback is used to calibrate away any initial errors. It is unclear if verbal responses can provide a reliable measure of egocentric distance perception. 1.2 Related Work Distance estimation in an IVE has been widely studied in action space. The most common techniques used when measuring distance estimation with a HMD are blind-walking, throwing, and triangulated walking. Blind-walking involves allowing a user to first view a target, and then walk to the target with eyes closed [Messing and Durgin 2005; Loomis and Knapp 2003]. Throwing allows the subject to view the target as in blind-walking, but the subject instead throws an object, again with eyes closed, towards the viewed target [Sahm et al. 2005]. A final common technique, triangulated walking, allows the subject to walk in a direction different from the target, stopping after some distance, then walk to the perceived target [Richardson and Waller 2007; Thompson et al. 2004]. Messing and Durgin [2005] show a 23% compression of the actual distance when blind walking to targets in an IVE. Sahm et al. [2005] show a 30% compression of distance in the IVE versus the RW. Finally, Richardson and Waller [2007] show a 54% compression of the actual distance in the virtual environment with the triangulated walking method. To the best of our knowledge, this article is one of the first to investigate physical arm reaches in egocentric reach space in an IVE. The closest previous research is that by Bingham and Pagano [1998] and Singh et al. [2010]. In Bingham and Pagano [1998], participants used a monocular HMD to view a luminous disk that was floating in black space in the RW. Under these conditions depth perception was only possible when the head was moved. Participants were asked to look at a target located from 50% to 90% of their maximum arm reach, close their eyes and then make a physical reach using a stylus to where they perceived the target to be. A restricted Field Of View (FOV) resulted in compressions of ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:4

•


depth for monocular RW viewing. With monocular restricted FOV the subjects showed improvement over trials when provided with feedback, and thus the compression in depth due to restricted FOV was removed with calibration. In contrast, the compression due to monocular viewing alone (without restricted FOV) was not removed by calibration with feedback. In Singh et al. [2010] reach space distance estimation in AR was tested using a modified slider apparatus from Ellis and Menges [1998]. Participants were shown a virtual spinning diamond target and were asked to align a Light-Emitting Diode (LED) on a slider to the virtual target in the closed-loop task, or to reach to the perceived distance using a slider under the table in the open-loop task. Participants’ vision of the virtual object was occluded before giving their estimations in half the trials. In general, both open-loop and closedloop estimations resulted in underestimations of the actual distances. Based on previous research on distance estimation in action space and reach space, it seems likely that an investigation of reach space in an IVE that employed the same setup as Pagano and Bingham [1998] would result in underestimations of distance. Ellis and Menges [1997] showed that convergence may have some effect on this depth perception in a virtual environment. In this article, we have conducted an experiment to examine a subject’s distance estimation in an IVE using an apparatus similar to that used by Bingham and Pagano [1998; Pagano and Bingham 1998]. Our contribution includes examining egocentric distance estimation in personal space using two viewing conditions: RW and an IVE, and two response methods: verbal reports and reaches. Research investigating different types of responses has typically employed a blocked methodology in which responses made in one mode during one set of trials are compared with the responses made in another mode during a different set of trials [Philbeck and Loomis 1997]. This comparison is even made between-subjects [Kunz et al. 2009; Mon-Williams and Tresilian 1999; Wang 2004]. A problem with these methodologies is that differences observed between the two responses may be due to the fact that they were made at different times, under different conditions and/or by different subjects. A much stronger test of differences between the two responses is to test them within-trial, where the subject makes a single observation of a target distance and then from this single observation they make both a verbal judgment and a manual reach. In this way the two responses are tested simultaneously. To date only a small handful of studies have used such a within-trial methodology to compare verbal reports and action responses made at the same time [Pagano and Bingham 1998; Pagano et al. 2001; Pagano and Isenhower 2008]. Following this past work, we chose to use a within-trial methodology to employ both verbal reports and manual reaches to compare egocentric distance perception in RW and an IVE. The remainder of the article is structured as follows. Section 2 discusses the experiment design, apparatus, and procedure. Section 3 describes the results of the experiment. Section 4 provides an analysis of the results. Finally, Section 5 concludes. 2.

EXPERIMENT

The primary aim of this study was to compare physical and verbal distance judgments in near field or reaching spaces between RW and IVEs. We specifically asked the following research questions: (1) Are near-field distances perceived differently in real and virtual environments? (2) What are the differences between physical reaches and verbal responses to near-field distances in both real and virtual environments? Participants made verbal reports concurrently with reaches in either the RW or in the IVE condition. In our study these two types of responses were made immediately after viewing the target. Visually perceived egocentric distances were measured in each trial using both verbal reports and reaches after vision was occluded. In the following subsections the experiment design, apparatus, participants, and procedure will be detailed. ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:5

Fig. 1. The image shows our near-field distance estimation apparatus. The target, participant’s head, and stylus are tracked in order to record actual and perceived distances of physical reach in both the IVE and RW conditions.

2.1 Participants The experiment included 14 volunteers from a population of Clemson University students. The participants ranged in age from 19 to 24; the mean age was 20.5. Three were female and 11 were male. Each participant was tested over two sessions at least two days apart to eliminate carryover effects. Half of the participants performed the IVE condition first while the other half performed the RW condition first. While the majority of participants did not experience difficulty learning the experiment, we did not use data for three participants who did not follow the instructions. 2.2 Experiment Design The experiment used a within-subjects 2(condition) × 2(measures) factorial design, where participants were initially assigned to one of two conditions (RW or IVE). After completing trials in one condition, participants returned two days later to complete the other condition. The experiment was conducted in this manner in order to eliminate any carryover or learning effects within the two conditions. In both the RW and IVE conditions, participants were presented with five random permutations of six target distances corresponding to .50, .58, .67, .75, .82 and .90 of the participant’s maximum reach for a total of 30 trial distances as in Pagano and Bingham [1998], Pagano et al. [2001], and Pagano and Isenhower [2008]. The primary dependent variable was distance as a percentage of maximum arm reach for both verbal reports and physical reaches. The verbal reports were made on a scale from 0 to 100, where 0 represented the subject’s shoulders and 100 their maximum arm reach. Verbal reports in intrinsic body scaled units should be more natural than the use of extrinsic scales such as inches or centimeters, with an extrinsic scale likely requiring an unconscious transformation from an intrinsic one [Bingham and Stassen 1994; Warren 1995]. 2.3 Apparatus and Materials 2.3.1 General Setup. Figure 1 depicts the apparatus used. Participants were seated in a wooden chair, and their shoulders were strapped to the back of a chair so as to allow freedom of movement of the head and arm while restricting motions of the shoulder. Participants reached with a wooden stylus that was 26.5 cm long, and 0.9 cm in diameter and weighing 65g. The participants held the stylus ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:6

•


Fig. 2. The image on the left shows a screen shot of the virtual target, and the image on the right shows the real target as perceived by participants in the IVE and RW conditions respectively.

in their right hand so that it extended approximately 3 cm in front and 12 cm behind their closed fist. Each trial began with the back end of the stylus inserted in a 0.5 cm groove on top of the launch platform. The launch platform was located right next to the hip of the participant facing parallel to the optical axis. The target consisted of a 0.5 cm deep vertical 8.0 cm × 1.2 cm groove that extending from the center to the base of a 8.0 cm wide × 16 cm tall white rectangular target (Figure 2). The edges of the target were covered by a 0.5 cm thick black tape, so that the participant could distinguish the target from the stucco background of the wall. The target was positioned in front of the participant along the optical axis, approximately midway between the participant’s midline and right shoulder (Figure 1). Therefore, the target was positioned such that the distance from the shoulder to the target would be as close as possible to the distance from the eyes to the target. The egocentric near-field distance to the target was adjusted by the experimenter using mounts attached to a 200 cm optical rail that extended parallel to the participant’s optical axis. The target was attached to the optical rail via an adjustable hinged stand. The target, stand, and stylus were made of wood. The aluminum optical rail was mounted on a wooden base. 2.3.2 Visual Aspects. In the IVE condition, participants wore a Virtual Research VR 1280 HMD weighing 880g. The HMD contained two LCOS displays each with a resolution of 1280 × 1024 pixels for viewing a stereoscopic virtual environment. The field of view of the HMD was determined to be 48 degrees horizontal and 36 degrees vertical. The field of view was determined by rendering a carefully registered virtual model of a physical object, and asking users to repetitively report the relative size of the virtual object against the physical counterpart through a forced-choice method. In the RW condition, participants donned a 352g field of view occluder that restricted the field of view of the participants to visually match that of the HMD (Figure 3). At the beginning of the experiment a calibration step was performed for each participant to ensure that his/her eyes were centered on the HMD display screen. A calibration pattern was displayed on the screen that consisted of concentric series of colored rectangular rings. Participants were asked to adjust the HMD such that the device was snug and they were able to see equal amounts of the same color on the top and bottom of the screen in both eyes. Next, they adjusted the HMD’s eyepieces to center on each eye horizontally on its screen by adjusting the inter-pupillary distance (IPD) knobs on the HMD. Researchers have theorized that the quality of the virtual environment could potentially be an important factor in perceiving distances [Loomis and Knapp 2003; Kline et al. 2009]. Our goal was to provide a photorealistic representation of the physical environment in which the real-world perception tests were performed. According to Ferwerda’s classification scheme of visual realism [Ferwerda 2003], physical realism would be preferable; however, it requires that the image provide the same visual stimulation as the real-world scene, including accurate replication of spectral irradiance, which ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:7

Fig. 3. The image shows a participant wearing the Field Of View (FOV) occluder with an electromagnetic sensor to track user’s head position in the RW condition. The FOV occluder was designed to match the FOV of the HMD.

Fig. 4. The left image shows a screenshot of the training environment from the participant’s first person perspective with HMD. The right image shows a screenshot of the avatar as seen from the participant’s perspective.

is not currently possible given the limitations of interactive rendering on a head-mounted display. At the other end of the visual scale is functional realism which only requires that the image provide the same visual information as the scene. As such, this level of detail can be achieved with a variety of techniques, including nonphotorealistic and abstract rendering methods [Haller 2004; Phillips et al. 2009]. Somewhere between these two varieties of visual realism lies photo-realism, in which the image produces the same visual response as the real-world scene. In order to keep the visual experiment scene as consistent as possible between the IVE and RW conditions, we strove to model and render the virtual setting to be similar to the physical setting. An accurate virtual replica of the experiment apparatus and surrounding environment were modeled using Blender. The virtual replica of the apparatus and surrounding environment included target, stand, chair, room, tracking system, stylus, and a virtual body representing the participant were also modeled. The gender-neutral model of a virtual body seated in the participant’s chair was meant to provide the participant with an egocentric representation of self whenever the participant glanced down at herself (see Figure 4). ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:8

•


We have attempted to achieve this level of realism by not only matching the size and placement of objects located in the real-world environment exactly, but by matching the textures and lighting as well. The accuracy of the scale and size of the virtual objects in the IVE experiment setup was insured by careful hand measurements of each of the physical objects in the RW experiment setup. Many of the textures of the synthetic world are simply photographs of the real-world objects. Great care was taken to match the objects exactly, especially those that were involved in the experiment, such as the virtual target, as shown in Figure 2. We also employed state-of-the-art rendering techniques such as radiosity and render to texture, to match the visual quality of the virtual environment and apparatus to the physical experiment setting as close as possible. These efforts were largely undertaken to prevent any adverse effects on perception in the virtual world, which can occur in nonphotorealistic virtual environments [Phillips et al. 2009]. The computational environment that hosted the distance estimation system consisted of a Dell Precision workstation with a quad core processor and dual NVIDIA Quadro FX 5600 SLI graphics cards. The distance estimation system that rendered the IVE condition in HMD stereo, ran the tracking system, and measured as well as recorded the perceived physical reaches in tracker coordinates in both conditions was developed in OpenGL and the Simple Virtual Environment toolkit (SVE) [Kessler et al. 2000]. The distance estimation experiment system ran at an application frame rate of 45Hz. 2.3.3 Tracking and Measurement of Physical Reach. A 6 degree of freedom Polhemus Liberty electromagnetic tracking system was used to track the position, and orientation of the participant’s head, stylus, and target in both the IVE and RW conditions. Prior to conducting the experiment, the Polhemus tracking system was calibrated to minimize any interference due to metallic objects in the physical environment, through the creation of a distortion map, using a calibration apparatus and proprietary software from the manufacturers of the tracking system. This calibration step ensured that the sensor position reported by the tracking system was accurate to 0.1cm, and the sensor orientation was accurate to 0.15 degrees. Measurements of the participant’s physical reach were measured from the position of the target face to the origin of the optical rail as reported by the tracking system in centimeters (cm) in both conditions. Raw position and orientation values of the tracked sensors as well as the measured perceived and actual distances for each trial were logged in a text file by the experiment system in both the IVE and RW conditions for each participant. This data was later used to analyze the results of the experiment. To ensure proper registration of the virtual target and stylus with their real counterparts, we carefully aligned the virtual object’s coordinate system with that of the tracking sensor’s coordinate system. We also determined the relationship between the coordinate system of the tracking sensor on the participant’s head (on top of the HMD) and the coordinate system of the HMD’s display screen (computer graphics view plane), to ensure proper registration of the virtual environment to the physical environment as perceived by the participant. 2.3.4 Sensor Data Analysis. Rather than analyzing speed profiles from the raw data, sensor position, and orientation for the HMD, the stylus and the target face at the end of each reach were logged by the experimenter at the keyboard via a key press. A key was pressed when the participant’s arm was fully extended at the end of each trial to show the physical reach response of the estimated distance to the target, before bringing the arm back to the loading dock. It has been suggested that the initial gross motion phase of the participant’s hand more closely denotes the visually perceived distance, rather than the end of the secondary hand motion phase towards fine adjustment [Bingham and Pagano 1998]. After conducting the experiment, it was important to empirically evaluate how well the data logged by the experimenter with a key press aligned with the end of the gross movement phase of the physical ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:9

reach made by participants. A data filtering operation was performed similar to that in Bingham and Pagano [1998], Pagano et al. [2001], and Pagano and Isenhower [2008], and it involved using the raw data to analyze the kinematic profiles of the sensor movements. Raw position data from the sensors was filtered using forward and backward passes of a second-order Butterworth filter with a resulting cutoff at 5 Hz. The speed between adjacent raw data entries was then computed. A graph of the speed calculations showed two gross movement phases by each participant: swinging the hand to the target, and returning the hand back to the loading dock. It was determined that the position of the stylus with speed closest to zero after the first gross movement phase was the end of the participant’s distance estimate. This analysis was done on 90 different trials from three subjects and we compared the position of the stylus based on our preceding analysis at the end of the first gross motion phase to the position of the stylus logged via a key press by the experimenter. Performing a linear regression on all the data showed a strong correlation (r 2 = .974) between the key press sensor data and the raw data, therefore indicating that the key press logs of the experimenters was a reliable method of recording the position of the hand denoting the distance judged by the participants via physical reach at the end of the gross motion phase. 2.4 Procedure As mentioned previously, each participant was involved in two separate sessions, one in the IVE and one in the RW, at least two days apart. At the beginning of the first session, each participant completed a standard consent form and a brief demographic survey. A Stereo Fly SO-001 stereo vision test was used to assess the participant’s stereo acuity. Participants were asked to describe a fly that could be perceived as raised above the plane of the image with passive stereo glasses, and then were asked to catch the wings of the fly. All participants passed the stereo acuity test, and were able to perceive the fly as hovering above the stereo test image plane. All participants were right-handed and had normal vision, or corrected to at least 20/32 vision. After passing the necessary vision tests, the participant was seated and loosely strapped in a chair to restrict movement of the trunk and shoulders but to allow movement of the arm. In both conditions, the participant’s maximum arm reach was measured by instructing the participant to place the stylus in the groove of the target. The target was then moved forward or backward until the subject’s arm was fully extended and the stylus was perpendicular to the floor. This maximum arm reach distance was used to generate the trial distances at which the apparatus would be placed during the experiment. The participant was also instructed on how to make verbal reports. In the IVE condition the participant’s inter-pupillary distance was measured with a ruler and used as a parameter to the experiment program, and was used to specify the inter-pupillary distance in the graphical simulation. The HMD was then placed on the participant and calibrated using an image of concentric rectangles. The participant was instructed to adjust the HMD, and to rotate two knobs in the front to focus the image until the rectangles in both displays were aligned. Once the participant was satisfied, and the HMD fastened to the head, an IVE training environment was presented to help the subject adjust to using the device and the head-coupled motion. The environment used a near-perfect replica of the real-world testing environment except that the testing apparatus could not be seen. Additionally, the training environment included a few objects not present in the RW, such as a television and a poster. The participant was asked to move their head around in order to view the objects for a minute in this environment. Then the participant was asked simple questions to ensure they had properly adjusted to the head motions and the viewing conditions of the IVE (e.g., What is on the television? What time is on the clock?). See Figure 4 for a screenshot of this training environment. After this training phase one of the experimenters would press a keyboard key to initiate the testing environment. ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:10

•

P. E. Napieralski et al. Table I. Slopes and Intercepts (in arm length units) for Simple Regressions Predicting Indicated Target Distance from Actual Target Distance in Each Viewing Condition (IVE or RW) and Response Condition

Subject 1 2 3 4 5 7 9 10 11 12 14 Overall

Verbal Reports IVE RW Slope Intercept Slope Intercept 1.42 −63.7 .92 −44.5 1.35 −17.7 1.18 −36.8 .68 3.3 .87 −9.2 1.63 −58.0 .90 −33.2 .95 −20.6 1.00 −14.8 1.18 −45.6 1.31 −53.4 1.44 −49.8 1.24 −40.5 .77 −19.2 .71 −12.8 1.17 −35.0 1.06 −13.2 1.57 57.4 1.01 −28.5 1.55 −48.5 1.11 −38.1 1.25 −27.0 1.03 −29.6

Reaches IVE RW Slope Intercept Slope Intercept .60 11.9 1.11 −27.1 1.04 −.5 .93 −9.0 .88 17.9 .77 −3.8 .78 12.1 .66 13.8 .72 −7.2 .68 6.6 .85 −3.3 .43 17.2 1.00 −19.2 .73 −2.0 .78 .5 .63 27.5 .73 −9.8 1.06 −18.1 .86 6.5 .70 10.1 1.01 −9.1 .74 4.0 0.84 0.0 0.77 1.7

The testing environment consisted of a photorealistic virtual representation of the real environment surrounding the participant. Instructions were repeated to the participant on making reaches and how to make verbal reports for each trial. Each subject was given at least one practice trial before beginning the collection of the experiment data. For each trial, with the participant’s eyes closed or the HMD display turned off, the target distance was adjusted. The participant then viewed the target until he or she felt comfortable with the target distance. The participant notified us by saying “ready.” At this point in the RW condition, the subject closed their eyes. In the IVE condition, the HMD video was turned off via a key press and the target was immediately swung out of the way to prevent any haptic feedback. The participant first made a verbal report, immediately followed by a physical reach using the stylus much, like in Pagano and Bingham [1998], Pagano et al. [2001], and Pagano and Isenhower [2008]. The experimenter at the keyboard then pressed a key to record all of the sensor data from the tracking system pertaining to the position of the stylus (hand), target face, and head to a log file. To reduce aural cues about the target position during adjustment of the target on the optical rail for the next trial, white noise was played in the participant’s headphones. This sound was also a cue to the participants to return their hand back on the stylus loading dock in preparation for the next trial. The next trial number would then be read and the target distance adjusted, with the participant’s eyes closed and HMD display turned off. After 30 trials, some participants were asked to repeat particular trials if, for instance, they made a verbal report and reach in the wrong order. 3.

RESULTS

The slopes and intercepts of the functions predicting indicated target distance from actual target distance for the individual subjects in each condition are presented in Table I. Multiple regression techniques were used to determine if the slopes and intercepts differed between the two viewing conditions and between the two response measures. Multiple regressions are preferable to ANOVAs because they allow us to predict a continuous dependent variable (indicated target distances) from both a continuous independent variable (actual target distances) and a categorical variable (condition) along with the interaction of these two. With ANOVAs we are restricted to treating all of the independent variables as categorical. Also, the slopes and intercepts given by regression techniques are more useful than other ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:11

Fig. 5. Physical reaches (top) and verbal estimates (bottom) as a function of the actual target distances for IVE and RW viewing. Note there is a significant effect of condition for reaches, though, it is very small.

descriptive statistics such as condition means and signed error because they describe the function that takes one from the actual target distances to the perceived target distances. 3.1 Comparing RW and IVE 3.1.1 Reaches. Overall, the slopes for the reaches were .77 and .84 for the RW and IVE sessions, respectively. The intercepts were 1.74 and -0.03 (in arm length units), respectively. Figure 5 (top) depicts the relation between actual target distance and the distances reported via reaches for the two sessions. Each point in Figure 5 represents the judgments made by an individual subject to a given target distance. A multiple regression confirmed that the reaches made in the RW session were different from the reaches made in the IVE session. To test for differences between the slopes and intercepts of the two different viewing conditions, this multiple regression was performed using the actual target distances and viewing conditions (coded orthogonally) to predict the reach distances. The multiple regression was first performed with an actual target distance X condition interaction term, yielding an r 2 = .419 (n = 655), with a partial F of 456.98 for actual target distance ( p < .0001). The partial Fs for both viewing condition and the interaction term were less than 1, with a partial F of ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:12

•


.0005 for condition ( p = .994) and .53 ( p = .466), respectively, with the partial F for viewing condition increasing to 12.62 ( p < .0001) after the removal of the interaction term. Put simply, the partial F for actual target distance assesses the degree to which the actual target distances predict the variation in the responses after variation due to the other terms (viewing condition and the interaction) having already been accounted for. Thus, the partial F for actual target distance tests for a main effect of actual target distance. The partial F for viewing condition assesses the degree to which the intercepts for the two sessions differ from each other and thus test for a main effect of viewing condition. The partial Ffor the interaction term assesses the degree to which the slopes for the two conditions differ from each other. Thus, the multiple regression revealed a statistically significant main effect for actual target distance, as well as a main effect for viewing condition (reaches made in IVE versus reaches made in the RW), but did not reveal an interaction. Therefore, the slopes of the functions predicting reached distance from actual distance did not differ for the two viewing conditions, while their intercepts did. Overall, the reaches were slightly farther in the RW than in the IVE, but this difference was very small, only 1.8 cm on average. A simple regression predicting the reaches from actual target distance resulted in an r 2 = .407 (n = 655), indicating that the difference between viewing in RW or IVE accounted for only 1.2% of the variances in the reaches. In sum, the reaches were very similar in IVE and RW. 3.1.2 Verbal Reports. The slopes of the functions predicting indicated target distance from actual target distance for the verbal judgments were 1.03 and 1.25 for the RW and IVE viewing conditions, respectively (see Figure 5, bottom). The intercepts were −29.5 and −27.0 (in arm length units), respectively. A multiple regression predicting the verbal judgments from actual target distance and session was first performed with an actual target distance X condition interaction term, yielding an r 2 = .507 (n = 655), with partial Fs of 636 for actual target distance ( p < .0001), 1.97 for viewing condition ( p = .16), and 6.44 for the interaction term ( p = .011), with the partial F for viewing condition increasing to 29.15 ( p < .0001) after the removal of the interaction term. This multiple regression confirmed that, unlike reaches, the verbal judgments changed in both slope and intercept as a function of changes in the viewing conditions. Overall, as the actual distances increased the verbal reports increased at a much higher rate in the virtual world than in the RW, 1.25 compared to 1.03, respectively. A simple regression predicting the verbal reports from actual target distance resulted in an r 2 = .480 (n = 655), indicating that the difference between viewing in RW or IVE accounted for 2.7% of the variance in the verbal reports. In sum, the verbal reports were different in IVE compared to RW. 3.2 Comparing Reaches and Verbal Reports 3.2.1 RW Viewing. Next we compare the verbal reports to the reaches made within each of the two viewing conditions (see top of Figure 6). In the RW the slopes of the functions predicting indicated target distance from actual target distance were 1.03 and .77 for the verbal reports and the reaches, respectively. The intercepts were −29.5 and 1.7 (in arm length units), respectively. A multiple regression predicting the judgments from actual target distance and response mode (verbal or reach) was first performed with an actual target distance X condition interaction term, yielding an r 2 = .570 (n = 655), with partial Fs of 524.73 for actual target distance ( p < .0001), 34.05 for viewing condition ( p < .0001), and 5.05 for the interaction term ( p = .025). This multiple regression confirmed that in the RW the verbal judgments were very different from the reaches that were made within the same trial and which were thus directed at the same target distance. Overall, as the actual distances increased the verbal reports increased at a higher rate than the reaches and this was accompanied by a large intercept difference. This is a very large effect. A simple regression predicting indicated target distance from actual target distance resulted in an r 2 = .347 (n = 655), indicating that the difference between ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:13

Esmate (% Arm's Reach)

1 0.8 0.6 Verbal

0.4

Reach 0.2 0 0

0.5

1

Actual Distance (% Arm's Reach)

Esmate (% Arm's Reach)

1 0.8 0.6 Verbal

0.4

Reach

0.2 0 0

0.5

1

Actual Distance (% Arm's Reach) Fig. 6. Interaction between actual target distance and verbal/reach estimates for (top) RW and (bottom) IVE.

the reaches and the verbal reports accounted for 22.1% of the variance in the responses. In sum, in the RW the verbal reports and the reaches were different. 3.2.2 IVE Viewing. We also compare the verbal reports to the simultaneous reaches made within IVE (see bottom of Figure 6). The slopes of the functions predicting indicated target distance from actual target distance were 1.25 and 0.84 for the verbal reports and the reaches, respectively. The intercepts were −27.0 and −0.3 (in arm length units), respectively. A multiple regression predicting the judgments from actual target distance and response mode (verbal or reach) was performed with an actual target distance X condition interaction term, yielding an r 2 = .503 (n = 655), with partial Fs of 536.11 for actual target distance ( p < .0001), 50.23 for viewing condition ( p < .0001), and 27.2 for the interaction term ( p < .0001). This multiple regression confirmed that in IVE the verbal judgments and reaches were different from each other despite being performed within the same trial. Overall, as the actual distances increased, the verbal reports increased at a higher rate than the reaches and this was accompanied by a large intercept difference. As with RW this is a large effect. A simple regression predicting indicated target distance from actual target distance resulted in an r 2 = .407 (n = 655), indicating that the difference between the reaches and the verbal reports accounted for 9.6% of the variance in the responses. In sum, in both the IVE and the RW the verbal reports were very different from the reaches. Also, the differences between the verbal reports and the reaches made within each viewing condition were much greater than the differences between the reaches made in IVE and RW. The verbal reports, ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:14

•


however, were affected by the viewing condition to a greater extent than the reaches. Thus the effect of response mode was much greater than the effect of viewing condition, with the reaches remaining more consistent between the viewing conditions than the verbal reports. 4.

DISCUSSION

We investigated egocentric distance perception for targets in personal space presented in the RW and in an IVE. Within each of these viewing conditions we tested both manual reaches and verbal reports as two modes for participants to use in indicating perceived distance. Within each experimental trial the subjects viewed the target and then responded to the single distance with both a verbal report and a manual reach. In past research verbal estimates have been found less accurate and more variable than reaches. Manual responses may be subject to variability associated with motor outputs and the perception of limb extension. Such variability should be absent in verbal reports. Nonetheless, verbal reports tend to be at least twice as variable as reaching or pointing and thus verbal reports are less reliable [Foley 1977; Pagano and Bingham 1998; Pagano and Isenhower 2008; Pagano et al. 2001; Gogeland and Tietz 1979]. Verbal reports are also less stable, with systematic errors changing dramatically between experimental conditions, between experimental sessions in which conditions are held constant, and between subjects within a single condition. For example, in experiments by Pagano et al. [2001] the slopes of the functions predicting verbal judgments from actual target distances increased when a 6-second delay was imposed between the target presentation and the responses and further increased with a delay of 12 seconds. No differences were observed in the concurrent reaches, they remained stable. Mon-Williams and Tresilian [1999] found that perceived size alters verbal reports but not pointing behavior. Pagano and Bingham [1998] found that reaches became more accurate after feedback, decreasing in both systematic and variable error. Verbal judgments changed as well, but they did not become more accurate and remained twice as variable as the reaches. The feedback seems to have anchored the verbal judgments relative to the nearest target distance. Unlike the reaches, the verbal judgments did not appear to be based on the absolute distance of the target, only on the distance relative to the closest targets experienced. Pagano and Isenhower [2008] investigated this further by manipulating the subjects’ expectations of the possible target distances. The subjects were instructed that the targets would be between .25 and .90 of their maximum arm reach in one condition and between .50 and 1.00 in another, while the targets were actually between .50 and .90 in both conditions. The verbal judgments were altered by the instructions, matching the expected range, while the reaches were unaffected and remained accurate. Thus while reaches are indicative of absolute metric distances, with each reach being based on the actual target distance for a given trial, verbal reports only reflect relative perception and are influenced by the expected range of targets or the range of distances experienced during an experiment. As a result, it has been suggested that verbal responses are inappropriate for investigating absolute or egocentric perception [Pagano and Bingham 1998; Pagano and Isenhower 2008]. Nonetheless, verbal measures remain a major methodological tool for the investigation of distance and size perception. Consistent with previous research [Pagano and Bingham 1998; Pagano et al. 2001; Pagano and Isenhower 2008], we found that verbal reports of egocentric distances differ substantially from concurrent reaches. Importantly, we found that this holds true for both the RW and an IVE. For both viewing conditions the slopes of the functions predicting reported distance from actual distance were much greater for the verbal reports than for the reaches, while the intercepts were much lower for the verbal reports than for the reaches (see Figure 5). Thus for the near targets the verbal reports were much lower than the reaches while the far targets tended to be greater than the reaches. Overall, the difference in response modality accounted for a large proportion of the variance in the participants’ ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:15

responses, 9.6% in IVE and 22.1% in the RW. In general the reaches tended to be more accurate and more consistent. It remains an open question whether the visual system produces a single internally represented perceived depth that is used to generate separate output functions for different response modes [Brunswick 1956; Foley 1977, 1985; Gogel 1993; Philbeck and Loomis 1997], or whether anatomically distinct visual systems underlie “cognitive” versus “motor” responses [Bridgeman et al. 1981; Milner and Goodale 1995, 2008]. Past research has obtained different results when egocentric distance judgments were made through verbal reports compared to manual pointing or reaching [Foley 1977, 1985; Gogel 1968; Pagano and Isenhower 2008]. These differences, however, are not enough to demonstrate the presence of two distinct neurological streams. This is because a unitary perceived depth will be subjected to different output functions that differentially scale the various responses [Foley 1977; Gogel 1993; Philbeck and Loomis 1997]. These output functions are likely to be calibrated separately [Pagano and Bingham 1998; Rieser et al. 1995]. However, if a single internally represented perceived distance is used to make all responses, then random (i.e., variable) errors in the two response measures should be correlated. If, for example, random error causes a single internally represented perceived distance to overestimate actual distance on a given trial, then one would expect that both the verbal response and the reach would be greater for that trial, because they are both generated from that single perceived distance. Such errors have been found uncorrelated, suggesting the presence of distinct perceptual processes for separate response modes [Pagano et al. 2001; Pagano and Bingham 1998]. As in the present experiments, the verbal reports were made in arm length units, with 100 corresponding to maximum reach, yet they still remained distinct from the reaches. With this in mind it is important to note that perceivers attune to different sources of information depending on what types of responses they intend to make [Pagano et al. 2001; Withagen and Michaels 2005]. That is, perception-for-cognition may be a different perceptual process than perception-for-action, relying on different types of optical information drawn from the environment and relying on a separate calibration process. Thus the perceptual system must generate separate perceptions to support the varying responses, whether or not it does so through anatomically distinct neurological streams. An artificial environment must support each of the likely responses that will be executed within it, and it must do so by supporting each of the “perceptions” underlying the responses and each of the calibration processes. The results of the present experiment also indicate that egocentric distance perception differed between the RW and the IVE. This difference, however, was more pronounced in the verbal reports than in the manual reaches, underscoring the instability of verbal reports. For the reaches, the participants tended to underestimate the target distances, with this underestimation increasing as target distance increased. The slopes of the functions predicting the reaches from the actual target distances were .84 and .77 for IVE and RW, respectively. For the verbal reports the difference between the two viewing conditions was much more pronounced, with the slopes of the functions being 1.25 and 1.03 for IVE and RW, respectively. While the verbal reports underestimated all of the target distances in RW and in the IVE, these slopes were such that the underestimation became progressively smaller as the targets became farther away. In sum, compared to RW, viewing the IVE had a small effect on manual reaches to egocentric distances in personal space while IVE viewing had a larger effect on concurrent verbal judgments. Also, the differences between RW and IVE viewing were much smaller than the differences between the two response modes made within each of the viewing conditions. Compared to related work investigating egocentric depth perception in action space, our work is one of the first studies investigating egocentric distance estimation in IVEs in the near field. Most work conducted in action space utilizes blind walking and imagined timed-walking estimates. A novel aspect of our study is the usage of both verbal and physical responses in an IVE. As mentioned earlier, we found that in personal space, participants in the IVE estimate significantly different as compared to ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:16

•


participants in the RW. Although the difference for reach judgments is small, our results contradict with significantly large distance compression perceived by participants in the IVE as compared to RW in action spaces found by other researchers [Grechkin et al. 2010; Ziemer et al. 2009; Klein et al. 2009; Richardson and Waller 2007; Interrante et al. 2006]. We believe one possible reason for this difference in our result in personal space, as compared to research findings in action space, could be explained by the dominance of stereoscopic visual cues in depth perception in the near field. This difference could further be explained by the speed of execution for our task versus blind-walking and imagined timedwalking. Blind-walking and imagined timed-talking techniques take time to accomplish. It could be argued that because of this delay, it is not necessarily measuring a visual perceptual response, but rather a cognitive response that continually calibrates during the action. One might then ask why our results do not match those of Sahm et al. [2005] where beanbag throwing, an immediate and direct task, was utilized. Throwing, however, could possibly be influenced by a cognitive response as well. While blind-walking potentially involves recalibrating one’s perceptual judgment during the action, calibration in throwing could occur from simply imagining the trajectory of the beanbag being thrown through the air [Sahm et al. 2005]. This would also explain the similar compression found between blind-walking and beanbag throwing [Sahm et al. 2005]. 5.

CONCLUSION AND FUTURE WORK

In this research, we have successfully compared near-space physical reaches and verbal responses of participants to targets in IVE and RW environments by using an apparatus similar to that used in our previous work [Bingham and Pagano 1998; Pagano and Bingham 1998; Pagano et al. 2001; Pagano and Isenhower 2008]. Participants’ physical reaches and verbal responses in both the RW and IVE conditions showed a significant difference of distance perception when compared to the actual distance. We found that distance underestimation of verbal as well as physical reach responses to targets increased with distance in both the IVE and RW conditions. Furthermore, we found that verbal reports of distance judgments were less accurate as evidenced by a steep slope in the IVE condition and a low intercept in both conditions in the regression analysis, even though the verbal reports were made simultaneously with physical reaches. The impact of our current and ongoing work for VR application developers and consumers could be substantial. For instance, designers of complex systems such as surgery simulators could potentially enhance user performance by automatically accounting for observed systematic underestimation in near-space distance perception. A limitation of our study is that we have only investigated verbal reports and physical reach responses to near-field distance judgments in IVEs that employ HMDs. It is unclear what effect other display methods in IVEs, such as LSIDs, have on perceptual judgments of near-field distances via physical and verbal responses. Future work will examine the effect of feedback on distance estimates. In the experiments by Bingham and Pagano [1998; Pagano and Bingham 1998] the participants reached to place a handheld stylus into the target and thus received feedback about the accuracy of their reaches. It was found that this feedback resulted in calibration that improved the accuracy of their reaches while it had a more limited effect on the accuracy of the verbal reports. The calibration, for example, reduced error that was due to a restricted FOV. We will test if calibration has a similar effect in the IVE and RW environment employed in the present experiment. We will also systematically vary the size of FOV in both viewing conditions to determine if the perturbation caused by a restricted FOV has the same effect on distance estimates in personal space for both IVE and RW. We may also investigate the importance of stereoscopic vision and motion parallax in near-field distance estimation. ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:17

We found that the differences between RW and IVE viewing were much smaller than reaches and verbal reports made within each of the viewing conditions. This finding is surprising, given the fact that the two responses were executed together within each trial while the two viewing conditions were performed as separate blocks of trials and on separate days. The verbal judgments were also made in arm length units, which should have fostered their similarity with the reaches. It seems that differences within each individual can be much greater then differences between environments. A great deal of research is devoted to understanding the effect that different viewing environments have on perception and action. In reality it may be that differences between response modes have a larger effect on “perception” than differences between environments. Perception-for-cognition is different from perception-for-action, with the perceptual systems likely attuning to different information when generating separate responses. What this means for designers of VR, IVE, AR, and other artificial environments is that the purpose for which perception will take place within a given environment must be understood in order to ensure that the correct information is being rendered, the environments must be tested using the appropriate response measure, and users may need to be trained to both attune to and calibrate to the appropriate information. The present experiment demonstrates that the results obtained by testing will vary dramatically depending on the response measure employed. Future work should focus on the processes of attunement and calibration in artificial environments, while recognizing that these are two separate processes [Withagen and Michaels 2005]. ACKNOWLEDGMENTS

We wish to thank J. Edward Swan for his insights and discussions regarding the experiment design, and the participants of our experiment for their time. We also acknowledge the contributions of Julia Nelson-Abbott and Adina-Raluca Stoica for assistance with 3D modeling and simulation. REFERENCES BROOKS, F. 1999. What’s real about virtual reality? IEEE Comput. Graph. Appl. 16–27. BINGHAM, G. AND PAGANO, C. 1998. The necessity of a perception—action approach to definite distance perception: Monocular distance perception to guide reaching. J. Exper. Psychol. Hum. Percept. Perform. 24, 1, 145–167. BINGHAM, G. AND STASSEN, M. 1994. Monocular egocentric distance information generated by head movement. Ecol. Psychol. 6, 3, 219–238. BRIDGEMAN, B., KIRCH, M., AND SPERLING, A. 1981. Segregation of cognitive and motor aspects of visual function using induced motion. Attent. Percept. Psychophys. 29, 4, 336–342. BRUNSWIK, E. 1956. Perception and the Representative Design of Psychological Experiments 2nd Ed. University of California Press, Berkeley. CUTTING, J. AND VISHTON, P. 1995. Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In Perception of Space and Motion 5, 69–117. ELLIS, S. AND MENGES, B. 1997. Judgments of the distance to nearby virtual objects: Interaction of viewing conditions and accommodative demand. Presence: Teleoper. Virt. Environ. 6, 4, 452. ELLIS, S. AND MENGES, B. 1998. Localization of virtual objects in the near visual field. Hum. Factors 40, 3, 415–416. FERWERDA, J. 2003. Three varieties of realism in computer graphics. In Proceedings of SPIE Human Vision and Electronic Imaging Conference. 290–297. FOLEY, J. M. 1977. Effect of distance information and range on two indices of visually perceived distance. Perception 6, 4, 449–460. FOLEY, J. M. 1985. Binocular distance perception: Egocentric distance tasks. J. Exper. Psych. Hum. Percept. Perform. 11, 2, 133–149. GOGEL, W. C. 1968. The measurement of perceived size and distance. Contrib. Sensory Physiol. 3, 125–148. GOGEL, W. C. 1993. The analysis of perceived space. Adv. Psych. 99, 113–182. GOGEL, W. C. AND TIETZ, J. D. 1979. A comparison of oculomotor and motion parallax cues of egocentric distance. Vis. Res. 19, 10, 1161–1170. GOZA, S., AMBROSE, R., DIFTLER, M., AND SPAIN, I. 2004. Telepresence control of the NASA/DARPA robonaut on a mobility platform. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 623–629. ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.

18:18

•


GRECHKIN, T., NGUYEN, T., PLUMERT, J., CREMER, J., AND KEARNEY, J. 2010. How does presentation method and measurement protocol affect distance estimation in real and virtual environments? ACM Trans. Appl. Percept. 7, 4, 1–18. Haller, M. 2004. Photorealism or/and non-photorealism in augmented reality. In Proceedings of the ACM SIGGRAPH International Conference on Virtual Reality Continuum and its Applications in Industry (VRCAI’04). 189–196. HILL, R. W., GRATCH, J., MARSELLA, S., RICKEL, J., SWARTOUT, W., AND TRAUM, D. 2003. Virtual humans in mission rehearsal exercises. Kunstliche ¨ Intelligenz 17, 3, 5–10. HINE, B. P., STOKER, C., SIMS, M., RASMUSSEN, D., HONTALAS, P., FONG, T., STEELE, J., BARCH, D., ANDERSEN, D., MILES, E., AND NYGREN, E. 1994. The application of telepresence and virtual reality to subsea exploration, In Proceedings of the 2nd Workshop on: Mobile Robots for Subsea Environments, International Advanced Robotics Program (IARP). 117–126. HODGES, L., ANDERSON, P., BURDEA, G., HOFFMAN, H., AND ROTHBAUM, B. 2001. Treating psychological and physical disorders with VR. IEEE Comput. Graph. Appl. 21, 6, 25–33. INTERRANTE, V., RIES, B., AND ANDERSON, L. 2006. Distance perception in immersive virtual environments, revisited. In Proceedings of IEEE Virtual Reality Conference. 3–10. JOHNSEN, K., DICKERSON, R., RAIJ, R., HARRISON, A., LOK, B., STEVENS, A., AND LIND, D. 2006. Evolving an immersive medical communication skills trainer. Presence: Teleoper. Virtual Environ. 15, 1, 3–10. KESSLER, G. D., BOWMAN, D., AND HODGES, L. 2000. The simple virtual environment library, an extensible framework for building VE applications, Presence: Teleoper. Virtual Environ. 9, 2, 187–208. KLEIN, E., SWAN, J., SCHMIDT, G., LIVINGSTON, M., AND STAADT, O. 2009. Measurement protocols for medium-field distance perception in large-screen immersive displays. In Proceedings of the IEEE Virtual Reality Conference. 107–113. KUNZ, B., WOUTERS L., SMITH, D., THOMPSON, W., AND CREEM-REGEHR, S. 2009. Revisiting the effect of quality of graphics on distance judgments in virtual environments: A comparison of verbal reports and blind walking. Attent. Percept. Psychophy. 71, 6, 1284–1293. LOOMIS, J. AND KNAPP, J. 2003. Visual perception of egocentric distance in real and virtual environments. In Virtual and Adaptive Environments: Applications, Implications, and Human Performance Issues, L. J. Hettinger and J. W. Haas Eds., Lawrence Erlbaum Associates, Mahwah, NJ, 21–46. MESSING, R. AND DURGIN, F. 2005. Distance perception and the visual horizon in head-mounted displays. ACM Trans. Appl. Percep. 2, 3, 234–250. MILNER, A. D. AND GOODALE, M. A. 1995. The Visual Brain in Action. Oxford University Press. MILNER, A. D. AND GOODALE, M. A. 2008. Two visual systems reviewed. Neuropsych. 46, 3, 774–785. MON-WILLIAMS, M. AND TRESILIAN, J.R. 1999. The size–distance paradox is a cognitive phenomenon. Exper. Brain Res. 126, 4, 578–582. PAGANO, C. AND BINGHAM, G. 1998. Comparing measures of monocular distance perception: Verbal and reaching errors are not correlated. J. Exper. Psych. Hum. Percept. Perform. 24, 4, 1037. PAGANO, C. AND ISENHOWER, R. 2008. Expectation affects verbal judgments but not reaches to visually perceived ego-centric distances. Psych. Bull. Rev. 15, 2, 437. PAGANO, C., GRUTZMACHER, R. AND JENKINS, J. 2001. Comparing verbal and reaching responses to visually perceived ego-centric distances. Ecol. Psychol. 13, 3, 197–226. PETERS, T. M., LINTE, C. A., MOORE, J., BAINBRIDGE, D., JONES, D. L., AND GUIRAUDON, G. M. 2008. Towards a medical virtual reality environment for minimally invasive cardiac surgery. Medical Imag. Augment. Reality 5128, 1–11. PHILBECK, J. W. AND LOOMIS, J. M. 1997. Comparison of two indicators of perceived egocentric distance under full-cue and reduced-cue conditions. J. Exper. Psych. Hum. Percep. Perform. 23, 1, 72–85. PHILLIPS, L,, RIES, B., INTERRANTE, V., KAEDING, M., AND ANDERSON, L. 2009. Distance perception in NPR immersive virtual environments, revisited. In Proceedings of the 6th Symposium on Applied Perception in Graphics and Visualization (APGV’09). 11–14. RIESER, J. J., PICK, H. L., ASHMEAD, D. H., AND GARING, A. E. 1995. Calibration of human locomotion and models of perceptualmotor organization. J. Exper. Psych. Hum. Percep. Perform. 21, 3, 480–497. SAHM, C., CREEM-REGEHR, S., THOMPSON, W., AND WILLEMSEN, P. 2005. Throwing versus walking as indicators of distance perception in similar real and virtual environments. ACM Trans. Appl. Percept. 2, 1, 35–45. SEYMOUR, N. 2008. VR to OR: A review of the evidence that virtual reality simulation improves operating room performance. World J. Surgery 32, 2, 182–188. SINGH, G., SWAN, J., JONES, A., AND ELLIS, S. 2010. Depth judgment measures and occluding surfaces in near-field augmented reality. In Proceedings of the 7th Symposium on Applied Perception in Graphics and Visualization (APGV’10). ACM, 149–156. SUTHERLAND, I. 1965. The ultimate display. In Proceedings of the IFIP Congress. Vol. 2, 506–508 ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.


•

18:19

THOMPSON, W., WILLEMSEN, P., GOOCH, A., CREEM-REGEHR, S., LOOMIS, J., AND BEALL, A. 2004. Does the quality of the computer graphics matter when judging distances in visually immersive environments? Presence: Teleoper. Virtual Environ. 13, 5, 560– 571. WANG, R. F. 2004. Action, verbal response and spatial reasoning. Cogni. 94, 2, 185–192. WARREN, W. H. 1995. Constructing an econiche. In Global Perspectives on the Ecology of Human-Machine Systems. J. Flach, P. Hancock, J. Caird, and K. Vicente Eds., Lawrence Erlbaum Associates, Mahwah, NJ, 210–237. WILLEMSEN, P., COLTON, M., CREEM-REGEHR, S., AND THOMPSON, W. 2009. The effects of head-mounted display mechanical properties and field of view on distance judgments in virtual environments. ACM Trans. Appl. Percept. 6, 2, 1–14. WITHAGEN, R. AND MICHAELS, C. F. 2005. The role of feedback information for calibration and attunement in perceiving length by dynamic touch. J. Exper. Psych. Hum. Percep. Perform. 31, 6, 1379–1390. WITMER, B. AND KLINE, P. 1998. Judging perceived and transversed distance in virtual environments. Presence: Teleoper. Virtual Environ. 7, 2, 144–167.

Received April 2011; accepted July 2011

ACM Transactions on Applied Perception, Vol. 8, No. 3, Article 18, Publication date: August 2011.