The Hybrid World of Virtual Environments - CiteSeerX

EUROGRAPHICS '99 / P. Brunet and R. Scopigno (Guest Editors)

Volume 18, (1999 ), Number 3

The Hybrid World of Virtual Environments Shamus Smith and David Duke HCI Group, Department of Computer Science University of York, Heslington, YO10 5DD York, United Kingdom (shamus duke)@cs.york.ac.uk

Mieke Massink CNR -Istituto CNUCE Via S. Maria 36 I56126 - Pisa - Italy [email protected]

Abstract Much of the work concerned with virtual environments has addressed the development of new rendering technologies or interaction techniques. As the technology matures and becomes widely available, there is, however, a need to better understand how this technology can be accommodated in software engineering practice. In particular, virtual environments have the potential to support interaction techniques that are substantially more complex than those found for example in graphical user interfaces. The work reported in this paper represents a rst step towards bridging the gap between the requirements for what a VR-based system should support, and what the technology available to the implementors of the system can provide. We argue that virtual environments are fundamentally hybrid systems, and show how techniques for modelling hybrid systems can be used to understand virtual interaction and potentially support the task of implementation.

Keywords: Virtual environments, hybrid systems, interaction techniques, VE design, HyNet. 1. Introduction

New technologies 1 12 21 have been a major factor in the development of virtual environments (VEs). This has lead to many innovative systems and novel interaction techniques. These are typically developed as research prototypes or bespoke systems, rather than following a more structured software development model and/or utilise a standardised toolkit. This situation is changing, as the technology matures and improvements in computer performance makes it feasible to use a generic VR toolkit, at least for problems of modest geometric and interactive complexity. ;

;

c The Eurographics Association and Blackwell Publishers 1999.

Published by Blackwell Publishers, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA.

In many cases, new technology is coupled with high level descriptions of environments and interaction techniques to produce iterative, prototype driven developments. This ad hoc prototyping may provide examples of the application of technology but does not necessarily result is usable end products. Jacob 8 notes that this ad hoc approach can also make these application's interfaces dicult to develop, share and reuse. There is a need to better understand how software engineering practice can be adapted to address the challenges presented by virtual environment development. In particular, virtual environments have the potential to support interaction techniques that are substantially more complex than those found, for exam-

S. P. Smith, D. Duke and M. Massink / The Hybrid World of Virtual Environments

ple, in graphical user interfaces. Of course, the use of stereoscopic 3D, visually extensive environments and the issues of user embodiment all contribute to the complexity of VE interaction. However, here we are concerned mainly with the ow of interaction between user and application. The advent of direct manipulation gave rise to techniques such as state-transition systems to model dialogue. In this paper, we argue that VR calls for a more powerful model of interaction based on hybrid models. Although the development of a software life cycle model for VE applications is outside the scope of the current work, one of the aims of the INQUISITIVE project is to investigate the process of VE development and how this can be supported. The INQUISITIVE project is a three year research eort between groups at the University of York and the Rutherford Appleton Laboratory (RAL). Its aim is to understand how to design and implement better user interfaces for VR systems and to develop methods and principles that can be used at an early stage in the design life cycle and system evaluation. The concern is not so much with the physical devices such as headsets and data gloves that have come to characterise interaction within VR, but rather with addressing the highly interactive and dynamic nature of the interaction that is found within virtual environments. One aim of the project is to prototype a toolkit for de ning interaction within virtual environments based on a set of guidelines for building VR interfaces. Virtual environments are made up of many dierent components, each of which require de nition if an accurate description of the environment is to be produced. Typically, descriptions of virtual environments are informally or incompletely de ned. VE descriptions are either entirely text based natural language descriptions or augmented with some abstract high-level diagrams. However, there are low level descriptions of some components of VE system, e.g. geometric de nitions of landscapes and avatar modelling. Although informal descriptions may be appropriate for general descriptions of environments, reimplementation of systems would require more detailed speci cations. Also, if any automated analysis is required, either for design or evaluation, then more formal descriptions would be desirable. The very nature of virtual environments makes them dicult to describe and model. Typically, they are a collection of static and dynamic components 8 23 , are extensively visually based and have a nonlinear process and control ow. Also, there are the considerations of the separation of what is system based and what is user based and how the timing of operations in the environment are to be handled. ;

A more detailed study of continuous interaction techniques and in particular the cognitive aspects of the use of these techniques are studied in the European TMR project TACIT. Continuous interaction techniques are techniques in which the interaction, or part thereof, evolves smoothly in time, such as for example in audio and video based techniques. Research on a proper graphical formalism that can assist interface designers during many phases of the design process is one of the topics of TACIT. Such a formalism should ideally be suitable to express both system oriented aspects as well as user oriented and cognitive aspects of the interface. The formalisms and speci cations discussed in this paper show the use of two notations developed within the area of hybrid systems theory. In Section 2 we will begin by considering interaction in virtual environments. Next in Sections 3 and 4 we will investigate the importance of modelling interaction and how this can be done using a hybrid systems approach. The rst of two example interaction techniques will then be introduced along with a description of a VE modelling technique based on a hybrid systems approach. Issues involved in the de nition of user and system based models will be considered and an approach to a more detailed VE description will be outlined.

2. Interaction in Virtual Environments

Research into virtual reality and virtual environments has been predominantly lead by the development of new technologies. Although many of these technologies have matured, there are still many issues about virtual environments that remain unanswered. One of these issues is how the discrete, event-based elements and the continuous visual components of interaction can be expressed? Virtual environments provide the user with an interaction environment which is fundamentally dierent to traditional computer systems. In VEs, the user is an active participant within the system. Traditional computer systems reinforce the man/machine barrier where the application is within the machine and the users can only interact with it from the outside. VEs provide a new level of interactivity as the separation of the user and the system becomes less clear. In immersive systems, the user is surrounded by the environment and can interact directly with components within the environment. This type of interaction requires not only new technologies, in the form of novel input devices (for example spaceballs, 3D mice, data gloves and vision tracking) but descriptions of interaction techniques (for example Head-Butt Zoom 16 , Go-Go interaction 19 , Head



Crusher Select 18 and Two Handed Flying 16 ) which are to be mapped onto these devices. A wide range of interaction techniques have been implemented in real implementations or developed as research models. They range through desktop based systems 6 , prop-based environments 5 7 , semiimmersive 13 and full immersive virtual environments 3 . Many of the developments in interaction reported in the HCI literature have been technology driven. There is a vast amount of material in the literature on how particular hardware con gurations have been developed and the applications which have been built to take advantage of a particular technology. However, many of these systems are tested by ad hoc user studies and are not necessarily based on any formal theory. This is undesirable as any usability analysis can only be based on the test cases that were presented to subjects. ;

3. Why Model Interaction?

There are many factors that make precise interaction dicult in the virtual world. Mine 14 notes that many virtual worlds lack haptic feedback (something we take for granted in the real world). While haptic rendering is becoming practical for some tasks, it is still far from practical for general tasks in virtual environments, such as navigation. Mine also observes that current alphanumeric input techniques (what we use for precise interaction in the computer world) for virtual worlds are ineective. He suggests that we must learn how to interact with information and controls distributed about the user instead of focused on a terminal in front of him/her. If natural forms of interaction can be identi ed, and possibly extended in the VE, then more usable VE interfaces can be constructed 14 . One important distinction that has been noted is that of the diering requirements of the users and designers of virtual environments. Modelling interaction is both important from the user and designer of VEs perspective. The users require interaction techniques which allow them to complete interaction tasks in a particular application and designers wish to build systems that make the required interaction possible. In the context of research tools, designers and users are closely bound. Typically, system designers are either the main users or closely associated with or working in the same domain as the target users. However, as VR technologies reach a larger audience, new ways of capturing user requirements explicitly for designers will be needed. Several problems have been identi ed when trying to describe interaction techniques. Firstly, the informal description of interaction techniques means that


there are no obvious ways to evaluate whether two interaction techniques are the same. This can lead to every new design \re-inventing" the same basic interaction techniques. Secondly, if an informal description of an interaction technique was given to several dierent designers, how similar would the resulting systems be? Typically, informal descriptions leave room for ambiguity and require extensive customisation before they are in an implementable form. If there is no access to the original design team, then dierent implementors may make their own arbitrary design decisions. This would be highly undesirable if consistency and standardisation are required between applications. Also vague descriptions in natural language do not lend themselves to rigorous analysis and the comparison of dierent techniques may be impossible. With no way to judge and compare techniques, how is the designer of a VE to make an informed decision on what interaction techniques are appropriate for any given task or application. If the descriptions are left informal, there is no guarantee that the nal implemented system will be close to the original design. At an initial stage of design it would be desirable if there was a useful way of \sketching" the ow of an interaction at a high level of abstraction, for requirements and speci cation. This would provide a basis for pre-implementation evaluation of the environment and could be developed into a more detailed model for mapping onto an implementation model.

4. Why use hybrid systems?

The description of virtual environments is a nontrivial task. VEs are dynamic environments and due to their continuous and highly visual nature, de ning salient and useful aspects of them is extremely dif cult. By continuous, we mean the user view of the virtual environment. The systems behaviour may be able to be broken down into discrete modes of interaction but the user is engaged with a continuous view of the environment. From a designer/implementors point of view, there are several aspects of VEs which require explicit definition. Namely, the virtual environment, the user interface, the interaction processes, the physical interaction devices and the users cognitive model. These elements of VEs make up a complex model of discrete and continuous processes. Also, each individual element can be thought of as a mixed model in its own right. Thus, traditional discrete modelling techniques are not appropriate for VEs. Trying to force the models into pure continuous descriptions is also undesirable. What is proposed here is that a hybrid of dierent


modelling components be used to mirror the hybrid nature of VEs. The modelling and simulation of hybrid systems, systems consisting of a mixture of discrete and continuous components, is a research area that becomes more and more interesting. This is due to the fact that most systems of real world applications are not purely discrete nor purely continuous and often both parts in uence each other 24 . Typically, hybrid systems are interactive systems of continuous devices and digital control programs 9 . The current work follows present trends in the hybrid systems literature 4 24 26 . We have developed a semiformal notation to aid in the description of virtual environments 23 . This notation has been developed using a hybrid systems approach. The consideration of virtual environments as hybrid systems seems to be a natural step towards a more detailed VE speci cation 23 . However, the current work is not intending to de ne and champion yet another notation for modelling, but in selecting basic features common to notations which seem particularly relevant to virtual environment description. This provides a structure for describing interaction. Research into this unknown territory requires careful consideration. We feel that beginning research at an abstract modelling level is an appropriate starting point as it allows us to initially consider VE modelling at an informal level of rigour. In the context of the current project, we wish to be able to describe what is required, but at a level that is independent of the VR toolkit. This will hopefully lead to the development of portable modelling techniques which can then be speci ed at higher levels of rigour if required. For the remainer of this paper, we will investigate the use of a hybrid systems approach to two example interaction techniques. The rst technique is from an immersive VE while the second technique is based in a desktop VR and is described both in a graphical ow notation and in HyNet, a recently developed hybrid extension of High-level Petri Nets 11 22 .[B ;

;

;

5. From ying hands to ying cameras

Navigation, object selection and object manipulation are three fundamental tasks within virtual environments. Within the VE literature there are many novel interaction techniques that have been developed to support these standard VE features. Many of these techniques have been based around new technology in an attempt to capture the imagination of the users and potential users. The use of head mounted displays (HMDs) and glove based input devices have become synonymous with many descriptions of VEs. Flying is a navigation technique which has been

popular in many virtual environment implementations because it avoids the need to take surface terrain into account. Typically, this technique is based around a

ight vector calculated by the angle between the users hand and head positions. However, there are several disadvantages to this technique. Like other gesture based interaction techniques it can cause arm fatigue with continuous use. User disorientation can also come through misunderstanding the relationship between hand orientation and ying direction 14 .

5.1. Flying hands

Two Handed Flying (THF) 16 is a specialized type of ying which exploits proprioception, the person's sense of the position and orientation of their body and limbs. Direction of ight is de ned by the vector between the users two hands and the ight speed is speci ed by the distance between the users hands. Flight is stopped by moving the hands into a deadzone, a minimum hand separation. Two Handed Flying was developed by Mine, Brooks and Sequin 16 to exploit user proprioception in virtual environment interaction at UNC-Chapel Hill using the Chapel Hill Immersive Modeling Program (CHIMP). CHIMP is a virtual environment application for the preliminary phases of architectural design. It includes both one and two-handed interaction techniques, minimizes unnecessary user interaction and takes advantage of the head-tracking and immersion provided by the virtual environment system 15 . Two Handed Flying is one of several techniques that were used to show how intuitive gestures could be augmented to take advantage of proprioception. The following is Mine, Brooks and Sequin's description of Two Handed Flying. We have found two-handed ying an eective technique for controlled locomotion. The direction of ight is speci ed by the vector between the user's two hands, and the speed is proportional to the user's hand separation. A dead zone (some minimum hand separation, e.g. 0.1 metres) enables users to stop their current motion quickly by bring their hands together (a quick and easy gesture). Two-handed ying exploits proprioception for judging ying direction and speed 16 . Although this description provides a high level view of the interaction technique, it does not provide the level of detail which is required to develop an implementable speci cation or a model that is amenable to rigorous usability analysis. A graphical notation has been proposed by Smith



and Duke 23 which provides a small set of powerful operators which can be used to de ne a concise and expressive representation of interaction in VEs. This notation is based on the event/process structure that is used in Petri Nets 17 . However, the Petri Net notation has been extended to provide provision for the de nition of discrete and continuous components. This is similar to the work carried out in the area of hybrid systems 4 25 24 26 . This is important for VEs descriptions as one of the features of VEs is that they are comprised of discrete and continuous components 8 23 . The users position/embodiment in the environment, the updating of the viewpoint and input from VE input devices are all examples of continuous ows of information which are required to be modelled in VEs speci cation. These ows can be triggered by discrete events and control arcs. Hence, there is a clear need for the description of the continuous and discrete components. ;

;

;

;

hand positions

enable 1 start

d min 2

4

~ exit

6

update position, speed exit

disable

Figure 1: Two Handed Flying hybrid model Figure 1 shows a representation of the Two Handed Flying technique in the graphical ow notation. A more detailed description of this notation has been discussed elsewhere 23 but will be illustrated here within this example. In this model there are three external plugs. These are the continuous ow from the hand positions and


the boolean control arcs from the techniques enable and disable. A control arc signals a control dependency in the model. Initially, the user triggers the interaction by some, unspeci ed, enable mechanism (1) which is part of the application or the environment in which THF is used. This enables the start transition. This transition also has an inhibitor arc so that the interaction cannot get restarted while the user is currently

ying. The start transition passes a token to the not

ying state. The user will remain in this state until their hands are moved outside the THF deadzone. This condition is detected by a sensor on the hand position ow (2). The sensor spans the ow and acts as a function from the ow content to a boolean. Once the users hands are moved outside the THF deadzone, the active token is passed to the ying state (3). In this state a ow control is activated. The ow control acts as a \valve" on the continuous loop for transforming the users current position and speed. The continuous loop in this example is comprised of three components; the ow control (3), a transformer (4) and a store. A transformer applies a transformation to a ow to yield a modi ed content. In Figure 1 the update position, speed transformer takes the current values from the continuous ow and updates it with the current value on the hand positions ow (4). This is then passed to the store. A store is a source and repository for information that is consumed or producted by a continuous ow. If the users hands are moved back into the THF deadzone, a sensor on the users hand positions would trigger a transition (5) back to the stationary position. Finally, while in either state, if the user wishes to exit the technique, a disable control arc can be triggered (6). The diagram highlights the modes/states of the interaction and the events that cause the transitions between modes. Also, there is a clear separation of the discrete (the control processes in the middle of the diagram) and the continuous (the continuous loop on the right of the diagram) processes. One of the aims of the current project is to produce models of interaction techniques which can be reused in alternative VE descriptions as part of a VE development toolkit. In the next section, an interaction technique for a PC based desktop VE for a virtual television camera is de ned and compared to the immersive Two Handed Flying description.

5.2. Flying cameras

The simulation of a television studio is an environment which has the potential to take advantage of many of the features of VR. PCs based desktop VR systems


The user initiates the ying camera mode by pressing the middle mouse button. When the mode is activated, a square (1cm by 1cm) appears at the current mouse pointer position. While the pointer remains within the square, the view remains stationary. Once the pointer is moved outside the square, the users movement and speed is directly proportional to the angle and distance between the current pointer and the center of the square respectively. The ying mode is deactivated by a second press on the middle mouse button. Alternative movement is obtained by use of the other mouse buttons. For example, with no buttons down, up/down/left/right with the mouse is mapped to forward/back/pan left/pan right while with the left button down it is mapped to up/down/crab left/ crab right. Figure 2 shows this description as a hybrid model. There are two external plugs in Figure 2, the mouse and the square events. They are repeated on the diagram for clarity. The mouse ow provides the current position of the mouse and any button events. The square events ow provides any events which are associated with the ying square when it is onscreen. In this example, the continuous ow loop for updating the users position is on the top of the diagram while the mode control within the technique is on the lower portion of Figure 2.

square mouse events

square events mouse

update position, speed, orientation

~

position, speed, orientation

middle button down moving

exit

pointer in square

pointer outside square middle button down

start

are currently commercially available 2 and provide a varied degree of interaction with their environment. Along with interaction with the props in the environment there needs to be some way of navigating the current view of the environment. In this domain, this would involve the implementation of a virtual camera. The user should be able to position the camera to experiment with dierent views and explore alternative camera angles. The use of an interaction technique like Two Handed Flying to implement the virtual camera may at rst seem infeasible on a desktop platform as this particular technique was developed with proprioception as a fundamental property. Desktop VR systems are typically limited to an external user embodiment and a mouse and keyboard for user input. The mouse can be mapped onto an onscreen cursor to provide one \virtual hand" position. To provide the second required hand position, a virtual object is required. This must be enabled and disabled. This toggling is typically done using mouse button combinations in PC based desktop VR. The following is a high level description of a virtual ying camera for a desktop VR simulation of a television studio 2 .

middle button down

stationary

exit

Figure 2: Flying camera hybrid model 5.3. Discussion

The two presented examples are from two distinct domains and are implemented on completely dierent platforms, but are they that dierent? By modelling them in the new notation it is possible to identify several common features that are shared between them. By focusing on the separation of the discrete and continuous components of both diagrams it is possible to see that each has a continuous updating of the users position which is managed by a discrete control structure. Each of these control structures are dependent on rstly, an initial enablement and secondly, a sensor to switch the mode of the interaction between the distinct states. Either the user is moving or the user is stationary. When the user is moving, this is the cue for updating of the users continuous position in the environment. This updating is done by a transformer based on the current position in space and the continuous ow of information from the users current position. (Either the users' hand positions in THF or the mouse pointer to square position in the ying camera example.) Using a more descriptive way to represent VE de nitions allows questions to be asked about the design and possible de ciencies in the design may be identi ed and xed at an early stage. One of the main bene ts is that the requirements and resources that are needed from and to the components of the VE are mapped out. For example, interaction triggers and mode enablement / disablement can be clearly de ned



on the diagram. The triggers for events (transitions) can be described for the discrete components of the diagram while the transformer element allows the nature of continuous ows to be speci ed. Although the hybrid approach we have taken provides a clearer view of the interaction, it is only a step towards a description which can be mapped directly onto an implementation. Our approach provides a user speci cation for an interaction. What is also required is a system model of the same processes. For example, in the ying camera example, the user has two main modes. Either they are moving the camera ( ying) or they are stationary. When they are moving, the view is being transformed by both the current position of the mouse pointer and the status of the mouse buttons. In Figure 2 this is represented by the update position, speed and orientation transformer. However in an actual system, the status of the mouse buttons represents discrete mode changes within the moving mode. Although these modes are transparent to the user, the system description will require this speci cation. The hybrid model can be seen as a user speci cation of the interaction. It describes what the user is doing in the interaction and how their input aects the interaction. This is a design sketch or storyboard type model which provides a clear view of the users expectations of the interaction. What is also needed is a system based approach which can be developed at a more detailed level and mapped directly onto an implementable model.

6. From design sketch to implementable model

That there is a clear need for formalisms that can provide for a more detailed, system oriented description has been argued for in 10 . The paper reports on the diculties that design teams have to document and communicate their ideas during meetings in the early phases of design. The use of formalisms and models should not be seen as an alternative to approaches as rapid prototyping for the development of interfaces, but rather as a complementary activity. This activity allows for the early discovery of particular problems in the interface and helps selecting the most promising design options that can consequently be used to develop a prototype.

6.1. Hybrid High-level Petri Nets

In this section we illustrate the use of HyNet, extended High-level Petri Nets 11 22 , as a modelling language. HyNet combines three promising concepts for the description of hybrid interfaces; a graphical notation to ;


de ne discrete and continuous parallel behaviour, the availability of object oriented concepts and the highlevel hierarchical description that allows the speci cation of more complex systems. In order to accommodate the description of processes which behaviour evolves in time in a continuous way the formalism provides the use of dierential algebraic equations. Sets of dierential algebraic equations are commonly used in elds like physics to describe continuous change. The underlying concept of time is that of discrete time, i.e. time evolves in discrete small time units. Object-oriented concepts such as inheritance, polymorphism and dynamic binding provide means for a more compact and clear structuring of the speci cation of a complex system. A detailed description of Hybrid High-Level Petri Nets can be found in 25 24 . Two small examples, showing most of HyNet features, are given in Figure 3. Like in standard Petri Nets 17 , speci cations consist of places and transitions connected by arcs. Besides standard arcs, HyNet has enabling and inhibitor arcs that respectively enable and inhibit a transition when a token resides on the place connected to the arc. The number and type of tokens that can reside on a place at any moment is de ned in an inscription label. Tokens in HyNet can be simple tokens, but also complex objects with object oriented features much similar to those found in the C++ language and are de ned separate from the net de nition. Also, transitions are labeled by inscriptions which de ne its characteristics. For continuous transitions these are the activation condition and the set of differential equations. Usually these are written within the box denoting the transition. Discrete transitions have more complicated inscriptions, de ning also the possible delay, ring time and ring capacity. For continuous transitions these have default values where the delay and ring time are zero and the capacity is in nite. Figure 3 shows an example of a discrete (dt ) and a continuous (ct ) transition. The discrete transition is enabled when all incident places have a token (except the one connected by an inhibitor arc), the activation condition (AC) is ful lled and the ring capacity is not exceeded. At that point the delay time starts and the transition is red when the delay has passed without change of the enabling conditions. The duration of ring of the transition is given by the ring time (FT). When dt res the variable y gets the Boolean value corresponding to the evaluation of the equation x at (2) x at (1). The continuous transition res when there are tokens on the incident places that are not connected to an inhibitor arc and the activation condition is ful lled (y 4). It continuously, i.e. ;

:

>

:

3 && z > 0 FA: y = x.at(2) > x.zt(1); DT: 2 FT: z

position calculation Cursor

p2

p3

[Token,1]

[Int, 1]

Square view point control

a) y