4 General Principles of Human–Computer Interaction

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4. General Principles of Human–Computer Interaction

This chapter serves as an introduction to general principles underlying the design of human–computer interfaces. Standard design and evaluation methods and guidelines are covered and linked to 3D UI design.

4.1 Introduction

In the previous chapter, we focused on the low-level perceptual, cognitive, and ergonomic capabilities of humans. In this chapter, we discuss how humans use those capabilities to interact with computers. We offer a broad overview of the HCI field, ranging from low-level human processor models to the high-level principles of activity theory, and include design guidelines and UX engineering processes for HCI practitioners.

While most of this book discusses specific issues related to the design and evaluation of 3D UIs, we recognize that it is critical to have foundational knowledge in HCI before building the 3D UI–specific knowledge on top of that foundation. Since readers may come to the topic of 3D UIs from a wide variety of backgrounds, not everyone will be familiar with general principles and processes used in HCI. Thus, we offer this chapter for readers needing an introduction to these foundational topics before diving into the particulars of 3D interaction. In this chapter, we offer an overview of the HCI field while highlighting topics of particular relevance to 3D UIs and providing pointers to additional readings that will strengthen this foundation.

4.1.1 What Is HCI?

Human–Computer Interaction (HCI) is a field that seeks to understand the relationship between human users and digital technological artifacts and to design new, effective ways for humans to use computer technologies for all sorts of purposes. Putting the “H” at the beginning of HCI is no coincidence; HCI focuses primarily on the human users and their needs. Thus, it is rooted in an understanding of human perception (sensing), cognition (thinking), and acting in the world coming from psychology, anatomy, biomechanics, and related human-centric fields (many of these topics were discussed in the previous chapter).

The word “computer” in HCI is typically construed very broadly. This word should conjure images not only of desktop and laptop computers but also of phones, watches, microwaves, automobiles, and video games—indeed, anything that contains digital computational elements. While HCI has played a vital role in designing and understanding users’ interaction with traditional computers, one could argue that most interaction today, and certainly the more interesting research in HCI, takes place off the desktop, including interaction with 3D UIs.

Another nuance of the definition of HCI is that while the term focuses on the interaction between people and computers, it also includes interaction among multiple people (e.g., couples, colleagues, workgroups, online communities) that is mediated by computer technology. So designing effective HCI is not just about how to get the computer to do what you want it to do but also about how to enable effective human–human communication and collaboration.

So what is effective HCI? The answer can depend to a large extent on the purpose of the interaction. HCI encompasses any application of computer technologies, with many different purposes. While serious work is an important computer application category, with its associated measure of task performance, HCI is also concerned with how to make video games more fun and challenging, online chats more engaging and deep, calendar notifications less annoying and more timely, and interactive art more thought-provoking.

Whatever the purpose, HCI is not simply about understanding our current interaction with computers. It is just as much, if not more, about designing new forms of interaction. This might take the form of developing a new input device or display, designing a set of gestures to be used in an application, or repurposing existing technology in imaginative new ways. Thus, HCI can be approached both from a scientific perspective (understanding the world of human–computer interactions), and from engineering or art perspectives (creating and designing new human–computer interactions).

Because of this emphasis on design, many practitioners of HCI have traditionally focused on user interfaces (UIs); that is, on designing the parts of a computer system that users see and interact with. More recently, UI design has come to be seen as part of user experience (UX) design, which encompasses not only the specifics of the interface but also the social context of the system, the ecology of tools and technologies in which the system lives, and everything that the user brings to the table (e.g., training, cultural background, emotional state) when using the system. UX also puts particular emphasis on a broad definition of effectiveness, including the emotional impact of using a system.

Interaction techniques are particular elements of a UI. They are the methods by which a user accomplishes a task or part of a task, and they are primarily concerned with the form of the input given by the user (although they also involve display elements that give feedback to the user) and the mapping between that input and its result in the system. In most parts of the HCI world, interaction techniques are no longer of primary importance for designers or researchers, since the techniques that work well for interaction via a mouse, keyboard, or joystick are well understood and standardized. In certain segments of HCI, however (including 3D UIs, touch-based gestural interfaces, and brain–computer interfaces), the input devices being used are so different, widely varied, and nonstandard that the design of interaction techniques becomes a very interesting topic. That’s why a large chunk of this book (Part III, “3D Interaction Techniques”) is devoted to the design of 3D interaction techniques, although you won’t see this topic addressed very deeply in typical HCI books.

One fact that every UX designer comes to terms with early on is that design in HCI is a series of tradeoffs. In other words, questions of design are not typically questions that have an unambiguous answer. There is not usually an optimal design (but some designs are demonstrably better than others). In making a design choice (and implicitly rejecting other choices), there are both benefits and disadvantages, and the benefits of one choice must be traded off against those of other choices. As a first principle of any sort of UX design (and of 3D UI design in particular), then, we encourage the reader to recognize these tradeoffs and then analyze them to determine the best course of action.

4.1.2 Chapter Roadmap

In this chapter, we give a summary and high-level overview of some of the key topics in HCI that are most relevant to 3D UIs. Again, this chapter and the previous one are meant for those readers without a background in human factors and HCI.

Since HCI is concerned with both understanding how humans interact with computers and designing such interactions, we organize the chapter around these ideas. Section 4.2 describes some of what we know from years of research that has sought to understand and explain human–computer interaction. This knowledge has been gathered and synthesized into some overarching theories of HCI that are highlighted in this section.

Sections 4.3 and 4.4 address the design side of HCI. In section 4.3, we explain how the knowledge and understanding of HCI, as seen in the theories, has been turned into actionable guidance for designers. We discuss high-level principles and guidelines for designing systems that provide effective user experiences. Since our knowledge is incomplete and even the best set of guidelines is insufficient to guarantee a good user experience, section 4.4 covers the key stages and elements of the UX engineering process.

4.2 Understanding the User Experience

HCI emerged in the early 1980s as a specialty area in computer science that embraced cognitive science and human factors engineering (Carroll 2012). Since then, HCI has attracted researchers from many other disciplines. As a result, HCI has evolved to incorporate many different types of user experiences and many different approaches for better understanding those experiences. In this section, we discuss some of the most prominent UX models and theories that HCI researchers have developed.

We begin in section 4.2.1 by discussing the low-level human processor models, which consider humans as computing entities with processors and memories that handle the atomic actions of interacting with computers. We then discuss user action models in section 4.2.2. Instead of focusing on atomic actions, these models focus on higher-level interactions, such as forming goals for interaction and perceiving whether those goals have been achieved within the system. In section 4.2.3, we describe conceptual models and affordances. In particular, we discuss how different types of affordances help to communicate the designer’s conceptual model of a system to the user. In section 4.2.4, we expand our focus beyond just the user and the computer to discuss activity theory and how the real-life settings that HCI occurs in can affect the user experience. Finally, in section 4.2.5, we discuss the notion of embodied interaction and how it incorporates less-traditional user experiences, such as tangible computing, social computing, and ubiquitous computing.

4.2.1 Human Processor Models

Modeling humans as complex computer systems was one of the earliest approaches to understanding human–computer interaction. As first described by Card et al. (1983), the Model Human Processor regards humans, their capabilities, and memories to be analogous to computers, processing units, and storage disks, as seen in Figure 4.1. The model consists of three processors. First, a perceptual processor handles the sensory stimuli perceived by the user and stores the translated information into working memory. A cognitive processor then leverages information from both working memory and long-term memory to make decisions. Finally, a motor processor carries out the decisions through physical actions. We provided a similar representation in Chapter 3 (Figure 3.1). In Chapter 3, we looked at these processes from a low-level perspective focusing on human capabilities. In this chapter, we consider the same processes, but look at how they affect the human–computer interaction loop and how they can be formally represented through interaction models.

Figure 4.1 The Model Human Processor. (Image adapted from Card et al. 1983)

The Model Human Processor can be used to account for various perceptual, cognitive, and motor processes. Each processor has an estimated cycle time and associated memories with estimated capacities and decay times. For example, Card et al. (1983) estimated that the perceptual processor had a minimum cycle time of 50 milliseconds, a maximum capacity for visual images equal to 17 letters, and an average visual decay half-life of 200 milliseconds. Using such estimates, they were able to calculate the limitations of human capabilities, such as the maximum time interval for an object seen in successive images to be perceived as moving, the amount of time required to determine if two strings match, and the number of words per minute a user can type. These examples demonstrate the Model Human Processor’s robust ability to model most perceptual, cognitive, and motor processes at a low level.

Keystroke-Level Model

Some human processor models sacrifice such robustness to be easier to use. For example, the Keystroke-Level Model (KLM) was one of the first human processor models to become popular. Card et al. (1980) developed it to predict the time that it takes an expert user to perform a given task without errors while using an interactive system. The KLM consists of a system response operator (which enables perception), a cognitive operator, and four possible motor operators. The R operator represents the time that the user must wait for the system to respond to a motor operator. The M operator represents the time that the user must spend to mentally prepare one or more of the motor operators. The four motor operators focus on handling the input devices, pressing keys or buttons, pointing at a target, and drawing a line. In particular, the H operator represents the time required to home the hands on the keyboard or to another device, such as a mouse. The K operator represents the time that it takes for a user to perform the motor skills required to press keys and buttons. The P operator represents the time that it takes to point to a target on a display and is normally calculated according to Fitts’s Law. Finally, the D operator represents the time needed to draw a set of straight-line segments, which is applicable to hierarchical menus and can be estimated with the steering law (Accot and Zhai 1999). See Chapter 3, “Human Factors Fundamentals,” section 3.2.4, for details on both Fitts’s Law and the steering law.

In contrast to the Model Human Processor, the KLM offers the advantage of being easy to use without requiring in-depth knowledge of human factors or psychology. Task times can be predicted using previously established estimates for the operators (Card et al. 1980). This allows designs to be evaluated without recruiting subjects as representative users, and even without building a prototype. However, there are several limitations of the KLM. First, it only measures time, which is only one aspect of user performance. It also only considers expert users and error-free task executions. The mental operator also simplifies all of the user’s cognitive actions (and some perceptual actions) to a single value, which cannot represent the depth and diversity of those actions.

GOMS

A more-complex version of the KLM is the Goals, Operators, Methods, and Selection (GOMS) model, also created by Card et al. (1983). Goals represent what the user wants to accomplish with the system and are often represented as symbolic structures that define the desired state. Operators are the elementary perceptual, cognitive, and motor actions required to accomplish each goal, similar to those of the KLM. Methods are specific sequences of operators that can be used to accomplish a particular goal. Selection rules indicate which method the user should take when multiple methods are available for accomplishing the same goal. Figure 4.2 provides a visual representation of the GOMS model.

Figure 4.2 A visual representation of the GOMS model. (Image courtesy of Ryan P. McMahan)

Touch-Level Model

Despite being one of the first HCI models, human processor models are still being studied today. Recently, Rice and Lartigue (2014) defined the Touch-Level Model (TLM), which extends the KLM to touchscreen and mobile devices. The TLM retains the original response time (R), mental act (M), homing (H), and keystroke (K) operators. It also includes several new operators. The distraction (X) operator is a multiplicative operator that adds time to the other operators, as it represents the time lost to distractions that naturally occur in the real world. Common touch gestures are represented by the tap (T), swipe (S), drag (D), pinch (P), zoom (Z), and rotate (O) operators. Specialized gestures requiring one or more fingers are represented by the gesture (G) operator. The tilt (L) operator represents the time required to physically tilt or rotate the device. Finally, the initial act (I) operator models the actions required for preparing the system for use, such as unlocking the device.

Human Processor Models for 3D UIs

A human processor model has yet to be designed specifically for 3D UIs, but researchers have used the concepts of human processor models to design and analyze various 3D interactions. McMahan and Bowman (2007) established several perceptual-motor operators for selecting various 3D objects and system control components in one of their user studies. They then used those operators to estimate the cognitive load of various system control task sequences (e.g., action-first or object-first tasks). Chun and Höllerer (2013) have also used a human processor model to informally compare a flicking gesture to a similar swiping gesture for touchscreen interfaces. Hence, human processor models can be used to design and analyze 3D UIs.

4.2.2 User Action Models

While human processor models represent the discrete atomic events that occur when the user performs certain operations, user action models portray the high-level interactions between the human and the system. These models consist of four distinct aspects: (i) the user’s goal, (ii) the user’s execution of physical actions, (iii) the outcomes of those actions within the system, and (iv) the user’s evaluation of those outcomes. For example, consider a user with the goal of moving a virtual object. The user decides upon an action plan, reaches out with a tracked handheld device to virtually touch the object, and presses a button to execute a grab (see Chapter 7, “Selection and Manipulation,” section 7.4.1 for details on the virtual hand technique). The system responds by attaching the object so that it now follows the movements of the tracked device. The user sees this outcome and comprehends that the goal of moving the object is in progress.

Seven Stages of Action

Norman’s seven stages of action was the first well-known user action model for human–computer interaction. In order to understand what makes interfaces difficult to use, Norman set out to define what happens when someone interacts with a system. His idea was to study the structure of actions. In doing so, he identified four considerations: “the goal, what is done to the world, the world itself, and the check of the world” (Norman 2013). Within these considerations, Norman defined seven stages of action.

The first stage of action is the formation of the user’s goal (e.g., the idea to move the virtual object). The second stage involves the formation of the user’s intention, which involves identifying a specific action that is expected to accomplish the goal (e.g., grabbing the object). The third stage yields the specification of an action plan, or sequence of actions, that will accomplish the intention (e.g., reach out with handheld device, virtually touch the object, and press the grab button). In the fourth stage, the user executes the action plan by performing the necessary physical actions. In response to the physical actions, the system determines the outcomes of the given input (e.g., attach the virtual object to the device’s position). In the fifth stage, the user perceives the stimuli generated by the system (e.g., displayed visuals of the object following the handheld device). The sixth stage involves the cognitive processing of the perceived sensory stimuli to interpret the current system state (e.g., the object is attached to the virtual hand). In the final stage, the user evaluates the outcome by comparing it to the intention and recognizes that the goal is in progress (e.g., the object is being moved). See Figure 4.3 for a visual representation of the seven stages of action.

Figure 4.3 Norman’s seven stages of action and the Gulfs of Execution and Evaluation. (Image courtesy of Ryan P. McMahan)

Gulfs of Execution and Evaluation

Along with the seven stages of action, Norman identified two major phases of interactions between the user and the system. The first phase involves what the user does to affect the system. The second phase involves how the system informs the user of its state. Norman described both of these phases as “gulfs” to represent the conceptual distance between the user’s mental model of the system and the truth about the system.

The Gulf of Execution represents the gap between the user’s idea of what to do and the actual cognitive and motor actions that must be performed for the user’s goal to be realized in the world. A user can achieve a goal by forming intentions, devising an action plan, and physically executing those actions required by the system to accomplish the goal. However, if the actions required by the system do not match the user’s intentions, the user’s goal will not be obtained. Consider the example goal of moving a virtual object. If the user expects a particular button to be pressed in order to grab the virtual object but the system requires a different button to be pressed, the system will not begin moving the object with the tracked device. Hence, the user has not accomplished the goal.

The Gulf of Execution essentially represents the potential failures of aligning the user’s mental model and expectations to the actual workings of the system and its requirements. Given enough time and experience with a particular system, users will develop procedural knowledge—the action sequences and routines required to achieve their goals and intentions (Cohen and Bacdayan 1994). However, when first interacting with a new system, the user must rely on semantic knowledge—general knowledge of objects, word meanings, facts, and relationships (Patterson et al. 2007). If the user’s semantic knowledge does not adequately match the system’s interface, the user will be forced to close the Gulf of Execution through trial and error. The designer, therefore, should present a user interface that makes it clear to the user (e.g., through affordances, labels, metaphors, or, if all else fails, instructions) how the system works and how goals can be achieved.

On the other hand, the Gulf of Evaluation represents the gap between the actual state of the system, and what the user perceives the state of the system to be through perception and cognition. In order to evaluate whether a goal has been accomplished, the user must perceive available stimuli, interpret those stimuli as outcomes, and assess whether those outcomes match the expected outcomes of the action plan. If the stimuli provided by the system cannot be perceived or are not easily interpreted, it can be difficult for the user to evaluate whether the intentions and goal have been accomplished. Reconsider our virtual hand example and assume that the system waits until the object is released to display its new position, instead of moving the object continuously with the tracked device. It is not possible for the user to confirm that the object is moving until releasing the object, at which point the object is no longer moving. Hence, it would be nearly impossible for the user to move the object to a target position. The designer, therefore, must carefully consider how to present sensory feedback to the user that makes the actual state of the system clear, providing information the user needs to evaluate progress towards goals.

User Action Framework

Another user action model is the User Action Framework (UAF), which is a structured knowledge base of usability concepts and issues that was built upon Norman’s seven stages of action model (Andre et al. 2001). At the core of the UAF is the Interaction Cycle, which is a representation of user interaction sequences (Hartson 2003). The Interaction Cycle essentially partitions Norman’s seven stages into five phases: (i) planning, (ii) translation, (iii) physical actions, (iv) outcomes, and (v) assessment (Hartson and Pyla 2012). Planning involves the formation of the user’s goal and the intention. Translation is the process of devising an action plan. Physical actions cover the motor aspects of executing the action plan. The outcomes (system responses) are entirely internal to the system and include the feedback generated by the system. Finally, the assessment phase covers perceiving the state of the system, interpreting that state, and evaluating the outcomes.

While very similar to the seven stages of action, Hartson and colleagues developed the Interaction Cycle to emphasize the importance of translation during human–computer interactions, in comparison to assessment (Hartson and Pyla 2012). This can be seen in Figure 4.4. Additionally, the Interaction Cycle provides a hierarchical structure for organizing design concepts. The UAF uses this structure along with several established design guidelines to provide a knowledge base of usability concepts and issues. See The UX Book by Hartson and Pyla (2012) for more details about this knowledge base.

Figure 4.4 Hartson’s Interaction Cycle emphasizes the importance of translation. (Image adapted from Hartson and Pyla 2012)

User-System Loop

One user action model that has been developed specifically for 3D UIs is the User-System Loop of McMahan et al. (2015), which is a system-centric adaptation of Norman’s seven stages of action. In the User-System Loop, every interaction begins with user actions that input devices sense or are manipulated by, which are then interpreted by transfer functions as meaningful system effects. Those effects alter the data and models underlying the simulation’s objects, physics, and artificial intelligence. Rendering software then captures aspects of the simulation’s updated state and sends commands to output devices to create sensory stimuli for the user to perceive. This essentially models the flow of information between a user and a system (see Figure 4.5).

Figure 4.5 The User-System Loop models the flow of information between a user and a computer system from a system-centric point of view. See Chapter 11, section 11.6.3 for a discussion of how different types of fidelity impact the User-System Loop. (Image courtesy of Ryan P. McMahan)

Consider our virtual hand example. The user’s actions include reaching out to virtually touch the object and pressing a button. The tracked handheld device acts as the input device. Transfer functions process the position of the device to determine if it is collocated with the virtual object and the button state to determine if a grab is being executed. The simulation updates its state by modeling the movement of the virtual object after the movements of the handheld device. Rendering software generates a 2D image of the object’s updated position to the system’s visual display, which generates light for the user’s eyes to perceive. As seen from this example, the User-System Loop model includes more focus on system actions and less focus on user actions than the seven stages of action or the UAF. However, like the user-centric models, the stages modeled by the User-System Loop can be useful to consider when designing and analyzing user experiences. In particular, the choice of input devices and the design of transfer functions is critical to the usability of many 3D interaction techniques.

4.2.3 Conceptual Models and Affordances

While user action models portray the interactions that occur between a user and the system, conceptual models can be used to understand how the user’s expectations match the system’s (or designer’s) expectations. A conceptual model is an understanding of the information and tasks that constitute a system (Norman 2013). Because these models are based on an individual’s understanding of a system, and not what actually constitutes the system, conceptual models can be incomplete and informal representations of the system (Rosson and Carroll 2001).

Designer’s Model versus User’s Model

There are two important conceptual models. The first is the designer’s model, which represents how the designer understands the system from a development perspective. This model should be a relatively complete and systematic representation of the implemented system. While the designer’s model will include a mental representation of the system, technical representations such as task analyses, flow diagrams, and usage models also compose the designer’s model of a system (Rosson and Carroll 2001). The second important conceptual model is the user’s model, which represents how the user understands the system based on experiences with it. Unlike the designer’s model, the user’s model is formed through ad hoc interactions with the system. This can result in an incomplete and simplistic understanding of what constitutes the system and how it works.

Ideally, for the system to have high levels of usability, the user’s model should match the designer’s model (Norman 2013). During the development process, designers spend a great deal of time expanding and refining their conceptual models of the system. However, users only have their own experiences and interactions with the system to form their models of it. As the designers do not talk directly with the users in most cases, the system is responsible for communicating the designer’s model to the users. If the system and UI do a poor job at making the designer’s model clear and consistent, the users will end up with incorrect conceptual models.

For example, consider a detailed virtual environment in which only a couple of objects can be manipulated. There’s a good chance that the user’s first attempts to interact with the environment may include noninteractive objects. Based on these failed interactions, the user may quickly form the conceptual model that all of the objects are noninteractive. Hence, it is the responsibility of the designer to ensure that the system sufficiently conveys to the user that the two objects can be manipulated.

Affordances

To increase the likelihood of the user’s model matching the designer’s, HCI researchers have studied affordances of systems. In basic terms, an affordance is something that helps the user do something (Hartson and Pyla 2012). The term “affordance” originated from perceptual psychology, where Gibson (1977) introduced the term to refer to a property of an animal’s environment that offers or provides the animal with something. Norman (1988) later popularized the concept within the HCI community to refer to the physical and perceived properties of an object that determine how it could possibly be used. Consider, for example, a door push bar, which is often found on the interior side of exterior building doors. Physically, the push bar affords no other action than to push the door open, as pulling on the bar is difficult without a place to grasp it. This affordance (i.e., the door must be pushed and not pulled) is easily perceived and understood by anyone who has prior experience with door push bars. Affordances of a system can help users better understand how it works, which improves their conceptual model of the system.

While the concept of an affordance is simple, there has been some confusion over what the term exactly refers to. This is primarily due to several researchers defining affordances in different ways (Norman 1988; Gaver 1991; McGrenere and Ho 2000). We believe Hartson (2003) addresses the confusion in a logical way with his four categories of affordances—cognitive, physical, functional, and sensory.

A cognitive affordance is a system property that helps the user with cognitive actions, such as deciding upon an action plan. For example, the user is likely to understand that removing objects from the virtual environment is possible with the provision of an eraser in a virtual tool belt (see Chapter 9, “System Control,” section 9.8.1 for details on the “tool belt” technique). A physical affordance is a property of the system that helps the user perform the necessary physical actions. In the tool belt example, if the eraser tool is near the extent of the user’s reach, it may be difficult for the user to select the eraser without making a conscious effort to reach for it. However, if the eraser and tool belt are located closer, the user should be able to select the eraser with little thought or effort. A functional affordance is a system property that helps the user to accomplish a high-level task. The eraser tool described affords the functionality of removing objects from the virtual environment. Finally, a sensory affordance is a property of the system that helps the user with sensory actions, such as recognizing that a goal has been accomplished. For example, it may be difficult for a user to see whether the eraser tool is touching the object to be erased because of a lack of haptic feedback. If the system highlights objects whenever they are touched by the eraser, the user’s perception of “touching” will be improved.

Note how cognitive, physical, functional, and sensory affordances correspond to the four primary aspects of user action models—the user’s goal, the user’s execution of physical actions, the outcomes of those actions within the system, and the user’s evaluation of those outcomes. With traditional UIs, designers are primarily concerned with cognitive and sensory affordances. These types of affordances relate directly to interaction design and information design (see section 4.4.3), respectively. However, for 3D UIs, physical affordances can sometimes be more important than cognitive and sensory affordances. For example, if users cannot physically perform the gesture required to travel within a virtual environment, it is irrelevant that they know what the intended travel gesture is or perceive that they are not traveling due to the gesture not working as expected. Finally, all systems are expected to provide some sort of functional affordances. Why else would people use them? Hence, it is important for 3D UI designers to consider all four types of affordances when creating systems.

4.2.4 Activity Theory

All of the previous models for understanding the user experience have focused primarily on the user and the system. However, humans do not interact with systems in a vacuum. Instead, these interactions occur in the real world, which includes other humans, systems, objects, and activities. Because of this, HCI researchers have looked at the bigger picture—how do humans interact with computers within real-world contexts? This is sometimes called the ecological perspective of UX (Hartson and Pyla 2012). This has led HCI researchers to adopt and develop new theories as tools for analyzing and designing user experiences for specific real-life contexts. One such theory is activity theory.

Activity theory is a framework for helping researchers to orient themselves in complex real-life contexts (Kaptelinin and Nardi 2012). Unlike prior UX models, it did not originate from within the HCI community. Instead, it was adopted from a movement within Russian psychology that originated in the 1920s. The main thrust of this movement was to overcome the conceptual divide between the human mind and aspects of society and culture (Kaptelinin and Nardi 2012). Several researchers contributed fundamental ideas to the movement, including Rubinshtein (1946) and Vygotsky (1980). However, Leont'ev (1978) is considered the chief architect of activity theory (Nardi 1996).

In activity theory, an activity is a relationship between a subject and an object, which is commonly represented as “S < – > O” (Kaptelinin and Nardi 2012). A subject is a person or a group engaged in an activity while an object represents a motive, something that can meet a need of the subject (Nardi 1996). For example, a user (the subject) can use a visual display (the object) to view a virtual environment (the activity). However, it is important to note that objects do not have to be physical things (Kaptelinin and Nardi 2012). If an artist needs a virtual 3D model for communicating a new car design, the virtual model itself serves as the object for the communication activity.

Principles of Activity Theory

Activity theory includes several basic principles (Kaptelinin and Nardi 2012). Object-orientedness is the principle that all activities are directed toward their objects and are differentiated from one another by their respective objects (Kaptelinin and Nardi 2006). For example, when users 3D-print a thing, they are 3D-printing some object. Additionally, the activity of 3D-printing a dinosaur model is different from the activity of 3D-printing a sphere.

Another principle of activity theory is that activities are hierarchical structures. Activities are composed of actions, and actions are in turn composed of operations (Kaptelinin and Nardi 2012). Actions are the steps necessary to accomplish the activity. Each step corresponds to a goal that the subject must consciously work toward. On the other hand, operations are routine processes that bring about the conditions necessary to complete a specific action. For example, in the activity of reviewing a virtual architectural design, an architect will consciously decide to perform the actions of reviewing every space, room by room. However, during this process, the architect is likely not paying attention to the navigation operation, unless the travel technique is poorly designed.

Internalization and externalization are two other basic principles of activity theory (Kaptelinin and Nardi 2012). Internalization is the concept that external operations can become internal. For example, when first learning to use a touchpad interface, many users may need to watch their fingers move on the touchpad. However, over time, most users can interact with these interfaces seamlessly, without watching their fingers. On the other hand, externalization is the transformation of internal operations into external ones. For example, the artist must externalize her design for a new car into a 3D model before she can share it with others.

The principle of mediation is the concept of a third entity that changes the original relationship between a subject and an object (Kuutti 1996). For example, modern GPS systems mediate the activity of humans navigating to their desired locations. A principle closely related to mediation is development. Development is the realization that activities will transform over time (Kaptelinin and Nardi 2006). While early humans had to observe the sun and stars in attempts to reach target destinations, the activity of navigation has changed over time due to mediating tools, such as maps, compasses, and GPS.

Activity System Model

Engeström (1987) proposed the Activity System Model as an extension of activity theory that accounts for activities carried out by collective subjects. The Activity System Model redefines the notion of an activity to include community as a third element, to represent the three-way interaction between subjects, their communities, and objects. Additionally, Engeström (1987) noted that a special type of entity mediates each of the three distinct interactions. Instruments mediate the subject-object interaction. Rules mediate the subject-community interaction. And, the division of labor mediates the community-object interaction. Finally, the outcome of the activity system represents the transformation of the object produced by the interactions and the mediating aspects, as seen in Figure 4.6.

Figure 4.6 Engeström’s Activity System Model. (Image adapted from Engeström 1987)

See Activity Theory in HCI by Kaptelinin and Nardi (2012) for an in-depth discussion of activity theory and the Activity System Model.

Activity Theory and 3D UIs

In addition to traditional interfaces, activity theory has also been used to understand user experiences in 3D UIs. Roussou et al. (2008) explored using activity theory as a tool for analyzing user interactions within virtual environments. The researchers categorized aspects of the interactions based on their effect within the activity system. For example, they categorized the virtual environment as the tool that mediated the user’s ability to accomplish her goal. Additionally, they categorized changes in the user’s conceptual model as changes to the rules that mediate the subject and community. As the researchers suggest, there are many other opportunities for using activity theory in the design and evaluation of 3D UIs.

4.2.5 Embodied Interaction

Another concept closely related to activity theory is Embodied Interaction, which is defined as “interaction with computer systems that occupy our world, a world of physical and social reality, and that exploit this fact in how they interact with us” (Dourish 2001). In other words, Embodied Interaction exploits our familiarity with the real world, including experiences with physical artifacts and social conversations. Dourish (2001) identifies these familiar real-world aspects as embodied phenomena, which by his descriptions are things that exist and are embedded in “real time and real space.” He further explains that Embodied Interaction is “the creation, manipulation, and sharing of meaning” through engaged interaction with these embodied phenomena.

In his book Where the Action Is: The Foundations of Embodied Interactions, Dourish (2001) primarily discusses the concept of Embodied Interaction in the contexts of tangible computing and social computing. We describe each of these HCI fields below and how Embodied Interaction relates to each.

Tangible Computing

Tangible computing is an HCI field that focuses on users interacting with digital information through the physical environment (Ishii 2008). Specifically, tangible user interfaces (TUIs) use physical objects to seamlessly and simultaneously integrate physical representations of digital information and physical mechanisms for interacting with them (Ullmer and Ishii 2001). A key aspect to TUIs is that computation is distributed across a variety of physical objects that are aware of their location and their proximity to other objects (Dourish 2001). This allows for concurrent access to and manipulation of these spatially aware computational devices (Kim and Maher 2008).

By their nature, TUIs provide Embodied Interaction (Dourish 2001). The physical objects used in tangible computing are great examples of embodied phenomena. They are embedded into the physical world and exist in real time and real space. Additionally, these physical objects provide the ability to take advantage of highly developed skills for physically interacting with real-world objects (Dourish 2001). Furthermore, they allow users to create, manipulate, and share meaning through physical interactions, as each physical object represents digital information.

TUIs are highly relevant to the 3D UI community. From an Embodied Interaction perspective, the tangible components are viewed as physical representation of digital information. However, from an AR perspective, the tangible components can be considered enhanced or augmented by the digital information. Hence, they are essentially a subset of AR interfaces (Azuma et al. 2001).

Social Computing

Social computing describes any type of system that serves as a medium or focus for a social relation (Schuler 1994). As media, computer systems can facilitate social conventions and contexts (Wang et al. 2007). Consider, for example, how email has largely replaced conventional paper-based mail. Alternatively, as the focus, a computer system can create new opportunities and types of social interactions (Dourish 2001). For example, massively multiplayer online role-playing games (MMORPGs) have enabled new opportunities for online gamers to socialize in a way not possible in the real world.

Social computing is closely related to Computer-Supported Cooperative Work (CSCW), which addresses how “collaborative activities and their coordination can be supported by means of computer systems” (Carstensen and Schmidt 1999). One of the primary concerns of CSCW is in what context a system is used. To help categorize contexts for CSCW systems, Johansen (1989) introduced the CSCW Matrix, which can be seen in Figure 4.7. This matrix categorizes CSCW contexts based on time and location. With regard to time, individuals can collaborate synchronously at the same time or asynchronously at different times. For location, social interactions can occur collocated in the same place or remotely in different places.

Figure 4.7 The CSCW Matrix categorizes the contexts of social computing according to time and location. (Image adapted from Johansen 1989)

By their nature, social conventions and conversations are embodied phenomena that create, manipulate, and share meaning. Hence, social computing and CSCW applications that facilitate such social interactions provide Embodied Interaction by definition (Dourish 2001). Additionally, CSCW applications that serve as the focus for social relations can provide new opportunities for creating, manipulating, and sharing meaning. Dourish (2001) terms this development of opportunities as “appropriation,” which is essentially the emergence and evolution of new practices within real-life settings, both physical and social.

Social computing and CSCW are highly relevant concepts in the 3D UI field. One of the earliest goals within the field was to create systems that enabled telepresence, the sharing of task space and person space in collaborative work (Buxton 1992). Since then, researchers have developed a number of systems that allow users to interact in the same virtual 3D space while present in different physical 3D spaces. Researchers have even developed group-to-group telepresence, in which two groups of remote users can meet within a virtual environment and engage in 3D social interactions (Beck et al. 2013). In addition to telepresence, researchers have investigated using 3D UIs as collaborative visualization systems, with multiple systems having been developed (Pang and Wittenbrink 1997; Lascara et al. 1999; Sawant et al. 2000).

4.2.6 Evolving Understanding

In the previous sections, we covered a number of models and theories that attempt to understand and describe the user experience. However, these are just a small sample of the contributions that have been made to the HCI and UX fields. Yet, even if we were able to take the plethora of UX models and theories into account as a whole, we would still not fully understand every user experience. This is mainly due to the fact that HCI is constantly changing. As it evolves and new UIs emerge, it is up to HCI and UX researchers to continue attempting to better understand the user experience.

4.3 Design Principles and Guidelines

Through understanding the user experience, HCI researchers have extracted and elucidated numerous principles and guidelines for designing UIs. In this section, we present some common ones that are repeatedly discussed. We organize these principles and guidelines according to the four distinct aspects of the user action models: (i) goal, (ii) execution, (iii) outcome, and (iv) evaluation.

4.3.1 Goal-Oriented Design Rules

It is important that UX designers create UIs that facilitate the formation of the user’s goals. If a UI is overly complex, it may take the user extra time to form a goal. If the UI must be complex due to the system’s wide range of functions, the corresponding components should be organized in a sensible manner. Finally, if the purpose of a UI component is not clear, the user may waste time discerning what it does. Hence, in this section, we discuss the design principles of simplicity, structure, and visibility.

Simplicity

Simplicity is the design principle that the UI should be kept “as simple as possible and task focused” (Johnson 2014). Based on the human processor models of HCI, we know that a complex UI will require more perceptual and cognitive processing time in order to form a goal than a simpler UI with fewer components. The Hick-Hyman Law (Hick 1952; Hyman 1953) formally explains that an increase in the number of choices increases decision times logarithmically. Hence, it is important that the design of a UI remains simple, with as few components as possible to achieve tasks within the system.

There are a number of guidelines that UX designers can follow to keep their UIs simple. Nielsen (1994) explains that UIs should not contain information or components that are irrelevant or rarely needed. An example use of this guideline is region-based information filtering, which reduces information clutter in AR applications (Julier et al. 2000). Another guideline to keep the UI simple is to avoid extraneous or redundant information (Constantine and Lockwood 1999). For example, Tatzgern et al. (2013) used compact visualization filters to reduce redundant information in AR applications, like the one seen in Figure 4.8. Finally, if a UI is still complex after removing irrelevant and redundant components, a third design guideline is to provide the user with the capability to customize a default subset of controls (Galitz 2007). For example, smartphones allow users to rearrange the icons representing apps so that their most frequently used apps are visible on the first screen.

Figure 4.8 The top image shows a truck annotated with numerous labels with a small font that is difficult to read. The bottom image of the truck uses compact visualization filters to remove redundant information, support larger font sizes and to keep the UI simple. (Image courtesy of Ryan P. McMahan)

Structure

Structure is the design principle that the UI should be organized “in a meaningful and useful way” (Stone et al. 2005). In particular, the structure of the UI should match the user’s conceptual model of the task (Rosson and Carroll 2001). The structure should facilitate the identification of specific actions that are expected to accomplish the user’s goal. To accomplish this, the UI’s components should be structured to “appear in a natural and logical order” (Nielsen 1994).

UX designers can rely on some guidelines when structuring their UIs. First, complex tasks should be broken down into simpler subtasks (Stone et al. 2005). A relevant 3D UI example of this guideline is the HOMER technique developed by Bowman and Hodges (1997). HOMER distinctly splits the complex task of manipulating a virtual object into two subtasks—selecting the object and positioning the object in 3D space. Many desktop 3D UIs go even further by breaking down the act of positioning the object into separate controls for each DOF. See Chapter 7, sections 7.9.1 and 7.7.3, for more details on HOMER and desktop-based manipulations, respectively.

Another structure-oriented design guideline is that every sequence of actions should be organized into a group or technique with a beginning, middle, and end (Shneiderman and Plaisant 2010). The SQUAD technique developed by Kopper et al. (2011) is an example of such a technique. It starts with casting a sphere to select a small group of objects and then uses a quad menu to refine the selection through a series of menu selections until the desired target is confirmed (see Chapter 7, section 7.10.3). Lastly, a similar guideline is to group any related or comparable functions. For example, in their rapMenu system, Ni et al. (2008) organized sets of related content into hierarchical pie menus (see Figure 4.9).

Figure 4.9 Similar controls or content can be grouped in hierarchical menus, such as this radial menu, to provide structure to the UI. (Image courtesy of Ryan P. McMahan)

Visibility

Stone et al. (2005) define visibility as the design principle that “it should be obvious what a control is used for.” In order for the user to form an intention and corresponding action plan, the user must understand what functions and options are currently available. From these, the user will decide upon a single feature or series of commands to accomplish her current goal.

The first design guideline to ensuring visibility is that UX designers must make certain that their controls are perceivable. This is also known as discoverability. In web-based interfaces, designers should ensure that important UI components do not appear “below the fold” (i.e., below the part of the page that will be initially visible (Nielsen 1999). For 3D UIs, designers have even more placement issues to be concerned with. First, UI components may initially be outside of the user’s field of view. Second, controls may be occluded by parts of the virtual environment or other UI components. In general, head-referenced and body-referenced placements are recommended for 3D UI components because they provide a strong spatial reference frame. See Chapter 9, section 9.5.2, for more discussion on placing 3D UI components.

A second visibility guideline is to employ visual icons and symbols to represent UI features in a familiar and recognizable manner (Shneiderman and Plaisant 2010). By doing so, UX designers can leverage users’ perceptual processors instead of relying solely on cognitively demanding labels and descriptions. Many 3D UIs can leverage icons and symbols found in traditional desktop UIs. For instance, Moore et al. (2015) provided a recycle bin for removing notes and chords from their virtual musical interface (see Figure 4.10). However, as with traditional UIs, text can be more comprehensible than nonintuitive icons or indirect symbolism (Shneiderman and Plaisant 2010). Hence, 3D UI designers should also be familiar with using labels to make it obvious what their controls are used for.

Figure 4.10 The recycle bin in the Wedge musical interface (Moore et al. 2015) is a visible method for removing notes and chords from the composition environment. (Image courtesy of Ryan P. McMahan)

4.3.2 Execution-Oriented Design Rules

In addition to supporting the formation of the user’s goals, UX designers must be concerned with how the user specifies and executes an action plan. If the user does not know how to use a UI component, he will have difficulty specifying an action plan, even if he knows what the purpose of the UI component is. Additionally, it is futile to specify an action plan if the user cannot physically execute the plan. Finally, UX designers should help users avoid committing errors when executing action plans. Correspondingly, we discuss the design principles of affordance, ergonomics, and error prevention in this section.

Affordance

Affordance is the design principle that “it should be obvious how a control is used” (Stone et al. 2005). Note that a user may understand what the purpose of a UI component is (i.e., it has visibility), but the user may not understand how to use the control (i.e., it lacks affordance). For example, a user may understand that a ray-casting technique is provided for making selections, but the user may not know which button is required to confirm a selection. The clarity of how to use a control will depend on its cognitive, physical, functional, and sensory affordances, which are discussed in section 4.2.3.

While creating affordances may not be intuitive for all UX designers, there are design guidelines that will generally help in that process. One guideline is to leverage the user’s familiarity with other UIs, as an interface that has already been mastered is the easiest to learn (Constantine and Lockwood 1999). This can be challenging for 3D UI designers, as many users have yet to experience, let alone master, another 3D UI. However, some aspects of traditional UIs can be leveraged within 3D UIs, such as the case of adapting 2D menus for system control (see Chapter 9, section 9.5.1, for more details).

A second design guideline is to provide direct manipulation, which is Shneiderman’s concept of creating a visual representation of the world of action that users can directly manipulate and interact with (Shneiderman 1998). Originally, direct manipulation was used to refer to graphical UIs, such as the desktop interface, which provided visual representations for direct interactions, as opposed to command-line interfaces. However, it is easy to see that many 3D UIs, especially VR ones, inherently rely on the concept of direct manipulation. Hence, 3D UI designers should be cognizant of the degree of direct manipulation that their interfaces provide. In particular, the realism or fidelity of the system should, when possible, be kept extremely high (see Figure 4.11 for an example and Chapter 11, “Evaluation of 3D User Interfaces,” section 11.6.3, for a discussion of fidelity).

A final design guideline for affordance is consistency. Uniformity in the appearance, placement, and behavior of components within the user interface will make a system easy to learn and remember (Stone et al. 2005). Similar situations should require consistent sequences of actions (Shneiderman and Plaisant 2010). Essentially, consistency builds familiarity within a system. For 3D UI designers, one concern is that many input devices have multiple buttons. As much as possible, the same button should be used for the same type of action (e.g., trigger button for selection, thumb button for navigation).

Figure 4.11 Providing realistic, high-fidelity interactions, such as the virtual hand technique, is one design guideline for affordance. (Image courtesy of Ryan P. McMahan)

Ergonomics

Ergonomics is “the science of fitting the job to the worker and the product to the user” (Pheasant and Haslegrave 2005). As a UX design principle, ergonomics is concerned with the physical execution of action plans (see Chapter 3, “Human Factors Fundamentals,” section 3.5 for a discussion of the anatomical and physiological foundations of physical ergonomics). Can the user physically execute the motor actions required for the plan without injury or fatigue? If not, the UI is a failure, regardless of whether other design guidelines were closely adhered to.

There are four design guidelines for fitting the UI to the user: (i) clearance, (ii) reach, (iii) posture, and (iv) strength (Pheasant and Haslegrave 2005; Tannen 2009). As a guideline, clearance ensures there is adequate room between objects to move around without inadvertently colliding with something. For traditional UIs, a common clearance issue is the “fat finger” problem, in which users struggle to press a target button due to their fingers also touching nearby buttons. A similar issue for some 3D UIs is attempting to select a small object but struggling to do so because of colliding with other nearby small objects. However, selection technique enhancements can be used to alleviate these issues (see Chapter 7, sections 7.4.3 and 7.5.3).

The reach design guideline is focused on ensuring that the user has the ability to touch and operate controls (Pheasant and Haslegrave 2005). A common reach issue is that users struggle to operate certain smartphones with one hand due to touchscreens larger than their hands. Reach issues also exist in 3D UIs, particularly with body-referenced controls. If a user is shorter than average and a control is placed at the average extent of a user’s arm length, the shorter user will struggle to reach the control.

Another design guideline is to ensure that the user’s body posture does not deviate from a natural and comfortable position (Tannen 2009). Postural problems are often the result of other clearance and reach issues (Pheasant and Haslegrave 2005). However, the form factor, or shape, of an input device can also directly cause postural problems, such as contorting the wrist posture to hold a controller (Tannen 2009). Because 3D UIs usually involve and require a broader range of body movements than traditional UIs, UX designers should carefully consider a comfortable posture as a design requirement. Both sitting and standing postures can be comfortable, although standing might lead to fatigue if usage durations are long, and sitting might be uncomfortable if users have to physically turn to look in all directions.

Strength is the focus of the final ergonomics design guideline and pertains to the amount of force required to operate a control (Pheasant and Haslegrave 2005). There are two issues relating to strength. First, there is the issue of the minimum amount of force required to use a control. This is especially important for weaker or smaller users. Second, there is the issue of the maximum amount of force that a control can handle. Some heavy-handed users may exert more force than an interface component can handle, which may result in a broken system or device. For example, many force-feedback devices have a maximum force resistance (see Chapter 5, “3D User Interface Output Hardware,” section 5.4.1). If a user exerts more than this, these devices are prone to break.

Error Prevention

One of the most important design principles is the prevention of user errors (Shneiderman and Plaisant 2010). User errors occur for a number of reasons (see section 3.4.3 for a definition of human error). Due to poor visibility, the user may have confused a paste icon and functionality for a copy feature. Due to poor affordance, users might misuse an eraser feature and delete entire images. Alternatively, due to poor ergonomics, the user may routinely press the wrong UI component by accident. Errors are time consuming and frustrating for users. More importantly, some errors are difficult or even impossible to reverse. Hence, it is important for UX designers to prevent users from committing errors during execution, when possible.

There are design principles that help to prevent user errors. The first of these is to permit only valid actions (Shneiderman and Plaisant 2010). In systems with many features, it is unlikely that all of the options are valid given the system’s current state. Hence, invalid options should be disabled and shown as disabled to the user (Hartson and Pyla 2012). For example, in most applications, the copy menu item is disabled and grayed out unless something is currently selected and can be copied. In 3D UIs, constraints are often used to enforce correct actions. For example, if a large virtual object should only be slid about the floor and not picked up, the 3D UI designer can apply a vertical constraint to the object to keep it on the floor. See Chapter 10, “Strategies in Designing and Developing 3D User Interfaces,” section 10.2.2 for more information about constraints.

A second design guideline for preventing errors is to confirm irreversible actions (Shneiderman and Plaisant 2010). UX designers should create most actions to be reversible or easy to recover from (see section 4.3.4 for more discussion on error recovery). However, some actions are difficult or impossible to reverse, such as saving the current state of a file over its previous state. In these cases, UX designers should require users to confirm irreversible actions before executing those actions. This is often achieved through confirmation dialog boxes. However, in some 3D UIs, voice commands (see Chapter 9, section 9.6) can be used to confirm actions without interfering with the user’s current physical interactions.

Another error prevention guideline is to offer common outcomes to users based on their current actions. For example, Google’s autocomplete feature will suggest search terms based on what the user has typed thus far (Ward et al. 2012). Similarly, when making a repeating event, most calendar applications will offer options to repeat the event every day, week, month, or year, in addition to the capability to specify particular days. The SWIFTER speech widget developed by Pick et al. (2016) is an example of a 3D UI offering common outcomes to prevent errors. With SWIFTER, the user can correct misrecognized speech by selecting the correct word from a provided list of likely alternatives.

4.3.3 Outcome-Oriented Design Rules

The primary reason that the User Action Framework (see section 4.2.2) includes outcomes is that system outcomes have a huge impact on the user experience. Computers are supposed to perform work for humans, not the other way around. If the user initiates a fairly common high-level action, the computer should process that action with as little input as possible from the user. However, the computer must not take complete control of the action and prohibit user input on the outcome. In this section, we discuss the design principles of automation and control.

Automation

The design principle of automation is to have the computer and UI automatically execute low-level actions that commonly occur as part of a high-level task (Shneiderman and Plaisant 2010). For example, cascading style sheets (CSS) allow the user to change the font size of all headers across an entire website with a single edit, instead of finding and modifying each individual header instance. This type of automation reduces user work and decreases the likelihood that the user will make a mistake in editing each individual header.

There are three guidelines that UX designers should consider for automation. First, a UI should be designed to avoid tedious input from the user (Hartson and Pyla 2012). Many 2D UIs, particularly Internet browsers and websites, will automatically fill in common information, such as the user’s name and street address. This helps avoid typing errors, which could result in less-than-desirable outcomes, such as paying for a pizza to be delivered to a neighbor. For 3D UIs, which already present many challenges with regard to symbolic input, voice recognition can be used to simplify user input and avoid tedious interactions.

Second, UX designers should design interfaces to complete common sequences of actions (Shneiderman and Plaisant 2010). For example, many applications provide a simple installation feature with a default set of options. This allows the user to initiate the installation process with a few clicks, instead of requiring a sequence of inputs to specify the options. Target-based travel techniques, which automate travel to a target location without requiring the user to specify a path (see Chapter 8, “Travel,” section 8.6.1), are examples of automating a sequence of actions in 3D UIs.

Third, UX designers should provide interaction methods for completing a series of similar actions in parallel. For example, to delete a sentence in most word processors, the user does not have to delete each character individually by repeatedly pressing the backspace key. Instead, the user can use the mouse cursor to select the entire sentence and then delete it with a single press of the backspace key. Another example would be providing a technique that allows the user to select multiple objects at once (see Chapter 7, section 7.10.2, for some 3D UI examples of multiple-object selection).

Control

Another design principle that is closely related to automation is control. Control is the principle of ensuring that the computer and UI respond to the actions of the user (Shneiderman and Plaisant 2010). Users should feel that they are in charge. Hence, the UI should not make decisions that contradict the actions of the users. If UX designers are not careful with automation, they may alienate users by giving too much control to the system. Also, if an interface is not correctly implemented, users may experience a loss of control due to missing or incorrect functionality that fails to produce their desired outcomes (Hartson and Pyla 2012).

One design guideline for providing control to the users is to avoid too much automation (Hartson and Pyla 2012). This guideline may seem to contradict the earlier automation guidelines, but it directly addresses the balance of automation. UX designers must be careful not to create UIs with too little automation or too much automation. Additionally, UX designers should provide mechanisms for easily disengaging automation (Shneiderman and Plaisant 2010). For example, most autocorrect techniques will automatically replace misspelled words with commonly intended ones. However, these techniques often include a simple confirm feature that allows the user to override the automation and accept a non-dictionary word. Similarly, in 3D UIs, some target-based travel techniques allow users to disengage travel toward the target location in order to look around their current positions.

Another control design guideline is to provide features that facilitate both novices and experts (Shneiderman and Plaisant 2010). For instance, novices will be more dependent upon the affordances of a UI, such as a menu item for copying, while experts will benefit from more efficient features, such as a keyboard shortcut for copying. A 3D UI example of facilitating both novices and experts is marking points along a path to control travel, as shown in Figure 4.12 (see Chapter 8, section 8.6.2). Novices can use this technique to specify a single point at their desired destination, and then the technique automates travel to that point. However, experts can use the same technique to specify multiple points and control the path of travel.

Figure 4.12 The travel technique of marking points along a path is an example of providing control to both novices and experts. Novices can use the technique to specify a single destination (left). Alternatively, experts can specify multiple points to control the path of travel (right). (Image courtesy of Ryan P. McMahan)

A control design guideline that may seem quite obvious is to avoid missing or incorrect functionality (Hartson and Pyla 2012). Such issues usually arise due to UI or backend software bugs. However, these issues can also be caused by incomplete or overly complex designs that do not account for the range and potential sequences of user input. For example, consider a 3D UI in which a virtual hand technique can be used to manipulate and move objects within a virtual environment. If the UX designer does not consider the possibility that the user may attempt to move normally stationary objects, the user may be surprised when he accidentally moves an architectural wall while using the virtual hand technique near it.

4.3.4 Evaluation-Oriented Design Rules

UX designers must also be concerned with how the user perceives, interprets, and evaluates the system’s outcomes. If the desired outcome is not obvious, the user may think that he incorrectly used a UI component or that the system is not functioning properly. If an outcome is obvious but incorrect, the system should provide the user with the ability to quickly recover from the error. In this section, we discuss the design principles of feedback and recovery.

Feedback

Feedback is the design principle that “it should be obvious when a control has been used” (Stone et al. 2005). There should be system feedback for every user action (Shneiderman and Plaisant 2010). Feedback for frequent or minor actions can be modest to avoid interrupting the user’s workflow. However, feedback for infrequent or major actions should be prominent, in order to get the user’s attention.

One guideline for providing feedback is to respond immediately to every user action to avoid a perceived loss of control. Hardware limitations and network communications are usually the cause of delayed responses; in these cases, there is little that UX designers can do to respond immediately with the desired outcome (Hartson and Pyla 2012). However, UX designers can create interfaces to notify users that the outcome is delayed by using warning messages and progress bars. Such notifications can also be useful in 3D UIs, as in the cases of loading a new environment or handling a global effect that takes time to process.

Another design guideline is to provide informative feedback (Shneiderman and Plaisant 2010). This may sound intuitive, but it can be challenging to deliver the appropriate amount of information back to the user. Too little information may be unhelpful to the user and quickly disregarded, such as displaying an error code in place of a useful error message. However, too much information may be overwhelming and also quickly disregarded. For instance, consider the “blue screen of death” that would occasionally plague users on older versions of the Microsoft Windows operating system. The screen would include a “wall” of text, including a notification that there is an issue, a short description of the issue, what the user should immediately do, what the user should do if the immediate action does not work, and then a technical error message for debugging the problem.

The concept of informative feedback also extends beyond textual information. For example, consider the ray-casting technique (Chapter 7, section 7.5.1). The results of some early research studies differed on the usability of the technique due to varying feedback regarding the direction of the ray. Poupyrev, Weghorstet al. (1998) used a short line segment attached to the user’s hand to represent the direction of the ray and found that users struggled to use their implementation of the technique when additional feedback was not provided. On the other hand, Bowman and Hodges (1997) used a dynamic line segment that would extend out to touch the virtual object being pointed to and found that users were able to easily perform selections with it (see Figure 4.13).

Figure 4.13 The ray-casting technique on the left provides better informative feedback than the ray-casting technique on the right. (Image courtesy of Ryan P. McMahan)

Error Recovery

The design principle of error recovery is that the system should help the user recognize, diagnose, and recover from errors (Nielsen 1994). In section 4.3.2, we discussed prevention, which helps prevent the user from committing errors. However, recovery is focused on helping the user after an error has already been committed. This also applies to outcomes that were originally desired by the user but then evaluated as not satisfying the user’s goal.

We’ll discuss two design guidelines for providing recovery. The first of these is to provide easy-to-reverse actions (Shneiderman and Plaisant 2010). When possible, actions should be able to reverse their own outcomes, and this should be easy to perform. If this is the case, the user will be able to reuse the same knowledge of the control that caused the error to reverse the error. For example, if the user mistakenly moves an image to the wrong position in a photo-editing application, the user can simply move the image back to its original position or to a new desired position. Many 3D interaction techniques with one-to-one correspondences between input and output afford easy-to-reverse actions. However, nonisomorphic mappings can result in difficult-to-reverse outcomes. See Chapter 7, section 7.3.1 for more details on isomorphism.

When actions cannot be used to reverse their own outcomes, UX designers should consider the guideline of providing additional controls that can reverse the outcomes. The undo and redo features provided in many word processors and photo-editing applications are great examples of this guideline. These additional controls allow users to easily reverse the latest outcomes, whether the unit of reversibility is a single action, a data-entry task, or a complete group of actions (Shneiderman and Plaisant 2010). A 3D UI example of providing additional controls for reversing outcomes is the ZoomBack technique (see Chapter 8, section 8.6.1). With the ZoomBack travel technique, the user can move to a new location with one control and then return to the previous position with a second control.

4.3.5 General Design Rules

Some design principles are important at every stage of user action. We consider these design rules general, as they usually apply to both the Gulf of Execution and the Gulf of Evaluation. If a UI is not designed for a diverse set of users, some intended users might be excluded from using the system due to disabilities or other limiting conditions. Even if a UI is physically accessible, users may struggle to use a system if they do not understand the system’s vocabulary. Finally, systems that require a great deal of memory recall to use can be imposing and frustrating for users. Below, we discuss the design principles of accessibility, vocabulary, and recognition.

Accessibility

Accessibility is the design principle that an interface is usable by all intended users, despite disabilities or environmental conditions (Johnson 2014). This principle echoes Shneiderman and Plaisant’s (2010) concept of providing “universal usability” by recognizing the needs of diverse users and facilitating the transformation of content. This includes facilitating users with disabilities, such as vision-impaired, hearing-impaired, and mobility-impaired users. The United States Access Board provides several guidelines for users with disabilities, including keyboard and mouse alternatives, color settings, font settings, contrast settings, alternative text for images, webpage frames, links, and plug-ins (http://www.access-board.gov/508.htm).

Accessibility has been and continues to be a major problem for many 3D UIs. Users that require glasses for vision correction are excluded from using many HMDs, as the visual displays provide little room for glasses between the eyes and display optics. 3D UI designers often fail to provide transformations of tones into visual signals for hearing-impaired users. Mobility-impaired users are perhaps the most excluded group, as the inherent nature of 3D UIs requires more physical movements than traditional interfaces.

However, strides have been made to improve accessibility to 3D UIs. Some HMDs, such as the Samsung Gear VR, provide a focus adjustment for nearsighted users. Bone-conduction headphones similarly help users with ear drum damage to hear sounds. Researchers have even found evidence that a crosshair always rendered in the same position on the user’s display will significantly improve stability for users with balance impairments (Shahnewaz et al. 2016). See Figure 4.14 for an example. UX designers will need to continue exploring solutions for users with disabilities and other limiting conditions to ensure the accessibility of 3D UIs.

Figure 4.14 A static reference frame has been shown to significantly improve accessibility for users with balance impairments. (Image adapted from Shahnewaz et al. 2016)

Vocabulary

Another general design principle is to use the vocabulary of the intended users (Johnson 2014). UX designers are sometimes guilty of using their own jargon and terminology instead of those used by their users. Asking about and documenting terminology during the contextual inquiry process (see section 4.4.2) can help remedy this issue. UX designers should also inquire about vocabulary differences among all their stakeholders. For example, if a system is to be used by accountants and non-accountants, the designers should not use the term “general ledger” to refer to the collection of all accounts. Instead, they should consider using “all accounts” or similar terminology for the feature.

Vocabulary is not usually an issue when designing 3D UIs because visual representations and virtual objects are normally preferred to text. However, vocabulary is important for some 3D UIs, especially those designed to facilitate learning or training. For example, Greunke and Sadagic (2016) had to observe the vocabulary of landing signal officers in order for their voice recognition system to support the commands issued by the officers during VR-based training (see Chapter 9, section 9.6 for more details on using voice commands). In a similar situation, Eubanks et al. (2016) used the same terminology as the Association of Surgical Technologists to ensure that their intended users would understand their VR system for training on personal protective equipment protocols (see Figure 4.15).

Figure 4.15 Vocabulary can be very important for some 3D UIs, particularly those focused on learning and training. (Image courtesy of Ryan P. McMahan)

Recognition

The design principle of recognition is that a UI should provide the knowledge required for operating the UI in the UI instead of requiring users to recall it from memory (Nielsen 1994). Recognition is essentially the activation of a memory given the perception of sensory stimuli similar to those present when the memory was encoded. On the other hand, recall is the activation of a memory in the absence of similar stimuli. Recalling information is much more difficult than recognizing information (Johnson 2014). This is why users should not be required to recall information from one screen to another, and instead, UIs should be designed to minimize the memory load on the user (Shneiderman and Plaisant 2010). Norman (2013) characterized this distinction as “knowledge in the world” (recognition) as opposed to “knowledge in the head” (recall).

Let’s review three design guidelines for facilitating recognition over recall. First, information likely required by the user should be placed in the context of its use. For example, most databases use associative tables to represent many-to-many relationships by referencing the primary keys of other data tables (e.g., “customer with ID 4 purchased product with ID 8”). However, when displaying these relationships, the UI should fetch the relevant information of each association to avoid requiring the user to recall what every ID refers to (e.g., “John Doe purchased coffee filters”). This guideline is also applicable to 3D UIs. For example, maps and signs can be used to avoid requiring users to recall spatial knowledge during wayfinding tasks (see Chapter 8, section 8.9.2 for more details).

A second design guideline for recognition is to let users know what their options are. Graphical UIs and their menu systems have demonstrated that it is easier to see and choose functions than to recall and type commands, as command-line interfaces require (Johnson 2014). This guideline is highly relevant for voice commands (Chapter 9, section 9.6) and gestural commands (Chapter 9, section 9.7). Both of these interfaces are by default invisible to the user, unless an overview of the available functions is displayed for the user. Hence, users are by default required to recall verbal commands and specific gestures instead of recognizing their options. However, researchers have demonstrated that “feedforward” mechanisms that provide information about commands prior to their execution should be employed (Bau and Mackay 2008).

A third design guideline for promoting recognition over recall is to use pictures and visual representations when possible (Johnson 2014). This directly relates to the principle of visibility and using icons and symbols to represent controls in a recognizable manner (see section 4.3.1). However, it also applies to easily recognizing feedback. For example, most operating systems have distinct symbols for error, warning, and information messages. 3D UI designers can leverage real-world objects in a similar way. For example, Nilsson et al. (2014) used a stop sign to help users recognize that they were unintentionally moving within the physical tracking space while using a walking-in-place technique.

4.3.6 Other Design Rules

While the design principles and guidelines above cover a broad range of issues that designers should be concerned with, they are by no means complete or exhaustive. There are numerous design guidelines that researchers have proposed and validated through user studies. This chapter simply provides an organized overview of the most commonly cited design rules. For those readers interested in learning about other design rules, we have recommended some additional readings at the end of this chapter.

4.4 Engineering the User Experience

Despite being armed with the best set of design principles and guidelines, even experienced UX designers cannot guarantee an optimal user experience for every one of their designs. This is largely because many systems have their own unique purpose, and even those with similar purposes may be intended for use in contrasting real-life settings. As a result, general design principles and guidelines, and even years of UX experience, may be insufficient to foresee the nuances of a system’s use and potential pitfalls of a particular UX design. Instead, delivering a system that enables an excellent user experience requires an explicit engineering process (Macleod et al. 1997). In this section, we discuss engineering the user experience and will speak of UX engineering, as opposed to UX design.

The key to UX engineering is the process itself. A process acts as a guiding structure that keeps the UX team members on track with checklists that ensure complex details are not overlooked. In UX engineering, the term lifecycle is used to refer to this structured process, which often consists of a series of stages and corresponding activities (Hartson and Pyla 2012). A number of lifecycles have been proposed and investigated for usability and UX engineering (e.g., Hartson and Hix 1989; Mayhew 1999; Helms et al. 2006; and Kreitzberg 2008). However, in this section, we discuss the stages and activities involved in the Wheel lifecycle originally proposed by Helms et al. (2006).

The Wheel lifecycle employs four primary stages for UX engineering: (i) Analyze, (ii) Design, (iii) Implement, and (iv) Evaluate. The Analyze stage is about understanding the user’s current activities and needs. The Design stage involves designing interaction concepts. The Implement stage involves realizing the design concepts in the form of prototypes or fully functioning systems. Finally, the Evaluate stage serves to verify and refine the interaction design.

In addition to the four stages, a key aspect of the Wheel lifecycle is that it is an iterative process. After the Evaluate stage is completed, the UX engineering process can continue by iterating back through the four stages, starting with the Analyze stage. Additionally, each individual stage can be immediately repeated if its outcomes were unsatisfactory, as seen in Figure 4.16. In our experiences with employing the Wheel lifecycle for UX engineering, we have found these iterative opportunities to be extremely important.

Figure 4.16 An illustration of the Wheel lifecycle process and its opportunities for iteration. (Image adapted from Hartson and Pyla 2012)

4.4.1 System Goals and Concepts

Before the UX engineering process, there must be a reason for developing a UI. Perhaps users consistently complain about a poorly designed system that wastes their time and frustrates them. Or maybe the upper management of a large company has decided to create a new product to better compete in the market. Or possibly a moment of inspiration has spurred an individual to pursue a revolutionary idea that will change the world. Regardless of its origins, there is always a reason behind starting the UX engineering process.

Improving Usability

One of the most common goals of UX engineering is to improve usability. The term usability refers to the qualities of a system that impact its effectiveness, efficiency, and satisfaction (Hartson and Pyla 2012). The effectiveness of a system depends on the completeness and accuracy with which users can achieve their intended goals. For example, if a user intends to create a 3D building model, the 3D UI would be ineffective if it could not save the model to a common 3D file format (incomplete outcome) or if it was not expressive enough to allow a user to translate the building design in his head to the system (inaccurate outcome). The efficiency of a system depends on the resources expended in order to accomplish a goal. If the user must spend a great amount of time and effort maneuvering around a virtual object in order to carefully manipulate a 3D widget, the 3D UI would be extremely inefficient. Finally, satisfaction with a system depends on whether users and other people affected by the system consider it comfortable and acceptable to use.

While the effectiveness of a system can be clearly determined based on outcomes, its efficiency and satisfaction qualities are dependent upon several other factors (Shneiderman and Plaisant 2010; McMahan et al. 2014). Learnability is an important aspect of a system’s usability that refers to how quickly a novice user can comprehend the state of the system and determine what to do in order to complete a task. Retainabilty is similar to learnability but concerns how well users maintain their knowledge of the system after not using it for a period of time. Ease of use regards the simplicity of a system from the user’s point of view and how much mental effort the user must exert every time to use the system. Speed of performance is how long it takes the user to carry out a task. Rate of errors concerns how many and what types of errors the user makes while attempting to complete a task. User comfort often depends on the ergonomics of a system (see Chapter 3, section 3.5) and the amount of physical exertion required of the user. These are some of the common factors that affect a system’s usability, but this is certainly not a complete list.

Striving for Usefulness

Another common goal of UX engineering is to create a useful system. Hartson and Pyla (2012) describe usefulness as the capacity of a system to allow users to accomplish the goals of work or play. It is important to distinguish usefulness from usability. A system can be highly usable (e.g., easy to learn, easy to use), but if its function serves little purpose in a user’s life, it is not useful to that user. Therefore, a useful system is one that provides needed functionality and serves a purpose. Lund (2001) has identified several qualities of useful systems, including aiding in productivity, providing control over daily activities, making it easier to get things done, and saving the user time when tackling a task.

Emotionally Impacting the User

More recently, UX design has focused not only on usability and usefulness but also on the emotional aspects of the user experience. Emotional impact refers to the affective aspects of a user experience that influence the emotions of the user (Hartson and Pyla 2012). As Shih and Liu (2007) point out, users are no longer satisfied with usability and usefulness; they want “emotional satisfaction” from systems. Emotional impact helps to explain why some products are extremely popular despite not differing much from other products in terms of usability and usefulness. Characteristics such as novelty, elegance, sleekness, beauty, or coolness of a product can influence the reaction of users, rather than functionality.

There are several ways that a system can emotionally impact a user. Kim and Moon (1998) have identified seven dimensions to describe emotional impacts—attractiveness, awkwardness, elegance, simplicity, sophistication, symmetry, and trustworthiness. Hartson and Pyla (2012) also include pleasure, fun, joy, aesthetics, desirability, novelty, originality, coolness, engagement, appeal, self-expression, self-identity, pride of ownership, and a feeling of contributing to the world. Norman (2004) presents a model of emotional impact including three types of emotional processing—behavioral, visceral, and reflective. Behavioral processing concerns pleasure and a sense of effectiveness (stemming from a system’s usability and usefulness). Visceral processing involves the “gut feelings” and emotions that are evoked by a system’s appearance, attractiveness, and aesthetics. Finally, reflective processing involves the user’s self-image and identify while using a system.

As 3D UIs become more commonplace in consumer products, designers will need to pay careful attention to emotional impact, not just usability and usefulness. Emotional impact may be especially relevant to 3D UIs because of the powerful and visceral experiences that can be provided by technologies such as VR. We recommend reading Kruijff et al. (2016) for an overview of the challenges and methodologies for emotionally impacting users in 3D UIs.

The System Concept

Regardless of the goal behind engineering a user experience, every UX engineering process should begin with a system concept. A system concept is a concise summary of the goals of an envisioned system or product (Hartson and Pyla 2012). It is essentially a mission statement for the system that includes the usability, usefulness, and emotional impacts that the engineering process should strive to achieve. It represents the user experience that the system will provide.

4.4.2 Requirements Analysis

Once the system concept and goals of a project have been identified, the first step of the UX engineering process is to analyze the needs of the intended users of the system. This is referred to as requirements analysis (Rosson and Carroll 2001). It involves analyzing the users’ work domain, work activities, and needs in order to understand the current situation before embarking on a new design. Note that by the term “work,” we are referring not only to tasks performed in a job, but also to any activities that help fulfill the purpose of the envisioned system, including things like learning, play, and creative activities. By better understanding the intended users, UX designers can better identify the requirements that the new system should address to support the users’ work (or play). In the sections below, we will discuss the subactivities of contextual inquiry, contextual analysis, and requirements extraction.

Contextual Inquiry

Contextual inquiry is the process of planning for and conducting interviews and observations of the work tasks being performed in the field, in order to gather detailed descriptions of the routines and procedures that occur within the work domain (Shneiderman and Plaisant 2010; Hartson and Pyla 2012). The purpose of contextual inquiry is to gain an understanding of how work activities are completed using the current system (computerized, physical, or process-based) within the target work context. There are many details and steps to conducting a proper contextual inquiry (Hartson and Pyla 2012), but the following is a high-level list of the main activities:

Learn about the organization and work domain before the visit

Prepare an initial set of goals and question before the visit

Make arrangements to observe and interview key individuals

Establish a rapport with the users during the visit

Take notes and recordings (video and audio) during observations and interviews

Collect work artifacts (e.g., copies of paper forms and photos of physical items)

Contextual Analysis

After completing the contextual inquiry, the next step of UX engineering is to analyze the collected data. Contextual analysis is the process of systematically organizing, identifying, interpreting, modeling, and communicating the data collected from a contextual inquiry (Hartson and Pyla 2012). The purpose of contextual analysis is to understand the work context that the new system is being engineered for. This involves three primary aspects of work—the people themselves, their current work activities, and the environment that they work in (Rosson and Carroll 2001).

One aspect of contextual analysis is to identify and model the stakeholders involved. A stakeholder is a person that is either involved in or impacted by the current work practices (Rosson and Carroll 2001). Most stakeholders have a work role, which is a collection of responsibilities that accomplish a job or work assignment (Beyer and Holtzblatt 1998). Additionally, since stakeholders do not work in a vacuum, the social context of the workplace should be accounted for, such as how individuals and groups are organized into larger structures, and how people depend on each other to accomplish their jobs (Rosson and Carroll 2001). User models, such as organizational charts and stakeholder diagrams, can be used to represent stakeholders, their work roles, and their work relationships (Hartson and Pyla 2012).

Another aspect of contextual analysis is to identify, interpret, and model the current work activities. A popular approach to accomplish this is a hierarchical task analysis (Rosson and Carroll 2001), in which individual tasks and subtasks are identified and organized into a hierarchy to capture the details and relationships of the work activities (Diaper and Johnson 1989). Such task analyses are also known as task models (Hartson and Pyla 2012).

Finally, a contextual analysis should also capture and represent the work environment through environment models. There are two common approaches to modeling the environment—artifact models and physical models. An artifact model represents how stakeholders use physical or electronic elements during work activities (Hartson and Pyla 2012). Artifact models are particularly important for engineering systems that are meant to replace physical elements with electronic ones, such as the electronic receipts that many restaurants and taxis use now. A physical model is basically an artifact model that depicts the artifacts, stakeholders, and activities in a physical setting (Hartson and Pyla 2012). Physical models show the placement and movements of people and objects within the physical setting. Such models are important for designing systems that are meant to change the work environment. For example, some restaurants use electronic kiosks for placing orders to reduce wait lines.

Though the purpose of contextual analysis is to understand the user, task, and environment, it also requires systematic steps to capture those models and communicate them. Otherwise, incomplete or inaccurate models may be derived, which can misguide the engineering process. Like contextual inquiry, there are many details and steps to conducting a proper contextual analysis, but we only present high-level activities here.

The first step is to review the data collected from the contextual inquiry and to synthesize work activity notes. A work activity note is a simple and concise statement about a single concept, topic, or issue observed during the contextual inquiry (Hartson and Pyla 2012). These notes should capture the key points and issues discussed by the engineering team while reviewing and drawing conclusions from the contextual inquiry data (Shneiderman and Plaisant 2010).

The next step is to organize the work activity notes into an affinity diagram. An affinity diagram is a hierarchical representation of the work activity notes (Shneiderman and Plaisant 2010). It serves to sort and organize work activity notes by their similarities and common themes to highlight important concepts, patterns, and issues (Hartson and Pyla 2012). Once organized, an affinity diagram can be used to inform the creation of user, task, and environment models, like the ones discussed above.

Finally, the affinity diagram and models can be used to construct problem scenarios and claims. A problem scenario is a story about one or more personas carrying out an activity in the current work practice (Rosson and Carroll 2001). A persona is a hypothetical person with a specific work role and personality (Cooper 2004). The purpose of writing problem scenarios is to reveal aspects of general stakeholders, their work activities, and their work environment that should be considered during design. Because of this, problem scenarios are often accompanied by a list of claims. A claim is an aspect of a scenario that has important effects on the personas involved and is expressed as a tradeoff of its hypothesized positive and negative effects (Rosson and Carroll 2001). Claims are especially important for extracting requirements.

Requirements Extraction

Contextual analysis is not enough to proceed to the design process. Design requirements are the bridge between analysis and interaction design (Hartson and Pyla 2012). Hence, the UX team must extract requirements from the outcomes of the contextual analysis.

A requirement is a statement of something the system must provide in order to fulfill some need of the users. Shneiderman and Plaisant (2010) identify three types of requirements: (i) functional, (ii) performance, and (iii) interface. A functional requirement states what the system is required to do, such as “the system shall permit the user to save the current state of the virtual environment.” On the other hand, a performance requirement states how the system should do what it is supposed to do (e.g., “the system shall provide three file slots for saving the current state”). Finally, an interface requirement states what characteristics are required of the UI. An example is “the system shall display a progress bar while saving the current state.”

The various types of requirements should be extracted from the models, diagrams, scenarios, and claims created during the contextual analysis. This process involves identifying functions that the system must provide, acceptable or desired qualities of those functions, and how those functions will be represented in the interface. For each identified requirement, the UX team should create a requirement statement.

A requirement statement is a structured description of what shall be required of the system and includes categories for each requirement. Requirement statements can also include a rationale that justifies the requirement and other notes that clarify or document any specifics regarding the requirement. The UX team can create a requirements document by organizing all of their requirement statements by the major and minor categories.

Finally, it is important to note that requirements and the requirements document can and should be updated as UX engineering progresses. While most requirements will be identified during requirements analysis, new requirements will likely be identified during the design, prototyping, or evaluation phases. Every new requirement should be formalized and added to the requirements document.

4.4.3 The Design Process

After requirements have been extracted and documented, the design process can begin. It involves using design tools, perspectives, and approaches to explore and identify good design ideas. These ideas are captured and communicated to the rest of the UX team and the stakeholders via design representations. Finally, design production tools are used to refine the design and its complete specification.

Design Tools

Three important tools for exploring design options are ideation, sketching, and critiquing. Ideation is the process of quickly brainstorming ideas for designs in a creative and exploratory manner, and sketching is “the rapid creation of free-hand drawings expressing preliminary design ideas” (Hartson and Pyla 2012). Both ideation and sketching should be used in a collaborative group process to generate potential designs. Critiquing is then used to review and judge those designs, in order to filter and avoid wasting time on poor ideas. However, it is important that ideation and critiquing occur at separate times to avoid stifling creativity and suppressing potentially great ideas during ideation (Hartson and Pyla 2012).

Design Perspectives

There are also three design perspectives or mindsets that are commonly used to guide ideation, sketching, and critiquing (Hartson and Pyla 2012). The first is the interaction perspective, which is concerned with how users interact with the system. Usability is the primary focus of the interaction design perspective. As such, the user action models (see section 4.2.2) provide a foundation for the perspective. The second is the ecological perspective, which is concerned with how the system is used within its surrounding environment. It primarily focuses on the usefulness of the system and is informed by activity theory (see section 4.2.4). Finally, the third is the emotional perspective, which is concerned with the emotional impact of the system and what users value about it. The emotional design perspective focuses on the three types of emotional processing (see section 4.4.1).

Design Approaches

There are a number of approaches that UX designers can take when designing a new system or UI. Activity design is the approach of focusing on the system’s functionality and the activities that it will support (Rosson and Carroll 2001). An ecological perspective is often kept during activity design, as the UX designers generate ideas to address the functional and performance requirements identified in the requirements document. Outcome-oriented design rules (section 4.3.3) should be taken into consideration during activity design.

While activity design generally addresses the big picture, information design and interaction design are both concerned with the interface requirements. Information design is the design approach of focusing on the representation and arrangement of the UI (Rosson and Carroll 2001). It is directly concerned with the Gulf of Evaluation and how users evaluate whether their goals have been accomplished. As such, goal-oriented and evaluation-oriented design rules (sections 4.3.1 and 4.3.4) are carefully adhered to during information design. On the other hand, interaction design is the approach of focusing on the mechanisms for accessing and controlling the UI (Rosson and Carroll 2001). It is concerned with the Gulf of Execution and how users plan and execute their action plans. As such, UX designers should consider execution-oriented design rules (section 4.3.2) during interaction design. Both information design and interaction design are normally conducted with an interaction perspective in mind.

Another design approach is participatory design, which is “the direct involvement of people in the collaborative design of the things and technologies they use” (Shneiderman and Plaisant 2010). The fundamental concept of participatory design is that users should be involved in the designs that they will be using and should have equal inputs into the design process (Hartson and Pyla 2012). However, extensive user involvement can be costly and can lengthen the design process (Shneiderman and Plaisant 2010). Additionally, users are unlikely to be familiar with design perspectives and guidelines, which may force UX designers to compromise on subpar ideas generated by the users (Ives and Olson 1984). Nevertheless, UX teams interested in participatory design should consider various approaches, such as the PICTIVE approach developed by Muller (1991). In PICTIVE, users sketch and create low-fidelity prototypes that are then recorded during scenario walkthroughs for presentation to other stakeholders. The UX designers then use these videos and their associated feedback to inform the design process.

Design Representations

Throughout the design process, it is important for UX designers to document their conceptual models of the new system. Perhaps more importantly, they need to be able to represent and communicate their conceptual models to each other, the users, and the stakeholders. Design representations can fulfill these needs.

There are various ways to represent and communicate a design. One of the simplest ways is through a metaphor—a known concept used to communicate and explain a new concept by analogy (Rosson and Carroll 2001). Perhaps the most well-known UI metaphor is the computer desktop. As personal computers were becoming popular tools for businesses, UX designers used the desktop metaphor to help users understand how to interact with the computers. The computer desktop provided a space to place and organize virtual files and folders. Users could throw away the files and folders by moving them to the virtual trashcan. Additionally, some systems provided email functionality through a virtual inbox positioned on the desktop. As seen with the desktop example, a single metaphor can lead to other related metaphors as the design process unfolds. Together, they can greatly help users understand how to use a system.

Other design representations include the design scenario and the storyboard. A design scenario is a story about one or more personas carrying out an activity with the new system (Rosson and Carroll 2001). The purpose of a design scenario is to represent and communicate how users will interact with a system. Design scenarios can be written from an interaction, ecological, or emotional perspective. Some scenarios may include all three perspectives for a holistic view of the new system. Closely related to the design scenario is the storyboard. A storyboard is a sequence of drawings illustrating how the envisioned system will be used (Hartson and Pyla 2012). While a design scenario is a written representation of a system’s design, a storyboard visually depicts a scenario of users interacting with the system.

A physical mockup—a 3D tangible prototype or model of a device or product—is another type of design representation (Hartson and Pyla 2012). Physical mockups are essentially physical and embodied sketches that can be touched and held. They allow UX designers to act out design scenarios. Additionally, they help in identifying potential issues with the ergonomics of a device or system. Physical mockups have traditionally been created with paper, cardboard, and tape, but with the advent of 3D printers, plastic mockups can also be easily created. Physical mockups are especially relevant to the design of special-purpose input devices for 3D UIs.

4.4.4 Prototyping the Design

The implementation stage of the Wheel lifecycle is about bringing the design to life. It is the realization of the interaction design. During the UX engineering process, implementation often takes the form of a prototype, an early representation of the design built to model, evaluate, and iterate on the design of a product (Hartson and Pyla 2012). Below, we discuss various characteristics and aspects of prototypes, including their benefits and drawbacks, breadth and depth, fidelity, and level of interactivity.

Benefits and Drawbacks

There are several benefits to using prototypes. They provide a concrete representation of a design that can be communicated to others. They allow for “test drives” and evaluations of designs. For the stakeholders, prototypes provide project visibility and help in transitioning from an old system to the new one. However, there are also some drawbacks to prototypes. Some stakeholders may associate the limited functionality of many prototypes with a poor design, though the design has yet to be fully implemented. On the other hand, some stakeholders may assume “magic” (simulated) functionality is real and expect the final product to deliver more than what is feasible. For the UX team, it is important that the purpose and functionality of each prototype is clearly communicated to the stakeholders before it is presented.

Breadth and Depth

The purpose of each prototype should be identified before it is created, as a prototype can serve many different purposes. The breadth of a prototype concerns how many features are implemented, while its depth represents how much functionality the features provide (Hartson and Pyla 2012). Different combinations of breath and depth yield different types of prototypes. For example, a horizontal prototype is a prototype very broad in features but with less depth in functionality. As such, horizontal prototypes are great for evaluating how users will navigate a design. On the other hand, a vertical prototype is one that contains as much depth of functionality as possible for one feature. Vertical prototypes are beneficial for exploring the design of a particular feature in detail.

Another type of prototype is the T prototype, which realizes much of the design at a shallow level (the horizontal top of the T) but covers one or a few features in depth (the vertical part of the T) (Hartson and Pyla 2012). T prototypes essentially combine the advantages of both horizontal and vertical prototypes by allowing users to navigate the system and explore some features in detail. Finally, a local prototype is a prototype limited in breadth and depth that is focused on a particular isolated feature of the design (Hartson and Pyla 2012). Local prototypes are normally used to evaluate design alternatives for specific portions of the UI.

Prototype Fidelity

The fidelity of a prototype refers to how completely and closely a prototype represents the intended design (Hartson and Pyla 2012). A low-fidelity prototype provides impressions of the intended design with little to no functionality. A paper prototype made with paper, pencil, and tape is an example of a low-fidelity prototype that can be rapidly implemented to explore design decisions. A medium-fidelity prototype provides the look and feel of the intended design with rudimentary functionality. There are numerous software applications that can facilitate the development of medium-fidelity prototypes for desktop, web, and mobile systems, including OmniGraffle, Balsamiq Mockups, PowerPoint, and basic HTML. However, tools for 3D UI prototyping are much less mature. Finally, a high-fidelity prototype is a prototype that closely resembles the final product. The aesthetics of a high-fidelity prototype should be nearly identical to the final product’s look and feel. Additionally, a high-fidelity prototype should have most, if not all, features fleshed out with full functionality. Due to their functional demands, high-fidelity prototypes are often programmed in the same language as the final product. In general, the fidelity of a prototype will increase with each of its iterations.

Prototype Interactivity

The interactivity of a prototype is the degree to which interactions are realized (Hartson and Pyla 2012). There are four common levels of interactivity that prototypes are developed for. At the lowest level, animated prototypes visualize predetermined interactions, and therefore offer no interactivity to the user. A video depicting how a user would interact with an interface design is an example of an animated prototype. This form of prototyping is useful to communicate the vision of a 3D UI without investing in full-scale development and is often used in early marketing, such as crowdfunding campaigns. Scripted prototypes offer more interactivity than animated prototypes, but they require that the user follows a scripted sequence of interactions. Vertical and T prototypes are most often used for scripted prototypes, as the scripted interactions are used to explore the depth of a particular feature. Fully programmed prototypes that implement all interactive features and backend functionality offer the highest level of interactivity. However, these prototypes are costly to develop.

An alternative to pursuing a fully programmed prototype is to create a Wizard of Oz prototype. Wizard of Oz prototypes can provide deceptively high levels of interactivity with little functionality actually implemented. These prototypes rely on a hidden UX team member that observes the user’s actions and then causes the interface to respond appropriately (like the wizard behind the curtain in the famous film). For 3D UIs, this prototyping method can be quite useful because the actual implementation of many 3D interaction techniques and UI metaphors can be very complex. For example, a 3D UI designer may not want to go to the trouble of implementing a speech or gesture recognition interface, if it is just one of the options being considered. Instead, a Wizard of Oz prototype can allow an evaluator to mimic the actions that the system would take when a user issues a gesture or voice command (Hayes et al. 2013).

4.4.5 Evaluating Prototypes

Once a prototype has been created, it is time to evaluate it to understand how well the underlying design meets its goals with respect to usability, usefulness, and emotional impact. There are a number of methods that can be used to evaluate prototypes. In Chapter 11, “Evaluation of 3D User Interfaces,” we discuss several methods for evaluating 3D UIs. However, here we discuss some qualities of evaluation methods that the reader should be familiar with before reading Chapter 11.

Formative versus Summative Evaluations

As discussed above, the Wheel UX lifecycle is an iterative process. During this process, formative evaluations will inform the process and the design of the system. These evaluations help identify bad design choices and usability issues before the design is finalized. In other words, formative evaluations help to “form” the design. On the other hand, summative evaluations usually occur during the final iteration of the UX engineering process and focus on collecting data to assess the quality of the design. Summative evaluations help to “sum up” the design. Both types of evaluations are important to the UX engineering process.

Rapid versus Rigorous Evaluations

A major consideration in choosing an evaluation method is whether it is rapid or rigorous (Hartson and Pyla 2012). Rapid evaluations are fast evaluation methods that are usually inexpensive to conduct. Examples of rapid evaluation methods include cognitive walkthroughs and heuristic evaluations (see Chapter 11, section 11.2, for details on these two methods). On the other hand, rigorous evaluations are more formal systematic methods that attempt to maximize the information gained from an assessment while minimizing the risk of errors or inaccuracies involved. However, rigorous evaluations tend to be relatively expensive in terms of both time and resources. Hence, UX evaluators must decide between conducting rapid evaluations, which are less expensive to conduct, or conducting rigorous evaluations, which are more informative. Usually, rapid evaluations are used during the early iterations of the UX lifecycle, and rigorous evaluations may be conducted during the later iterations. As a result, many rapid evaluation methods tend to be formative, while most rigorous evaluation methods tend to serve summative purposes. However, these relationships are not exclusive, and it is possible to have a rigorous formative evaluation method or a rapid summative evaluation method.

Analytic versus Empirical Evaluations

Another consideration to account for when choosing an evaluation method is whether it is analytic or empirical. Analytic evaluations are focused on analyzing the inherent attributes of a design, as opposed to observing the system in use (Hartson and Pyla 2012). Analytic evaluations are usually conducted by UX experts and guided by heuristics or guidelines (see Chapter 11, section 11.2, for more details). On the other hand, empirical evaluations are based on data collected while observing participants using the system. Due to their nature, analytic evaluations also tend to be rapid evaluation methods while empirical evaluations tend to be more rigorous. However, there are rapid empirical evaluation methods, such as the Rapid and Iterative Testing and Evaluation (RITE) method presented by Medlock et al. (2005).

4.5 Conclusion

In this chapter, we have reviewed the foundations and theories of HCI, from which many design rules have been derived. We have also discussed those design principles and guidelines in detail, including 3D UI examples of their application. Finally, we covered the UX engineering process. With this background, we now turn to the main topics of the book: the design and evaluation of user experience in 3D UIs.

Table of Contents for 4 General Principles of Human–Computer Interaction

Create new playlist

Sign In

Sign Up