Chapter 12. Specifying Modalities: States, Flows, Systems, and Prototypes

Introduction: A Prototype Is a Custom Measuring Tool

In the Charles and Ray Eames film, Fiberglass Chairs, Something of How They Get the Way They Are, the first scene shows a sheet of connected dowels attached to an adjustable wooden block jig (see Figure 12-1). A person sits down on the prototype, and the blocks are adjusted to fit the shape of their back. Informed by the pair’s work creating molded wooden casts for the military, their furniture reflected the extended body of work they developed both in creating prototypes and with the human form.

The jig for the Eames fiberglass chairs had adjustable blocks throughout the form to measure and shape the seat, curve, and back of the chair.
Figure 12-1. The jig for the Eames fiberglass chairs had adjustable blocks throughout the form to measure and shape the seat, curve, and back of the chair.

In a workshop, a jig is a custom-made tool used to help measure, guide, and speed up the usage of existing tools. In product design, especially when using molding techniques, the jigs and final prototypes are sometimes destroyed. Other times, a jig is used as a tool in the manufacturing process itself.

In the Eameses’ design process, the jig for their fiberglass chairs served multiple purposes. In earlier stages of development, it served a kind of custom ruler. They had many different people sit in it, and they adjusted the shape of the chair to fit their posture and shape. Then they documented that. The process allowed them to test hypotheses, such as “Is it more comfortable to have direct support from the chair or to have more space?” in specific contact points between the chair and the sitter. In later stages of the development process, they could use it as a working model, to find out what single shape worked best across the widest range of sitting postures and tooshie types. They could then trace that shape off to begin to create the next round of prototypes and specifications. It wasn’t just an early stage design, but a custom-designed measuring tool in and of itself. It allowed the Eames to take multiple measurements, answer data-focused questions, and quickly validate or reject design hypotheses.

In the same way, multimodal design deliverables serve multiple purposes through product development: tools of research and inquiry, hypothesis proposals, guidelines, and of course production specifications and assets. It’s helpful to think of all of the deliverables as prototypes—living, working models that are used flexibly to measure, shape, test, and refine solutions. Good ideas, good insights, good programming, and good products are crafted. Design deliverables can be just as often used as a stepping stone to figuring out which specifications really matter, not just the final specification numbers in pixels or inches. Don’t just create prototypes. Use them. Wear them out. Then make more.

Practice Makes Perfect

The body of the Apple MacBook series has a much-touted unique manufacturing process. A custom extrusion of aluminum called a blank is cut and machined to allow the componentry to be placed inside a solid unibody form factor made of fewer parts that offer higher strength and durability. The design team doesn’t just design the products. They design the manufacturing process. The same principle holds true for both hardware and software. Amazon Web Services are expressly designed to be used by other Amazon teams and third parties; core software and services are flexible and modular. This approach allowed Amazon Web Services to become a huge part of the overall company’s business model. Whatever Amazon needed was probably something that another company was going to need too. Minimalism and modularity are not just an aesthetic or system rationale: they are values embodied within the design and working practices of their respective teams. In the same way a custom jig is a tool, so is the design practice itself.

The term “best practices” is something of a misnomer, because it implies that there is one superior way for everyone to do something. Figuring out the right practice, especially for new product categories, products, and interaction capabilities is a design exercise in and of itself. Multimodal design introduces a new set of human factors, integrates a blend of technologies in new ways, and addresses some new usage considerations into the solution space. A good design practice enables a team to share a clear goal or vision, provides rigor for thoughtful details and cohesive systems, and facilitates clear feedback and communication. This only becomes more important with new kinds of design solutions. A lot of design practice was really developed from screen-based experiences and products. As the type of products that design teams becomes more diversified, so will the right kind of design practice.

The Media of Multimodal Products: Information and Interactions, Understandings and Behaviors

Architects create solutions fabricated with building materials like wood, brick, glass, or steel. Fashion designers realize their solutions with fabric and finishings. Electronics industrial designers work with aluminum, titanium, sapphire glass, plastics. All of these different designers also design the behaviors of their materials: hinged doors swing, fabric drapes and folds, seat cushions give, and earbuds are inserted into the outer ear canal and mostly stay there. But how do you create with bits, and how do they behave? Some of them exist for a few fractions of a second, only to be overwritten again and again. These unruly 1s and 0s zipping through silicon chips, copper cables, optical drives, and electromagnetic waves are invisible to the naked eye, process faster than we can comprehend, and most user experiences require millions, billions, and even trillions of them. Much of what is powerful about interaction design lies beyond the range of what people are able to physically experience or comprehend. Dr. Hiroshi Ishii, who leads the Tangible Media Group at MIT, has made it his work to make these bits “tangible,” focusing on how information and computation require representation. His group explores how to reintegrate physical and digital experiences. One of the key themes of exploration in his work is to better couple the sensory properties of physical media with the dynamic properties of computing. In his framework for tangible user interfaces, information and computation is represented to users across a wide range of physical media and properties.1 He believes that the common ground between atoms and bits is not how we give bits shape, but how bits can shape human behavior. Human behavior is comprised of sensing, understanding, deciding, and acting. Important questions that product designers must answer every day include: How do bits help people perceive information? How do they affect the way we think and make choices? How do bits help us do what we want to do? Most importantly, designers must ask themselves, are we using these bits to make peoples’ lives better?

The Product Development Process for Multimodal Products

Once the foundational and exploratory facets of product development are identified, there are many kinds of deliverables that can be used. Because multimodal products span multiple design media, including visual, audio, and haptic elements, they are more complex, as each mode needs to be designed individually. Physical designs call for 3D models and production specifications. Wireframes are used for screen-based experiences. Dialog scripts are used for speech. Haptic deliverables can look something like a choreography score. Designing multimodal experiences is much more like scoring a symphony (see Figure 12-2). Like instruments in an orchestra, there are individual scores for each mode of the experience, and a conductor’s score that coordinates them all. However, unlike conductor’s scores, multimodal design has multiple choice or choose-your-own-adventure sections. And sometimes self-playing instruments that improvise.

Like an orchestral score, multimodal design syncs different technologies into a unified experience.
Figure 12-2. Like an orchestral score, multimodal design syncs different technologies into a unified experience.

Another set of techniques that inform multimodal design is borrowed from theatrical, film, and animation processes. Because filmmaking is by nature multimodal, many of the techniques used in filmmaking help for multimodal design. The cameras and microphones are constantly moving between different layers of visual information, sound and dialog streams, and story arcs. The way they move prioritizes specific sensory elements: this is comparable to sensory focus within a multimodal user experience. Art critics have spent the last century debating whether film is the closest existing media to human experience. Wherever they landed on that topic, thinking like a director or a conductor is a good mindset for creating multimodal design. It is coordinating multiple sensory elements together over time into a cohesive experience.

In this book, mapping and modeling are treated separately from prototypes and specifications. This is to allow a closer look at the new kinds of human factors, design considerations, and design elements that are becoming a part of interaction design. In practice, these aspects of product development are much more fluid and ideally highly integrated and iterative. On small teams, they may be done by the same person. For new products and new product categories, a hypothesis-driven approach is used when there is little precedent for the product typology. Depending on the scale of the product and the team, many deliverables can be folded together into one deliverable or broken apart and shared to ensure alignment at different levels of the product architecture. Fair warning: if you skipped ahead to this chapter to get right to the good stuff, then you might be disappointed. There are new types of design factors (Chapter 2Chapter 5) that are included in specifications, as well as design elements (Chapter 8). Without them, it might be tough to understand this chapter.

Creating prototypes and specifications has roughly four phases that are also fluid and iterative: exploratory, generative, foundational, and elaborative (see Figure 12-3). Exploratory design identifies important design inspiration, information, and concepts that are central to creating a design solution; generative design explores possible solutions within those constraints. Often shared with user research and engineering, these help inform the starting hypotheses of what a design solution might be. Some of the deliverables designers might create during this phase are mood boards, style frames, concept or user point-of-view videos, material investigations, and formal studies. These provide inspiration and a reference library of materials, design techniques, and aesthetic qualities. During the foundational and elaborative phases of design, core user behaviors are established, and product and manufacturing specifications are developed. These may include design metaphors, interaction models, input/output maps, wireframes, sound design, colors/material/finish (CMF), and various functional and nonfunctional prototypes developed with engineering and manufacturing teams. There is a broad range of design deliverables and activities that can be used for multimodal products. Like jigs, a design team may end up needing a few custom ones.

Each phase of the design process provides different pieces of the solution. The number of iterations in each phase can depend on the maturity of the product.
Figure 12-3. Each phase of the design process provides different pieces of the solution. The number of iterations in each phase can depend on the maturity of the product.

Defining Design Requirements

Start by expressing a clear set of the user goals. There are many ways to do this, but a fun, quick way is using a first-person “fill-in-the-blanks” style template (see Figure 12-4). Be concise.

Fill-in-the-blank user goals may not be comprehensive, but they are a good way to break the ice and get teams thinking about users and their needs
Figure 12-4. Fill-in-the-blank user goals may not be comprehensive, but they are a good way to break the ice and get teams thinking about users and their needs

User Goals, Scenarios and Storyboards, and Use Cases

User scenarios describe the situations in which a user might experience specific needs or try to achieve certain goals, including details of the environment, the general state of the user, and other considerations. For multimodal design, these kinds of deliverables are extended to include physical context and sensory focus (see Figure 12-5). For repetitive use cases and those that have safety requirements, it is important to explore suboptimal conditions for user experiences, to identify the types of errors or hazards that may occur as part of the experience. It is also important to explore circumstances that may require a substitution mode or a range of modes, to accommodate personal preferences.

Some topics to explore are pain points, hero moments, transitions, and social context during the interaction. These can be documented as a storyboard, photo narrative, or video. The primary goal is to drive out use cases and to ensure that the priorities of the experience are identified. Once the important events are established, the experience can be blocked, a technique also borrowed from theatrical and film production. Blocking ensures that each modal interaction is complete and properly integrated with the other modalities. The sequence of frames is also reviewed, to ensure that each step in the experience provides the information and functionality necessary for the next step. At this stage, the pacing of the experience can be established.

Pseudocode and Swimlane Logic Flows

Once all of the user experience steps have been identified (and the pathway of those steps has been established), some of the high-level technical considerations are reviewed. Pseudocode and swimlane logic flows are used to design the technical elements of the experience.

Pseudocode comes from programming, where plain language is used to describe the way a program is supposed to work. For designers, it’s useful to be able to describe the conditions or triggers that invoke a certain type of interaction, and then the multiple pathways that can unfold during the experience. It can be used to describe conditional or complex use cases. Many speech design tools are variations of pseudocode. It’s helpful for open-ended and nonlinear experiences, where it’s possible to jump around inside an experience.

Swimlane logic flows are useful when multiple modes are being used simultaneously or there are multiple transitions or substitutions between modes (see Figure 12-6). These can be very helpful to create smooth transitions or entry and exit points within a multimodal experience. They are also useful in highly responsive experiences with high levels of automation, external triggers, or turn taking.

The user scenario for a snowboarding lesson is more structured, to enhance learning; and more open-ended, to allow flexibility across riding styles and left- or right side dominance.
Figure 12-5. The user scenario for a snowboarding lesson is more structured, to enhance learning; and more open-ended, to allow flexibility across riding styles and left- or right side dominance.

Image
Image
Both pseudocode and swimlane logic flows can help create flows for multimodal experiences.
Figure 12-6. Both pseudocode and swimlane logic flows can help create flows for multimodal experiences.

Specifying Multimodalities

Most interface modes already have a fairly well established set of specifications. Multimodal specifications are really about how to create connective tissue between them, or how to indicate focal, auxiliary, peripheral, or substitution modalities. This book isn’t meant to go deep into any one mode, but rather look at how designers can design across them. There are many great references for the less common modes, many of which are also O’Reilly books. See the Additional Reading section for more information about designing specific modes.

To specify multimodal interactions, it’s important to understand why multiple modes are being used together and how. There are four main types of multimodal interactions: synchronous, asynchronous, parallel, and integrated. They sound a bit similar, so it’s important to keep them straight. Each has a different experience objective.

Synchronous and Asynchronous Modes

When an experience integrates multiple modes together, the multimodal elements can come together in different ways. For example, when your phone rings, the screen, ringtone, and vibration occur in tandem for a single experience. The phone gets your attention, the screen shows the name of the caller, and the vibration may be a preference, or it may be a substitution when your phone is in silent mode. In these types of multimodal experiences, the design elements are synchronous, occurring together to support each other. In other experiences, the multimodal elements are used asynchronously. For example, in the Amazon Alexa, the light ring is used to indicate waiting, and to mark the beginning, continuation, and end of speech. The visual and speech cues complement and counterpoint each other but don’t happen simultaneously.

Parallel and Integrated Modes

Parallel and integrated modes serve a different purpose. An example of a parallel mode would be navigating the XBox using the controller, Kinect, or Cortana. Button controls, gesture, and voice commands can each do the same thing. At any point in time, a user can easily switch from one mode to another. Parallel modes allow a single user to choose their mode preferences or to allow for substitution if it is needed. Integrated modes are when multiple modes are used simultaneously, allowing different user subgroups to experience the same thing at the same time. For example, a crosswalk uses both visual and sound cues to indicate the time to cross.

In these kinds of experiences, how the different modes map to each other becomes important. In one example, text can be entered through both dictation or typing. In another example, a feature may be accessed through a keyboard shortcut or a gesture. Input/output maps can be used to ensure consistency between different modes that provide the same functionality. Parallel experiences are common for core interactions and navigation shared across different kinds of devices.

Input/Output MAP

An input/output map shows all of the core multimodal interactions in one place. It is used to ensure that the modes map correctly to key features or interactions. It also ensures—across parallel or integrated modes—that there is a clear rationale and consistency within and across modes.

It’s important to note all interactive points of input and output, and their pertinent attributes and expectations. This can be a very useful starting point for developing prototypes and imagining the forms they might take, as the design works to accommodate the inputs and outputs of all modalities. This kind of document is especially useful if a number of different devices share a set of features and interactions across them but have variations in modes (see Figure 12-7).

Input/output maps help keep interactions consistent but not identical across related devices and experiences.
Figure 12-7. Input/output maps help keep interactions consistent but not identical across related devices and experiences.

Summary

Creating prototypes and specifications has four phases: exploratory, generative, foundational, and elaborative. Because of the interdependent nature of multimodal products, building, using, and refining prototypes is essential. Real-world product use is an integral stage of design, and using prototypes effectively can both reveal and address design considerations as they emerge. In addition to standard technical specifications, deliverables like storyboards and flows can help communicate purpose and context among the team so that everyone understands the experience goal and can apply their own expertise when needed.

1 Cara McGoogan, The MIT Media Lab is waging war on pixels, Wired, October 2015

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset