Chapter 11 Solving Problems and Making Decisions

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google


11	Solving Problems and Making Decisions

Like any goal-directed activity, thinking can be done well or badly. Thinking that is done well is thinking of the sort that achieves its goals.

J. Baron
2008

INTRODUCTION

Complicated problem-solving and decision-making processes are engaged for all sorts of human activities. You must make decisions about things as simple as what clothes to put on in the morning and as complex as how to raise your children. Your decisions can have long-lasting consequences. The CEO of a company could decide to expand based on an overestimate of the company’s financial strength, which might result in bankruptcy. This in turn would result in the loss of many jobs and have devastating consequences for the local economy. Similarly, a government’s decision to enter into war will result in loss of life, economic hardship, and aftereffects of varying types that carry far into the future. Scientists have tried to understand how human reasoning and decision making take place so that poor decisions can be prevented.

Consider the operator of a human–machine system. To operate the system effectively, he must comprehend system information and decide on appropriate actions. There are two ways that the operator can control the system (Bennett, Flach, Edman, Holt, & Lee, 2015). The first mode of operation will be used when the system is operating in a familiar and predictable way. Under these circumstances, the operator can control the system with very little effort, relying on well-practiced responses to the system’s behavior (skill-based performance; see Chapter 3). The operator will face difficulty when system information indicates that an unusual condition has developed, requiring that the operator change to the second mode of operation. In this mode, the operator will need to make decisions based on his reasoning about the system state. This reasoning may involve recall of information from semantic or episodic memory (rule-based performance) or formulating a novel solution by integrating several different sources of information (knowledge-based performance).

For example, most of a pilot’s efforts in flying an airplane involve monitoring the instruments in the cockpit. This does not require a great deal of mental effort. Only when the instruments indicate that a problem has occurred must the pilot engage in any effortful problem solving or decision making. When the pilot determines that an emergency has occurred, she must integrate information from the many visual and auditory displays in the cockpit, diagnose the nature of the emergency, and decide on the actions that she should take in response. Yet, as we have discussed in previous chapters, the pilot’s capacity for processing such information is limited, and she may make a bad decision even though she is well-trained and well-intentioned.

This chapter examines how people reason about and choose between different actions. There are two ways to describe how people make decisions: normative and descriptive. A normative model specifies those choices that a rational person should make under ideal circumstances. However, as Johnson-Laird (1983, p. 133) observed, “Human reasoners often fail to be rational. Their limited working memories constrain their performance. They lack guidelines for systematic searches for counter-examples; they lack secure principles for deriving conclusions; they lack a logic.” In other words, our decisions often deviate from those prescribed by normative models, primarily because of our limited capacity for processing information. Descriptive models of reasoning and decision making try to explain how people actually think. By understanding how and why people deviate from normative rationality, the human factors specialist can present information and design support systems that will help an operator devise optimal problem solutions.

PROBLEM SOLVING

In most problem-solving tasks, a person confronts a problem that has a clear goal. In the laboratory, problem solving is studied by presenting people with multistep tasks that take minutes or hours to perform. These tasks usually require a person to perform many different actions to attain a goal. One famous problem of this type is the Tower of Hanoi (see Figure 11.1), which is widely used to assess people’s executive control functions (see Chapter 10; Welsh & Huizinga, 2005). The goal is to move all of the discs from peg A to peg C, under the restrictions that only one disc can be moved at a time and a larger disc cannot be put on top of a smaller disc. Problem solving in the Tower of Hanoi and similar tasks is studied by recording the problem solver’s moves as well as his or her accuracy and time to solution.

FIGURE 11.1The Tower of Hanoi problem.

Another way to study problem solving is to obtain verbal reports, sometimes called protocols, from the problem solver that describe the steps he took to solve the problem. Verbal protocols are especially useful for tasks in which intermediate steps to the solution are made mentally and are therefore not observable. Verbal protocol analysis has been used in applied settings, such as in the development of expert systems (see Chapter 12), as well as to understand problem-solving processes (Noyes, 2006). Protocols are assumed to reflect the information and hypotheses being attended in working memory, although in reality they are only reports of the thoughts that are occurring at the time (Ericsson & Simon, 1993).

Protocols are usually generated while the task is being performed, rather than after it is completed, because people may forget and fabricate information if the reports are obtained retrospectively (Russo, Johnson, & Stephens, 1989). When protocols are collected systematically, they can provide valuable information about the cognitive processes engaged for a particular task (Hughes & Parkes, 2003). However, if a person generates the protocol while performing the task, she may alter how she performs the task. This could occur because of competition between the resources required to generate the protocol and to solve the problem (Biehal & Chakravarti, 1989; Russo et al., 1989). We must also remember that the information supplied by a protocol is a function of the instructions that the problem solver has been given and the questions the investigator has asked (Hughes & Parkes, 2003). These will determine what information is reported, and how much, and poor instructions or bad questions will lead to useless protocols.

THE PROBLEM SPACE HYPOTHESIS

One way to think about problem solving is to imagine how objects are manipulated within an imaginary mental space. This space is constructed by the problem solver from his understanding of the problem, including those relevant facts and relationships that he thinks are important for the task. All problem solving takes place within this space. Objects are manipulated within the space according to the problem solver’s knowledge of allowable actions defined by the rules of the problem. Finally, the problem solver has available a number of rules or strategies that can coordinate the overall problem-solving process.

Newell and Simon (1972) proposed a framework for problem solving in which goals are achieved by movement through the problem space. Within this framework, different problem spaces are mental representations of different task environments. These problem spaces can be characterized by a set of states (positions in the problem space) and a set of operators that produce allowable changes between states (movement through the space). A problem is specified by its starting state and its desired ending, or goal, state. For the Tower of Hanoi problem, the starting state is the initial tower on peg A, the goal state is a tower on peg C, and the operators are the allowable movements of disks between the three pegs.

Two aspects of Newell and Simon’s (1972) portrayal of problem solving are important: how the problem is represented and how the problem space is searched. First, because the problem space is only a mental representation of the task environment, it may differ from the task environment in certain important respects. With respect to product and system design, the editor of a special issue of a journal devoted to computational approaches for early stages of design noted, “The decisions we take in early design are often the most influential, imposing key constraints on our view of the problem space and thereby shaping later downstream decisions” (Nakakoji, 2005, p. 381). Similarly, advocates of support systems for developing groupware (computer software designed to facilitate interactions among group or team members) emphasize the need for developers and users to have a common understanding of the problem space (Lukosch & Schummer, 2006).

Second, searching the problem space requires consideration and evaluation of allowable moves between states. Some moves that are allowable within the task environment may not be included in the problem space, which means that they will not be considered. Furthermore, the limited capacity of working memory constrains the number of moves that can be considered simultaneously. This means that for complex problems, only a small part of the problem space can be held in memory and searched at any one time. Because only a limited number of moves can be examined, finding a problem solution quickly and effectively will require using strategies that direct search toward likely solution paths.

Consider the nine-dots problem in Figure 11.2. The goal is to connect the dots in the figure by drawing four straight lines without lifting the pencil from the paper. Many people find the nine-dot problem difficult. The reason for this is that the solution requires the lines to go outside of the boundaries of the square formed by the dots (see Figure 11.3). This allowable move is usually not included in the problem space, even though the problem description does not exclude such moves, possibly because Gestalt perceptual principles organize the dots into an object with boundaries and prior knowledge places inappropriate constraints on the representation (Kershaw & Ohlsson, 2004). When the move that allows the person to solve this problem is not included in the problem space, no amount of searching will allow the person to find it. As this example illustrates, an incomplete or inaccurate problem representation is a common source of problem-solving difficulty. Hence, one way to improve problem solving is for the person to spend more time constructing a mental representation before seeking a solution (Rubinstein, 1986), and another is to provide hints that lead the person to change her representation of the problem space (Öllinger, Jones, & Knoblich, 2014).

FIGURE 11.2The nine-dots problem.

FIGURE 11.3The solution to the nine-dots problem.

Even when all allowable moves are contained in the problem space, a person needs a strategy to find a solution path through the problem space. A strategy will be most important when people are solving problems in unfamiliar domains, where their abilities to find a solution path is limited. Perhaps the weakest strategy, trial and error, consists of unsystematic or random selections of moves between states to attain the goal. Two stronger, more systematic strategies are forward chaining (working forward) and backward chaining (working backward). Forward chaining begins from the initial state. All possible actions are evaluated, the best one is selected and performed, and feedback tells the problem solver whether the action was a good one or a bad one. This process is repeated until a solution is achieved. Backward chaining begins from the goal state and attempts to construct a solution path to the initial state.

A third general strategy is called operator subgoaling. A person solving a problem selects a move (operator) without consideration of whether or not it is appropriate for the current state. If the move is inappropriate, the problem solver forms a subgoal in which he attempts to determine how to change the current state so that the desired move becomes appropriate.

These three strategies all incorporate heuristics to narrow the search for possible moves. You can think of heuristics as rules of thumb that increase the probability of finding a correct solution. Heuristics allow the problem solver to choose among several possible actions at any point in the problem space. For example, one heuristic is called hill climbing. In hill climbing, the problem solver evaluates whether or not the goal will be closer after making each possible move. The problem solver selects the move that brings him “higher” or closer to the goal state (the top of the hill). Because only the direction of each local move is considered, this heuristic works like climbing to the top of a hill while blindfolded. The problem solver may be left “stranded on a small knoll”; that is, every possible move may lead downhill although the goal state has not yet been reached. Consequently, the best solution may not be found.

Chronicle, MacGregor, and Ormerod (2004) proposed that a hill-climbing heuristic is one factor that underlies the difficulty that people have in solving the nine-dot problem. They argue that people evaluate potential moves against a criterion of satisfactory progress, which in the nine-dot problem is “that each line must cancel a number of dots given by the ratio of the number of dots remaining to lines available” (p. 15). Selecting moves that meet this criterion drives the problem solvers away from moves that lie on the correct solution path.

Means-end analysis is a heuristic that is similar to hill climbing in its focus on reducing the distance between the current location in the problem space and the goal state. The difference between means-end analysis and hill climbing is that in problem spaces appropriate for means-end analysis, the move needed to reach the goal can be seen, allowing an appropriate move to be selected to reduce the distance. Note that the heuristic described for the nine-dot problem above was called hill climbing rather than means-end analysis, because the criterion against which the problem solver was evaluating progress is inferred from the problem statement (dots must be cancelled) and not a known goal state.

Means-end analysis is a heuristic based on identifying the difference between the current state and the goal state and trying to reduce it. However, sometimes a solution path will require increasing the distance from the goal. Under means-end analysis, these kinds of actions are particularly difficult. For example, Atwood and Polson (1976) had people solve versions of water jug problems (in which water from a filled large jug must be distributed equally between it and another medium-sized jug, using a small jug). Their problem solvers had considerable difficulty with the problems for which finding a solution required them to move away from the known goal state of equal amounts of water in the two largest jugs.

The problem space hypothesis is particularly useful as a framework for artificial intelligence. This framework is embodied in the idea of a production system, which includes a data base, production rules that operate on the data base, and a control system that determines which rules to apply (Davis, 2001; Nilsson, 1998). One benefit of modeling human problem solving using production systems is that we can describe human performance with the same terminology used to describe machine performance. Consequently, insight into how human problem solving occurs can be used to advance artificial intelligence, and vice versa. This interaction lays the foundation for the design of cognitively engineered computer programs to assist human problem solving, called expert systems, which will be discussed in the next chapter.

ANALOGY

Analogy is another powerful heuristic in problem solving (Chan, Paletz, & Schunn, 2012; VanLehn, 1998). It involves a comparison between a novel problem and a similar, familiar problem for which the steps to a solution are known. An appropriate analogy can provide a structured representation for the novel problem, give an idea about the operations that will probably lead to a solution, and suggest potential mistakes. People tend to use analogies when the source and target problems have similar surface features (Bassok, 2003; Holland, Holyoak, Nisbett, & Thagard, 1986). A problem solver may attempt erroneously to apply an analogy when the surface characteristics are similar, even though the problems are structurally quite different and require different paths to solution. Conversely, analogical reasoning may not be used appropriately if the source and target problems have only structural similarity. Thus, the effective use of an analogy to solve a problem requires that the problem solver recognize structural similarity between the novel problem and the familiar analogous problem, and then apply the analogy correctly.

In general, people are adept at using analogies to solve problems, but they often fail to retrieve useful analogies from memory. Gick and Holyoak (1980, 1983) investigated the use of analogies in solving the “radiation problem” originated by Duncker (1945). The problem is stated as follows:

Suppose you are a doctor faced with a patient who has a malignant tumor in his stomach. It is impossible to operate on the patient, but unless the tumor is destroyed the patient will die. There is a kind of ray that at a sufficiently high intensity can destroy the tumor. If the rays reach the tumor all at once at a sufficiently high intensity, the tumor will be destroyed. Unfortunately, at this intensity the healthy tissue that the rays pass through on the way to the tumor will also be destroyed. At lower intensities the rays are harmless to healthy tissue, but they will not affect the tumor either. What type of procedure might be used to destroy the tumor with the rays, and at the same time avoid destroying the healthy tissue? (Gick & Holyoak, 1983, p. 3)

In this problem, the actions that the problem solver might take are not well-defined. To arrive at a solution, she might use an analogy to transform the problem into a representation with clear actions. Before presenting the problem to be solved, Gick and Holyoak (1980) told the people in their study a military story in which a general divided his army to converge on a single location from different directions. Dividing the army is analogous to the solution of splitting the ray into several lower‑intensity rays that converge on the tumor (see Figure 11.4). Approximately 75% of the people tested following the military story generated the analogous solution to the ray problem, but only if they were told to use the analogy between the story and the problem. When they were not told that there was a relationship between the story and the problem, only 10% of them solved the problem. In short, the people had difficulty recognizing the analogy between the story and the problem, but they could easily apply the analogy when they were told that it was there.

FIGURE 11.4The visual analogs of the radiation problem used by (a) Gick and Holyoak (1983) and (b) by Beveridge and Parkins (1987).

When people read two different stories using the same convergence solution, they were more likely to solve the problem with the analogy than when they read only the military story. Gick and Holyoak (1983) attributed the ability to use the analogy after two stories to the problem solvers’ acquisition of an abstract convergence schema. Specifically, when two stories with similar structure are presented, the problem solver generates an abstract schema of that structure. The problem is then interpreted within the context of that schema, and the analogy is immediately available. Requiring a person to generate an analogous problem after solving one seems to have a similar beneficial effect on the solution of another, related problem (Nikata & Shimada, 2005).

Given that the solution to the radiation problem has a spatial representation, we might predict that providing problem solvers with a visual aid should help them generate the problem solution. Gick and Holyoak (1983) used the diagram in Figure 11.4a for the radiation problem and found that it did not improve performance. However, Beveridge and Parkins (1987) noted that this diagram does not capture one of the essential features of the solution, which is that several relatively weak beams have a summative effect at the point of their intersection. When they showed people the diagram in Figure 11.4b or colored strips of plastic arranged to intersect as in Figure 11.4c, problem solvers’ performance improved. Thus, to be useful, the visual aid must appropriately represent the important features of the task. Holyoak and Koh (1987) reached a similar conclusion about verbal analogies like the military story after they showed that problem solvers used them spontaneously more often when there were more salient structural and surface features shared by the two problems.

These findings suggest that to ensure the appropriate use of a problem-solving procedure, an operator should be trained using many different scenarios in which the procedure could be used. Visual aids can be designed that explicitly depict the features important for solving a problem or for directing participants to attend to critical features (Grant & Spivey, 2003). By exploiting the variables that increase the probability that an analogy is recognized as relevant, the human factors specialist can optimize the likelihood that previously learned solutions will be applied to novel problems.

Although people have difficulty retrieving structurally similar, but superficially dissimilar, analogous problems to which they have previously been exposed, they seem to be much better at using structural similarities to generate analogies. Dunbar and Blanchette (2001) noted that scientists tended to use structural analogies when engaged in tasks like generating hypotheses. The results from their experiments suggest that even nonscientists can and do use superficially dissimilar sources for analogy if they are to freely generate possible analogies.

LOGIC AND REASONING

Recall again the concept of a problem space. We have presented problem solving as the discovery of ways to move through that space. Another way of thinking about problem solving is to consider how people use logic or reason to create new mental representations from old ones. Reasoning, which can be defined as the process of drawing conclusions (Leighton, 2004a), is a necessary part of all forms of cognition, including problem solving and decision making.

We can distinguish three types of reasoning: deductive, inductive, and abductive (Holyoak & Morrison, 2012). Deduction is reasoning in which a conclusion follows necessarily from general premises (assumptions) about the problem. Induction is reasoning in which a conclusion is drawn from particular conditions or facts relevant to a problem. Abduction is reasoning in which a novel hypothesis is generated to best explain a pattern of observations. People find all types of reasoning difficult, and they make systematic errors that can lead to incorrect conclusions. In the next two sections, we will discuss deductive and inductive reasoning, which have been studied extensively, and the ways that people can make mistakes when faced with different kinds of problems. We will then provide a briefer description of abduction.

DEDUCTION

Deduction depends on formal rules of logic. Formal logic involves arguments in the form of a list of premises and a conclusion. Consider the following statements:

1.Nobody in the class wanted to do the optional homework assignment.

2.Paul was a student in the class.

3.Therefore, Paul didn’t want to do the optional homework assignment.

Statements 1 and 2 are premises, or assumptions, and statement 3 is a conclusion that is deduced from the premises. Together, these statements form a kind of “argument” called a syllogism. A syllogism is valid if the conclusion logically follows from the premises, as in this example, and invalid if it does not.

To the extent that any problem can be formulated as a syllogism, formal rules of logic can be applied to arrive at valid conclusions. We could characterize human reasoning for everyday problems as “optimal” if it really worked this way. However, it probably does not. Research on how people do reason deductively and the extent to which they use formal logic while reasoning has been performed using syllogisms (Evans, 2002; Rips, 2002). In particular, syllogisms are used to explore conditional and categorical reasoning.

Conditional Reasoning

To understand conditional reasoning, consider the statement, “If the system was shut down, then there was a system failure.” This statement allows one to draw a conclusion (system failure) when given a condition of the system (being shut down). Deductive reasoning with conditional statements of this form is called conditional reasoning. More formally, we can write such statements as conditional syllogisms, which are of the form:

1.If the system was shut down, then there was a system failure.

2.The system was shut down.

3.Therefore, the system failed.

There are two rules of logic that allow us to come to a conclusion when given a syllogism of this form: affirmation (also called modus ponens) and denial (also called modus tollens). The syllogism just provided illustrates the rule of affirmation. Affirmation states that if A implies B (e.g., system shut down implies system failure) and A is true (system shut down), then B (system failed) must be true.

Now consider a different syllogism based on the same major premise:

1.If the system was shut down, then there was a system failure.

2.The system did not fail.

3.Therefore, the system was not shut down.

The rule of denial states that given the same major premise that A implies B and also that B is false (system did not fail), then A must also be false (the system was not shut down).

When we try to determine how people reason deductively, we present them with syllogisms and ask them to judge whether or not the conclusion of the syllogism is valid. People find some kinds of syllogisms easier than others. In particular, when a syllogism correctly makes use of the affirmation rule, people can easily distinguish between valid and invalid conclusions (Rips & Marcus, 1977). People begin to have problems, however, when the information provided by the premises is insufficient to draw a valid conclusion, or when drawing a valid conclusion requires use of the denial rule.

Consider for example, the premise, “If the red light appears, then the engine is overheating.” Four syllogisms using this premise are shown in Table 11.1. The two valid syllogisms are shown in the top row of the table, using affirmation and denial. Whereas people have no problem judging the affirmation syllogism to be valid, they are less accurate at classifying the denial syllogism as valid. The table also shows two invalid syllogisms in the bottom row. For these syllogisms, the information provided by the second premise does not allow a person to draw any valid conclusion. The conclusions shown represent two common logical fallacies called the “affirmation of the consequent” and the “denial of the antecedent.” In the original premise, the antecedent is the appearance of the red light, and the consequent is that the engine is overheating. To understand why these syllogisms are typically classified as fallacious, it is important to understand that nothing in the major premise (statement 1) rules out the possibility that the engine could overheat without the light turning on. Both of the invalid conclusions are based on the unwarranted assumption that the conditional statement “if” implies “if and only if,” that is, that the light always appears if the engine is overheating and never when it is not.

TABLE 11.1

Examples of Valid and Invalid Conditional Syllogisms

	Affirmation	Denial
Valid	1.If the red light appears, the engine is overheating. 2.The red light appeared. 3.The engine is overheating.	1.If the red light appears, the engine is overheating. 2.The engine is not overheating. 3.The red light did not appear.
	Affirmation of the consequent	Denial of the antecedent
Invalid	1.If the red light appears, the engine is overheating. 2.The engine is overheating. 3.The red light appeared.	1.If the red light appears, the engine is overheating. 2.The red light did not appear. 3.The engine is not overheating.

A famous experiment that explored how people engage affirmation and denial rules in reasoning was performed by Wason (1969). He showed people four cards, two with letters showing and two with digits showing, as shown in Figure 11.5a. He gave his subjects the following conditional statement:

FIGURE 11.5The four-card problem.

If a card has a vowel on one side, then it has an even number on the other side.

The subject’s task was to decide which cards would need to be turned over to determine whether the statement was true or false.

Most people turned over the E, demonstrating good use of the affirmation rule. But many people also turned over the 4, showing affirmation of the consequent. This is an error, because there is nothing in the statement that says that consonants could not also have an even number on the other side. According to the denial rule, the other card that must be turned over is the 7, since it must have a consonant on the other side. Very few people correctly turned over the 7.

It seems as though the problem solver’s difficulty in applying the denial rule arises from an insufficient search of the problem space. Evans (1998) suggested that people are biased to select cards that match the conditions in the statement (i.e., the vowel and even number), regardless of whether or not they are relevant to the problem. Surprisingly, even students who had taken a course in formal logic did no better at this task than students who had not (Cheng, Holyoak, Nisbett, & Oliver, 1986), suggesting that this bias is a fundamental characteristic of human reasoning.

While people find the four-card problem difficult, they do very well with a completely equivalent problem as long as it is framed within a familiar context (Griggs & Cox, 1982; Johnson-Laird, Legrenzi, & Legrenzi, 1972). Consider the conditional statement “If a person is drinking beer, then the person must be over 20 years of age.” If a police officer is attempting to determine whether a bar is in compliance with the minimum drinking age, students correctly indicated that the officer should examine the IDs of those people drinking beer and whether people under 20 were drinking beer.

If people were able to apply logical rules like affirmation and denial to conditional statements, we would not expect to see any difference in reasoning performance between the four-card problem and the minimum drinking age problem. The fact that people do better when problems are presented with familiar contexts suggests that people do not routinely use logical rules (Evans, 1989). Reasoning seems to be context-specific. For the drinking age problem, reasoning is accurate because people are good at using “permission schemas” (Cheng & Holyoak, 1985) to figure out what they are and are not allowed to do.

In the four-card problem, the tendency for people to turn over cards that match the conditions in the antecedent and the consequent can be viewed as a bias to look for confirming rather than disconfirming evidence. This bias affects reasoning performance in many other situations involved in verifying truth or falsity, such as medical diagnoses, troubleshooting, and fault diagnoses. Even highly trained scientists, when trying to confirm or disconfirm hypotheses through experimentation, may fall victim to their own confirmation biases.

One reason why confirmation bias is so strong is that people want to be able to retain their ideas of what is true and reject the ideas they wish to be false. One way to eliminate confirmation bias is to present premises that are personally distasteful to the problem solver, so that he will be motivated to reject them. Dawson, Gilovich, and Regan (2002) classified people according to their emotional reactivity. Each person took a test that classified him or her as having high or low emotional reactivity. Then they were told that people like themselves (high or low) tended to experience earlier death. This was not a belief that these people wanted to verify; they were highly motivated to disconfirm it. They were then shown four cards very similar to those used in the Wason four-card task, except that they were labeled with high and low emotional reactivity on one side and early and late death on the other. The labels that were exposed were high and low emotional reactivity and early and late death. The people were asked to test the early death hypothesis by turning over two cards. The two cards that correctly verify the hypothesis are the (confirming) card indicating the person’s reactivity level and the (disconfirming) card indicating late death. The people told that they were at risk of an early death turned over the correct cards approximately five times more frequently than people who were told that they were not at risk. Dawson et al. found similar results by asking people to verify personally distasteful racial stereotypes.

Even when people intend to look for disconfirming evidence, this becomes hard to do when the task becomes more complex (Silverman, 1992). In such situations, reasoning can be improved through the use of computer-aided displays. There are several reasons why this may be true. First, the display can continuously remind the reasoner that disconfirming evidence is more important than confirming evidence, and second, the display can reduce some of the cognitive workload imposed by a complex task.

Rouse (1979) noted that maintenance trainees had difficulty diagnosing a fault within a network of interconnected units. This diagnosis required locating operational units in the network and tracing through their connections to potentially faulty units. However, the trainees tended to look for failures and to ignore information about which nodes had not failed. One condition in the experiment used a computer-aided display to help trainees keep track of tested nodes that had not failed. The trainees’ fault diagnosis was better when the display was used than when it was not. Furthermore, people trained with the display performed better even after training was over and the display was no longer available.

Nickerson (2015, p. 1), in his book devoted to conditional reasoning, provides a clear statement that summarizes its importance:

Conditional reasoning is reasoning about events or circumstances that are contingent on other events or circumstances. It is a type of reasoning in which we all engage constantly, and without the ability to do so, human beings would be very different creatures, and greatly impoverished cognitively.

Categorical Reasoning

Categorical syllogisms are different from conditional syllogisms in that they include quantifiers like some, all, no, and some-not. For example, a valid categorical syllogism is:

1.All pilots are human.

2.All humans drink water.

3.Therefore, all pilots drink water.

As with conditional syllogisms, judgments about the validity of a conclusion in a categorical syllogism can be influenced by the context of the syllogism, misinterpreting the premises, and confirmation bias.

Consider the syllogism:

1.Some pilots are men.

2.Some men drink beer.

3.Therefore, some pilots drink beer.

The conclusion that some pilots drink beer does not follow from the premises, although many people will judge it to be valid. To appreciate why this conclusion is invalid, consider the following very similar syllogism:

1.Some pilots are men.

2.Some men are older than 100 years.

3.Therefore, some pilots are older than 100 years.

There are no pilots older than 100 years. What has gone wrong here? If a logical rule is to be applied to a set of premises, the resulting conclusion should be valid regardless of the content of those premises. In both of these syllogisms, exactly the same logical rule was applied to reach a conclusion. Whereas it seems reasonable to conclude that some pilots drink beer, it does not seem at all reasonable to conclude that some pilots are older than 100 years. This means that the first conclusion, no matter how reasonable (or true), is not a valid deduction. The error made with syllogisms of this type is to assume that the subset of men who are pilots and the subset of men who drink beer (or who are older than 100 years) overlap, but none of the premises forces this to be true.

These types of errors have been explained in part by the atmosphere hypothesis (Woodworth & Sells, 1935). According to this hypothesis, the quantifiers in the premises set an “atmosphere,” and people tend to accept conclusions consistent with that atmosphere (Leighton, 2004b). In the two syllogisms above, the presence of the quantifier some in the premises creates a bias to accept the conclusion as valid because it also uses the quantifier some.

Many errors on categorical syllogisms also may be a consequence of an inappropriate mental representation of one or more of the premises. For example, the premise some men drink beer could be incorrectly converted by a person to mean all men drink beer. The accuracy of syllogistic reasoning is also affected by how the premises are presented, and in particular, the ordering of the nouns in the premises. For example, the premises in the syllogism above present the nouns in the form pilots-men (premise 1) and men-beer (premise 2). When the premises are presented like this, people will be more likely to produce a conclusion of the form pilots-beer than of the form beer-pilots, regardless of whether it is valid (Morley, Evans, & Handley, 2004). We can refer to this form as “A-B, B-C,” where A, B, and C refer to the nouns in the premises. For premises with the form B-A, C-B, people tend to judge conclusions of the form C-A as valid. The following syllogism illustrates an example of this form:

1.Some men are pilots.

2.Some beer drinkers are men.

3.Therefore, some beer drinkers are pilots.

Again, this is an invalid conclusion. One reason why people may endorse this conclusion is that they may change the order in which the premises are encoded, so that the second premise is represented first and the first premise is represented second (Johnson-Laird, 1983). Reordering the premises in this way allows a person to think about beer drinkers, men, and pilots as subsets of each other, and sets up an “easier” A-B, B-C representation.

Johnson-Laird (1983) proposes that reasoning occurs through the construction of a mental model of the relations described in the syllogism. For example, given the premises:

All the pilots are men.

All the men drink beer.

a mental tableau would be constructed. The first premise designates every pilot as a man but allows for some men who are not pilots. The tableau for this would be of the following type:

pilot = man

(man)

The parentheses indicate that men who are not pilots may or may not exist. The tableau can be expanded to accommodate the second premise for which all men drink beer but some beer drinkers may not be men. It leads to the following model:

pilot = man = beer drinker

(man) = (beer drinker)

(beer drinker)

When asked whether a conclusion such as All the pilots are beer drinkers is valid, the mental model is consulted to determine whether the conclusion is true. In this case, it is true.

Two factors can affect the difficulty of a syllogism, according to Johnson-Laird (1983). The first factor is the number of different mental models that are consistent with the premises. When trying to decide whether a conclusion is valid, a person must construct and consider all such models. This imposes a heavy load on working memory resources. The second factor is the order in which premises are presented and the ordering of nouns within the premises, as discussed above. The orderings dictate the ease with which the two premises can be related to form an integrated mental model. Again, it seems as though reasoning does not occur through the use of formal logical rules, but by cognitive processes that are subject to biases and working memory limitations.

INDUCTION AND CONCEPTS

Induction differs from deduction in that an inductive conclusion is not necessarily true if the premises are true, as is the case with valid deductions. Inductive reasoning is accomplished by drawing a general conclusion from particular conditions. We do this every day without using any formal rules of logic. For example, a student may arrive at the inductive conclusion that all midterm exams are held in middle week of the semester because all of hers have been held at this time. Although this conclusion may be generally true, the student may take a class next semester for which the midterm exam is given at some other time. Inductive reasoning involves processes like categorization, reasoning about rules and events, and problem solving (Holyoak & Nisbett, 1988).

Our understanding of how the world works grows by using induction (Holland et al., 1986). Induction modifies how we think about procedures, or ways to do things, and our conceptual understanding of the world, or how objects and concepts are related to each other. Concepts and procedures can be represented by interrelated clusters of rules (schemas). Rules and rule clusters operate as mental models that can simulate the effects of possible actions on different objects.

A concept is an abstraction of the rules and relationships that govern the behavior of certain objects. How concepts are learned from examples and used is a fundamental component of inductive reasoning. Concepts have at least two functions (Smith, 1989): minimizing the storage of information and providing analogies to past experience. Concepts minimize the amount of information stored in memory, because a general rule and the objects to which it applies can be represented more economically than specific relationships between all objects in a particular category. For example, the rule “has wings and flies” can be applied easily to most objects in the category “bird,” whereas it seems wasteful to remember separately that “robins have wings and fly,” “sparrows have wings and fly,” “canaries have wings and fly,” and so on.

Past experiences represented as concepts can be used as analogies for problem solving. Recall the student who has induced that all midterms occur in the middle week of the semester. If she habitually skips her classes, she may use this induction to attend classes during that middle week to avoid missing her midterms.

Induction cannot occur between just any conceptual categories using just any rules. We can conceive of induction taking place through the activation of conceptual categories and rules appropriate to those categories. Activated concepts are formulated into a mental model similar to the “problem space” we discussed earlier in this chapter. Induction will be limited by the information that a person is able to incorporate into his problem space and retain in working memory. A particular problem-solving context will activate only a limited number of categories of conceptual knowledge, and not all of the information that may be necessary for valid induction may be incorporated into the mental model. If the wrong categories are activated, any conclusions made within the context may not be accurate. Similarly, if important information is left out of the mental model, any inductive reasoning based on that mental representation will not be able to use that information, and again, the conclusions may not be accurate.

Mental models may be used to simulate possible outcomes of actions (Gentner & Stevens, 1983). That is, given a model that incorporates a particular conceptual category, induction may proceed by “running” the model through different possible configurations and “observing” the outcome. These models and a person’s ability to use them in this way will depend on that person’s experiences interacting with a system or other relevant experiences. As with any inductive reasoning, these simulations can result in an accurate conclusion, but nothing is guaranteed. The accuracy of a conclusion will depend on the accuracy of the mental model. The accuracy of the mental model is one factor that allows experts to reason better than novices in a particular domain (see Chapter 12).

McCloskey (1983) demonstrated how an incorrect mental model can lead to incorrect inferences. He examined naive theories of motion acquired from everyday interactions with the world. He asked his people to solve problems of the type shown in Figure 11.6. For the spiral tube problem, he told them to imagine a metal ball put into the spiral tube at the end marked by the arrow. For the ball and string problem, he told them to imagine the ball being swung at high speed above them. People then drew the path of the ball when exiting the tube in the first case and when the string broke in the second case. The correct path in each case is a straight line, but many people responded that the balls would continue in curved paths. This led McCloskey to propose that people used a “naive impetus theory” to induce the path of the balls: the movement of an object sets up an impetus for it to continue in the same path. This theory, when incorporated into a mental model, yields simulations that produce incorrect inductions.

FIGURE 11.6The spiral tube (a) and ball-and-string (b) problems with correct (solid lines) and incorrect (dashed lines) solutions.

One important issue in the development of conceptual categories is how particular objects are classified as belonging to particular categories. One idea is that an object is classified as belonging to a particular category if and only if it contains the features that define the category (Smith & Medin, 1981). For example, “robin” may be classified as a “bird” because it has wings and flies. This idea, while important for the categorization of objects, does not explain how concepts are developed. Defining features do not exist for many categories. For example, there is no single feature that is shared by all instances of the concept games. Moreover, typicality effects, in which classification judgments can be made faster and more accurately if an object (robin) is typical of the category (bird) than if it is atypical (penguin; see Chapter 10), show that not all instances of a category are equal members of that category.

The effect of typicality results in fallacious reasoning (Tversky & Kahneman, 1983). For example, one experiment showed people personality profiles of different individuals, for example, “Linda,” who was “deeply concerned with issues of discrimination and social justice and also participated in antinuclear demonstrations.” They were then asked to judge how typical an example Linda was of the category “bankteller” or “feminist bankteller,” or how probable it was that Linda was a bankteller or a feminist bankteller. Because feminist banktellers are a subset of the larger category “bankteller,” people should estimate the probability that Linda is a bankteller as higher than the probability that she is a feminist bankteller. However, Linda was usually judged as more typical of a feminist bankteller, and people estimated the probability that she was a feminist bankteller as higher than the probability that she was just a bankteller.

This error is a conjunction error (Kahneman & Tversky, 1972; Shafir, Smith, & Osherson, 1990; Tversky & Kahneman, 1983), which arises from a representativeness heuristic. The representativeness heuristic is a rule of thumb that assigns objects to categories based on how typical they seem to be of those categories (see later discussion of decision-making heuristics in this chapter).

There are other ways that category membership can be determined. These include an assessment of how similar an object is to a category “prototype” (the ideal or most typical category member) and using other information to convert an inductive problem into one of deduction (Osherson, Smith, & Shafir, 1986). For example, when given a volume for an object and asked to decide whether it is a tennis ball or a teapot, objects that are closer in volume to the tennis ball than to the average teapot will be classified as a tennis ball (Rips, 1989). Apparently, the knowledge that tennis balls are of a fixed size is incorporated into the categorical judgment, changing the problem into one of deduction.

ABDUCTION AND HYPOTHESES

A third form of reasoning, introduced by Peirce (1940), is called abduction or retroduction. Abductive reasoning involves three interrelated elements (Holcomb, 1998; Proctor & Capaldi, 2006): explaining patterns of data; entertaining multiple hypotheses; and inference to the best explanation. With regard to explaining patterns of data, a person using abduction examines phenomena, observes patterns, and then develops a hypothesis that explains them. This form of reasoning is not deduction, because the hypothesis is not derived from the phenomena, nor is it an induction, because it is a generalization not about the properties shared by the phenomena but about their cause.

The latter two elements derive from the idea that people don’t think about a single hypothesis in isolation. Thus, when reasoning abductively, people evaluate any given hypothesis relative to other hypotheses, with the goal of arriving at the best explanation. As we mentioned in Chapter 2, this form of reasoning is used widely in science (Haig, 2014). However, it also is used widely in other circumstances. For example, in medical diagnosis (Patel, Arocha, & Zhang, 2005) and judicial decision making (Ciampolini & Torroni, 2004), people will generate and consider alternative hypotheses, and their diagnosis or decision will be in favor of the hypothesis that provides the best explanation of the evidence. Likewise, when diagnosing a fault with a complex system, such as a chemical plant, operators will typically apply abductive reasoning in generating and evaluating different hypotheses (Lozinskii, 2000).

DECISION MAKING

The things that a person decides to do affect both the person (the decision maker) and the people around him. Also, the conditions under which the decision is made can influence what a person chooses to do. Decisions can be made under conditions of certainty, in which the consequences for each choice are known for sure, or under conditions of uncertainty, in which the consequences for each choice may be unknown. Gambling is an example of decision making under uncertainty. Most real-life decisions are made under uncertainty. If you decide to exceed the speed limit by a significant amount, one of several things could possibly happen: you might arrive at your destination early and save a lot of time, you could be stopped and cited for speeding and arrive late, or you could cause a serious traffic accident and never arrive at all. An example of an applied decision-making problem is the choice to include curtain air bags as standard equipment on a line of automobiles, given the estimated cost, prevailing market conditions, effectiveness of the air bags, and so on.

How do people choose what to do when they do not know what the consequences of their actions will be? The most interesting questions about how people make decisions are asked within this context. Decisions based on both certain and uncertain conditions are faced regularly by operators of human–machine systems as well as by human factors specialists. Therefore, it is important to understand the ways that decisions are made and the factors that influence them. There are two ways that we can talk about how people make decisions (Lehto, Nah, & Yi, 2012). Normative theories explain what people should do to make the best decisions possible. But people do not often make the best decisions, so descriptive theories explain how people really make decisions, including how people overcome cognitive limitations and how they are biased by decision-making contexts.

NORMATIVE THEORY

Normative theories of decision making concern how we should choose between possible actions under ideal conditions. Normative theories rely on the notion of utility, or how much particular choice outcomes are worth to the decision maker. Utility is a measure of the extent to which a particular outcome achieves the decision maker’s goal. The decision maker should choose the action that provides the greatest total utility. If outcomes are uncertain, both the probabilities of the various outcomes and their utilities must be figured into the decision-making process.

How people incorporate utility into their decision-making processes has been studied using gambles. Table 11.2 shows two gambles. People in decision-making experiments are often given some amount of money to play with, and they can choose which gamble they would prefer to play. Of the choices in Table 11.2, which gamble should a decision maker choose? Is gamble A better than gamble B, or vice versa? Expected-utility theory provides an answer to these questions. For monetary gambles, assume that the utility of a dollar amount is equal to its value. An expected utility E(u) of each gamble can be found by multiplying the probability of each possible outcome by its utility. That is,

E (u) = \sum_{i = 1}^{n} p (i) u (i),

$E (u) = \sum_{i = 1}^{n} p (i) u (i),$

where:
p(i)	is the probability of the ith outcome, and
u(i)	is the value of the ith outcome.

We can view an expected utility as the average amount that a decision maker would win for a particular gamble. In choosing between two gambles, the rational choice would be the one with the highest expected utility.

We can compute the expected utilities for the gamble in Table 11.2. For gamble A, the expected utility is

10 ($ 10) - . 90 ($ 1) = $ 1 - . 90 = $ . 10,

$10 ($ 10) - . 90 ($ 1) = $ 1 - . 90 = $ . 10,$

and for gamble B the expected utility is

. 90 ($ 1) - . 10 ($ 10) = $ . 90 - $ 1 = - ($ . 10) .

$. 90 ($ 1) - . 10 ($ 10) = $ . 90 - $ 1 = - ($ . 10) .$

Thus, gamble A has the highest expected utility and should be preferred to gamble B.

A rational decision maker makes choices in an attempt to achieve some goal. We have defined utility as the extent to which an outcome achieves this goal. Therefore, expected-utility theory provides a yardstick for rational action, because rational decisions must be consistent with those that yield the greatest utility.

Expected-utility theory forms the basis of a discipline called behavioral economics, which studies how people make economic choices. One reason why expected-utility theory has been so influential is the simple fact that rational choices must be based on numbers. This means that only a few fundamental rules of behavior, called axioms (Wright, 1984), can be used to deduce very complex decision-making behavior. One fundamental axiom is called transitivity. This means that if you prefer choice A to choice B, and you also prefer choice B to choice C, then when you are presented with options A and C, you should prefer A. Another is that of dominance: if, for all possible states of the world, choice A produces at least as desirable an outcome as choice B, then you should prefer choice A. Most importantly, preferences for different options should not be influenced by the way they are described or the context in which they are presented; only the expected utility should matter. As we shall see, these axioms do not always hold for real-life decisions, which is why psychologists have developed descriptive theories to explain human behavior.

DESCRIPTIVE THEORY

Do people perform optimally in the way proposed by expected-utility theory? The answer is no. People consistently violate the axioms of expected-utility theory and demonstrate what could be interpreted as irrational choice behavior. In this section, we will talk about the ways that people violate these axioms, the reasons for violating these axioms, and then, in the next section, ways to improve decision-making performance.

Transitivity and Framing

Consider the axiom of transitivity described above. If A is chosen over B, and B over C, A should be chosen over C. Yet violations of transitivity occur (Tversky, 1969), in part because small differences between alternatives are ignored in some situations but not in others. Consider the three health clubs in Table 11.3 (Kivetz & Simonson, 2000). Information about price, variety of exercise machines, and travel time is given if it is available. You can see that while you may prefer Health Club A to B on the basis of price, and B to C on the basis of travel time, you may nonetheless prefer Health Club C to A on the basis of variety. This violation of transitivity results from a comparison between different features of each alternative and does not necessarily represent irrational behavior.

Another important violation of expected utility axioms results from framing (Levin et al., 2015). Choice behavior will change when the context of the decision changes, even when that context does not alter the expected utilities of the choices (Tversky & Kahneman, 1981). People can be manipulated to make different choices by restating identical problems to emphasize gains or losses. As one example, consider the following problem:

Imagine that the U.S. is preparing for the outbreak of an unusual and virulent disease, which is expected to kill 600 people. Two alternative programs to combat the disease have been proposed. Assume that the consequences of the programs are as follows.

One description of the two programs, emphasizing lives saved, might look like this:

If Program A is adopted, 200 people will be saved. If Program B is adopted there is a 1/3 probability that 600 people will be saved, and 2/3 probability that no people will be saved. Which of the two programs would you favor?

Another description, emphasizing lives lost, might look like this:

If Program C is adopted, 400 people will die. If Program D is adopted, there is 1/3 probability that nobody will die, and 2/3 probability that 600 people will die. Which of the two programs would you favor?

Notice that the two descriptions are formally identical. For instance, in the first description 200 people are saved with the first program, which is the same as 400 people dying, as written in the second description. People who saw the first description usually chose Program A over Program B, whereas most people who saw the second version chose Program D over Program C. The first description provides a positive frame for “saving lives,” whereas the second provides a negative frame for “lives that will be lost.” This demonstrates that a person’s decision may be affected greatly by the way in which important information is presented, primarily by influencing how people pay attention to various attributes of the choice.

Another axiom of expected-utility theory that is closely related to framing has to do with the stability of preference. If A is preferred to B in one situation, then A should be preferred to B in all other situations. However, it is easy to get people to reverse their preferences for A and B in different contexts. Lichtenstein and Slovic (1971) found that when choosing between a bet with a high probability of winning a modest amount of money and one with a low probability of winning a large amount of money, most people chose the high-probability bet. They were then asked to state their selling price for the gamble, where they would turn over the gamble to a buyer and allow that buyer to play the gamble. In this case, most people gave a higher selling price for the low-probability event than for the high-probability event.

This is a preference reversal, because the selling price indicates that the unchosen alternative has a higher value or utility than the chosen alternative. Tversky, Sattath, and Slovic (1988) concluded that such reversals occur because when the person must choose which gamble to play, he focuses his attention on probability, whereas when the person sets the selling price, he focuses his attention on the dollar amount. Again, the context in which the choice is framed has an effect on a person’s preference, and a person will attend to different features of the choice in different contexts.

Bounded Rationality

Violations of transitivity and framing effects (among other findings) led Simon (1957) to introduce the concept of bounded rationality. This concept embodies the notion that a decision maker bases his or her decisions on a simplified model of the world. The decision maker

behaves rationally with respect to this (simplified) model, and such behavior is not even approximately optimal with respect to the real world. To predict his behavior, we must understand the way in which this simplified model is constructed, and its construction will certainly be related to his psychological properties as a perceiving, thinking, and learning animal. (Simon, 1957, p. 198)

Bounded rationality recognizes that human decision makers have limitations on the amount of information that can be processed at any one time. For a difficult decision, it will not be possible to consider every feature of all of the alternatives. For example, when you go to choose a mobile phone plan, you cannot compare all plans on all possible features that differentiate them (Friesen & Earl, 2015). For a decision like this, you might think about only those features that you care about most (e.g., connection fees, price for on-network calls, data limits), and if you can come to a decision based only on these features, you will do so. This decision-making strategy is called satisficing (Simon, 1957). Whereas satisficing may not lead to the best decision every time, it will lead to pretty good decisions most of the time.

We defined heuristics earlier in this chapter as rules of thumb that allow people to reason in very complex situations. While heuristics will not always produce correct or optimal decisions, they help people bypass their cognitive and attentional limitations (Katsikopoulos & Gigerenzer, 2013). Satisficing takes place, then, through the use of heuristics.

Elimination by Aspects

One example of a satisficing heuristic applied to complex decisions is called elimination by aspects (Tversky, 1972). When people use this heuristic, they reduce the number of features they evaluate across their choices by focusing only on those that are personally most important. Beginning with the feature that you think is most important, you might evaluate all your choices and retain only those that seem attractive on the basis of this single feature. For you, price may be the most important aspect of a new car. You decide to eliminate all cars that cost more than $15,000. Size may be next important; so from the cars of $15,000 or less, you eliminate any compact cars. This elimination procedure continues through all of the personally important features until only a few alternatives remain that can be compared in more detail. Like many satisficing heuristics, although this procedure reduces the processing load, it can also lead to the elimination of the optimal choice.

Decision makers will often base their choice on a single dominant feature among alternatives and can be unwilling to consider other important attributes. In a study of proposals for coastline development in California, Gardiner and Edwards (1975) found that people could be grouped according to whether development or environmental concerns were most important to them. While members in the development group attended only to the development dimension across different development alternatives, members in the environmental group attended only to the environmental dimension. However, when people were forced to rate each proposal on both development and environmental dimensions, they gave some weight to both dimensions. This demonstrates that people can fairly evaluate alternatives on the basis of features that are not particularly salient to them if they are forced to do so.

The tendency to base decisions on only salient dimensions is even greater under stress. Stress increases the level of arousal, and, as discussed in Chapter 9, at high levels of arousal a person’s attentional focus becomes narrowed and less controlled. Wright (1976) found evidence for both of these effects in the decisions of people who rated how likely they would be to purchase each of several automobiles. In one condition, Wright increased task-relevant time stress by reducing the time available for making the decision, and in another, he played an excerpt from a radio talk show as a distraction. Both manipulations caused the decision makers to focus more on the negative characteristics of the cars than they did in the baseline condition. This and other studies suggest that the tendency to narrow attention during decision making under stress can be minimized by eliminating unnecessary stressors and by structuring the decision process in such a way that the decision maker is forced to consider all the features important to making a good choice.

Availability

Another useful heuristic, which is used to estimate the probabilities or frequencies of events, is called availability (Kahneman, Slovic, & Tversky, 1982). Availability is the ease with which events can be retrieved from memory. More easily remembered events are judged as more likely than less memorable events. For example, if a person is asked to judge whether the letter R is more likely to occur in the first or third position of words in the English language, she will usually pick the first position as most likely. In reality, R occurs twice as often in the third position. Tversky and Kahneman (1973) argue that this happens because it is easier to retrieve words from memory on the basis of the first letter. Availability also biases people to overestimate the probability of dying from accidents relative to routine illnesses (Lichtenstein, Slovic, Fischoff, Layman, & Combs, 1978). Violent accidents such as plane crashes are much more available than most illnesses because they receive more media coverage, so their incidence tends to be overestimated.

Representativeness

The representativeness heuristic mentioned earlier in the chapter uses degree of resemblance between different events as an indication of how likely those events are to occur. More representative outcomes will be judged as more likely to occur than less representative ones. The following example from Kahneman and Tversky (1972) illustrates this point:

All families of six children in a city were surveyed. In 72 families the exact order of births of boys (B) and girls (G) were GBGBBG.

What is your estimate of the number of families surveyed in which the exact order of births was BGBBBB?

Because there is a 50% chance of giving birth to a boy or a girl, the sequence BGBBBB has exactly the same probability of occurring as the sequence GBGBBG. Despite the fact that these two sequences are equally probable, the BGBBBB sequence is often judged to be less likely than the GBGBBG sequence. We can explain this mistake by noting that the sequence with five boys and one girl is less representative of the proportion of boys and girls in the population.

Representativeness is closely related to the gambler’s fallacy, which is the belief that a continuing run of one of two or more possible events is increasingly likely to be followed by an occurrence of the other event. For example, suppose the births in the BGBBBB sequence above were presented sequentially to a person who made a probability judgment after each birth that the next birth would be a girl. The predicted probability that the subsequent birth would be a girl tends to become larger through the run of four boys, even though the probability is always 50%. The gambler’s fallacy occurs because people fail to treat the occurrence of random events in a sequence as independent; that is, that having a boy does not change the future probability of having a girl.

Probability Estimation

People are very bad at making accurate probability estimates. For example, the gambler’s fallacy is the failure to perceive independent events as independent. Shortly, we will see that people are also very bad at considering base rate information. Representativeness and anchoring (discussed below) are heuristics that permit people to make probability estimates for complex events. In particular, they allow people to make probability estimates for complex events composed of several simple events (such as the sequences of births presented above) without having to perform difficult mathematical calculations. As for all situations of satisficing, when a heuristic is used for probability judgments, such judgments will show systematic inaccuracies.

These inaccuracies can be demonstrated in real-life judgment situations (Fleming, 1970). Fleming asked people in his experiment to imagine themselves in a combat situation. He asked them to estimate the overall probability of an enemy attack on each of three ships, given the altitude, bearing, and type (e.g., size and armament) of an enemy plane. The person’s goal was to protect the ship that was most likely to be attacked. Although each aspect of the plane was independent and of equal importance, people tended to add the different probabilities together rather than to multiply them, as was appropriate (see Chapter 4). Because of these mistakes, people underestimated the probability of very likely targets and overestimated the probability of unlikely targets. Decision makers apparently experience considerable difficulty in aggregating probabilities from multiple sources, which suggests that such estimates should be automated when possible.

When base rates or prior probabilities of events are known, the information from the current events must be integrated with the base rate information. In the previous example, if the prior probabilities of each of the three ships being attacked were not equal, then this information would need to be integrated with the altitude, bearing, and type information. Yet, in such situations, people do not typically consider base rates.

A famous example of a base rate problem is presented as an evaluation of the reliability of an eyewitness’s testimony (Tversky & Kahneman, 1980). A witness, let’s call him Mr. Foster, sees an accident late one night between a car and a taxi. In this part of town, 90% of the taxis are blue and 10% are green. Mr. Foster sees the taxi speed off without stopping. Because it was dark, Mr. Foster could not tell for sure whether the taxi was green or blue. He thinks it was green. To establish how well Mr. Foster can discriminate between blue and green taxis at night, the police showed him 50 green taxis and 50 blue taxis in a random order, all in similar lighting. Mr. Foster correctly identified the color of 80% of the green taxis and 80% of the blue taxis. Given Mr. Foster’s identification performance, how likely is it that he correctly identified the color of the taxi involved in the accident?

Most people estimate Mr. Foster’s testimony to have around an 80% probability of being accurate. However, this estimate “neglects” the base rate information provided early in the problem: only 10% of the taxis in that part of town were green to begin with. The true probability that Mr. Foster saw a green taxi is only about 31% when this information is considered.

We can demonstrate that people rely on the representativeness heuristic to solve some problems of this type. For example, Kahneman and Tversky (1973) gave people descriptions of several individuals supposedly drawn at random from a pool of 100 engineers and lawyers. One group was told that the pool contained 70 engineers and 30 lawyers, whereas the other group was told the reverse. These prior probabilities did not affect the judgments; the judgments were based only on how representative of an engineer or a lawyer a person seemed to be.

Decision makers can adjust their probability estimates when they are instructed to pay attention to base rate information, but their modified estimates are not adjusted enough. So, in the case of Mr. Foster, if a jurist were to be instructed to consider the fact that only 10% of the taxis are green, he might modify his estimate of Mr. Foster’s accuracy from 80% down to 50%, but probably not all the way down to 31%. This tendency to be conservative in adjusting probability estimates can be linked to the use of an anchoring heuristic (Tversky & Kahneman, 1974). The evidence that Mr. Foster is 80% correct in judging blue from green taxis forms the basis of a preliminary judgment, or anchor. The base rate information is evaluated with respect to that anchor. The anchor exerts a disproportionate effect on the final judgment.

The importance of anchors was demonstrated by Lichtenstein et al. (1978), who had people estimate the frequencies of death in the United States for 40 causes. They were given an initial anchor of either “50,000 people die annually from motor vehicle accidents” or “1,000 deaths each year are caused by electrocution,” and then they estimated the frequencies of death due to other causes. The frequency estimates for other causes were considerably higher with the anchor “50,000 deaths” than with the anchor “1,000 deaths.”

In summary, when performing complex reasoning tasks, people often use heuristics that reduce their mental workload. These heuristics will produce accurate judgments in many cases, particularly when the reasoner knows something about the domain in question. The benefit of heuristics is that they render complex tasks workable by drawing on previous knowledge. The cost is that these heuristics are the source of many mistakes made by operators and decision makers.

IMPROVING DECISIONS

Individuals in an organization and operators of human–machine systems are often faced with complex decisions, sometimes under very stressful conditions. We have just discussed how people are forced to make less than optimal decisions because of their limited capacity for attending to and working with information. For this reason, one area in human factors has been concerned with the improvement of decision making through design. There are three ways in which we can improve the quality of decisions: designing education and training programs, improving the design of task environments, and developing decision aids (Evans, 1989).

TRAINING AND TASK ENVIRONMENT

We said earlier that people with formal training in logic make the same types of reasoning errors as people without such training. For example, Cheng et al.’s (1986) experiments found that people performed no better on Wason’s four-card problem after a semester course in logic than before the course. What this means for us is that training focused on improving reasoning and decision making more generally is not going to be effective at improving reasoning and decision making for specific tasks. Rather, training should focus on improving performance in specific task environments, because most reasoning is based on context-specific knowledge.

One exception to this general rule involves probability estimation. Fong, Krantz, and Nisbett (1986) showed that people could be taught to estimate probabilities more accurately with training. Their task required their subjects to use a statistical rule called the law of large numbers. This law states that the more data we collect, the more accurate our statistical estimates of population characteristics will be. Some of Fong et al.’s subjects received brief training sessions on the law of large numbers, and then were given 18 test problems of the following type (Fong et al., 1986, p. 284):

An auditor for the Internal Revenue Service wants to study the nature of arithmetic errors on income tax returns. She selects 4000 Social Security numbers by using random digits generated by an “Electronic Mastermind” calculator. And for each selected social security number she checks the 1978 Federal Income Tax return thoroughly for arithmetic errors. She finds errors on a large percentage of the tax returns, often 2 to 6 errors on a single return. Tabulating the effect of each error separately, she finds that there are virtually the same numbers of errors in favor of the taxpayer as in favor of the government. Her boss objects vigorously to her assertions, saying that it is fairly obvious that people will notice and correct errors in favor of the government, but will “overlook” errors in their own favor. Even if her figures are correct, he says, looking at a lot more returns will bear out his point.

The auditor’s reasoning was based on the fact that she used random sampling, which should be unbiased, and a relatively large sample of income tax forms. Her boss’s contrary stand is that the sample is not large enough to yield accurate estimates. The people who received training in the law of large numbers were much more likely to use statistical reasoning and to use it appropriately in their answers. In the above problem, for example, those who received training would be more likely to mention that the auditor’s findings were based on random sampling of a large number of tax returns.

An important aspect of any training program or task environment is how information is presented to trainees or decision makers. We will talk more about training in general in the next chapter. For now, we wish to emphasize that if information is presented unclearly, too generally, or abstractly, people will be unable to perceive the relevance of the information to the task that they wish to perform, and they will be unable to apply their experience to solve novel problems.

We have already seen one important example of the effect that different ways of presenting information can have on decisions: the framing effect. Presenting a problem in one way may lead to a different decision than presenting it another way. For example, people have difficulty reasoning about negative information and do much better if the information is framed in such a way that important attributes are encoded positively rather than negatively in the mental representation (e.g., Griggs & Newstead, 1982).

Whereas framing can be used to draw a decision maker’s attention to one or more features of a problem, many errors of inference and bias can be attributed to information being presented in such a way that it increases the decision makers’ information-processing load (Evans, 1989). Unfortunately, this is very easy to do simply by presenting information in a complicated or unclear way. As an example, people who study consumer behavior are very concerned about how pricing information is presented for products on grocery store shelves. You are probably familiar with the little tags displaying unit price that appear under all of the products on a shelf in a U.S. grocery. These tags are supposed to provide the consumer with a unit price that allows him to make easy price comparisons across similar products. However, the units are often different on each tag, tags are not always aligned with the product that they identify, and often, tags across several meters of shelf space may need to be searched, memorized, and compared to determine the best price. Russo (1977) performed a simple experiment that compared the self-tag system with a simple list of unit prices for all products posted near the products. When the list was used, consumers reported that comparisons across brands were easier, and consumers purchased the less expensive brand more often.

DECISION AIDS

Decision-making performance can be improved by providing decision makers with aids that relieve some of the memory and information-processing demands of the task. There are many kinds of such aids, ranging from the very simple (like notes written on index cards) to the very complex (computer-based decision-support systems that use artificial intelligence). A decision aid may not even be an object. It could simply be a rule that one follows within a familiar, but uncertain, situation. For example, physicians often use something called the Alvarado score to diagnose acute appendicitis. Different symptoms, such as pain in the lower right abdomen, are assigned point values, and when the number of accumulated points becomes high enough, the physician will remove the patient’s appendix.

Often the role of a decision aid is to force the decision maker to conform to the choices prescribed by normative theories. The Alvarado scale for appendicitis forces a physician to consider all relevant symptoms and weights them according to their diagnosticity, so it works a lot like an expected utility measure. One approach to complex decision making is decision analysis, a set of techniques for structuring complex problems and decomposing them into simpler components (Lehto et al., 2012). A decision analysis can be viewed as a decision aid in and of itself, or it can be performed for the purpose of better understanding a decision or constructing a new decision aid.

Structuring a problem for decision analysis usually involves the construction of a decision tree specifying all possible decisions and their associated outcomes. The probability and utility of each outcome are estimated. Then, the expected utility is computed for each possible decision and used to recommend an optimal choice (von Winterfeldt & Edwards, 1986). Decision analysis has been applied with success to problems like suicide prevention, landslide risks, and weather prediction (Edwards, 1998). This success is due in large part to adequate structuring of the complex problem.

Decision analysis must be used with care. Because probabilities and utilities are estimated by the decision analyst, biases can still arise when these quantities are inaccurately assessed. Furthermore, even during a decision analysis, it is possible that certain critical features of the decision-making problem will be overlooked. One of the more spectacular failures of decision analysis involved the decision to place the gas tank of the Ford Pinto (sold between 1971 and 1980) behind the rear axle (von Winterfeldt & Edwards, 1986). When the Pinto was hit from behind, there was a chance that the gas tank would rupture and explode. A decision analysis was performed in which the cost of relocating the tank in front of the axle ($11 per vehicle) was compared with the expected dollar value of lives saved ($200,000 per “soul”) by tank relocation. The total cost of tank relocation was computed to be greater than the utilities associated with saving lives and avoiding injuries, so the gas tank was left where it was. Not considered in this analysis were the cost of punitive damages awarded in liability suits and the cost of the negative publicity resulting from publication of the analysis. The reputation of the Pinto never recovered, and it was discontinued in 1980.

One computer-based decision-analysis system is MAUD (Multi-Attribute Utility Decomposition; Humphreys & McFadden, 1980). MAUD contains no domain-specific knowledge but elicits information from the decision maker about the problem and the different alternatives available for solving the problem. Based on this input, it structures problems and recommends decisions using normative decision theory. Because of the way that MAUD asks questions of the decision maker, decision-maker bias is reduced.

In many disciplines, computer-based decision-support systems have been developed to aid complex decision-making processes (Marakas, 2003). The availability of mobile phones and table computers has allowed the opportunity for much more widespread use of decision-support systems than in the past (Gao, 2013). A decision-support system is used to guide operators through the decision-making process. A decision-support system has three major components: a user interface, a control structure, and a fact base. The interface solicits input from the user and presents information to the user that is relevant to the problem. Users may retrieve and filter data, request computer simulations or projections, and obtain recommended courses of action (Keen & Scott-Morton, 1978).

The control structure of a decision-support system consists of a data base management system and a model management system (Liebowitz, 1990). The data base management system is a set of programs for creating data files organized according to the needs of the user. The model management system is used to model the decision situation by drawing on information from the data base. Finally, the fact base of the decision-support system includes not only the data base but also the models that can be applied to the data.

A good decision-support system has a number of characteristics, and human factors engineering can make a positive contribution to its usability. Most important from the users’ perspective is that it satisfy their needs. As Little, Manzanares, and Watson (2015, p. 273) note, “A decision support system that is mismatched with user needs benefits no one and can lead to poor decision making that results in unnecessary human and economic costs.” In addition to usefulness, usability is a critical factor, which mainly involves the design of the interface. This interface should allow effective dialogue between the user and the computer. The design should consider how information is presented and elicited, providing flexibility in how data are analyzed and displayed. As we will see in later chapters, there is usually a tradeoff between flexibility and usability. Human factors engineers can help determine the level of flexibility appropriate for any particular application. It is important to recognize that decision-support systems do not replace decision makers but only provide them with important information in such a way that decision-making performance is improved. Even when a decision-support system is known to be effective, people’s attitudes may prevent its widespread use, as we discuss in Box 11.1.

BOX 11.1DIAGNOSTIC SUPPORT SYSTEMS

Computer-based Diagnostic Support Systems (DSSs) are used in a variety of situations, to aid medical diagnoses, for example of appendicitis and heart failure, and some DSSs are for very general use. Such aids can be extraordinarily effective. Some studies have shown that when physicians use a DSS, their diagnostic accuracy improves dramatically. Unfortunately, physicians demonstrate marked reluctance to use them.

One study examined physicians’ ability to diagnose acute cardiac ischemia (ACI), which includes obstruction of the heart’s arteries leading to chest pain, as well as full-blown “heart attacks,” in which the obstruction is complete and the heart muscle is dying. Such diagnoses are extraordinarily expensive because of the procedures employed to protect the patient’s life, but the cost of overlooking possible ACI is also very high, because the risk of the patient’s death is very high. Because of this high risk of death, physicians tend to err on the side of caution by diagnosing ACI even when it is not present. That is, they make lots of false alarms.

There is a very accurate DSS that can be used to assist in the diagnosis of ACI, which takes into account the patient’s actual risk of having ACI. For instance, a young, healthy woman who doesn’t smoke but complains of chest pains is unlikely to be suffering from a heart attack, whereas an older, overweight man who smokes is far more likely to be suffering from a heart attack. When physicians used this DSS, their false-alarm rate dropped from 71% to 0%. However, when given the opportunity to use the DSS later, only 2.8% of physicians chose to do so, citing little perceived usefulness of the aid as the reason (Corey & Merenstein, 1987).

There are several reasons why physicians may be reluctant to use a DSS, but “usefulness” is surely not one of them. A more likely explanation is a physician’s concern with how qualified he or she is perceived to be, both by patients and by colleagues. Even when told that such aids reduce errors, patients perceive physicians who use computer-aided DSSs to be less thorough, clever, and thoughtful than physicians who do not (Cruikshank, 1985).

Arkes, Shaffer, and Medow (2007) showed that this general finding persists even today, when, we might assume, patients are more accustomed to the presence of computer technology in medicine. Patients in their experiments read several scenarios in which physicians used either a computer-based DSS or no DSS at all. They rated physicians who used a computer-based DSS as less thorough, less professional, and having less diagnostic ability than physicians who used no aid at all. Furthermore, they also rated themselves as being less satisfied by the care they would receive from these physicians. These evaluations were mitigated somewhat, however, when they were also told that the DSS had been designed by the prestigious Mayo Clinic. Shaffer, Probst, Merkle, Arkes, and Medow (2013) found that seeking advice from another physician did not result in low ratings, suggesting that consultation of a nonhuman device specifically is the source of the negative evaluations.

These findings are troublesome for both patients and physicians. A physician trying to be as accurate as possible will have his or her best attempts at accuracy perceived negatively by his or her patients. There is also some evidence that these negative perceptions may extend also to the physician’s colleagues. Such negative perceptions may lead to increased patient dissatisfaction, distrust, and, in the worst-case scenario, an increase in accusations of malpractice.

In sum, DSSs are a key component of modern medical practice. The system designer must be aware of the problems in their use. Some diagnostic systems, such as EEGs, contain within them DSSs that augment the system output with diagnostic guidelines. The challenge to the designer is to present this information in such a way that the physician is willing to use it. Patient acceptance is a more difficult problem, but one that would be easier to solve if physicians were more positive about the use of DSSs in their practice.

An alternative to support systems based on decision theory is case-based aiding (Lenz et al., 1998). Case-based aiding uses information about specific scenarios to support the decision maker. These computer-based systems try to provide the decision maker with appropriate analogies that are applicable to an ongoing problem. Kolodner (1991) argues that such an approach should be beneficial in many circumstances, because people reason through problems by using prior knowledge. A case-based support system, which stores and retrieves appropriate analogies, aids in decision making because people find it natural to reason using analogies but have difficulty retrieving them from memory.

As an example of how a case-based support system might be used, consider an architect who has been given the following problem:

Design a geriatric hospital: The site is a 4-acre wooded sloping square; the hospital will serve 150 inpatients and 50 outpatients daily; offices for 40 doctors are needed. Both long-term and short-term facilities are needed. It should be more like a home than an institution, and it should allow easy visitation by family members. (Kolodner, 1991, p. 58)

The architect could highlight the key words in the problem specification, and the support system would retrieve cases of geriatric hospitals that are similar to the design criteria. The architect then could evaluate the successes and failures of one or more of those cases and adapt similar designs for his purpose.

A final example of decision support is a recommendation system (Stohr & Viswanathan, 1999). A recommendation system provides information about the relative advantages of alternative actions or products. Recommendation systems are used by online retailers to suggest books or recorded music that you might want to purchase, based on your previous purchasing patterns and those of other people. Web-based agents may make recommendations about various aspects of websites. An example is Privacy Bird®, which was developed in the first decade of the 21st century and is still available to download. It is a user agent that alerts users as to whether a website’s privacy policy, posted in machine-readable form, is consistent with the user’s preferences (Cranor, Guduru, & Arjula, 2006). A happy green bird indicates that the site’s policy matches the user’s preferences, an angry red bird indicates that it does not, and a yellow bird specifies that the site does not have a machine-readable privacy policy.

Recommendation systems are designed to provide users with information to assist in their decisions. In the case of Privacy Bird, this decision is whether to provide personal information to different companies. Most of you have experienced phishing attacks, in which you receive an apparently legitimate e-mail message directing you to a fraudulent website with the intent of getting you to enter personal information such as a credit card number. Recommendations to leave a suspected phishing site can be incorporated into the warnings, much as in Privacy Bird. However, for such warnings to be effective, various usability issues, such as the criteria for when to display the warning, what information should be displayed to users, how that information should be displayed, and how to encourage selection of the safe action, need to be taken into account when the warning/recommendation system is designed (Yang et al., 2017).

SUMMARY

Human problem solving, reasoning, and decision making are notoriously fallible. Even in very simple or straightforward cases, human performance deviates systematically from that defined as correct or optimal. However, these deviations do not imply that people are irrational. Rather, they reflect characteristics of the human information-processing system. Decision-making performance is constrained by a person’s limited ability to attend to multiple sources of information, to retain information in working memory, and to retrieve information from long-term memory. Consequently, people use heuristics to solve problems and make decisions. These heuristics have the benefit of rendering complex situations manageable, but at the expense of increasing the likelihood of errors. In virtually all situations, a person’s performance will depend on the accuracy of his or her mental representation or model of the problem. To the extent that this representation is inappropriate, errors will occur.

Human factors engineering has focused on training programs, methods for presenting information, and the design of decision-support systems to improve reasoning and decision-making performance. We have discussed how training in statistics and domain-specific problem solving can, to a limited extent, improve performance. However, it is easy to present information in ways that mislead or bias the decision maker. Computer-based decision-support systems circumvent many of these problems. Recommendation systems provide users with suggestions about actions or products, usually during their interactions with the World Wide Web. However, many decision-support systems are intended for use by experts in a particular field and cannot be used to help untrained individuals perform like experts. The knowledge possessed by an expert in a domain differs substantially from that of a novice. These differences and how expertise is incorporated into expert systems are topics of the next chapter.

Table of Contents for Chapter 11 Solving Problems and Making Decisions

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 11 Solving Problems and Making Decisions