CHAPTER  6

Sequence and Anticipation

Many critical analyses in this book depend on sequences of stimuli and responses that arise from experience with an S* and must anticipate the S* for conditioning to be effective. These sequences are the basis of the analyses of preconditioning in chapter 4 and of goal gradient effects in chapter 5. The existence of sequences of stimuli and responses and also response-produced stimuli is based on actual observation rather than on speculative theory. This chapter describes two case histories in which conjectures about anticipatory links in a sequence led from academic debates about top-down contingency to bottom-up observations of actual behavior.

LATENT EXTINCTION

According to response reinforcement theory, rats learn to turn right or left at the choice point of a T-maze because they receive reward in the goal box at the end of the arm. The Sa is the stimulus situation at the choice point and the Ra is the turn, right or left. We know the response is arbitrary because the experimenter randomly assigns half the rats to one group and the other half to a second group. The first group gets rewarded for turning right and the second group gets rewarded for turning left. In response reinforcement theory, animals learn the turn at the choice point because they receive reward in the goal box after they make an arbitrarily correct turn. According to cognitive expectancy theory, animals learn to expect food in one goal box or the other. They can learn this without making any turns at all.

There is the same contrast in extinction. According to response reinforcement theory, failure to find food in the goal box extinguishes the chain of responses that leads to the goal box. According to cognitive expectancy theory animals discover that the goal box contains food during acquisition and that it is empty during extinction. During extinction they stop making the turn because expectancy of food extinguishes. One way to test this contrasting view of extinction would be to train the animals with food reward in the usual way and then extinguish the expectancy without extinguishing the turning response.

To make this test, Deese (1951) trained 20 rats to go to the left side of a T-maze and 20 other rats to go to the right side for food reward. After six days of training at four trials per day, the 40 rats were going to the rewarded side on about 90% of the trials. On the seventh day, Deese extinguished all of the animals in the usual manner. They ran in the maze as before, but without food in the goal box. Extinction reduced the preference for the formerly rewarded side to 50%, which is the same as zero preference.

Half of the animals received special experience just before the start of extinction. Deese placed each of these rats in the empty goal box four times for one minute each time with one minute between each exposure to the empty goal box. The results of Deese’s (1951) experiment and a similar experiment of Seward and Levy (1949) were clear. The animals that spent time in the empty goal box before making the correct turn for themselves extinguished the correct response much more rapidly than animals without this preextinction experience. In fact, the experimental animals that experienced nonresponse extinction extinguished almost immediately, while the control groups took many trials to extinguish. This looks like a clear-cut case of learning by perceiving without responding. Or does it?

Segments of a T-Maze

Deese’s typical version of cognitive expectancy views a run through a T-maze as one of two acts, a right turn or a left turn with food at the end of the rewarded alternative. In those terms, there cannot be a response in nonresponse extinction because a rat that the experimenter places in the goal box cannot perform the act of making the formerly correct turn at the choice point. Popular as this view may be among traditional theorists, it is an unrealistic view of behavior that distorts and obscures vital details. More realistic and more informative pictures emerge from more detailed descriptions.

This section first analyzes the sequence of stimulus-response units in an act such as running to the goal box of a T-maze. Next the analysis zooms in on the sequence of stimulus-response units between entering a goal box and seizing a reward and between pressing a lever and seizing a reward.

Chapter 5 began with Hull’s analysis of a maze into a series of segments as shown in Figs. 5.1 and 5.2 and the experiments on the goal gradient that confirmed his analysis. Figure 6.1 recasts Hull’s S-R diagrams of Figs. 5.1 and 5.2 into the S-R chain diagrams of Figs. 4.3 through 4.6. All of these diagrams express the fact that every response changes the stimulus situation in some way. As a rat runs through a maze, each movement changes the stimulus situation. Each section of the maze is a little different: There are different spots on the floor, different imperfections in the walls, different lighting from outside the maze, different gradients of warmth and sound from the room outside the maze, and so on. Running from one section to another ends the stimulation from one section and exposes the animal to the stimulation in the next section.

The advantage of the system shown in Fig. 6.1 over Hull’s in Figs. 5.1 and 5.2, is that it emphasizes parallels: (a) parallels between maze learning and classical conditioning described in Figs. 4.3 through 4.6, and (b) parallels between maze learning and other sequential patterns of response such as lever-pressing and key-pecking described in detail in later chapters of this book.

In Fig. 6.1, S1, S2, S3, … Sn represent the stimuli in each section of the maze, while R1 R2, R3, … Rn represent the response of running in each corresponding section of the maze. The break between S3—R3 and Sn—Rn (indicated by dots …) means that the diagram can stand for any arbitrary number of maze units. Also, R1 R2, R3, … Rn could represent running straight on in a straight alley runway, or they could represent turning right or left at successive choice points in a multiple-unit maze, and S1, S2, S3, … Sn could represent the stimulus situation at each corresponding choice point. As in Hull’s analysis, SG represents the stimulus situation at the entrance to the goal box and RG represents the response of entering the goal box and getting the reward. For simplicity at this point in the analysis, everything that happens after the rat enters the goal box and finds the food is represented in Fig. 6.1 as S*.

images

FIG. 6.1. Chain of S-R units in a runway or a T-maze. Copyright © 1997 by R. Allen Gardner.

Figure 6.2 is a diagram representing a segmental analysis of a T-maze. The diagram arbitrarily divides each arm into three segments: R1, R2, and R3 for the right arm and L1, L2, and L3 for the left arm. Both arms have a compartment at the end. The compartment at the end of the right arm contains the food dish and it is labeled G for goal box. The choice point where the rat must enter one arm or the other is labeled C. At that point, the rat either responds to the stimuli in R1 by entering R1 or responds to the stimuli in L1 by entering L1. There are two chains like the one in Fig. 6.1, one starting with a right turn and the other with a left turn. In S-R-S* theories, only one of the chains is rewarded because the rat only finds food in the goal box at the end of the right arm. Consequently, only the response of turning right to enter the first segment of the right arm in response to the choice point stimuli is part of the rewarded chain.

Segments of a Goal Box

Latent extinction depends on what happens when the rats are placed directly in the goal box without running there on their own. In the cognitive expectancy view, they just sit in the goal box either eating the food and forming expectancies of reward or experiencing the empty goal box and forming expectancies of nonreward. Experimenters who have enough ethological interest in living animals to observe their experimental subjects in the goal box describe sequences of stimulus-response units. The first stimulus is the entrance to the goal box and the first response is entering. Once in the goal box, rats see the food dish and respond by approaching the food dish. At the food dish, they find food and respond by eating. Then, in S-R-S* theories, the stimulus effects of eating provide the S* that reinforces the whole chain that began when the rat left the starting box and includes the right turn at the choice point. This is still a very simplified version of what actually happens, but it is sufficient for the present analysis. Figure 6.3 is a diagram of this simplified version of the behavior chain in a goal box.

images

FIG. 6.2. Segmental analysis of S-R units in a T-maze. Copyright © 1997 by R. Allen Gardner.

According to the goal gradient principle, the first thing a rat learns is to find food in the food dish at the back of the goal box. This is the strongest response. Next the rat learns to go straight to the food dish after entering the goal box; next to enter the goal box, and so on. Placed in the goal box with an empty dish, rats in a latent extinction experiment get immediate and concentrated extinction of the last response, approaching the food dish. Extinction of the anchor point in the chain of responses is the most important factor in extinction. This view of latent extinciton has specific experimental implications.

Several experiments tested for the implications of this bottom-up view of latent extinction (see Coate, 1956; and Kimble, 1961; for detailed reviews). In one type of experiment (Moltz, 1955; Moltz & Maddi, 1956), two groups of rats spent time in the goal box without finding food. One group found the usual food dish in the goal box, but it was empty. The second group found an empty goal box containing neither food nor food dish. The extinction test, running the maze from start to goal without food reward, was carried out in the usual fashion; both groups found the usual food dish, but it was empty. The group that had found an empty food dish in the goal box during preextinction, extinguished the turning response at the choice point much more rapidly. Moltz reasoned that the response of approaching the food dish is what extinguishes in so-called “nonresponse” extinction. In this way, confinement in the goal box with an empty food dish extinguishes the anchor point and weakens the whole chain. If the food dish is absent, a rat cannot extinguish the response of approaching the food dish, which preserves the vital anchor and increases resistance to extinction.

images

FIG. 6.3. Chain of S-R units in a goal box. Copyright © 1997 by R. Allen Gardner.

Coate (1956) devised an ethologically more explicit demonstration of the response in nonresponse extinction by constructing a modified Skinner box. In Coate’s experiment, the apparatus delivered pellets into a food dish, as usual. Instead of placing the food dish inside the box, however, Coate placed the food dish outside the box. Rats could reach the food by poking their heads through a hole in the wall under the lever. There was a curtain over the hole that prevented the rats from seeing the food before they poked their heads through the curtain. The food magazine, an apparatus that delivered pellets one at a time into the food dish, sounded a distinct click each time it operated.

Coate first conditioned the animals to poke their heads through the hole when they heard the click of the magazine. They could poke their heads through the hole at any time, but they only found food after hearing the click, and soon poked only after a click. Next, Coate trained the rats to press the lever which operated the food magazine on a VI 36 schedule of food reward. During training, the rats pressed the lever at the usual high rate generated by VI 36 schedules; they would stop pressing the lever when they heard the click, then poke their heads through the hole, and then eat the pellets they found in the food dish. After six days of this training, Coate divided the rats into two matched groups on the basis of their rate of lever-pressing during acquisition. That is, performance during acquisition was nearly equal for both groups.

For the next three days, all of the animals received preextinction in the original apparatus with the lever removed to prevent any preextinction of lever-pressing, itself. During preextinction, the experimental group heard the click of the food magazine at roughly the same rate that they had heard it during training, but the magazine never delivered any pellets. The control group spent the same amount of time in the apparatus, but the food magazine never operated and never made a single click when the control group was getting their preextinction. When they heard the clicks, the experimental group poked their heads through the hole, at first, but soon extinguished that response. Without the stimulus of the click, the control group rarely poked their heads into the hole. Coate then extinguished both groups with the lever present and with the food magazine operating on the VI 36 schedule, but the magazine was empty so that it only made clicks and never delivered pellets. The design of Coate’s experiment appears in Table 6.1. The results were clear. The experimental group that received preextinction of the clicks extinguished very rapidly compared to the control group.

TABLE 6.1

Design of Coate’s (1956) Preextinction Experiment

images

Coate’s (1956) result together with the results of Moltz (1955) and Moltz and Maddi (1956), who induced latent extinction in a T-maze like Deese’s as well as in a straight-alley runway, clearly implicate the response to the food dish as the significant factor in latent extinction. The extinction was latent only when experimenters refused to look at the actual behavior of the living animals. When experimenters like Coate observed and recorded what the animals were doing during latent extinction, they saw manifest rather than latent extinction. Responses to particular stimuli extinguished before their eyes. Coate, Moltz, and Moltz and Maddi all interpreted their results in terms of secondary reward. They reasoned that the food dish in the mazes and the click of the food magazine became secondary or conditioned rewards by association with the food. It is this secondary reward effect that extinguished in nonresponse extinction according to reinforcement theory. Chapter 7 takes up the question of secondary reward and discriminative stimuli in great detail.

Studies of latent extinction stimulated experiments that separated out parts of the chain of responses leading up to reward. Although bottom-up systems had assumed the existence of such chains for a long time, the evidence appeared when the latent learning controversy challenged these assumptions.

DRIVE DISCRIMINATION

Obviously, animals must be able to tell whether they are hungry or thirsty, otherwise they might eat when they were thirsty and drink when they were hungry, which would be clearly maladaptive. That said, what is the role of learning? Can animals learn to go to a particular place to eat when hungry and to a different place to drink when thirsty?

To answer this question, Kendler (1946) deprived rats of both food and water at the same time and gave them equal experience running in both arms of a T-maze. At the end of one arm there was food to eat and at the end of the other there was water to drink. In this phase of the experiment the animals were both hungry and thirsty, but they found food in the goal box on the food side and found water in the goal box on the water side. On 2 test days, Kendler deprived the animals of only one incentive, either food or water on alternate days, and gave them a single test trial on each test day. On thirsty days 98% of them ran to the water side of the maze and on hungry days 73% ran to the food side. Clearly, the animals learned which side had food and which side had water and responded appropriately on test days.

Contingencies were the same on both sides during training because the animals were both hungry and thirsty and received either food or water on every trial. They received reward for going right just as often as they received reward for going left. Contingent reinforcement theory seems to say that they should have gone equally to both sides on the test days. Does Kendler’s experiment prove instead that the rats had expectancies or images of food and water in their minds to guide them in their choices? Did cognitive images dance in their heads like visions of sugar plums on the night before Christmas?

The trouble with expectancies is that they are confined to the minds of learners, unobservable by definition and knowable only by their end effects. Kendler attributed his results, instead, to anticipatory responses and response-produced stimuli that were observable in principle, although they had not yet been observed when he wrote.

During the training phase of Kendler’s experiment, the rats always ate in the goal box at the end of one arm of the T-maze and drank in the goal box at the end of the other arm. Kendler reasoned that prefeeding responses, like salivation, became conditioned to stimuli in the arm on the food side and predrinking responses, like licking, became conditioned to stimuli in the arm of the water side. Anticipatory eating and drinking responses like all responses must themselves produce stimuli which become part of the stimulus complex in each arm. These are the response-produced stimuli discussed in chapter 4 and earlier in this chapter and illustrated in Figs. 4.3 through 4.6 and Figs. 6.1 and 6.3. When the animals ran in the food arm, they experienced the stimulus complex of the food arm of the maze plus stimuli produced by prefeeding. When they ran in the water arm, they experienced the stimulus complex of the water arm plus stimuli produced by predrinking.

During the test phase of Kendler’s experiment, hunger made prefeeding responses more likely on hungry days and thirst made predrinking responses more likely on thirsty days. As a result, when the animals found themselves at the choice point on hungry days, prefeeding responses evoked stimuli for running in the food arm, while on thirsty days predrinking responses evoked stimuli for running in the water arm. Notice that this is a bottom-up, feed forward principle. Hunger evokes prefeeding and thirst evokes predrinking.

The alternative, as usual, is cognitive expectancy. Animals ran to the food side when they were hungry because they had learned to expect food there. They ran to the water side when they were thirsty because they had learned to expect water there. This is the same as saying that the animals learned to go to the food side because they learned to go to the food side, and likewise, that they learned to go to the water side because they learned to go to the water side. As usual, cognitive expectancy only restates the results of the experiment after the results are in. At the time, the trouble with Kendler’s line of reasoning was that anticipatory eating and drinking responses in the arms of a T-maze existed only as conjectures in Hull’s theory. They had never been observed in any experiment. Unlike the expectancies of Tolman’s theory, however, anticipatory eating and drinking are in principle observable.

Deaux and Patten (1964) used the device in Fig. 6.4 to observe predrinking responses directly as rats ran freely through a straight-alley runway.

At first, as Fig. 6.5 shows, the rats licked the tube at roughly the same low rate in all segments of the alley. In later trials, however, they licked more, and as they ran, the closer they came to the goal box the more they licked just as Hull and Kendler predicted many years earlier. The curves in Fig. 6.5 directly confirm Hull’s and Kendler’s conjecture about anticipatory responses conditioned to stimuli that regularly appear before an S*. Deaux and Patten’s findings confirm by direct observation the chains of stimuli and responses discussed in chapter 4 and earlier in this chapter and illustrated in Figs. 4.3 through 4.6 and Figs. 6.1 and 6.3. Cognitive expectancies remain unobserved.

Other modern, ethologically rich observations of behavior in the Skinner box also confirm the existence of fractional anticipatory responses. Using high-speed motion picture photography and videotape of pigeons in a Skinner box, Jenkins and Moore (1973) recorded the precise form of each key-pecking response. Sometimes the pigeons in this experiment were deprived of food and rewarded with grain, and sometimes they were deprived of water and rewarded with water. Jenkins and Moore found that, when pecking the key for grain, pigeons held their beaks in a shape and pecked with a force related to the responses made in eating grain. At the same time they found that key-pecking for water partially resembled drinking. This is exactly what should happen if prefeeding becomes conditioned to a key in the Skinner box that regularly precedes eating, and predrinking becomes conditioned to a key that regularly precedes drinking.

images

FIG. 6.4. Rats in Deaux and Patten (1964) wore a harness that held a drinking tube in their mouths. Each lick at the end of the tube activated an electronic device called a lickometer, which recorded all licks, but delivered water only when the rat was in the goal box. Copyright © 1964 by Psychonomic Society, Inc. Adapted by permission.

images

FIG. 6.5. Anticipatory licking in a runway. From Deaux and Patten (1964). Copyright © 1964 by Psychonomic Society, Inc. Adapted by permission.

In a series of experiments described in detail in chapter 8, Timberlake (1983) delivered a small steel ball into a Skinner box just before delivering a pellet of food or a drop of water. After repetition of the sequence ball-food or ball-water, his rats handled the ball when it preceded food and mouthed it when it preceeded water with responses that partially resembled their responses to the food or the water when these came.

J. H. Hull (1977) made videotape records of rats pressing levers; sometimes Hull deprived the rats of food and rewarded them with pellets of food and sometimes he deprived them of water and rewarded them with drops of water. The tapes showed frontal views of the rats pressing the levers, but the animals were out of view when they lowered their heads to eat or drink their rewards. After watching live rats eat and drink, six students viewed the tapes and judged when the experimental rats were pressing for food and when for water. The inexperienced students were correct in 96% of their judgments. Clearly, stimuli that regularly precede S* in instrumental conditioning evoke distinctive fractional anticipatory responses, and animals incorporate these anticipatory responses into instrumental behavior such as running in mazes, pecking keys, and pressing levers (see also J. H. Hull, Bartlett, & Hill, 1981).

CONDITIONED REJECTION

Fractional anticipatory responses also play a role in avoiding poisons. Poisonous plants and animals taste bad, usually bitter. It is a good defense against predators. Usually, poisonous plants and animals also look conspicuous. Fiery reds are common and so are showy patterns like the distinctive orange and black wings of the Monarch butterfly. Monarch butterflies eat milkweed plants, which contain a cardiac poison. Birds will die of cardiac arrest if they eat too much Monarch butterfly. Fortunately, the poison tastes so bad that they usually spit out the first mouthful and vomit up anything they may have swallowed. Just the taste of the wings as the bird first catches a Monarch is so bad that birds usually quit right there and release the victim.

This is a better defense than killing a predator with poison. First, many Monarchs escape with minor injury that way. More important, the last thing that a bird sees as it spits out whatever got into its mouth is the orange and black pattern of Monarch wings. The next time that bird sees the same distinctive stimulus, it spits before it bites, or perhaps it hesitates just long enough for a Monarch to escape (Brower, 1984, 1988). In this way, one taste of one Monarch defends many other Monarchs from that bird. Death would accomplish the same thing, of course, but a live bird feeds on other species of bugs that often compete with Monarchs for food. That bird also competes with other birds that prey on Monarchs. Thus, a live predator that has been conditioned to reject Monarchs is better than a dead predator, for the overall survival of Monarchs.

Conditioned rejection is also good for the overall survival of the predator. If poisonous food has a distinctive taste, an animal can spit it out immediately without serious consequences. If poisonous food also looks or smells distinctively different from other foods, an animal can learn to reject it before retasting it. Viceroy butterfies are a different species from Monarchs that have very similar orange and black patterns on their wings. Brower also demonstrated that birds that had tasted a single Monarch in the laboratory not only rejected all other Monarchs but also rejected Viceroys.

Sooner is better than later for conditioned rejection. Extended trials of repeated experience to build up statistical contingencies of reinforcement or expectancy would cost animals more discomfort and more danger of serious consequences than immediate conditioning in a single trial.

In an extensive series of laboratory studies of conditioned taste aversion, experimenters have made rats sick with poisons, usually lithium chloride, and experimentally paired poisoning with an innocuous novel taste such as water sweetened with saccharine. After recovering from the illness, rats get a choice between two water bottles, one with the experimental flavor and the other with plain water or with a control flavor. Conditioned rats drink dramatically less of the experimental flavor. Only one trial is sufficient for this form of conditioned rejection. In agreement with the evidence on single trial conditioning in chapter 4, temporal relations between the CS and the UCS are much more variable than expected from traditional Pavlovian principles (Domjan, 1980).

Most early studies of conditioned taste aversion only measured the amount that the rats drank from the experimental and control drinking bottles. Modern studies take advantage of inexpensive videotape recording to observe what the animals actually do when they reject fluids (Meachum & Bernstein, 1990; Parker, 1988; Parker, Hills, & Jensen, 1984; Zalaquette & Parker, 1989). Usually, experimenters administer poison directly by intraperitoneal injection. The resulting illness evokes characteristic behavior in rats, such as rubbing their bellies on the ground or stretching their limbs. These are unconditioned responses to this sort of poisoning. Later, the experimental flavors evoke fractions of these unconditioned responses including responses that are appropriate to a poisonous taste in the mouth, such as rubbing their chins against the floor or gaping their mouths open; these are conditioned responses to otherwise innocuous flavors. It is easy to see how these conditioned responses reduce drinking so dramatically. Modern ethological observations certainly vindicate Kendler’s and Hull’s early conjecture that choices depend on anticipatory responses to food and water that appear before actual eating or drinking.

Before the current movement to study conditioned rejection through direct observation, Rescorla (1987) proposed a return to traditional cogntive speculation. Where Deese (1951) rewarded rats in the goal box for turning right or left in a T-maze, Colwill and Rescorla (1985a, 1985b, 1986) rewarded rats for pressing levers or for pulling chains in a modified Skinner box. They rewarded one response, either lever-pressing or chain-pulling, with saccharine-flavored food pellets and rewarded the other response with ordinary food pellets. Then, instead of nonresponse extinction as in Deese (1951), they paired the saccharine flavor with lithium poisoning. In an extinction test conducted after poisoning, the rats markedly reduced either lever-pressing or chain-pulling, depending on which had formerly earned saccharine-flavored pellets. Just as Deese attributed nonresponse extinction in a T-maze to loss of cognitive expectancy in the goal box, Rescorla (1987) attributed selective extinction after poisoning to a cognitive factor called “devaluation.”

Rescorla’s (1987) analysis depends entirely on Skinner’s (1938) distinction between elicited and emitted responses. There Skinner (1938, p. 20) maintained that in classical conditioning, which he renamed Type S or respondent conditioning, there is always a stimulus, UCS, that initially elicits a particular response, UCR. In instrumental conditioning, on the other hand, responses such as pressing a lever seemed to Skinner to appear spontaneously—to be emitted without an initiating stimulus. He renamed this Type R or operant conditioning. Skinner argued throughout his long life that, if operant responses are emitted, then the proper formula for contingent reinforcement is R-S* rather than S-R-S*. Rescorla’s version of this principle is: “According to this third view … the basic structure of instrumental learning is response-reinforcer [that is, R-S*] in nature” (1987, p. 120).

In a long life of writing on this subject, Skinner offered only one argument to support his R-S* formula for instrumental conditioning. This was his personal inability to identify or even to imagine a stimulus for lever-pressing or key-pecking. For decades his followers subscribed to this view without producing any other evidence. In scientific circles this is called the argument from “lack of imagination.” In this case, there cannot be a stimulus for the response in instrumental conditioning because B. F. Skinner and his followers cannot imagine such a thing. As most readers should know, the history of modern science is largely a history of discoveries that were once beyond the imagination of innumerable learned, even distinguished, scholars.

In the case of the R-S* formula, plain facts must replace Skinner’s imagination. Rats do not press the air in the Skinner box; they press the lever. Pigeons do not peck the air in the Skinner box either; they peck the key. Taken literally, the R-S* formula implies that pressing and pecking randomly increase with reinforcement and only lead to lever-pressing and key-pecking accidentally when these stimulus-independent responses happen to engage the levers or keys of an operant conditioning chamber.

In the case of the Colwill and Rescorla experiments, the R-S* formula implies that rats emit lever-pressing and chain-pulling into the air and these responses engage the lever or the chain, adventitiously. Clearly, rats respond to levers by pressing and to chains by pulling. This is impossible unless levers and chains have stimulus differences. Colwill and Rescorla (1986) attempted to answer this objection with a Skinner box that contained neither a chain nor a lever, but only a single rod suspended from the ceiling. For half of the rats in this experiment, pushing the rod to the right earned saccharine-flavored rewards and pushing it to the left earned unflavored rewards. For the other half the left–right contingency was reversed. After the experimenters paired saccharine with lithium chloride, the rats markedly reduced either the right or the left movement of the rod, whichever had earned saccharine-flavored rewards. Colwill and Rescorla argued that there was only one stimulus, the single rod, that could precede right and left pushes; hence their results implicate an R-S* formula rather than an S-R-S* formula.

Like their lever-pressing versus chain-pulling experiments, the single-rod experiment of Colwill and Rescorla (1986) demands that the rats emit right and left pushing responses at random and only engage the rod by chance. After the alleged cognitive devaluation of saccharine by pairing with lithium chloride, the rats emit less of one response than the other. This blinkered view may appeal to experimenters who refuse to look at the rats in the box while they theorize about increases and decreases in a single index of response. Worse still, this restrictive view cannot cope with sequential behavior. Applying the R-S* formula to experiments with single-unit T-mazes like Kendler (1946) or Deese (1951), we have to say that there is only one stimulus, the single choice point, and rats make correct choices by emitting either right or left turns at that single choice point. But, what can we say of mazes with multiple sequences of choice points as in Tolman and Honzik (1930), shown in Fig. 5.6. Rats cannot respond correctly in a maze with a sequence of choice points if they only learn one response for one type of reward as in the R-S* formula. The only way that they can learn to go to the right at certain choice points and to the left at certain others is by responding to stimulus differences that distinguish one choice point from another. The R-S* formula is trapped in the Skinner box and cannot cope with the vast amount of animal behavior that involves sequences of appropriately different responses to different stimuli.

In the case of conditioned taste aversion, conditioned rats continue to drink. They even continue to drink from drinking bottles that are identical to the drinking bottles that they reject. They only reject drinking bottles that taste of the experimentally flavored liquid. Birds that have tasted Monarch butterflies continue to attack and to eat butterflies. They only reject butterflies that look like Monarchs. Perhaps the most characteristic feature of conditioned rejection is that it is stimulus specific.

Rescorla’s (1987) analysis of conditioned taste aversion is a return to Skinner’s (1938) behaviorism and Deese’s (1951) cognitivism. This is an honorable tradition that continues to attract supporters. This book would rather point readers toward modern ethological and ecological studies of taste aversion. With inexpensive videotape recording, experimenters can look into the conditioning chamber and observe what the animals are doing. They can supplement a traditional single index like lever-pressing or key-pecking with rich descriptions of ongoing behavior. They can discover new patterns of behavior and open up the field to new interpretations of earlier research. For example, by watching the animals she injected with different poisons, Parker (1993, 1996) found different patterns of response to injection, to the place paired with injection, and to the taste paired with injection. She also found different patterns of response that depended on whether the drug was a nauseating emetic like lithium chloride or a psychoactive drug like lysergic acid diethylamide (LSD). Parker also showed how ethological patterns of response have significant implications for human problems such as anorexia and drug addiction.

SUMMARY

Modern ethological observations reveal that fractional anticipatory responses play a vital role in common forms of sequential behavior. Unlike the hypothetical mechanisms of reinforcement and cognition, fractional anticipatory responses appear in direct observations of actual animals. Detailed observations of reward and extinction in T-mazes, runways, and Skinner boxes as well as conditioned rejection of poisons show the same pattern of conditioned anticipatory responses.

These direct ethological observations contradict the traditional view that classical and instrumental conditioning represent distinctly different forms of conditioning. Clearly, instrumental conditioning is impossible without regularly repeating sequences of stimuli and responses followed by an S*. A distinctive arbitrary stimulus, Sa, always precedes the S* at the end of a chain of instrumental responses and this pairing of Sa and S* plays a vital role in instrumental conditioning (Bolles, 1988; Plonsky, Driscoll, Warren, & Rosellini, 1984). This vital pairing also satisfies the defining operations of classical conditioning. Instrumental conditioning is impossible without the defining experimental operations of classical conditioning.

Apart from procedural differences, is there any value to the traditional distinction betweeen classical and instrumental conditioning? The next chapters pursue this question as it arises in the study of several basic phenomena of conditioning and learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset