7

Learning to Research about Learning

Herbert A. Simon

Carnegie Mellon University

 

The purpose of this chapter is not to introduce wholly new themes into the discussion that is pursued throughout this volume, but to highlight some general implications from the work discussed in the other chapters about directions for learning research, and how our understanding of cognition can define these directions.

LEARNING AS THE LINK

The connection between cognition and instruction lies, of course, in learning; and for at least two very good reasons, learning has been a central topic in psychology for the whole century of that discipline's history.

First, learning is one of the most important activities in which human beings engage, occupying a very large fraction of their lives and absorbing a substantial fraction of the national income (including, of course, the child labor that is invested in education instead of immediately productive occupations). But learning is also important, and perhaps most important, because it is central to the whole psychological enterprise. The survival of living creatures is tied tightly to their adaptability, and one of the two most powerful mechanisms for adapting to an environment is learning; genetic change is the other.

As a consequence of the success of the human species in evolving its learning capabilities, almost all human behavior, from an early age, is learned behavior, and it changes whenever what is learned and stored in memory changes. I am not ignoring or denying the role that nature plays as the foundation for nurture, but simply observing that if the task of science is to find invariant characteristics in its objects of study, many, if not most, of the human invariants do not lie in what is learned, which can exhibit the greatest diversity, for we can learn almost anything, sense or nonsense. The invariants lie mainly in the learning mechanisms themselves, along with the basic mechanisms for sensing, perceiving, and manipulating symbols in the nervous system (which we call thinking) and emoting.

This is why learning theory has played such a central role, perhaps the most important, in psychological research and theory. Of course, learning is also not a strict invariant in an adaptive organism, for we can move up to the first metalevel and learn to learn. I do not think that anyone doubts today that there are more efficient and less efficient ways of learning—for example, learning with understanding versus rote learning. Nor is there any doubt that better ways of learning can be learned, like any other forms of behavior, and become habitual.

That brings us to the second metalevel; learning how to teach people how to learn, that is to say, learning to research about learning, which is the title of my talk and the theme of this conference. Research on learning, as we are all aware, is a complicated affair because learning is a complicated process, or, more accurately a congeries of complex processes. The task of research on learning is to understand all of these processes, and to understand how we can facilitate them.

THE PROCESSES OF LEARNING

The structure of this volume reflects these complexities.

  1. Much learning takes place in childhood, a period, and in many respects the most crucial one, during which the child is not only learning but also developing and maturing in an amalgam of biological and psychological changes that continue into adolescence. However much Piaget may have been right or wrong about the specific facts of developmental stages, he taught us that we must study the biological maturation of the infant and child as an essential part of our study of learning. So the relation between development and learning becomes our first theme.
  2. Learning is often carried out in a structured environment of facilitators (at least, intended facilitators) called teachers and tutors, and nowadays, computer tutors and a whole host of other communication devices that have progressed well beyond the blackboard. So the role of teachers and their strategies must be another of our themes, but never in isolation from our study of the learner.
  3. Learning is social. Learning takes place not only in schools or other formal settings, but also in every encounter the learner has with the natural and social environments. It begins at birth, so that the child has already acquired substantial skills of locomotion, recognition and manipulation of objects, and oral language long before beginning formal education. This early learning most likely has a greater aggregate importance, even in a school-saturated society like our own, than the subsequent formal learning, for it shapes the learning processes themselves, and the frameworks within which new knowledge is organized.

    If we observe any peasant society, we see an enormous amount of learning taking place without the intercession of schools at all, written language being perhaps the most nearly schoolbound of the basic human skills. We should not make the mistake of identifying learning exclusively with schools. In the future, the association of the two may become closer, or as many computer enthusiasts think, much more distant. We should also not make the mistake, as is sometimes made in the more extreme constructivist circles, of supposing that learning cannot occur, and even occur efficiently, in schools.

    What do we mean by the social context of learning? In speaking of learning as a social process, “social” means more than “having people around.” In a modern society, perhaps the most important social influences on learning are the book and other forms of written communication that do not involve having people around at all—at least at the moment of learning. Putting learning in books does not desocialize it.

    Whether from books or people, at least 90% of what we have in our heads (I haven't measured it, that's just a guess) is acquired by social processes, including watching others, listening to them, and reading their writings. Hence, there is no such thing as individual psychology. If we except the rare feral person, all psychology is social. Among the most social people are the bookworms, who go off to hobnob with the scribblers, and who know the world, not through direct contact but mainly by way of its interpretation in books. Others know it by listening to and watching other people. They learn about different worlds, I'm sure, and it's not clear which is more like the real world, but all are learning, and learning socially.

    Nor do the scribblers whom the bookworms read or the storytellers we all listen to get most of their information from the external world. The storyteller is a transmitter of tales told by previous storytellers, the preserver of the collective memory. And the traveler is the transmitter of information largely obtained by listening to the inhabitants of other lands. Of course the traveler also looks about (mostly at other people and their artifacts), but is deeply dependent on the local informants to interpret what she or he sees.

  4. Learning uses tools. Human tool-building propensities are applicable to learning just as they are applicable to the other aspects of human activity. I have already mentioned three of the most important tools of traditional formal learning systems; written language (including books), teachers, and the blackboard. Today, there are many more, as I have mentioned.

    This volume includes many examples of the use of overhead projectors and videotape as learning tools for ourselves, and could have included a productive discussion of the use and misuse of such tools; for example, on the conditions for their compatibility with the limits of human serial, one-at-a-time attention, for the division of labor and coordination between eye and ear, and for the use of the learner's Mind's Eye as a powerful visual display (Tabachneck-Schijf, Leonardo, & Simon, 1997). In learning to research on learning, we must both research the characteristics of these tools, and use these tools in our research, as we are already beginning to do.

    With the tools we have today, especially those that can themselves participate in communication, we can aspire to the same kinds of enhancement of learning efficiency that we have achieved historically with tools and machines in a wide host of other productive activities. Notice that I said “aspire.” Today, we are not even close to approaching the potential.

    We need also to be alert, using foresight as well as hindsight, to the many side effects, desirable and undesirable that new tools invariably bring with them, some of them of societywide and worldwide impact. Today, we are on the brink of the “industrialization” (is that the right word? or “mechanization,” or “automation”?) of education. Whatever we call it, the “information revolution” is going to have effects over the course of the next century or two—some good, some bad—as sweeping as those of the industrial revolution. These potential effects need to be part of our research agenda, as they were not in the agendas of those who introduced the automobile or the airplane—or the World Wide Web.

  5. These are the four main topics around which this volume is organized. However, at least one important topic is absent; the natural environment, which, although I have just been downplaying its importance relative to the social environment, gives very many welcome and unwelcome lessons to all of us (touchings, you might say, of many pleasantly warm and many unpleasantly hot stoves). One of the striking differences between a rural and an urban society is that the former provides all sorts of ways of learning from the natural environment of the farm that are not easily available to city dwellers. Most city kids have very little experience with tools of any kind, unless you consider computers and TV knobs to be tools. Perhaps, but I am not sure of it, this is compensated by broader and more varied social experience.

We especially need to attend to what it is we must learn about the interactions between the natural and the social environment in a world where we humans are so numerous and so powerful that we can produce, quite rapidly, enormous changes for better, and more frequently for worse, in our natural environment. The consequent problems surely require social learning, which calls on the resources of all of our sciences, both natural and behavioral and social, for the required knowledge. In any case, the natural environment also provides important resources and needs for learning, but it is not represented here, and we will need another volume to deal with it.

THE FIRST LAW OF LEARNING

Now we come to the substance of the matter; how we can learn to research on learning, and—not to put the matter wholly into the future—how we have already been learning to do such research. In reading the chapters of my colleagues, I am struck by the fact that most of the cognitive ideas they are using in their educational research and practice are quite general principles that have been around for a long time and that are not couched in any very precise theoretical terms. They might even be mistaken for common sense.

In the chapters on learning tools, we see examples of much more specific and precise use of cognitive theory in education. However, we should not dismiss our heritage of learning about learning just because much of it sounds like common sense. That is the fate of all good theory as it becomes thoroughly understood (like the law of falling bodies, which was transformed in about a century from utter nonsense to obvious common sense).

Therefore, before offering a glimpse of some of the more contemporary trends in cognitive research, I'd like to make explicit the theoretical core of the common sense that I believe my colleagues have been using as the basis for the education procedures they describe.

The long-established first principle, the foundation stone of the entire enterprise, is that learning takes place inside the learner and only inside the learner. Learning requires changes in the brain of the learner. We may not understand fully or even approximately what these changes are, especially at the neural level, but we must have some accurate and principled way of characterizing them at least at the symbolic level. If not, we know almost nothing about learning, and can do little to enhance it. The contents of textbooks, the lectures or tutorial activities of teachers, the humming of computer tutors, the murmurings of classroom student discussion and study groups—all of these are wholly irrelevant except insofar as they affect what knowledge and skills are stored inside learners' heads (and sometimes in their fingers and toes).

I must, as a passing remark, observe that during the 1930s, Carnegie Tech (as CMU was then called) was a national leader in introducing into engineering education the so-called “Carnegie Plan,” which was squarely based on this principle—that learning depends on what the learner does, and only on what the learner does—and on a focus on problem-solving skills. Neither of the pioneers who accomplished this, Provost Elliott Dunlap Smith nor President Robert Doherty, was by professional training a psychologist, but, wherever they learned this lesson, they were superb learners.

The implication of a learner-centered learning theory (it sounds obvious when you put it that way) is that research on learning must aim at understanding the intracranial learning processes and mechanisms. That is where the action is. In the past 100 years, and especially the past 50 years, psychology has come a very long way along this path. It has identified and characterized some major components of the human cognitive system shown in Fig. 7.1; long-term (LTM) and short-term memory (STM), and subdivisions of these, for example, the iconic memory and the “Mind's Eye” as principal components of visual short-term memory; the echo box and auditory STM (that of the “magic number seven”); long-term semantic memory, both episodic and topical; knowledge in the form of associative list structures, and processes in the form of productions and discrimination nets, together with learned domain-specific retrieval structures in LTM that in effect enlarge STM memory capacity in those domains (Richman, Staszewski, & Simon, 1995).

I am sure you can add to this list, and not everyone's list would be identical with this one (which already defines an important goal for learning research), but a fact-based argument can be made for every box and arrow in the diagram, and there is considerable consensus about the general shape of this picture of memory. Its components are not simply the names of conjectural entities. For each of them, we have substantial data that permits us to estimate its parameters, often within a factor of two or less; capacity, access time, storage time, forgetting rates and so on. Very important: Once estimated, these are no longer free parameters when we fit the models to data.

Each memory component accounts for a range of learning and memory phenomena that do not make sense without them. The general brain locations of more and more of them are being identified with the aid of data from brain-damaged patients and from EEG, PET and MRI scans. fMRI, especially, is beginning to provide us with data about the sequence in which the memories function in the performance of particular tasks (Postle & D'Esposito, in press).

Only to the degree that we understand these processes and mechanisms can we expect to influence them in ways that will enhance learning. This does not mean that we need to understand them all at the neural level— that may still be a long way off. For purposes of applying psychology to learning, what is critical is to understand how the system functions at the level of symbols and their processes. The chapter of John Anderson in this volume makes that very specific and clear (chap. 8).

In addition to the progress that has been made in identifying and characterizing the functional components of the mind, our research has begun to tell us a great deal about the learning processes themselves and how to improve them. Carrying out an educational program consistently with the

image

FIG. 7.1. Hypothesized memory structures. The left side of the figure represents the principal short-term memories—auditory, visual and motor—connecting with sensory and motor organs. The right side represents the principal species of long-term memory; (upper left), the network of tests that discriminates stimuli; (lower left), the actions that are triggered by the presence of (internal or external) stimuli; (right) a variety of semantic memory structures, including memory for episodes, for topics (or concepts), and specialized retrieval structures acquired by experts permitting rapid storage of information in their area of expertise. The arrows indicate some of the principal direct connections between components.

basic law of learning requires an understanding of (a) what the learner does during an active learning period, and (b) how the environment can enhance ability and willingness to do it.

As for (a), we shall see that good learners mainly store new information and new processes in memory, organized in such a way as to be evocable and evoked when appropriate for the performance of some task. As for (b), the environment, including teachers, can facilitate the storage process, first, by attracting the learners' attention to relevant information and processes, organized in ways that make them easy to assimilate in the most useful form, and, second, by motivating them to acquire it in a way that will make it readily available for use.

Often, an effective environment will induce learners to spend much time performing tasks that give them practice in accessing the information and processes that they are acquiring, and applying them to specific situations. If you call this “drill and practice,” I will not protest.

A corollary to the basic law of learning is that designing an effective learning environment requires a deep understanding of what knowledge and processes (skills) are needed for performing each class of tasks and how they have to be stored and organized in memory in order to be effective. Thorough task analysis is one prerequisite for the environmental design.

A second prerequisite to designing an effective learning environment is understanding the conditions under which learners are likely to respond positively to it and to carry out the activities that have been planned for them. Without appropriate motivation, learners will allot insufficient time to the task. And even with appropriate motivation, but without some understanding of their own learning processes, they are likely to use ineffectively the time they do spend in learning (Simon, 1994).

Motivation is not well represented in the models that have been constructed of learning processes; rather, it has been assumed (as it is in the design of many laboratory experiments on learning). However, where motivation fits in has become relatively clear. Specifically, the attention-focusing mechanisms determine what motives and drives will, from time to time, gain control of the cognitive processes and set their current goals. Attention, then, is the key mechanism linking cognition with motives and drives. These latter include stimuli that attract or threaten, internal drives (e.g., hunger), and memories that have become tinged with affect and that, when evoked by appropriate stimuli, will direct attention to their concerns.

All of these statements are obvious, I think, and will be accepted without debate. The only thing not obvious is why practice so often deviates from them—why, for example, little effort is spent in formal education helping students to understand what learning processes are effective and to practice these processes, whatever content they may be learning. How much attention do we devote, for example, to examining the impact that our testing procedures have on our students' learning practices? Do the structure and frequency of our tests bias them, for example, toward rote learning or toward learning with understanding?

Rote Learning and Understanding

We would all agree that whatever is learned should be learned with understanding and not by rote. Understanding is sometimes regarded as a rather mysterious state of mind. But we cannot learn about learning without understanding “understanding,” which is one of the favorite mystery words of both pop and academic psychology. Understanding understanding, and using it meaningfully, turns out not to be a hard problem after all. We define understanding in the same way as we define intelligence or skill, by specifying the observable conditions under which we would say that a person exhibits understanding (Simon, 1975).

In broad terms, a person understands some information to the extent that he or she can use it in performing the tasks for which it is relevant. There is no great difficulty in measuring understanding, thus defined. As we can measure it, we can also carry out careful research to discover when our students are learning by rote and when they are acquiring understanding.

Obviously, understanding is a matter of degree. Do I understand the number “3”? Given the task of adding 3 and 4, I may be able to answer, “7”; but given the task of multiplying 3 by 19, I may fail miserably. That means that I understand how to add 3 to another number, but not (at least not in all cases) to multiply it by another number. Nor does it mean that I understand the sentence, “Three is a cardinal number.” There is no mystery in that, or in any situation where we understand something in some respects but not in others. In sum, rote learning is learning that allows us only to repeat back what we have learned, and learning brings understanding to the extent that it allows us to use what we have learned in order to perform tasks. When other mystery terms like “intuition,” “insight,” creativity,” are treated in this same operational way, their mystery, fortunately for psychology, goes away too.

Tools for Banishing Mysteries

One of the great advances that cognitive psychology has made in the past 50 years has been to find powerful new ways to banish mysteries of the kinds I have just described. One half of the task of removing the mystery from a psychological concept is accomplished in the way I have just described; by specifying the behaviors to which the concept applies. Our ability to do this is enhanced whenever we find a new technique for observing behaviors. The humble tape recorder, and the verbal protocol that it records, the eye movement camera, the videotape are among the tools that have been brought to a practical level of application since World War II. Techniques for moving the observation further inside, like EEG, PET scans and fMRI are now just beginning to deliver important new information. These will be improved further, and there will continue to be others, operating from the information processing level down to the level of neurons.

The other half of the task of removing the mystery is developing languages that allow us to describe clearly and precisely the complexities of the cognitive system and the processes whereby it accomplishes its learning and problem solving. Newton demonstrated that the invention of the calculus was the necessary (and very nearly sufficient) condition for developing a viable theory of motion and applying it to explain the thousands of years old phenomena of the heavens. When physicists could describe what they saw in systems of differential equations—Newton's equations of classical mechanics, Maxwell's equations of electromagnetics, Einstein's equations of special and general relativity, Schrödinger's wave equations of quantum mechanics—they possessed the language necessary for reasoning about physical phenomena.

In exactly the same way, computer programming systems have now provided us with the languages that are necessary for reasoning accurately about physical symbol systems, including the human brain and mind. There are only two differences between the two formalisms. First, there is the unimportant difference that computer programs are systems of difference equations (with a finite time cycle), whereas the calculus employs differential equations (with the cycle going to an infinitesimal limit).

The second difference, an important one, is that the calculus requires that the phenomena it represents be expressed in terms of numbers, real or complex; whereas computer programs can admit and encode any kinds of symbolic patterns; numerical, verbal, graphical, pictorial, auditory—just as the brain can. The Physical Symbol System hypothesis (Newell & Simon, 1976) asserts that this is exactly the capability a system needs in order to exhibit intelligence. The empirical evidence for that hypothesis is overwhelming today (Anderson, Corbett, Koedinger, & Pelletier, 1995; Feigenbaum & Feldman, 1963; Langley, Simon, Bradshaw, & Zytkow, 1987; Newell & Simon, 1972; Newell, 1990; Simon, 1996).

The remainder of this chapter illustrates how computer models of learning can help us to think clearly about learning processes and what we can do to facilitate learning. I first use the venerable EPAM program (Feigenbaum, 1960) to illustrate how we can describe what the expert has learned, and how that learning is organized in memory. Then I use a program, ZBIE, (Siklóssy, 1972), to explain the processes that children use to learn natural language, and that can be used in natural language instruction. Finally, I show how a production system language, OPS5, can be used as the first step in constructing effective materials (textbooks and problem books or computer tutors) for school learning tasks (Zhu, Lee, Simon, & Zhu, 1996; Zhu & Simon, 1987). In his chapter, John Anderson uses the Act-R program in a similar way to construct a powerful computer tutor (chap. 8).

My claim is not that these programs, in their present form, are complete or wholly correct theories of the phenomena they address, but that they are powerful tools for interpreting the data we have about these matters, for suggesting what new data we need, for testing our explanatory hypotheses sharply, and for designing tasks for learners. They, and a number of others like them, have clearly demonstrated their efficacy, and they need to be added to the armatorium of tools that we use in improving instruction and learning.

EPAM. EPAM is a theory expressed as a computer program of human perception and memory, which has been used extensively to explain a wide range of experimental results in human verbal learning (paired associate and serial anticipation learning; Feigenbaum & Simon, 1984), in the structure and content of expert memories (Richman, et al., 1995), and in categorization (Gobet, Richman, Staszewski, & Simon, 1997). Figure 7.1 depicts the general structure of the EPAM, memory, except that in EPAM, the motor component has never been implemented, and the early stages of perception (prior to encoding the stimulus as a vector of features) are implemented very incompletely.

EPAM models the memory structures and processes at a symbolic, not a neural level. It is important to note that “symbolic” means “in terms of patterns.” The patterns can represent not only linguistic structures (visual or auditory), but also diagrams, pictures, percepts derived from the tactile and olfactory senses, and so on. The theory is neither language-bound nor rule-bound.

EPAM has two main capabilities. First, when presented with stimuli, and feedback of correctness of responses, it can learn to discriminate among stimuli, dividing them into classes to an arbitrary degree of refinement, and can then associate with each class a body of semantic information about the things that belong to that class. It does this by “growing” a discrimination net, with a test at each node to sort stimuli on the basis of some feature, and by storing the semantic information about discriminated stimuli at the leaf nodes. Second, when the net has been learned, EPAM can use it to recognize familiar stimuli and retrieve the information stored about them.

EPAM, without alternation of its processes or parameters, has been shown to give an excellent account of the memory capabilities of experts in a number of fields, notably mnemonic experts and chess players, but generalizing to the innumerable other kinds of human expertise. For example, in the case of chess, it has been shown, by means of an EPAM-like model, that world-class expertise requires the acquisition of a discrimination net with 300,000 or more branches, each of which provides information about a different class of chess positions (“templates”) or features of chess positions (“chunks”; Gobet & Simon, 1998). This number is not surprising, in that it is of the same order of magnitude as the natural language vocabulary of a college graduate, who has certainly put in his or her 10 years and more in acquiring language.

The chess information is usually acquired by humans by studying over a long period of years tens of thousands of positions from chess games, and the merits of various moves in these positions. In terms of our definitions, this material is learned with understanding, and not by rote, because recognizing a familiar template or chunk enables the expert to access information about the goodness or badness of the position and the kinds of moves that are most favorable for continuing the game.

There is every reason to suppose, and a considerable body of evidence, that the expertise, for example, of a physician is organized in a very similar way; the discrimination net distinguishing different symptoms, with which are associated, at the leaf nodes, information about diagnosis, prognosis, and treatment (Pople, 1982). Thus, EPAM provides a quite general model of expertise and of what it takes to become an expert.

The expert, according to this model, operates on a “recognize, then act” cycle. Features in the observed situation are recognized to provide access to the associated knowledge relevant for choosing a response. Of course, there is more to it than that. The associated knowledge does not always, or even usually, provide a complete course of action for the next step in the task at hand, but may be input into a search procedure that uses means-ends analysis to find a likely action. So, a unified model of expert performance would have to include a general problem solver capable of heuristic search (a GPS, Newell & Simon, 1972, or a Soar, Newell, 1990) as well as EPAM. There are strong and fundamental similarities between that architecture and the architecture of Act-R, and, I believe, only modest differences.

The interest of EPAM for this conference is in telling us something about the kinds of knowledge that are needed for high-level performance of a task, and how that knowledge needs to be organized in memory. From that, we can draw conclusions about the content and method of presentation of a curriculum of study. For example, we draw from EPAM the conclusion that if we want students to be competent in arithmetic, we ought to provide them with learning materials and activities that will enable them to master the sums table and the times table, but also materials and activities that will enable them to understand when sums and products are appropriate things to compute, and to evoke the tables at those times.

Minstrell's description (chap. 4) of the Diagnoser system for high school physics instructions illustrates how to help students learn about both what the appropriate physics concepts are and when to use them in a particular situation, relating the student's knowledge to the response required in that situation.

From models like EPAM and Diagnoser, we obtain an understanding of the relation between drill and practice and skill in application. We can express the relation in the programming languages called “production systems”; that is, languages each of whose instructions is an “if, then” statement. The if consists of a set of conditions, and the then, of a set of actions. Whenever all the conditions of a production are satisfied, the action is taken. We can see that the actions only tell the student what can be done; he or she needs to understand the conditions in order to know when to do it.

ZBIE. The program ZBIE (Siklóssy, 1972; its author never disclosed the meaning of the acronym), was constructed by Laurent Siklóssy, inspired by a series of “language by pictures” books developed by the philosopher I. A. Richards as a preferred method of teaching languages. Richards' idea was that second languages should be learned in the same way as first languages are learned; not by translating the words and sentences from and to one's native language, but by learning how to translate visual, auditory, and other sensory experiences directly to and from the new language.

To implement this idea, ZBIE is presented with pairs of stimuli, the first member of the pair being a simple scene (represented internally by a diagram in the form of list structures), the second member, a sentence in the language to be learned. If the sequence begins with very simple scenes and sentences that describe them, ZBIE can begin to match the words in the sentences with objects and relations among objects in the scene. It could, for example, match “dog” with one object in a scene, “cat” with another, and “chases” with the relation in the scene of the dog chasing the cat. Increasingly, after a succession of such pairs, ZBIE can name the objects and relations in scenes correctly, and form sentences that describe the scene.

ZBIE gradually builds up in memory both a lexicon and relational structures that have very much the properties of a grammar. (For example, examination of its memory structures reveals nounlike and verblike classes, and so on.) In fact, it can be said that ZBIE constitutes a realistic model of what, from a psychological standpoint, a grammar really is; not a body of complete and consistent rules, but a set of structures and procedures in memory that govern language understanding and usage.

Notice that ZBIE gets to the very heart of the idea of understanding, for it relates language to the “real world,” by creating tight links between syntax and semantics. Its semantics are not limited to dictionary meanings (based on other words) of the vocabulary it learns. It learns to understand the intensions of the sentences it reads; the external situations that the sentences denote.

Moreover, ZBIE acquires these connections between words and situations in a way that closely resembles the way in which a child learns its native language—by hearing the language in the presence of the objects and events being mentioned. ZBIE thereby casts a great deal of light on what is involved in language learning, especially at its earliest stages.

At later stages, things become more complicated because, for example, the meanings of new words can be acquired not only from scenes whose features they denote but also from linguistic expressions that relate them to words already known. Thus, you don't need to see unicorns or pictures of them to learn what a unicorn is, or even how you would recognize one if you did see one: “Horse with a single horn on its forehead” would give the idea.

Only very recently has the linguistic community begun to build further on the achievements of ZBIE. Currently, a new effort is under way to extend the ZBIE program with the means afforded by the much larger computer memories and faster computer processes that are now available. Today, such a program can aspire to attain an adult vocabulary and grammatical skills.

Again, I am less interested in praising the virtues of ZBIE, which was a first effort to address a very complex (and important) problem, than I am in showing how computer models, whether they turn out to be wholly correct or to need major extension, can guide research toward understanding thinking and learning phenomena, and in the interim, can provide an approach to designing learning experiences for students. Operative computer models give clear specifications of what Greeno (1976) called “cognitive objects.” In such running programs as the algebra tutor described by Anderson and Gluck (chap. 8) or in Lovett's (chap. 11) analysis of the cognitive objectives of a statistics course, they provide powerful guidelines for the instructional process.

Learning From Examples: Production Systems. For my last example, I describe a procedure for task analysis that has proved very effective for building curricula in high school algebra and geometry. In this work, a computer model was important in demonstrating the effectiveness of the learning process embodied in the curricula, but a running model of the entire program was not actually needed to develop the cur-ricular materials.

In the late 1970s, David Neves (Neves, 1978) wrote a program, in the production system language, OPS5, that was capable of learning to solve linear equations of the sort one encounters in elementary algebra. I earlier defined a production system as a computer language each of whose instructions take the form: “If condition C is satisfied, then take action A.” It also contains priority rules so that if, at any given time, the conditions of more than one of its instructions (productions) are satisfied, it will execute a particular one of them. When Neves' program was presented with a worked-out example of solving an equation, it would examine the example to discover what condition motivated each step in the solution, and what action was taken in response to that condition. The following example will make the process clear.

image

We notice that the second line was obtained by adding 12 to both sides of the first line, thereby getting rid of the numerical term on the left side. Similarly, the third line was obtained from the second by subtracting 4x from both sides, getting rid of the term in x on the right side. Finally, the fourth line was obtained from the third by dividing both sides by 3, putting the expression in the desired form “x equals a number.” At each step, a difference was noted between the form of the current expression and the form of the desired expression; a number on the left, or a term in x on the right, or an x with a coefficient different from unity. An action was then taken that removed that difference. The solution path can be summed up in three productions:

If there is a number on the left,

then subtract the number from both sides.

If there is a term in x on the right,

then subtract the term in x from both sides.

If there is a term in ax on the left, and a≠,

then divide both sides by a.

The information for the if clauses of these productions is obtained by noticing which difference between given expression and desired expression is removed at that step. The information for the then clauses is obtained by noticing what action removed the difference. The actions (operating on both sides of the equation) are those that the students have previously learned to be legitimate (preserving the value of the variable) operations on equations.

Those of you who have been concerned with algebra instruction will observe that textbooks and lesson plans explicitly give time and attention to teaching the legitimate manipulations (actions) of equations; they largely ignore the clues (conditions) that determine what actions would be effective if applied in any particular circumstances. That is, they teach what is permissible, but not what is appropriate—law but not strategy. The same is true of other instructional material, but it is particularly evident in most algebra textbooks.

By examining the kinds of problems we wish students to be able to understand and solve, we can now discover sets of productions that are effective in solving these sorts of problems, then construct sets of worked out examples for the students to use in acquiring their own production systems (both legitimate actions and conditions for their application), and additional problems to test the correctness of what they have learned. Notice that the method does not assume or imply that the students will learn verbal rules equivalent to the productions; just that they will store procedures equivalent to these productions in memory and so will be capable of executing them when they are evoked in appropriate situations (i.e., in situations where the conditions of the productions are satisfied).

Adaptive production systems are now widely used as models of processes for learning procedures. Curriculum design methods utilizing learning from examples techniques based on adaptive production systems have been employed in the People's Republic of China, without any use of computers, to construct a complete 3-year curriculum in middle school algebra and geometry. The curriculum has been used with great success (good posttest results at the end of the course and a year later, and significant time savings) since the middle 1980s, for the instruction of students in various parts of China, now about 20,000 students annually. The classroom materials are paper and pencil workbooks. The teachers do almost no lecturing, devoting almost all their time to tutoring students who are having problems. Classes typically have 40 to 50 students. The entire curriculum was created by six people, including experienced teachers and an educational psychology researcher at the Psychology Institute of the Chinese Academy of Sciences.

John Anderson and his colleagues tell a similar success story about their computer tutor programs, which are now being used extensively in American schools (Anderson et al., 1995). The basic approach to task analysis and curriculum construction is essentially the same as the one I have just described, but the final products are tutorial courses in which the students interact directly with a computer.

Clearly, these methods are not restricted in application to mathematics, although they are probably applied most easily in the sciences, where there is already a tradition of conducting a large part of the learning through problem-solving exercises. Nor as the successful program in China shows, need they always be carried to the stage of computer implementation. This volume provides several other examples of such formal task analyses that approach the precision of computational models; Lovett's application to college level statistics (chap. 11), Kalchman, Moss, and Case's to the teaching of number sense (chap. 1), and Carver's to the design and administration of an entire preschool program (chap. 12). Extending such task analysis to subjects like history and English literature requires much initial thought about defining the skills that the learner is to attain and the kinds of examples that can be developed to help students learn those skills.

Just the exercise of going through this planning step to define the learning goals and experiences that could be useful in reaching them, would be of substantial benefit to learning in these subjects, over and above its contribution to the design of effective curricular materials. Notice that the procedure I have been describing is significantly different from the notion of “learning by doing” or “learning from problem solving,” for the worked-out examples provide a trail of crumbs through the forest, eliminating a great deal of the tiresome and excessively difficult search students are subjected to if they are thrown into problems without an appropriate level of guidance. The detail in the sequence of exemplary steps can be regulated to adjust the difficulty of the learning task as students' skills advance.

CONCLUSION: THE TOOLS OF MODERN THEORY

The first of the three examples I have discussed (EPAM) shows how cognitive modeling has provided us with a broad and deep understanding of the cognitive bases for human expertise, in particular, the roles of perception and memory-based pattern recognition in expert performance. The second example (ZBIE), through its capabilities for acquiring both the syntax and semantics of natural languages, instructs us about processes through which these skills can be learned and the structure of curricula for employing these processes. The third example (adaptive production systems) demonstrates an effective set of procedures, learning from examples, that has led to an effective model for building curricula in algebra and geometry.

Of course, this is not the only useful purpose that modeling serves. Computer models are a powerful and natural language of theory for psychology in general, and cognition in particular. Perhaps their most important use is identical with that of theories in every field of science; to give the researchers themselves a language in which to think and communicate, that provides a clear, precise, and powerful representation of the situations and problems with which they are dealing. In psychology, we have not in the past commonly had that kind of clarity and precision. It is now available, and we should not delay in taking utmost advantage of it. Our problems are difficult enough without handicapping ourselves by using ordinary language as our only medium of representation, communication, understanding, and explanation.

The Social Dimension of Learning

In my remarks, I have not given equal time to the four major themes of this volume. I have had a good deal to say about development and learning, both in general and in my discussion of ZBIE as a model of first language learning. I have used both EPAM and adaptive production systems to examine some aspects of teaching strategies. I have discussed the computer as a learning tool, at least in two of its uses (understanding the learning process and designing curricula and specific learning experiences within them). But I have had little to say about the social context of learning except insofar as I described the expert's memory as the product of social experience.

My relative neglect of social context does not reflect my assessment of its importance, especially in relation to issues of motivation, and of training in social skills, broadly construed (i.e., the skills of living immersed in a social world). These matters receive extensive discussion in other chapters, so I limit myself to just a few comments on broad questions that I believe need more attention than they have received elsewhere.

First, I must express my disappointment at the controversial spirit in which the discussion of alternative approaches to instruction is often carried on; disappointed, but not surprised, because it is not just a debate among scientists and scholars, or practitioners, about the facts of the matter, but also a much broader public and political debate about the schools, often leading (in both the scientific and public communities) to expressions of extreme “constructivism” on the one hand, and to extreme advocacy of “a return to the fundamentals” on the other. We must expect such debates to generate heat as well as light, but usually much more of the former than the latter. The discussion in this book has a much more objective tone.

Second, there has been considerable conflation of two quite separate issues; the contribution, on the one hand, of educational settings to the social development of the child, and the contribution, on the other hand, of the social structure and processes of the educational setting to children's learning. The first issue is concerned with how far the school is and should be concerned with the child's skills in interacting with other children and people generally; the second is concerned with whether, and what kinds of, group activities among the children contribute to the substantive learning goals (Okada & Simon, 1997). Should algebra be taught so as to improve children's social skills, or should children's social skills be used to teach algebra through group activities? Or both, or neither?

On both issues, the educational world is in considerable need of more solid facts, and while waiting for them, is full of strong and very divergent opinions. On these topics, I have nothing to add beyond what I have already said, in collaboration with several colleagues, in other places (Anderson, Reder, & Simon, 1996, 1997, 1998; Vera & Simon, 1994).

GENERAL CONCLUSION

There is today a science of learning, however incomplete and imperfect. This science has a great deal to say about how learning environments have to be designed for learning to proceed in effective ways. To make use of this know-how, we have to create strong bonds between the research community and the teaching community that do not exist today. And we have to realize that the research community has as much to learn from the teaching community as the latter from the former.

Closer bonds—as illustrated by Klahr, Chen and Toth (chap. 3)—will give a tremendous impulse to our basic research on learning, and will greatly accelerate our progress toward a deep understanding of the human mind; while, as Carver's chapter shows (chap. 12), fundamental principles from cognition can be used to encourage multiple levels and goals in an ongoing school. Looking directly at the phenomena of the real world is the way that all science begins, and the schools are a principal place where the phenomena of learning are found in a form that is easily observed.

REFERENCES

Anderson, J. R., Corbett, A. T., Koedinger, K. R., & Pelletier, R. (1995). Cognitive tutors: Lessons learned. Journal of Learning Science, 4, 167–207.

Anderson, J. R., Reder, L. M., & Simon, H. A. (1996). Situated learning and education. Educational Researcher, 25 (4), 5–11.

Anderson, J. R., Reder, L. M., & Simon, H. A. (1997). Rejoinder: Situative versus cognitive perspectives: Form versus substance. Educational Researcher, 26 (1), 18–21.

Anderson, J. R., Reder, L. M., & Simon, H. A. (1998). Radical constructivism and cognitive psychology. In D. Ravitch (Ed.), Brookings papers on education policy (pp. 227–278). Washington, DC: Brookings Institution.

Feigenbaum, E. A. (1960). The simulation of verbal learning behavior. In E. A. Feigenbaum & J. Feldman (Eds.), Computers and thought (pp. 297–309). New York: McGraw-Hill.

Feigenbaum, E. A., & Feldman, J. (1963). Computers and thought. New York: McGraw-Hill.

Feigenbaum, E., & Simon, H. A. (1984). EPAM-like models of recognition and learning. Cognitive Science, 8, 305–336.

Gobet, F., Richman, H., Staszewski, J., & Simon, H. A. (1997). Goals, representations and strategies in a concept attainment task: The EPAM model. In D. L. Medin (Ed.), The psychology of learning and motivation (Vol. 37, pp. 265–290). San Diego, CA: Academic Press.

Gobet, F., & Simon, H. A. (1998). Expert chess memory: Revisiting the chunking hypothesis. Memory, 6, 225–255.

Greeno, J. G. (1976). Indefinite goals in well-structured problems. Psychological Review, 83, 479–491.

Langley, P., Simon, H. A., Bradshaw, G. L., & Zytkow, J. M. (1987). Scientific discovery: Computational explorations of the creative processes. Cambridge, MA: MIT Press.

Neves, D. M. (1978). A computer program that learns algebraic procedures by examining examples and working problems in a textbook. Proceedings of the Second Conference of Computational Studies of Intelligence (pp. 191–195). Toronto: Canadian Society for Computational Studies of Intelligence.

Newell, A. (1990). Unified theories of cognition. Cambridge, MA: Harvard University Press.

Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall

Newell, A., & Simon, H. A. (1976). Computer science as empirical inquiry: Symbols and search. Communications of the Association for Computing Machinery, 9 (3), 113–126.

Okada, T., & Simon, H. A. (1997). Collaborative discovery in a scientific domain. Cognitive Science, 21 (2), 109–146.

Pople, H. E., Jr. (1982). Heuristic methods for imposing structure on ill-structured problems: The structure of medical diagnostics. In P. Szolovits (Ed.), Artificial intelligence in medicine (pp. 119–190). Boulder, CO: Westview Press.

Postle, B. R., & D'Esposito, M. (in press). “What”—then—“where” in visual working memory: An event-related fMRI study. Journal of Cognitive Neuroscience.

Richman, H. B., Staszewski, J. J., & Simon, H. A. (1995). Simulation of expert memory using EPAM IV. Psychological Review, 102 (2), 305–330.

Siklóssy, L. (1972). Natural language learning by computer. In H. A. Simon & L. Siklóssy (Eds.), Representation and meaning (pp. 288–328). Englewood Cliffs, NJ: Prentice-Hall.

Simon, H. A. (1975). Learning with understanding. The ERIC Science, Mathematics and Environmental Education Clearinghouse. Columbus, OH.

Simon, H. A. (1994). Bottleneck of attention: Connecting thought with motivation. In W. D. Spaulding (Ed.), Integrative views of motivation, cognition, and emotion (pp. 1–21). Lincoln, NE: University of Nebraska Press.

Simon, H. A. (1996). The sciences of the artificial (3rd ed.). Cambridge, MA: MIT Press.

Tabachneck-Schijf, H. J. M., Leonardo, A. M., & Simon, H. A. (1997). CaMeRa: A computational model of multiple representations. Cognitive Science, 21 (3), 305–350.

Vera, A. H., & Simon, H. A. (1994). Reply to Touretzky and Pomerleau: Reconstructing physical symbol systems. Cognitive Science, 18 (2), 355–360.

Zhu, X., Lee, Y., Simon, H. A., & Zhu, D. (1996). Cue recognition and cue elaboration in learning from examples. Proceedings of the National Academy of Sciences, 93, 1346–1351.

Zhu, X., & Simon, H. A. (1987). Learning mathematics from examples and by doing. Cognition and Instruction, 4 (3), 137–166.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset