Cognitive Science

Cognitive Science, the study of human thinking, problem solving, learning and decision making, is today a rapidly progressing area of research in the social and behavioral sciences which makes use of the computer to understand human thinking. Cognitive science is unique and important in linking basic to applied research across several disciplines. Many of the major advances in artificial intelligence, education and computer software technology have been strongly influenced by basic research in this new science. For example, cognitive science has changed our understanding of how doctors make medical diagnoses and how students learn from worked-out examples. Cognitive science has also contributed to the building of expert systems that automatically make medical diagnoses or construct appropriate exercises for students.


Today I would like to tell you a little about some of the developments that are currently going on in cognitive science, and point to some of the directions in which they might affect our own lives as human beings individually and collectively. This is obviously much too large a domain to be covered, for I want to leave most of the time for your questions, so I will focus on some specific "for instances."

At Yorktown Heights at the IBM Research Headquarters a couple of days ago I saw an interesting computer program, which illustrates nicely, I believe, the kind of things that I am going to talk about today. It was a computer program which would locate spelling errors in written text. That's old hat now; computers have been doing that for three or four years. It would then correct the spelling, which is even more useful. In addition, it would find most grammatical errors—split infinitives, lack of subject and verb agreement, etc. and give advice about correcting those. This computer program gives a glimpse into the nature of the mind, or at least into the nature of language, which we have always thought of as one of the most essential products of the human mind. This little program, and it is a program of modest size, provides just a slight indication of the sorts of insights we are getting, in cognitive science, into the way the human mind operates when it is using its capabilities to process language.

Most scientific research programs start with exciting problems and the hope that they will lead to answers that might be of some use, but all of you know that it's really the excitement that drives scientists most of the time. There are some scientific problems that are big problems and other problems that are little. Everybody has his own list of big problems. My own list of really big scientific problems goes something like this:
1) The nature of matter. That is what the high energy particle physicists are all about. They would really like to know how things behave at the foundations, but the foundations are always a level below where they have been able to dig so far. That's the way science is.
2) The origins of the universe. That is what the space program is all about. What is out there? How did it begin? How did it develop?
3) The nature of life. That covers the whole range of the biological sciences. How is it that matter can organize itself to exhibit the sort of behavior that living creatures are capable of?
4) The nature of mind. How is it that a chamber made out of bone with various kinds of biological structures inside can do the things that we call thinking, problem solving, speaking and so on?


Now, answers in any one of those areas would certainly lead to some possibilities for doing something about everyday practical problems of human beings, not just the problems of understanding ourselves. Deeper understanding of the nature of matter and the origin of the universe underlies the whole practice of engineering and our whole ability to use physical laws to accomplish our human purposes. Likewise, understanding the nature of life leads us to the kinds of knowledge of the human body that provide the basis for advances in medicine.

To what does the understanding of the nature of the mind lead? Presumably it can lead to all kinds of advances in our ability to use the central human resource, the resource that makes us truly human and able to do the things that other creatures in this world cannot do. If we ask what good it can do to understand the mind better, we need to look in the direction of proving management, the processes of learning, the decision making and problem solving, that take place in organizations, and perhaps even the decision making and problem solving that take place in Congress. A very large part of the human resources in our society are occupied in learning. We spend the first third or half of our lives learning, and some of us don't quit even then. Learning is a major consumer of this human resource, the brain.

Now, to try to answer scientifically the big exciting questions about matter and the universe, life or mind, science needs techniques. The best problems in the world (i.e., building a gravity shield or a perpetual motion engine) are not good scientific problems unless we have some techniques that tell us what we can do tomorrow, what is the first step toward solving them. A good scientific problem is an important problem combined with some glimmer of an idea about how it can be approached and tackled. So if you are going to study the nature of matter, you are going to need accelerators. If you are going to study the origin of the universe, you are going to need telescopes and space vehicles. If you are going to study biology and the nature of life, you are going to need something like recombinant DNA techniques. Of course, people did biology before they had those, but they answered a different set of questions then. When you are on the frontier, you need powerful techniques.

What about the study of the mind? Beginning about thirty years ago, we began to get a new and powerful tool of research for the study of the mind. It was the now familiar digital computer. Its use in this research rests on a hypothesis that you can accept or not, a hypothesis for which there is a good deal more evidence now than there was thirty years ago. The hypothesis is that the reason the human mind or brain is able to do the things we call "thinking" is because it can perform some very simple operations with symbols. By symbols I simply mean patterns—patterns you can put on the blackboard with chalk, or any kinds of patterns. Biologists will some day tell us how these operations are done inside the brain; we don't know very well yet.

But human thinking is done by simple processes like reading symbols (patterns) as our eyes do, producing symbols as our tongues do and writing symbols as our hands do. We store symbols inside our heads so we can act later, not only storing individual symbols but storing relations among them—big structures of symbols that can represent meanings. We compare symbols and decide when they are alike or when they are different. We can branch if they are different, and do something else if they are alike.

Now all of you who have your own personal computers at home know that these are just the things that computers do. They read symbols. They output symbols on the screen or on the printer. They store symbols in memory. They can relate different symbols and build organized structures in memory (on all of those floppy disks that you have). They can compare symbols. Every computer language has a branch operation, and can compare two symbols and then do one thing if they are different and another if they are alike.

The hypothesis that people have been pursuing in cognitive science research in the last thirty years is that the reason human beings can think is because they are able to carry out with neurons these simple kinds of processes that computers do with tubes or chips. Notice that this isn't quite a biological hypothesis. I am not saying anything about how neurons actually work; I wish I knew. There is still a very large gap between understanding the biology of the human brain and understanding symbolic processes at a level where we can show how they do complicated things like solving problems or producing natural language. It is a little like the physicist's problem. Some physicists study quarks, and others, as well as chemists and biologists, study macromolecules. Now everybody knows that macromolecules are made up of quarks, but it doesn't help you explain very much how the body works to know about quarks. There are too many levels in between, so you have to build up layers of theory. You build up biochemical theories, chemical theories and physical nuclear theories. Finally several layers down are quarks.

Thus, we are not talking about the biology of the brain. We are talking about the information processes that are supported by that biology. In some happy day in the future, as our research in neuroscience progresses, we will be able to close the gap between those two. We will see not only how a human being thinks in terms of basic memory and input and output processes, we will also eventually learn how that thinking takes place in terms of neurons. That is not what I am going to be able to explain today. I don't know, and as far as I know, nobody does.

Now, I have made a pretty strong claim, a claim that the reason human beings can think is because they can input patterns, output patterns, store patterns and compare patterns. It makes the thinking brain sound like a pretty simple device. How would you test a hypothesis like that? The way in which we have tried to test it is by seeing how closely we can simulate or mimic the processes of human thinking on a digital computer, which demonstrably just uses these simple processes. We know a computer can't do anything else besides the four or five things that I mentioned, and the name of this game is to see whether you can write computer programs that not only do things like use language and solve problems, but do them in a demonstrably human way. I don't want to go into the details of how we match the computer's behavior against human behavior, but there are all sorts of laboratory techniques for doing that now. Essentially we trace second by second what the human being is doing when solving a problem, and we compare that with the trace of the computer. I would rather talk about some of the results of using this technique and some of the things that we know today as a result of such research.

So let me go to my first example. In any domain of human skill there are experts and novices. In most domains, most of us are novices. Maybe there are one or two areas in which we have achieved expert status. What are the differences between experts and novices? How can we characterize the nature of expert skills?

We know a lot today about expertness in playing chess. Playing chess is probably not the most important activity; grand master chess players don't get paid at the same level as football players or baseball players. Nevertheless it is a nice activity to study for purposes of this kind of research. Clearly playing chess well requires a certain amount of thinking—or at least the people who do it frown a lot while they are playing, which may be an indication. So we use chess in this research in the same way that geneticists use fruit flies. I don't think geneticists are terribly fond of fruit flies, but the fruit fly is a good model organism in which the genetic mechanisms can be studied. Similarly, chess is a fine task in which one can study the nature of expertness.

It has commonly been thought that to be a great chess player you have to have unusual visual imagination. Some simple experiments can test that. Take a chessboard and arrange the pieces on it as in a game between two good players, say maybe twenty-five pieces, after a few of them have been knocked off. Let a chess player look at that board for five to ten seconds, not very long. Then take it away and say, "Reconstruct the board for me. Put those twenty-five pieces back on the board the way they were." The grand master will do that with 95% accuracy—maybe make one mistake occasionally. "Oh," you think, "he must have marvelous visual imagery to remember all of the pieces on the board." You or I (unless there are some grand masters in the audience whom I don't recognize) would be lucky to get six pieces back on the board correctly. Now six turns out to be a very important number. It is very important because in many psychological tasks it turns out that human beings can only hold on to six or seven things at once. I will have to define a little more carefully what a "thing" is. Things are basically familiar patterns—six words or the seven digits in a telephone number. We are lucky that telephone numbers are only seven digits in length, so we can look one up in the telephone book and rush to the telephone and usually retain it.

Now we just make a small change in the experiment: we take the same pieces and put them back on the board at random. We show them to a grand master and to an ordinary player for five or ten seconds, take them away and ask that they be reproduced. What happens? The grand master can put six back on the board correctly, and the ordinary player can put six back on the board correctly. With random boards, there is no difference. The grand master's prowess with positions from real games has nothing to do with ability to do visual imagery. It has a great deal to do with the fact that grand masters have spent an awful lot of time looking at chessboards, and there are lots of familiar friends on any chessboard from well played games—little configurations of pieces that have been seen again and again. I won't test your chess knowledge here, but any chess master who sees a board on which there is a certain pattern called an open file will notice that open file and will even have some ideas about what to do about it. That is all stored in memory.

These ideas about the nature of expertise are developed further by trying to write computer programs that can behave in an expert matter. The minute you try to write those computer programs you have to ask, "How much information must an expert have?" and even more important, "How does it have to be organized in the brain in order to be of any use?" Today we have some computer programs which, when shown chessboards with patterns or positions on them from reasonable games, will in fact be able to quickly store away information about those familiar patterns and be able to reproduce the board.

From this kind of research we get notions about how many patterns make an expert. For chess the answer is something upwards of 50,000. I don't know if that seems like a large number to you, but it turns out that we are all expert in using our native language—well, fairly expert. All or most of the people in this room who have had college educations or the equivalent have a natural language vocabulary of 50,000 words or more. If you don't believe that, get an unabridged dictionary (the bigger the dictionary the better because you will get a more favorable result), pick ten pages at random, count all of the words on the page whose meanings you know in context, multiply by the number of pages in the dictionary, and divide by ten because you took ten pages as a sample. You will then have a measure of your vocabulary, and it will be upwards of 50,000 words. We have also made same measurements in other fields of expertness and get roughly that order of magnitude.

Today there are computer systems that can operate at an expert level in certain professional fields. In our city of Pittsburgh, we have a program over at the University of Pittsburgh called Caduceus, produced jointly by Jack Myers, who is a fine internist, and Harry Pople, who is a fine computer scientist. That program diagnoses illnesses in the area of internal medicine, and it diagnoses illness at the level of a good clinician, although it probably isn't as good as Jack Myers. How does it do it? It does it because that program has well over 10,000 pieces of information and patterns that it can recognize—symptoms that a patient presents to the doctor—that immediately cue an hypothesis. It still doesn't have 50,000, but then Dr. Myers knows a lot of other things besides those diagnostic tricks.

Of course, you don't build an expert system just by having these 50,000 chunks. There has to be a little capability for reasoning about the chunks. Today we know something about how you combine expert knowledge with a little bit, not too much, of reasoning power. You shouldn't romanticize about how much reasoning power the human mind has. There is a small industry burgeoning now of people who write software expert systems for a wide variety of domains. They are not trying to imitate human beings; they are trying to imitate human performance. A lot of the ability to do this today comes out of the research that has been done back and forth between psychology and the part of computer science that we call artificial intelligence. This research has tried to build computer programs that try to understand human performance.

Let me give you one other example which has to do with learning. I said earlier that a large part of the human resources of our society are spent in the activity of learning or helping other people to learn, which is called teaching. In addition to all of the school budgets, perhaps 5 to 10 per cent of the GNP, we really ought to charge all of the time of the kids. If they weren't in school, they would be out tormenting their parents or doing something interesting.

In those terms, a very large part of our society's resources are devoted to the processes of teaching and learning. What is the theory on which this practice is built? I know what the theory is in the university. The educational theory on which we build our whole educational process is that if you get a bunch of people together in a room and spray them with enough words, some of the words will turn out to be infectious. We have two spraying instruments: the human voice and the book, but we have very little theory as to the processes that transfer the textbook or lecture from the source to the student or the form in which it has to be transferred so that it is of any use to the student. In education we have had notions like rote learning vs. learning with understanding. Those are useful words, but what do they mean? What is the process of rote learning vs. learning with understanding? What is it people can do when they learn by rote vs. what they can do when they learn with understanding? The techniques that I am talking about in cognitive science today—computer simulation techniques—are giving us insights into that.

Let me give you an example. Just about everybody in this room at one time or another was subjected to algebra. Perhaps subjected is the wrong word; apply it or not as you see fit. We had to take a course in algebra, and in that course one of the central skills we acquired was the skill of solving a simple linear equation with an unknown: 3X + 4 = X + 10. What was it you learned? You didn't learn something that you memorized; you learned a skill. You learned, when an equation was presented to you, to notice certain things about that equation and act on those things.

It begins to sound a little like the chess player, doesn't it? He notices patterns on the chessboard and that immediately reminds him of what he does when there is an open file or whatever. Well, when you looked at that equation 3X + 4 = X + 10, you also noticed something. You didn't say it in words to yourself, but it was something like this: "I am trying to solve this equation, and the solution of the equation looks like an X followed by an = sign followed by a number—not any number; it has to be the number that will fit the equation if I substitute it back in. I know all sorts of rules that I am allowed to apply which will take the equation that I am given and turn it into a different equation that has the same solution. I can add the same number to both sides, subtract the same number from both sides, multiply both sides by the same number and divide both sides by the same number (if I don't divide by zero)." Those were the rules you all learned. You can do those things but you have to do them on both side! Then the teacher would magically solve this equation by doing things like that, and you would think: "Well, I see every step, but what made the teacher think of doing just that? Why was that subtracted and that subtracted and that multiplied and that divided?"

The secret is something like this. (You all learned the secret although you may not have been aware of it.) If you start out with something that has a number on the left hand side where you don't want it, then you better get rid of it since you are trying to get something with an X and an = and a number. So when you have 3X + 4 = X + 10, you subtract the 4. Then you have a new equation which is 3X = X + 6. But you don't want an X on the right hand side of the equation so you subtract it, and now you have 2X = 6. Now you don't want a 2X, but instead just a plain X on the left hand side of the equation, so you divide by 2.

I can write a computer program which has similar instructions. The instructions say that if there is a number on the left hand side of an equation, subtract it from both sides. If there is an X on the right side, subtract X from both sides. If there is some number in front of the X on the left side, divide by it. Then you need a fourth instruction: if you arrive at something which looks like X = a number, quit and check.

Today we think that is the way in which the skill of solving algebraic equations is stored in the human mind. It is stored in the form of things we call productions—each one of which is a little cue or an ability to recognize a cue whenever it appears, hitched to an ability to remember what to do whenever that cue is seen. Very complicated reasoning skills can be built out of simple structures of that sort and can even be carried one step further. We have today computer programs that can examine a worked-out example and figure out for itself what the rules are that would get it there. It can then store these new rules in its own internal parts and have the capability of doing algebra. So here we have a theory of how human learning takes place—a theory implemented in a computer program.

If computers can learn in this way, why don't we see whether it is possible to teach things to kids this way? Why don't we see if we can develop learning materials that allow a computer to develop the skill of solving equations and then try out the same materials on kids and see what happens? I had a chance to do this for a similar skill in China a couple of years ago. I would rather have done it in this country, but I happened to be in China and there were more Chinese kids around than there were American kids for subjects. Unfortunately, all of the schools in China that year had already learned to solve linear algebraic equations, so we had to take a slightly later lesson in the year's curriculum, which was factoring quadratics. Using this theory and the computer model, we built teaching materials. We took them out to four or five different schools in Peking and gave them to the kids. In twenty minutes they were factoring quadratics.

I don't want to draw too strong a conclusion from that. There is a lot of research that has to follow that up: Did it really work? If so, why? Are there other methods that might work better and so on? But we have begun to understand how the knowledge we are gaining of how human beings think, work and learn can be turned into solid theory to improve and transform our educational practices. I am not promising any great things for tomorrow, but I am also not talking about the year 2010. I am talking about research that can start right today and is starting. Without developing the examples, I will just mention two other areas where this kind of research is going on today. Then I would like to stop and see what questions you might have.

I have already mentioned mental imagery. There is a lot of interest today in education in science and mathematics, and I think we have some special problems in those areas. It is quite unclear how many people can learn scientific and mathematical subjects—and up to what level of advancement and sophistication. A very important question in our society is, "Who can be taught and by what kinds of methods?" It is generally believed that mental imagery plays a very large role in science. Give people problems in physics, and the first thing they want to do is to draw pictures or diagrams of some kind. It turns out that if their hands are tied behind their backs, they have difficulty solving the problems. So there is research going on today, very interesting research, that says, "How does the picture help?" The question being asked is: What individual differences are there between the people who use pictures and diagrams readily and those who don't. Parallel with this, there is research on the computer which says: How can we model that difference? What would a program that uses diagrams look like that is different from a program that doesn't use diagrams? What extra leverage does it get from using them? What is different about the program? What is different about the way it processes information that makes a diagram so powerful to it? There are now some very good beginning ideas on that topic.

Finally, you all know the principle that a computer can only do what it is programmed to do. Now that is sometimes misunderstood. Some people understand that to mean that a program only does what you think you programmed it to do. That is demonstrably false, as you all know. Computer programmers spend most of their time debugging programs which don't do what they think they programmed them to do. But more than that, it is often interpreted to mean that a computer can only do things you already know how to do, where you tell the computer step by step exactly how to do it. That is a wrong interpretation of the principle. Of course, a computer has to have a program that tells it step by step what to do. But step by step might mean to instruct it to search for answers to some kind of problem and to use some rules of thumb in the conduct of the search. That is the way the medical diagnosis system works, as well as the other expert systems that I described. You don't have to predict an event in advance or program the exact steps it is going to take. What you have to do is to tell the computer what kinds of principles might guide its search.

Therefore, it is not unreasonable to ask if you could program a computer to be creative. Well, what is creative? There are a lot of kinds of creativity. Could you program a computer to do interesting drawings? The answer is yes. There are a number of people who have done it. Perhaps the most remarkable is Harold Cohen at the University of California/San Diego, a well-known painter who got hooked on computers. He does not use the computer as a paint brush but instead writes programs that make the computer able to create interesting works of art. By interesting I mean that people who are sophisticated about art are interested in them. There is also some computer music—not of the most remarkable quality but again of "interesting" quality.

There is one area where we can get even sharper criteria of whether the computer is doing something interesting or not, and that is in science. One can tell whether a piece of science is good science or bad science—whether a discovery is new and important or not. So several research groups, including one in our shop, have been engaged for a number of years in writing computer programs that are capable of making discoveries in science. A particular slant we took in our project was to say that if the computer progam is so smart, then it ought to be able to do the things that Kepler did, or it ought to be able to do the things that Ohm did—or you name your hero. So we have been engaged in giving the program, the main version of which we call BACON in honor of Sir Francis Bacon, the same kinds of data and starting point that Kepler had when he began his search for what we now call "Kepler's Third Law" or that Ohm had when he first decided to find out something about the laws that govern electrical currents. We give it those data, and we ask the computer to find the law. The answer is, it does. As a matter of fact, it finds it so quickly that there is a little embarrassment here. It took Kepler ten years to develop Kepler's Third Law. Now, it is true, he had distractions. His mother was being tried for witchcraft during part of that period, and other things were happening to him. But, nevertheless, he was somehow or another messing around with this problem for a number of years. BACON does it in two minutes.

We are doing a book on BACON now. One of the exercises I was engaged in just a couple of days ago was to ask what accounted for all of the rest of that time, not in Kepler's case but in general. Why should it be so fast? I won't bore you with the whole story, but I think we can account for a good deal of the time and explain why it takes 105 more time for a human being to arrive at these laws than it does for the computer. Part of it has to do with the basic speed of the device itself. If you just look at the basic speeds of the components, there is no doubt at all that computers are fantastically faster than human neurons. Neurons involve very complicated electro-chemical processes, and one hundredth of a second is a very short time for any event in neural material. Well, today in computers you couldn't sell a product if it took a hundredth of a second to do anything. We want computers to act in microseconds or better yet, millionths of a second, not just in hundredths of a second. So the computer hardware is clearly faster, but there are many other things that also account for the vast difference in speed between the human and the computer.

Why don't I stop at this point by just summing up the general point of view I have been trying to express. First of all, the human mind really is our biggest economic resource. Human muscles are nice to have around, but they are not so important anymore since we have had steam engines and other sorts of power. The human mind is the resource that human beings apply to the world of work; it is therefore required for productivity increases in the post-industrial world. Some people, however, think that the United States doesn't need any productivity increase because we already have everything. People in Congress know better than that because they know we can't have all of the guns we would like to have and all of the butter and all of the health services and all of the other things, so there must be something still lacking in the productivity that we would like to have.

The prospects of increasing human productivity lie, to an important extent, in improving the productivity of the human mind. That can come about in a mixture of two ways. One is by augmenting it—using the computer to augment human thinking, just as we use the steam engine as an augmentation of human muscle. Secondly, and I think equally important, we can use the computer as a research tool to understand the cognitive processes of the human mind. In this way, we will enhance our own abilities to think and solve problems.

I think we have reached an understanding that the central problems the world faces today are not technological, or at least they do not lend themselves to technological fixes. To paraphrase the old farmer, "we already have more technological fixes than we have used yet." Why don't we use them? Our problems really are human problems. We ourselves are the problem, and the solution is going to come about by better understanding of ourselves, and as a part of that, understanding how the human mind works.