THIS SITE IS UNDER CONSTRUCTION!!!

JUDEA PEARL'S LECTURE, "THE ART AND SCIENCE OF CAUSAL EFFECT"

GIVEN THURSDAY, OCTOBER 29, 1996 AS PART OF THE

UCLA 81ST FACULTY RESEARCH LECTURE SERIES.

BELOW IS A TRANSCRIPT Of UCLA'S FACULTY RESEARCH LECTURE SERIES:

STILL TO COME - SLIDES FROM THE TALK

4515 Boelter Hall

Los Angeles, CA 90095-1596

e-mail:*judea@cs.ucla.edu*

SLIDE 1: THE ART AND SCIENCE OF CAUSE AND EFFECT

**************************************************************

OPENING STATEMENT

Thank you Chancellor Young, colleagues, and members of the Senate Selection Committee for inviting me to deliver the eighty-first lecture in the UCLA Faculty Research Lectureship Program. It is a great honor to be deemed worthy of this podium, and to be given the opportunity to share my research with such a diverse and distinguished audience.

The topic of this lecture is causality -- namely, our awareness of what causes what in the world and why it matters. Though it is basic to human thought, Causality is a notion shrouded in mystery, controversy, and caution, because scientists and philosophers have had difficulties defining when one event TRULY CAUSES another. We all understand that the rooster's crow does not cause the sun to rise, but even this simple fact cannot easily be translated into a mathematical equation.

Today, I would like to share with you a set of ideas which I have found very useful in studying phenomena of this kind. These ideas have led to practical tools that I hope you will find useful on your next encounter with a cause and effect.

And it is hard to imagine anyone here who is NOT dealing with cause and effect. Whether you are evaluating the impact of bilingual education programs or running an experiment on how mice distinguish food from danger or speculating about why Julius Caesar crossed the Rubicon or diagnosing a patient or predicting who will win the 1996 presidential election, you are dealing with a tangled web of cause-effect considerations. The story that I am about to tell is aimed at helping researchers deal with the complexities of such considerations, and to clarify their meaning.

SLIDE 2: OUTLINE

**************************************************************

SLIDE 3: SHOWING ADAM EVE AND SNAKE DURER

**************************************************************

The thing to notice about this story is that God did not ask for explanation, only for the facts: It was Adam who felt the need to explain -- the message is clear, causal explanation is a man-made concept. Another interesting point about the story: explanations are used exclusively for passing responsibilities. Indeed, for thousands of years explanations had no other function. Therefore, only Gods, people and animals could cause things to happen, not objects, events or physical processes.

SLIDE 4: SHOWING THE FLIGHT OF LOT

**************************************************************

SLIDE 5: EGYPTIAN PICTURE

**************************************************************

SLIDE 6: PICTURE OF JONAH IN THE BOAT

**************************************************************

In summary, the agents of causal forces in the ancient world were either deities, who cause things to happen for a purpose, or human beings and animals, who possess free will, for which they are punished and rewarded.

SLIDE 7: WATER SCREW

**************************************************************

SLIDE 8: OF ARCHIMEDES MOVING EARTH

**************************************************************

SLIDE 9: OF PULLEYS

**************************************************************

Not surprisingly, these new agents of causation TOOK ON some of the characteristics of their predecessors -- Gods and humans. Natural objects became not only carriers of credit and blame, but also carriers of force, will, and even purpose. Aristotle regarded explanation in terms of a PURPOSE to be the only complete and satisfactory explanation for why a thing is what it is. He even called it a "FINAL CAUSE", namely, the final aim of scientific inquiry.

From that point on, causality served a dual role: CAUSES were the targets of credit and blame on one hand, and the carriers of physical flow of control on the other.

SLIDE 10: SHOWING WATER-MILL

**************************************************************

SLIDE 11: RECODES TITLE PAGE

**************************************************************

SLIDE 12: GALILEO

**************************************************************

SLIDE 13: GALILEO'S PRISON SCENE

**************************************************************

SLIDE 14: TITLE PAGE OF DISCORSI

**************************************************************

SLIDE 15: THE GALILEAN REVOLUTION

**************************************************************

SLIDE 16: INCLINED PLAIN EXPERIMENT

**************************************************************

SLIDE 17: GALILEAN EQUATION $d=t sup 2$

**************************************************************

SLIDE 18: GALILEAN BEAM

**************************************************************

Let us concentrate now on Galileo's first maxim, "description first explanation second", because that idea was taken very seriously by the scientists, and changed the character of science from speculative to empirical.

**************************************************************

SLIDE 20: HOOKES LAW

**************************************************************

SLIDE 21: HUME PORTRAIT

**************************************************************

SLIDE 22: TITLE PAGE OF HUME - "A TREATISE OF HUMAN NATURE"

**************************************************************

SLIDE 23: PAGE 156 FROM "A TREATISE OF HUMAN NATURE"

**************************************************************

"Thus we remember to have seen that species of object we call *FLAME*, and to have felt that species of sensation we call *HEAT*. We likewise call to mind their constant conjunction in all past instances. Without any farther ceremony, we call the one *CAUSE* and the other *EFFECT*, and infer the existence of the one from that of the other."

Thus, causal connections according to Hume are product of observations. Causation is a learnable habit of the mind, almost as fictional as optical illusions and as transitory as Pavlov's conditioning. It is hard to believe that Hume was not aware of the difficulties inherent in his proposed recipe. He knew quite well that the rooster crow STANDS in constant conjunction to the sunrise, yet it does not CAUSE the sun to rise. He knew that the barometer reading STANDS in constant conjunction to the rain, but does not CAUSE the rain. Today these difficulties fall under the rubric of SPURIOUS CORRELATIONS, namely "correlations that do not imply causation".

Now, taking Hume's dictum that all knowledge comes from experience, that experience is encoded in the mind as correlation, and our observation that correlation does not imply causation, we are led into our first riddle of causation: How do people EVER acquire knowledge of CAUSATION?

SLIDE 24: THE FIRST RIDDLE OF CAUSATION

**************************************************************

SLIDE 25: THE SECOND RIDDLE OF CAUSATION

**************************************************************

Continuing our example, what difference does it make if I told you that the rooster does cause the sun to rise? This may sound trivial. The obvious answer is that knowing what causes what makes a big difference in how we act. If the rooster's crow causes the sun to rise we could make the night shorter by waking up our rooster earlier and make him crow -- say by telling him the latest rooster joke.

But this riddle is NOT as trivial as it seems. If causal information has an empirical meaning beyond regularity of succession, then that information should show up in the laws of physics. But it does not! The philosopher Bertrand Russell made this argument in 1913:

SLIDE 26: PURGING CAUSALITY FROM PHYSICS?

**************************************************************

Another philosopher, Patrick Suppes, on the other hand, arguing for the importance of causality, noted that: "There is scarcely an issue of *PHYSICAL REVIEW* that does not contain at least one article using either `cause' or `causality' in its title."

What we conclude from this exchange is that
physicists talk, write, and think one way and formulate
physics in another.
Such bi-lingual activity would be forgiven if causality
was used merely as a
convenient communication device --- a shorthand for expressing
complex patterns of physical relationships
that would otherwise take many equations to write.
After all! Science is full of abbreviations:
We use, "multiply *x* by 5", instead of "add *x*
to itself 5 times"; we say: "density" instead of
"the ratio of weight to volume".
Why pick on causality?

"Because causality is different," Lord Russell would argue,
"It could not possibly be an abbreviation,
because the laws of physics are all symmetrical, going both ways,
while causal relations are uni-directional, going from cause to effect."
Take for instance Newton's law
*f = ma*
The rules of algebra permit us to write this law
in a wild variety of syntactic forms, all meaning
the same thing -- that if we know any two of the three quantities,
the third is determined.
Yet, in ordinary discourse
we say that force causes acceleration -- not that
acceleration causes force, and we feel very strongly
about this distinction.
Likewise, we say that the ratio *f/a* helps us DETERMINE
the mass, not that it CAUSES the mass.
Such distinctions are not supported by the
equations of physics, and this leads us to ask whether
the whole causal vocabulary is purely
metaphysical.
"surviving, like the monarchy...etc."

Fortunately, very few physicists paid attention to Russell's enigma. They continued to write equations in the office and talk cause-effect in the CAFETERIA, with astonishing success, they smashed the atom, invented the transistor, and the laser. The same is true for engineering. But in another arena the tension could not go unnoticed, because in that arena the demand for distinguishing causal from other relationships was very explicit. This arena is statistics.

The story begins with the discovery of correlation, about one hundred years ago.

SLIDE 27: PORTRAIT OF GALTON

**************************************************************

SLIDE 28: TITLE PAGE "NATURAL INHERITANCE"

**************************************************************

SLIDE 29: PAGE SHOWING PLOT

**************************************************************

SLIDE 30: PORTRAIT OF PEARSON IN 1890

**************************************************************

SLIDE 31: PEARSON 1934

**************************************************************

Now, Pearson has been described as a person "with the kind of drive and determination that took Hannibal over the Alps and Marco Polo to China." When Pearson felt like a buccaneer, you can be sure he gets his bounty.

SLIDE 32: CONTINGENCY TABLE

**************************************************************

SLIDE 33: PEARSON 1934

**************************************************************

SLIDE 34: FISHER

**************************************************************

And that is roughly where things stand today... If we count the number of doctoral theses, research papers, or textbooks pages written on causation, we get the impression that Pearson still rules statistics. The "Encyclopedia of Statistical Science" devotes 12 pages to correlation but only 2 pages to causation, and spends one of those pages demonstrating that "correlation does not imply causation."

Let us hear what modern statisticians say about causality

SLIDE 35: MODERN STATISTICS AND CAUSALITY

**************************************************************

This position of caution and avoidance has paralyzed many fields that look to statistics for guidance, especially economics and social science. A leading social scientist stated in 1987: "It would be very healthy if more researchers abandon thinking of and using terms such as cause and effect."

Can this state of affairs be the work of just one person? even a buccaneer like Pearson? I doubt it.

But how else can we explain why statistics, the field that has given the world such powerful concepts as the testing of hypothesis and the design of experiment would give up so early on causation?

One obvious explanation is, of course, that causation is much harder to measure than correlation. Correlations can be estimated directly in a single uncontrolled study, while causal conclusions require controlled experiments.

But this is too simplistic; statisticians are not easily detrred by difficulties and children manage to learn cause effect relations WITHOUT running controlled experiments. The answer, I believe lies deeper, and it has to do with the official language of statistics, namely the language of probability. This may come as a surprise to some of you but the word "CAUSE" is not in the vocabulary of probability theory; we cannot express in the language of probabilities the sentence, "MUD DOES NOT CAUSE RAIN" -- all we can say is that the two are mutually correlated, or dependent -- meaning if we find one, we can expect the other. Naturally, if we lack a language to express a certain concept explicitly, we can't expect to develop scientific activity around that concept. Scientific development requires that knowledge be transferred reliably from one study to another and, as Galileo has shown 350 years ago, such transference requires the precision and computational benefits of a formal language.

I will soon come back to discuss the importance of language and notation, but first, I wish to conclude this historical survey with a tale from another field in which causation has had its share of difficulty. I will soon come back to discuss the importance of language and notation, but first, I wish to conclude this historical survey with a tale from another field in which causation has had its share of difficulty. This time it is computer science -- the science of symbols -- a field that is relatively new, yet it has placed a tremendous emphasis on language and notation and, therefore, may offer a useful perspective on the problem.

When researchers began to encode causal relationships using computers, the two riddles of causation were awakened with renewed vigor.

SLIDE 36: ROBOT IN LAB

**************************************************************

SLIDE 37: ROBOT WITH MENTOR

**************************************************************

SLIDE 38: OLD RIDDLES IN NEW DRESS

**************************************************************

SLIDE 39: PROGRAMMER'S NIGHTMARE

**************************************************************

SLIDE 40: OUTLINE - PART 2

**************************************************************

Let us start with an area that uses causation extensively and never had any trouble with it: Engineering.

SLIDE 41: CIRCUIT DIAGRAM

**************************************************************

The designer of this circuit did not anticipate or even consider such weird interventions, yet, miraculously, we can predict their consequences. How? Where does this representational power come from?

It comes from what early economists called AUTONOMY, namely, the gates in these diagram represent independent mechanisms -- it is easy to change one without changing the other. The diagram takes advantage of this independence and describes the normal functioning of the circuit USING PRECISELY THOSE BUILDING BLOCKS THAT WILL REMAIN UNALTERED UNDER INTERVENTION.

My colleagues from Boelter Hall are surely wondering why I stand here before you blathering about an engineering triviality as if it were the 8th wonder of the world. I have three reasons for doing this. First, I will try to show that there is a lot of unexploited wisdom in practices that engineers take for granted.

SLIDE 42: PATH DIAGRAMS

**************************************************************

Finally, these diagrams capture in my opinion, the very essence of causation -- the ability to predict the consequences of abnormal eventualities and new manipulations. In this diagram, for example, it is possible to predict what coat pattern the litter guinea-pig is likely to have, if we change environmental factors, shown here by as input (E) in green, or even genetic factors, shown in red as intermediate nodes between parents and offsprings (H). Such predictions cannot be made on the basis of algebraic or correlational analysis.

Viewing causality this way explains why scientists pursue causal explanations with such zeal, and why attaining a causal model is accompanied with a sense of gaining "deep understanding" and "being in control."

SLIDE 43: DUCK MACHINE

**************************************************************

Interestingly, when we have such understanding we feel "in control" even when if we have no practical way of controlling things. For example, we have no practical way to control celestial motion, and still the theory of gravitation gives us a feeling of understanding and control, because it provides a blueprint for hypothetical control. We can predict the effect on tidal waves of unexpected new events, say, the moon being hit by a meteor or the gravitational constant suddenly diminishing by a factor of 2 and, just as important, the gravitational theory gives us the assurance that ordinary manipulation of earthly things will NOT control tidal waves. It is not surprising that causal models are viewed as the litmus test distinguishing deliberate reasoning from reactive or instinctive response. Birds and monkeys may possibly be trained to perform complex tasks such as fixing a broken wire, but that requires trial-and-error training. Deliberate reasoners, on the other hand, can anticipate the consequences of new manipulations WITHOUT EVER TRYING those manipulations.

SLIDE 44: EQUATIONS VS. DIAGRAMS

**************************************************************

But are these equations EQUIVALENT to the diagram
on the right?
Obviously not!
If they were, then let us switch the variables
around, and the resulting two
equations should be equivalent to the circuit shown
below.
But these two circuits are different.
The top one tells us that if we physically manipulate *Y* it will
affect *Z*, while the bottom one shows that manipulating *Y* will affect X
and will have no effect on *Z*.
Moreover, performing some additional algebraic operations
on our equations, we can obtain two new equations, shown
at the bottom, which point to no structure AT ALL;
they simply represent two constraints on
three variables, without telling us how they influence
each other.

Let us examine more closely the mental process by which we
determine the effect of physically manipulating *Y*, say setting *Y* to 0.

SLIDE 45: INTERVENTION AS SURGERY ON MECHANISM

**************************************************************

We now see how this model of intervention leads to
a formal definition of causation:
"*Y* is a cause of *Z* if we can change *Z* by manipulating *Y*, namely, if
after surgically removing the equation for *Y*, the solution for
*Z* will depend on the new value we substitute for *Y*".
We also see how vital the diagram is in this
process.
THE DIAGRAM TELLS US WHICH EQUATION
IS TO BE DELETED WHEN WE MANIPULATE *Y*.
That information is totally washed out when we transform the
equations into algebraically equivalent form, as shown
at the bottom of the screen --
from this pair equations alone, it is impossible to
predict the result of setting *Y* to 0, because we do not
know what surgery to perform -- there is no
such thing as "the equation for *Y*".

IN SUMMARY, INTERVENTION AMOUNTS TO A SURGERY ON EQUATIONS, GUIDED BY A DIAGRAM, AND CAUSATION MEANS PREDICTING THE CONSEQUENCES OF SUCH A SURGERY.

This is a universal theme that goes beyond physical systems. In fact, the idea of modeling interventions by "wiping out" equations was first proposed by an ECONOMIST, Herman Wold in 1960, but his teachings have all but disappeared from the economics literature. History books attribute this mysterious disappearance to Wold's personality, but I tend to believe that the reason goes deeper: Early econometricians were very careful mathematicians; they fought hard to keep their algebra clean and formal, and could not agree to have it contaminated by gimmicks such as diagrams. And as we see on the screen the surgery operation makes no mathematical sense without the diagram, as it is sensitive to the way we write the equations.

Before expounding on the properties of this new mathematical operation, let me demonstrate how useful it is for clarifying concepts in statistics and economics.

SLIDE 46: INTERVENTION AS SURGERY - CONTROLLED EXPERIMENTS

**************************************************************

It actually consists of two parts, randomization and INTERVENTION. Intervention means that we change the natural behavior of the individual: we separate subjects into two groups, called treatment and control, and we convince the subjects to obey the experimental policy. We assign treatment to some patients who, under normal circumstances, will not seek treatment, and we give placebo to patients who otherwise would receive treatment. That, in our new vocabulary, means SURGERY -- we are severing one functional link and replacing it by another. Fisher's great insight was that connecting the new link to a random coin flip, GUARANTEES that the link we wish to break, is actually broken. The reason is, that a random coin is assumed unaffected by anything we can measure on a macroscopic level, including, of course, a patient socio-economic background.

This picture provides a meaningful and formal rationale for the universally accepted procedure of radomized trials. In contrast, our next example uses the surgery idea to point out inadequacies in widely accepted procedures.

SLIDE 47: EXAMPLE 2 - POLICY ANALYSIS

**************************************************************

In this set-up, it is impossible of course to connect our policy to a coin and run a controlled experiment; we do not have the time for that, and we might ruin the economy before the experiment is over. Nevertheless the analysis that we SHOULD CONDUCT is to infer the behavior of this mutilated model from data governed by a non-mutilated model.

I said, SHOULD CONDUCT, because you will not find such analysis in any economics textbook. As I mentioned earlier, the surgery idea of Herman Wold, was stamped out of the economics literature in the 1970's and all discussions on policy analysis that I could find, assume that the mutilated model prevails throughout. The fact that taxation is under government control at the time of evaluation is assumed to be sufficient for treating taxation an exogenous variable throughout when, in fact, taxation is an engodenous variable during the model-building phase, and turns exogenous only when evaluated. Of course, I am not claiming that reinstating the surgery model would enable the government to balance its budget overnight, but it is certainly something worth trying.

Let us examine now how the surgery interpretation resolves Russell's
enigma: concerning the clash between the directionality of
causal relations and the symmetry of physical equations.
The equations of physics are indeed symmetrical,
but when we compare the phrases "*A* CAUSES *B*"
vs. "*B* CAUSES *A*" we are not talking about a single
set of equations.
Rather, we are comparing two
world models, represented by two different sets of equations;
one in which the equation for *A* is surgically removed,
the other where the equation for *B* is removed.
Russell would probably stop us at this point and ask:
"How can you talk about TWO world models,
when in fact there is only one world model, given by all the equations
of physics put together?"
The answer is: YES.
If you wish to include the entire universe in the model,
causality disappears because interventions disappear -- the manipulator and the
manipulated loose their distribution.
However, scientists rarely
consider the entirety of the universe as an object of
investigation.
In most cases the scientist carves a
piece from the universe and proclaims that piece: IN namely, the FOCUS
of investigation.
The rest of the universe is then
considered OUT or BACKGROUND, and is summarized by
what we call BOUNDARY Conditions
THIS CHOICE OF inS AND outS CREATES ASYMMETRY IN THE WAY WE
LOOK AT THINGS, AND IT IS THIS ASYMMETRY THAT PERMITS US TO TALK ABOUT
"OUTSIDE INTERVENTION", HENCE, CAUSALITY AND CAUSE-EFFECT DIRECTIONALITY.

SLIDE 48: HAND-EYE ONLY

**************************************************************

SLIDE 49: HAND-EYE W/ OUT-IN

**************************************************************

SLIDE 50: HAND-EYE W/ IN-OUT

**************************************************************

SLIDE 51: FROM PHYSICS TO CAUSALITY

**************************************************************

We discussed earlier how important the computational facility of algebra was to scientists and engineers in the Galilean era. Can we expect such algebraic facility to serve causality as well? Let me rephrase it differently: Scientific activity, as we know it, consists of two basic components:

SLIDE 52: TELESCOPE

**************************************************************

SLIDE 53: HAMMER

**************************************************************

SLIDE 54: LABORATORY

**************************************************************

SLIDE 55: NEEDED: ALGEBRA OF DOING

**************************************************************

But suppose we ask a different question: "What is the chance it rained if we MAKE the grass wet?" We cannot even express our query in the syntax of probability, because the vertical bar is already taken to mean "given that I see". We can invent a new symbol "DO", and each time we see a DO after the bar we read it "GIVEN THAT WE DO" -- but this does not help us compute the answer to our question, because the rules of probability do not apply to this new reading. We know intuitively what the answer should be: P(rain), because making the grass wetdoes not change the chance of rain. But can this intuitive answer, and others like it, be derived mechanically? so as to comfort our thoughts when intuition fails?

The answer is YES, and it takes a new algebra: First, we assign a symbol to the new operator "given that I do". Second, we find the rules for manipulating sentences containing this new symbol. We do that by a process analogous to the way mathematicians found the rules of standard algebra.

SLIDE 56: NEEDED: ALGEBRA OF DOING

**************************************************************

In exactly the same fashion, we can deduce the rules
that govern our new symbol: *do(x)*.
We have an algebra for seeing, namely, probability theory.
We have a new operator, with a brand new red outfit and a
very clear meaning, given to us by the surgery procedure.
The door is open for deduction and the
result is give in the next slide.

SLIDE 57: CAUSAL CALCULUS

**************************************************************

SLIDE 58: OUTLINE

**************************************************************

SLIDE 59: SMOKING - CANCER

**************************************************************

These studies came under severe attacks from
the tobacco industry, backed by some very prominent statisticians,
among them Sir Ronald Fisher.
The claim was that the observed correlations can
also be explained by a model in which there is no causal connection
between smoking and lung cancer.
Instead, an unobserved
genotype might exist which simultaneously causes cancer and
produces an inborn craving for nicotine.
Formally, this claim would be written in our notation
as: *P(cancer | do(smoke)) = P(cancer)*
stating that making the population smoke or stop smoking
would have no effect on the rate of cancer cases.
Controlled experiment could decide between the two models, but
these are impossible, and now also illegal to conduct.

This is all history. Now we enter a hypothetical era where representatives of both sides decide to meet and iron out their differences. The tobacco industry concedes that there might be some weak causal link between smoking and cancer and representatives of the health group concede that there might be some weak links to genetic factors. Accordingly, they draw this combined model, and the question boils down to assessing, from the data, the strengths of the various links. They submit the query to a statistician and the answer comes back immediately: IMPOSSIBLE. Meaning: there is no way to estimate the strength from the data, because any data whatsoever can perfectly fit either one of these two extreme models. So they give up, and decide to continue the political battle as usual.

Before parting, a suggestion comes up: perhaps we can resolve our differences, if we measure some auxiliary factors, For example, since the causal link model is based on the understanding that smoking affects lung cancer through the accumulation of tar deposits in the lungs, perhaps we can measure the amount of tar deposits in the lungs of sampled individuals, and this might provide the necessary information for quantifying the links? Both sides agree that this is a reasonable suggestion, so they submit a new query to the statistician: Can we find the effect of smoking on cancer assuming that an intermediate measurement of tar deposits is available??? The statistician comes back with good news: IT IS COMPUTABLE and, moreover, the solution is given in close mathematical form. HOW?

SLIDE 60: PROOF

**************************************************************

You are probably wondering whether this derivation solves the smoking-cancer debate. The answer is NO. Even if we could get the data on tar deposits, the model above is quite simplistic, as it is based on certain assumptions which both parties might not agree to. For instance, that there is no direct link between smoking and lung cancer, immediated by tar deposits. The model would need to be refined then, and we might end up with a graph containing 20 variables or more. There is no need to panic when someone tells us: "you did not take this or that factor into account". On the contrary, the graph welcomes such new ideas, because it is so easy to add factors and measurements into the model. Simple tests are now available that permit an investigator to merely glance at the graph and decide if we can compute the effect of one variable on another.

Our next example illustrates how a long-standing problem is solved by purely graphical means -- proven by the new algebra. The problem is called THE ADJUSTMENT PROBLEM or "the covariate selection problem" and represents the practical side of Simpson's paradox.

SLIDE 61: SIMPSON'S PARADOX

**************************************************************

Equally disturbing is the fact that no one has been able to tell us which factors SHOULD be included in the analysis. Such factors can now be identified by simple graphical means.

The classical case demonstrating Simpson's paradox took place in 1975, when UC Berkeley was investigated for sex bias in graduate admission. In this study, overall data showed a higher rate of admission among male applicants, but, broken down by departments, data showed a slight bias in favor of admitting female applicants. The explanation is simple: female applicants tended to apply to more competitive departments than males, and in these departments, the rate of admission was low for both males and females.

SLIDE 62: FISHNET

**************************************************************

Another example involves a controversy called "reverse regression", which occupied the social science literature in the 1970's. Should we, in salary discrimination cases, compare salaries of equally qualified men and women, or, instead, compare qualifications of equally paid men and women? Remarkably, the two choices led to opposite conclusions. It turned out that men earned a higher salary than equally qualified women, and SIMULTANEOUSLY, men were more qualified than equally paid women. The moral is that all conclusions are extremely sensitive to which variables we choose to hold constant when we are comparing, and that is why the adjustment problem is so critical in the analysis of observational studies.

SLIDE 63: THE STATISTICAL ADJUSTMENT PROBLEM

**************************************************************

SLIDE 64: GRAPHICAL SOLUTION OF THE ADJUSTMENT PROBLEM

**************************************************************

SLIDES 65-69: GRAPHICAL SOLUTION OF THE ADJUSTMENT PROBLEM (CONT)

**************************************************************

ENDING STATEMENT

I now wish to summarize briefly the central message of this lecture. It is true that testing for cause and effect is difficult. Discovering causes of effects is even more difficult. But causality is not MYSTICAL OR METAPHYSICAL. It can be understood in terms of simple processes, and it can be expressed in a friendly mathematical language, ready for computer analysis.

SLIDE 70: ABACUS HELD BY CHILD

*********************************************************

SLIDE 71: BOETHIUS ON MATH NOTATION

*********************************************************

But the really challenging problems are still ahead: We still do not have a causal understanding of POVERTY and CANCER and INTOLERANCE, and only the accumulation of data and the insight of great minds will eventually lead to such understanding. The data is all over the place, the insight is yours, and now an abacus is at your disposal too. I hope the combination amplifies each of these components. Thank you.

[Remarks: technical details can be found in J. Pearl, "Causal diagrams for experimental research," (with discussion), Biometrika, 82(4), 669--710, December 1995, and J. Pearl, "Structural and probabilistic causality,'' In D.R. Shanks, K.J. Holyoak, and D.L. Medin (Eds.), The Psychology of Learning and Motivation, Vol. 34 Academic Press, San Diego, CA, 393--435, 1996. Copies are available at Cognitive Systems Laboratory Publications.

ÿÿ±ËÿÿäËÿÿ