Chapter 7

Causality

Correlation Is Not Causality

Chicken or Egg?

Siblings usually have similar heights. Tall people are expected to have tall siblings, and short people are expected to have short siblings. At least this is the conventional wisdom. Which sibling’s height explains the height of the other siblings? Is it the first child’s height that sets the height expectations for the rest of the siblings or the last child’s height? The question is whose height causes the height of the other siblings? Although most people would agree that sibling heights are highly correlated, few people would expect one child’s height to explain a sibling’s height. The cause, or the explanatory variable that determines sibling heights, is the heights of parents. However, a regression of height of one sibling on another sibling would result in a highly significant coefficient and other measures of suitability of the model. In this example, it is obvious that the source of high correlation between two siblings is not causal. The cause is the height of the parents.

How about correlation between income and consumption? Does income determine consumption, consumption determines income, or is there something else that determines both? The economic theory states that the causal direction must be from income toward consumption. As income changes, level of consumption changes because a fraction of each additional unit of income, known as the “marginal propensity to consume out of income,” is spent on consumption. The direction of causality is logical. When A causes B, then A can occur without B occurring, but B cannot occur unless A occurs. This type of causality exists between parents and children. Similar, but not exactly the same, relationships exist between income and consumption. Typically, earning income entitles one to consumption, while the opposite is not true. Nevertheless, one can imagine consumption without income, such as through borrowing, charity, theft. The important thing is that consumption does not cause income.

Regression analysis can and does provide evidence for causality, but the reason for causality is rooted in economic theory, not in the results of regression analysis. Recall how in Chapter 1 a random variable was regressed on another random variable. Obviously, the random variable that was used as the independent variable did not cause the random variable used as dependent variable. Regression analysis provides evidence to disprove a hypothesis or fails to do so. Whether the hypothesis is false or groundless does not change the computations of regression analysis and, therefore, the outcome. Results of a regression analysis should never be used to form hypotheses about a model or its coefficients. In fact, claims about regression coefficients must be made, using economic theory, before any data are gathered.

The Role of Theory

Demonstrating a causal relationship requires a valid theory. The theory states the requirement and the outcome. For example, the theory of demand states when there is an exogenous increase in income, demand curve will shift to the right, and the price and quantity demanded will increase. Any time the theory’s requirements are met, the theory’s prediction would apply. This is a testable hypothesis using regression analysis. Note that the claim is that an increase in income causes the above-mentioned outcomes. Ignoring this causal factor and obtaining data on quantity demanded and price, and regressing the first on the second, would result in incorrect conclusion that a price increase causes an increase in the quantity demanded. Careless use of data and regression analysis without theory could prove disastrous. Demand theory is correct universally everywhere and for every good, except for an inferior good. An inferior good is a good with negative income elasticity; an increase in income would result in a decrease in quantity demanded for an inferior good. Therefore, the customary assertion that an increase in income causes an increase in quantity demanded does not apply to inferior goods. Most of studies using cross-sectional data can be causal in nature. In such studies, a theory is stated. The theory is believed to apply to all populations under study, such as firms, individuals, regions, or countries, and there is a causal effect between independent variable(s) and the dependent variable. If there are exceptions, such as the case of inferior goods in the above example they are noted. Such studies are suitable for verification of theories.

Direction of Causality

Causality is more evident in some studies than in others. For example, in a study of relationship between height of a father and height of his son, the direction of causality is clearly from father to son and never the other way around. The direction of cause-and-effect relationships between education and income is not as clear. It seems that people with higher levels of education should have higher levels of income, indicating a causal relationship where education determines income. However, it is easy to argue that people with higher (family) income attain higher levels of education, which places the causal relationship in the other direction, where income determines educational attainment. In fact, children of poor families have lower educational attainment, which in turn perpetuates poverty. A child’s educational attainment cannot be the cause of low family income, while low family income can, potentially, cause low educational attainment of the children of the family. In this case, using family income instead of individual income clearly establishes the direction of causality. We tend to believe or hope that the direction is from education to income and not the other way around, but the reality and logic cannot be ignored. We are not, however, claiming education does not have a positive effect on income. Sometimes, through using lagged variables, one can establish the direction of causality. However, it might not be possible to use this methodology to test the possibility of education causing income, because among other things, income levels are serially correlated, which means this year’s income is highly correlated with last year’s income.

Sometimes the direction of causality is misunderstood because of poor comprehension of the theory. In economics, the theory of demand states that a change in price results in a change in quantity demanded in the opposite direction, except in the case of a Giffen good. A novice might argue that if the quantity declines, the price must increase. The novice is correct in that a decline in quantity would result in an increase in price, but the cause is not a decline in quantity. Rather the cause is what made the quantity to decline. The statement, as presented, is a statement about the quantity supplied not demanded. Furthermore, the supply would not decline by itself; something must cause the reduction in supply, such as a natural disaster, or change in production capacity. It is not wise to ignore the origin of a cause and only look at part of a cause-and-effect chain.

What if the subject of the study does not have an existing theory? The necessary data to conduct research based on a theory cannot be obtained. Similar problems exit when there are conflicting or contradictory theories, the assumptions or the requirements of the theory cannot be met, or the theory does not provide a specific or measurable outcome. Examples abound, but only few suffice to make the point clear. Stock market prices follow a random walk model, which is stochastic in nature and cannot be estimated with conventional models such as regression. Factors that affect weekly sales are nonlinear and are affected by the calendar date as well as seasonal effects, and therefore, cannot be estimated with linear models such as regression. When an inappropriate linear regression model is used, the outcome would be meaningless.

Association Without Causality

In many cases problems that render regression-based causal models ineffective can be remedied by other methods. For example, a good estimator or even a predictor of today’s temperature is yesterday’s temperature and the temperature of the same date a year before. An economic example is the price of a home, which depends on the price of the neighboring homes, location, and other hedonic factors rather than the economic theory; this does not mean that the economic theory of supply and demand does not have a role in determining the price of a home. The point is that it is somewhat meaningless to say that yesterday’s temperature “caused” today’s temperature. Although regressing the temperature of a day on the temperature of the day before results in a “good fit,” it does not prove causality. Both the temperature of yesterday and today depend on the time of year, which changes the distance from the sun and the angle of the earth toward the sun. Here, the existence of the sun, the distance from it, and the earth angle toward the sun are the causes of a day’s temperature and all other factors are control variables.

Many time series data present characteristics that do not render themselves to causal regression analysis. There are specific regression-type analyses that apply to such data but are beyond the scope of this book. In general, noncausal models, especially those that depend or use time series data, are reliable in the short run but not in the long run. Causal-based models are reliable in the long run but might be affected by short-run shocks and do poorer in the short run. When regression is applied to cases where there are no theoretically stated causal relationships, the results only indicate an association and not causation. Therefore, causality must be established before the regression is performed and not inferred from the outcome of a regression analysis. A famous historical example is the statement that the sun spot activities cause business cycles. A high degree of association, represented by a large R2 value, does not indicate causation.

Ceteris Paribus

Students of economics are familiar with the concept of ceteris paribus, which means other things being equal. Economists have noticed long ago that numerous factors affect a particular phenomenon. For example, in demand theory, the relationship of interest is the link between price and quantity. In economics, the relationship is stated in terms of quantity being a function of price, which means a change in price would cause a change in quantity demanded. A change in quantity would also change price, however, the change in quantity is a supply issue, as stated earlier. This does not mean that nothing else affects quantity that one demands for a given price. Changes in taste, lifestyle, needs, family status, etc. all affect quantity demanded even if the price does not change. Imagine what would happen to quantity demanded of many goods for a student who graduates, finds a job, gets married, has children, etc. These are assumed to cause a shift in demand schedule while changes in price cause a change in quantity demanded. In order to create a demand theory, all of the other factors must be assumed to remain constant. In natural science, the factors that are not of interest can be kept constant while the same is not possible in social sciences. It would be unreasonable to tell a control group not to have children or finish their education so you can study their demand function. In regression analysis, these factors are included in the model and are called control variables. Although these variables are not of interest, nevertheless, their impact on the dependent variable must be accounted for and reported. Note that in multiple regression the contribution of each coefficient to the dependent variable is based on the presence of all the other variables in the model. In other words, each variable is treated as if it was the last variable to be added to the model. Therefore, βi is the impact of one unit change in Xi given that all the other variables are already in the model and their effect has been accounted.

The study of causal relationships between two factors while keeping other things equal does not preclude the possibility of studying a causal effect of one of the control variables in its own right. For instance, in the study of demand, the causal relationship of interest is between price and quantity demanded. In the process it is necessary to keep income constant. However, the study of causal relationship between income and quantity demanded has been pursued in the literature with as much interest as the price–quantity relationship and is still covered in all microeconomics books. Income–consumption curve is also known as Engel curve. The slope of the Engel curve is used to group goods as inferior, normal, necessity, and luxury goods. In a study of an Engle curve the price of the good, among other things, is assumed to remain constant.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset