There are four questions that we need to consider before we start a research agenda.
- What is the causal relationship of interest?
- the most interesting research in social science is about cause and effect
- A causal relationship is useful for making predictions about the consequences of changing circumstances or policies; it tells us what would happen in alternative or counterfactual worlds.
- The experiment that could ideally be used to capture the causal effect of interest.
- Ideal experiments are most ofter hypothetical
- What is your identification strategy?
- Angrist and Krueger (1999) used the term identification strategy to describe the manner in which a researcher uses observational data (i.e., data not generated by a randomized trial) to approximate a real experiment.
- What is your mode of statistical inference?
- The answer to this question describes the population to be studied, the sample to be used, and the assumptions made when constructing standard errors.
The most credible and influential research designs use random assignment. There are three problems we want to emphasize about experiments in uncovering causal effect.
The Selection Problem
First, we define a binary random variable D = 0 or 1. The outcome is denoted by Y.
In other word, Y(0i) is the health status of an individual had he not gone to the hospital and Y(1i) is the individual’s health status if he goes.
We can rewrite the observed outcome Y(i) = Y(0i) + (Y(1i) - Y(0i)) * D(i). And we denote Y(1i) - Y(0i) as the causal effect of hospitalization for an individual.
The comparison of average health conditional on hospitalization status is formally linked to the average causal effect as the following explanation.
The first term (average treatment effect on treated) is the average causal effect of hospitalization on those who were hospitalized while the second term is the selection bias, which is the difference in average between those who were and were not hospitalized.
Random assignment of D(i) solves the selection problem because random assignment makes D(i) independent of potential outcomes.
The quasi-experimental study of class size by Angrist and Lavy (1999) illustrates the manner in which non-experimental data can be analyzed in an experimental spirit.
Regression Analysis of Experiments
Regression is a useful tool for the study of causal questions, including the analysis of data from experiments. Suppose (for now) that the treatment effect is the same for everyone, say Y(1i) - Y(0i) , a constant. Selection bias amounts to correlation between the regression error term and the regressor, D(i). Regression plays an exceptionally important role in empirical economic research. Some regressions are simply descriptive tools, as in much of the research on earnings inequality.