A Reader’s Diary: Mastering `Metrics


Joshua D. Angrist, Jörn-Steffen Pischke

This book provides a great informal and practical overview of the strategies used to infer causality in modern economics research. It explains these strategies by breaking down landmark studies that use them. It also presents a clear picture of the statistical methods without delving deep into the maths. Overall, I found it a great complement to a more formal econometrics class. It is certainly not technical enough to teach you how to use the methods, but it is still a good and accessible starting point.

The book goes over:

Randomised Control Trials: e.g. medical trials – the researcher constructs two similar groups and assigns a treatment to one of them to assess its effect. Unfortunately, it is almost impossible to conduct in most economic contexts.

Regression: This technique controls for possible confounders—e.g., more educated people earn more. However, children of richer parents are also more educated and earn more. To separate the effect of education on wages, we use regression to find the effect of education on two people with the same parental income. This is a fairly rudimentary technique that is used in every empirical research paper.

Instrumental variable: Imitating a randomised trial by finding a naturally occurring process that is similar to random assignment – e.g. Charter school attendees earn more than public school attendees. But, people who apply to charter schools are more motivated and might have been successful anyway. in some US states, overfilled schools choose applicants via lottery. This means that the only difference between successful and unsuccessful applicants is their luck in the draw, not their motivation. Therefore, the differences in earnings between the successful and unsuccessful applicants are only because of their schooling.

Regression discontinuity: Using arbitrary breaks to study causal effects—e.g., the only difference between people who are 17.95 years old and those who are 18.05 years old is that they can legally drink. This means that the difference between the two age groups should only reflect the effect of legal alcohol drinking.

Differences in differences: Comparing the treated group with the general trend – e.g. Israel liberalised after a wave of Russian Jew immigrants. But the whole region liberalised – was it the Russians or the trend? We look at the relative standing in the region – perhaps it was the 10. most liberalised country before, and now, it is the 2. most liberalised. This should reflect just the event we’re studying.