Lecture 6 - Identification: IV


Topics in Econometrics - M2 ENS Lyon

Vincent Bagilet

2025-10-14

Introduction

Goal of the session

  • Strengthen intuition on Instrumental Variables (IV)
  • Review key assumptions
  • Focus on some specific points:
    • LATE
    • Weak instruments
  • Learning through applied papers
  • Implement some analyses yourself

An example

Air pollution and health



  • How would you measure health effects of air pollution?

  • Why not regress an health outcome (mortality or hospital admissions for instance) on air pollution?

    • OVB: air pollution might be affected by unobserved factors (eg economic activity, income, healthcare, etc)
    • Measurement error (if correlated with treatment)
    • How to separate between pollutants?
  • How would you proceed?

    • Need to find a source of exogeneity
    • Something that affects air pollution levels but do not otherwise affect health

An instrument for air pollution

Why does that work?

Intuition

  • Wind \Rightarrow “as-good-as-random” variation in exposure to pollutants
  • Mimics randomized experiment
  • Wind affects pollution levels: wind moves pollutants

    • ie relevance
  • Wind direction varies quasi-randomly over time

    • ie independence (or exogeneity)
  • Wind itself doesn’t affect health (after controlling for temperature and humidity)

    • ie exclusion restriction
  • Individuals down wind of pollution source receive more pollution (no defiers)

    • ie motonicity

Fundamentals

Other ideas of instruments?






  • Why are they each valid instruments?

Formal intuition

  • yi=α+βxi+uiy_i = \alpha + \beta x_i + u_i with cov(x,u)0cov(x, u) \neq 0

  • Problem: xix_i is endogenous

  • IV solution:

    • Find a variable zz that shifts xx for reasons unrelated to unobservables in yy
    • It isolates variation in xx that is unrelated to uu and recover β\beta
    • Focuses on the variation that is explained by the instrument

Potential outcomes

Group Treatment status Effect of instrument
Compliers di(zi=1)=1,di(zi=0)=0d_i(z_i = 1) = 1,\; d_i(z_i = 0) = 0 Instrument \nearrow proba of treatment
Defiers di(zi=1)=0,di(zi=0)=1d_i(z_i = 1) = 0,\; d_i(z_i = 0) = 1 Instrument \searrow proba of treatment
Never-takers di(zi=1)=0,di(zi=0)=0d_i(z_i = 1) = 0,\; d_i(z_i = 0) = 0 Instrument has no effect
Always-takers di(zi=1)=1,di(zi=0)=1d_i(z_i = 1) = 1,\; d_i(z_i = 0) = 1 Instrument has no effect
  • Effect of the instrument on the treatment varies across individuals
  • We only observe ziz_i and did_i, not these groups

Warning

The IV estimator identifies the effect for compliers only (a Local Average Treatment Effect, LATE)

Identifying assumptions


Assumption. Formal expression Intuition
Independence, exogeneity Cov(z,v)=0\text{Cov}(z, v) = 0 No unobserved confounders affecting both zz and yy
Exclusion restriction Cov(z,u)=0\text{Cov}(z, u) = 0 zz affects yy only through dd
Relevance Cov(z,d)0\text{Cov}(z, d) \neq 0 zz does affect dd
Monotonicity di(zi=1)di(zi=0)d_i(z_i = 1) \ge d_i(z_i = 0) with zz (no defiers) zz is an incentive, does not discourage treatment
  • Exclusion restriction ITTnever-takers=ITTalways-takers=0\Rightarrow ITT_{\text{never-takers}} = ITT_{\text{always-takers}} = 0.
  • Motonicity \Rightarrow compliers are the only individuals whose treatment status could be altered by the instrument

Validity and exclusion restriction

  • Cannot show that the exclusion restriction holds
  • Need to convince the audience that it does with logic arguments
  • Need to find “clever” instruments
  • But some instruments may turn out to be not so great


Exclusion-restriction violations

  • Sometimes an instrument affects many (many) other variables \Rightarrow it may affect the outcome through another path (Mellon 2024)

LATE

What does the IV estimate?



βIV:=Cov(yi,zi)Cov(di,zi)=...=𝔼[yizi=1]𝔼[yizi=0]𝔼[dizi=1]e[dizi=0]=...=𝔼[yi(1)yi(0)|di(1)=1,di(0)=0]LATE on the compliers \begin{aligned} \beta_{IV} & := \dfrac{\text{Cov}(y_i, z_i)}{\text{Cov}(d_i, z_i)} = \ ... \\ & = \dfrac{\mathbb{E}[y_i \mid z_i = 1] - \mathbb{E}[y_i \mid z_i = 0]}{\mathbb{E}[d_i \mid z_i = 1] - e[d_i \mid z_i = 0]} = \ ... \\ & = \underbrace{\mathbb{E}\left[y_i(1) - y_i(0) \;\middle|\; d_i(1) = 1,\; d_i(0) = 0 \right]}_{\text{LATE on the compliers}} \end{aligned}

IV ≠ ATE

  • The IV estimator only captures the causal effect for those individuals whose treatment status changes because of the instrument (the compliers)
  • ie the ATE for compliers (a LATE)

LATE - Intuition


  • The instrument isolates the local variation in treatment driven by zz
  • If treatment effects are heterogeneous, IV does not recover the overall average treatment effect (ATE)
  • Instead, it identifies the effect local to those influenced by the instrument


Take-away message

  • Always ask yourself: who are the compliers?
  • The LATE is causal, but is local: it depends on the instrument used
  • Using a different instrument identifies a different LATE

LATE - Example


  • Want to estimate the causal effect of education (dd) on earnings (yy)

  • Education not randomly assigned

  • Studied in Card (1993)

  • Solution:

    • Use an instrument zz, such as proximity to a college
    • Is it a valid instrument? Yes, arguably. It affects the likelihood of getting more education but does not affect earnings directly
  • It estimates:

    • The ATE of education on earnings for those whose education decisions are changed by the instrument
    • ie students who attend college because they live nearby

The LATE differs from the ATE and the ATT




Estimand Definition Estimates the effect for? Identified by
ATE 𝔼[yi(1)yi(0)]\mathbb{E}[y_i(1) - y_i(0)] The entire population Randomized experiment
ATT 𝔼[yi(1)yi(0)di=1]\mathbb{E}[y_i(1) - y_i(0) \mid d_i = 1] Those who actually receive treatment Selection on observables
LATE 𝔼[yi(1)yi(0)|di(1)=1,di(0)=0]\begin{split}\mathbb{E}[ y_i(1) - y_i(0) \\ \ | \ d_i(1) = 1, d_i(0) = 0]\end{split} Those whose treatment status is affected by the instrument (the compliers) Instrumental Variables (IV)

Weak instruments

Variation used for identification

  • Variation used for identification may come from only a few observations: compliers

  • In IV, use the variation that is explained by the instrument

  • Throws out variation not explained by the instrument

  • Throwing out variation increases variance \Rightarrow σβIV2σβOLS2\sigma^2_{\beta_{IV}} \ge \sigma^2_{\beta_{OLS}}

  • May end up with too little variation to estimate the effect of the treatment

Weak Instruments




  • What is a weak instrument?
    • An instrument that does not strongly predict treatment did_i
    • In the first stage (di=π0+π1zi+vid_i = \pi_0 + \pi_1 z_i + v_i) a weak instrument has π10\pi_1 \approx 0.
  • Why problematic?
    • IV uses the variation in dd explained by zz
    • If zz barely changes dd, the IV estimate is noisy and can be biased
    • Small changes in sample or error can leas to large changes in the estimate

Consequences


  1. Large variance: Confidence intervals can be very wide
  2. Finite-sample bias: IV can be biased towards OLS (especially in small samples)
  3. Distorted inference: Standard t-tests and F-tests become unreliable


Take-away message

Even if the instrument is valid, if it is weak, IV estimates are not trustworthy


  • Testing for relevance: compute the first-stage F-stat (rule of thumb: ok if F-stat > 10)

Estimation

IV Estimator and 2SLS

  • Wald estimator: sample analog of the estimand

β̂W=Cov̂(y,z)Cov̂(d,z)\widehat{\beta}_W = \dfrac{\widehat{\text{Cov}}(y, z)}{\widehat{\text{Cov}}(d, z)}

  • Numerically equivalent to Two-Stage Least Squares (2SLS) estimator β̂2SLS\widehat{\beta}_{2SLS} obtained through:

    1. Regress dd on instrument(s) zz and controls and retrieve predicted values

    di=π0+π1zi+vid̂i=𝔼̂[dizi] d_i = \pi_0 + \pi_1 z_i + v_i \quad \Rightarrow \quad \widehat{d}_i = \widehat{\mathbb{E}}[d_i \mid z_i]

    1. Regress yy on predicted treatment d̂\hat{d}

    yi=α+βd̂i+eiy_i = \alpha + \beta \widehat{d}_i + e_i





Warning

  • SEs of the 2nd stage do not give correct SEs (they do not account for the first-stage estimation)
  • Need to adjust for the two stages
  • 2SLS packages (eg ivreg, fixest) automatically adjust SEs
  • Alternatively, use bootstrapping to get correct standard errors

Control function approach

  • Alternative approach for the IV
  • Steps:
    1. Regress dd on zz and controls
    2. Retrieve the residuals (the part of dd unexplained by zz)
    3. Regress yy on dd and these residuals
  • Adjusting for these residuals partials this unexplained variation out
  • Only leaves out the explained variation

Practical advice

  • Support relevance by showing a large F-statistic for the 1st stage
    • F > 10 (but the larger the better
  • Think about what are your compliers
    • Different instruments \Rightarrow different estimands
  • Make sure instrument is relevant w.r.t. the policy of interest
  • If weak IV (and you don’t want to give up)
  • Adjust for all relevant pre-treatment variables (predictors of yy not affected by dd)
  • For models non-linear in dd, properties of 2SLS do not necessarily hold
    • Consider alternative estimation strategies (e.g., control function method)

Summaries

Strengths and weaknesses



  • ++ Compelling identification strategy
  • ++ Can use IVs to address attenuation bias that may result from measurement error in dd


  • - βIV\beta_{IV} has finite sample bias. Stems from randomness in estimates of dd
  • - βIV\beta_{IV} less efficient than OLS, precision further \searrow with weak instruments
  • - Weak instruments can make βIV\beta_{IV} way less efficient and even more biased than βOLS\beta_{OLS}
  • - In many settings (e.g., non-linear dd), 2SLS can be very biased

Exercises

  1. Run two regressions on the wooldridge::card data using fixest::feols to compare the 2SLS and the control function approaches
    • Key variables:
      • Outcome: lwage (log wage)
      • Treatment: educ (years of schooling)
      • Instrument: nearc4 (near 4-year college)
      • Controls: exper, black, south, smsa, fatheduc, motheduc
  2. Build a simple and naive simulation to explore the impact of the use of instruments on measurement error (depending on the correlation between measurement error and the treatment)

Summary




Take away messages

  • IV can be particularly helpful identification strategy

  • An IV estimates a LATE on compliers and not the ATE

  • Weak instruments should be avoided

  • There are 3 ways of estimating an IV (ratio of covariances, 2SLS, control function)

References

Andrews, Isaiah, James H. Stock, and Liyang Sun. 2019. “Weak Instruments in Instrumental Variables Regression: Theory and Practice.” Annual Review of Economics 11 (1): 727–53. https://doi.org/10.1146/annurev-economics-080218-025643.
Arceo, Eva, Rema Hanna, and Paulina Oliva. 2016. “Does the Effect of Pollution on Infant Mortality Differ Between Developing and Developed Countries? Evidence from Mexico City.” The Economic Journal 126 (591): 257–80. https://doi.org/10.1111/ecoj.12273.
Card, David. 1993. “Using Geographic Variation in College Proximity to Estimate the Return to Schooling.” Working {{Paper}} 4483. Working Paper Series. National Bureau of Economic Research. https://doi.org/10.3386/w4483.
Deryugina, Tatyana, Garth Heutel, Nolan H. Miller, David Molitor, and Julian Reif. 2019. “The Mortality and Medical Costs of Air Pollution: Evidence from Changes in Wind Direction.” American Economic Review 109 (12): 4178–4219. https://doi.org/10.1257/aer.20180279.
Mellon, Jonathan. 2024. “Rain, Rain, Go Away: 194 Potential Exclusion-Restriction Violations for Studies Using Weather as an Instrumental Variable.” American Journal of Political Science n/a (n/a). https://doi.org/10.1111/ajps.12894.
Moretti, Enrico, and Matthew Neidell. 2011. “Pollution, Health, and Avoidance Behavior: Evidence from the Ports of Los Angeles.” Journal of Human Resources 46 (1): 154–75. https://doi.org/10.1353/jhr.2011.0012.
Schlenker, Wolfram, and W. Reed Walker. 2016. “Airports, Air Pollution, and Contemporaneous Health.” The Review of Economic Studies 83 (2): 768–809. https://doi.org/10.1093/restud/rdv043.