Simulations for regression analysis

What is the usefulness of simulations in regression analysis and how to implement them?

Date

September 16, 2025

Objective

After this session, you should be able to implement a basic simulation for regression analysis in R and use it to test some of the hypotheses you made in you own analysis.

Summary

In this session, we first discuss why we should implement simulations. The basic idea behind simulations is that the Data Generating Process is known: in particular, that allows to evaluate the performance of our analysis and to study what happens if an hypothesis made does not hold, etc. To build simulation, one should start with a simple DGP, wondering whether their analysis performs well in a rather “pristine” setting. Then, they can complexify the DGP. In the second part of the lecture, we implement a simulation in R.

Session Outline

  1. What are simulations?
  2. Why do simulations?
    • To understand econometric concepts
    • To design a study
    • For tests and checks
    • As a rhetorical tool
  3. How to implement a simulation?
  4. Coding a simulation in R together

Materials

Open slides in html

Open slides in pdf

Exercise

During the lecture, we emulate an RCT studying the impact of additional courses on students’ grades, in this document.

In addition, you have a graded assignment for next week.

Specific resources for this lecture

References

Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2021. Regression and Other Stories.
Huntington-Klein, Nick. 2022. The Effect: An Introduction to Research Design and Causality. 1st edition. Chapman and Hall/CRC.