Syllabus - Topics in Econometrics (ECO5106)
MSc Advanced Economics - ENS Lyon
Course website
All the information relative to this class can be found on the course website.
Instructor
Course Objectives and Overview
This course aims to explore some of the practical issues and challenges routinely faced when implementing an applied econometrics analysis. It ambitions to help students be aware of these challenges and to provide them with tools to spot others by themselves.
This course aims to give students a deeper understanding of:
- How regression works “under the hood”
- Common pitfalls and challenges in empirical work
- Advanced causal identification strategies and their assumptions
- How design, modeling, and analysis choices shape empirical results
- Using simulations to explore estimator behavior and diagnose potential problems.
Outline
More specifically, the course content will be divided into eight 3h-long lectures:
- Overview and fundamental hurdles
- Simulations
- Design beyond identification
- Design: Identification and Fixed Effects
- Data visualization
- Design: IV and RDD
- Modelling
- Analysis
Prerequisites
Prerequisites for this class include foundational knowledge in econometrics, statistical theory, causal inference, mathematics for economists, and familiarity with statistical software, in particular R and the Tidyverse.
Grading and assignments
Your grade for this class will be divided between several kind of assignments and grading mechanisms:
Assignment | Percentage of final grade | Due date |
---|---|---|
Final report | 30 % | November 7, 8pm |
Final presentation | 20 % | November 4, 8:30am |
Participation | 10 % | - |
Replication | 20 % | October 14, 8:30am |
Homework | 20 % | See below |
Homework
Your homework will be composed of graded assignment and/or readings. Readings are mandatory, we will discuss the papers in class together and everyone will be invited to participate in the discussion. These are due before the beginning of each lecture and according to the following schedule:
Due before lecture 1 (Hurdles - 11/09): -
Due before lecture 2 (Simulations - 16/09):
- Non-graded assignment: Simple Simulation and Leverage
Due before lecture 3 (Design - 23/09): -
Due before lecture 4 (Repeated observations - 24/09):
- Assignment: Simulation, school grades and statistical power
- Reading: De Chaisemartin and d’Haultfœuille (2023)
Due before lecture 5 (Data viz - 30/09):
Due before lecture 6 (IV - 08/10):
- Assignment: Graded project proposal
- Reading: Paper you got assigned for the Replication Games
Due before lecture 7 (Modelling - 14/10):
Due before lecture 8 (Analysis - 21/10):
- Graded assignment
- Reading
Final Project
Overview
This project can be handed-in in teams of two but you can also do it alone if prefered. Your team will design and implement a simulation that replicates an analysis one of you might pursue in their master’s thesis. Think of this as an opportunity to get an early start on a potential research idea — though it does not need to be something you will actually work on in your thesis.
You will generate synthetic data, specify a data-generating process (DGP), define an identification strategy, and estimate a regression model to recover a causal effect of interest. Because you generate the data, you will know the “true” effect, which allows you to examine how well your empirical strategy recovers it.
This project aims to make you think carefully about the type of data you need for your master’s thesis, its structure, the identification strategies you could use, the parameter you would actually be estimating, and the hurdles you might encounter in practice.
You should use this exercise to:
- Define a clear research question and identification strategy for your master’s thesis or research project
- Think about the design of your analysis: the structure of actual data you may use in your thesis, which variables to include in your analysis
- Consider potential threats to identification and to the estimation of the effects of interest
- Evaluate how some undesirable features that might prevent you from retrieving the true effect of interest
- Reflect on what challenges might arise when applying the same design to real-world data.
The final product should resemble a short research paper, with the crucial difference that your data are simulated. Your report should also highlight potential pitfalls that you might face when implementing the same analysis on actual data.
When generating your data, start extremely simple. You will complexify and make your analysis more realistic later. And start working on this assignment early!
Project Proposal
Write a short document (2 pages max) that briefly presents:
- The context and motivation
- Your research question
- Your main econometric specification
- What you intend to test/explore in your analysis (for instance the sample size needed, the impact of choices regarding the level of fixed effects chosen, etc)
Deliverables
For this project, we ask you to produce 3 deliverable. The report and .html
documents are due on November 7, 8pm.
A short standalone report, at most 7 pages-long, structured as a concise research paper, including:
- Context and motivation. Brief but rooted in theory.
- Research question.
- Empirical specification. Describe both the specification you would use if you were working on actual data and the one you will use on your simulated data (if different)
- Data section. Introduce the structure of your different simulations1 and key choices you made — brief, with full details in the
Quarto
file. - Modeling and analysis results. The most important aspect is to discuss some challenges you may face (eg heterogenous effects, strong correlation between your FE and treatment, etc) and to explain how you tested and explored them.
- Discussion of lessons learned for your research project (eg structure and granumlarity of the data needed, aspects you will need to be careful about in your analysis) and potential real-world challenges you may face but did not simulate and explored here.
A .html document generated with Quarto presenting your whole analysis. You should return a rendered and standalone
.html
version of a Quarto document. It should be roughly similar to the documents describing the simulations we implemented together (for instance here ). Report your code and describe your choices concisely but extensively.A 20 min presentation of your project (on November 4, 8:30am)
Replication
For the replication, you will participate in the replication games organised by the CERGIC and the Institute 4 Replication on October 9 at the ENS de Lyon. Participation is required, as part of this class.
Please make 3 teams (of 4-5) and indicate the name of your team members in the following registration form: https://www.surveymonkey.ca/r/Replication_Games_Lyon_2025
There will also be an online pre-game meeting organised by the I4R to describe how the day will unfold. You will receive the link to the meeting via email. This meeting will take place on Tuesday September 23, at 1pm.
You are asked to write a report on your replication and to send it to me by October 14, 8:30am.
Bibliography
The course website provides a series of references to complement and go beyond the material taught in this class. Among those, key handbooks references are:
- Angrist, Joshua D. and Jorn-Stefen Prischke, Mostly Harmless Econometrics: An Empiricist’s Companion: the causal inference bible for economists
- Cunningham, Causal Inference, The Mixtape: a nice overview of causal inference approaches, with R, Python and Stata code and examples
- Huntington-Klein, The Effect: An Introduction to Research Design and Causality: an excellent causal inference textbook centered on intuitions and not maths.
- Gelman, Hill and Vehtari, Regression and Other Stories: presents regression from a fresh stats perspective, centered on intuition, simulations and design.
Footnotes
You will have various simulations with increasing levels of complexity↩︎