Syllabus - Topics in Econometrics (ECO5106)

MSc Advanced Economics - ENS Lyon

Course website

All the information relative to this class can be found on the course website.

Instructor

Course Objectives and Overview

This course aims to explore some of the practical issues and challenges routinely faced when implementing an applied econometrics analysis. It ambitions to help students be aware of these challenges and to provide them with tools to spot others by themselves.

This course aims to give students a deeper understanding of:

  • How regression works “under the hood”
  • Common pitfalls and challenges in empirical work
  • Advanced causal identification strategies and their assumptions
  • How design, modeling, and analysis choices shape empirical results
  • Using simulations to explore estimator behavior and diagnose potential problems.

Outline

More specifically, the course content will be divided into eight 3h-long lectures:

  1. Overview and fundamental hurdles
  2. Simulations
  3. Design beyond identification
  4. Design: Identification and Fixed Effects
  5. Data visualization
  6. Design: IV and RDD
  7. Modelling
  8. Analysis

Prerequisites

Prerequisites for this class include foundational knowledge in econometrics, statistical theory, causal inference, mathematics for economists, and familiarity with statistical software, in particular R and the Tidyverse.

Grading and assignments

Your grade for this class will be divided between several kind of assignments and grading mechanisms:

Assignment Percentage of final grade Due date
Final report 30 % November 7, 8pm
Final presentation 20 % November 4, 8:30am
Participation 10 % -
Replication 20 % October 14, 8:30am
Homework 20 % See below

Homework

Your homework will be composed of graded assignment and/or readings. Readings are mandatory, we will discuss the papers in class together and everyone will be invited to participate in the discussion. These are due before the beginning of each lecture and according to the following schedule:

Final Project

Overview

This project can be handed-in in teams of two but you can also do it alone if prefered. Your team will design and implement a simulation that replicates an analysis one of you might pursue in their master’s thesis. Think of this as an opportunity to get an early start on a potential research idea — though it does not need to be something you will actually work on in your thesis.

You will generate synthetic data, specify a data-generating process (DGP), define an identification strategy, and estimate a regression model to recover a causal effect of interest. Because you generate the data, you will know the “true” effect, which allows you to examine how well your empirical strategy recovers it.

Objective of the Project

This project aims to make you think carefully about the type of data you need for your master’s thesis, its structure, the identification strategies you could use, the parameter you would actually be estimating, and the hurdles you might encounter in practice.

You should use this exercise to:

  • Define a clear research question and identification strategy for your master’s thesis or research project
  • Think about the design of your analysis: the structure of actual data you may use in your thesis, which variables to include in your analysis
  • Consider potential threats to identification and to the estimation of the effects of interest
  • Evaluate how some undesirable features that might prevent you from retrieving the true effect of interest
  • Reflect on what challenges might arise when applying the same design to real-world data.

The final product should resemble a short research paper, with the crucial difference that your data are simulated. Your report should also highlight potential pitfalls that you might face when implementing the same analysis on actual data.

Important

When generating your data, start extremely simple. You will complexify and make your analysis more realistic later. And start working on this assignment early!

Project Proposal

Write a short document (2 pages max) that briefly presents:

  • The context and motivation
  • Your research question
  • Your main econometric specification
  • What you intend to test/explore in your analysis (for instance the sample size needed, the impact of choices regarding the level of fixed effects chosen, etc)

Deliverables

For this project, we ask you to produce 3 deliverable. The report and .html documents are due on November 7, 8pm.

  • A short standalone report, at most 7 pages-long, structured as a concise research paper, including:

    • Context and motivation. Brief but rooted in theory.
    • Research question.
    • Empirical specification. Describe both the specification you would use if you were working on actual data and the one you will use on your simulated data (if different)
    • Data section. Introduce the structure of your different simulations1 and key choices you made — brief, with full details in the Quarto file.
    • Modeling and analysis results. The most important aspect is to discuss some challenges you may face (eg heterogenous effects, strong correlation between your FE and treatment, etc) and to explain how you tested and explored them.
    • Discussion of lessons learned for your research project (eg structure and granumlarity of the data needed, aspects you will need to be careful about in your analysis) and potential real-world challenges you may face but did not simulate and explored here.
  • A .html document generated with Quarto presenting your whole analysis. You should return a rendered and standalone .html version of a Quarto document. It should be roughly similar to the documents describing the simulations we implemented together (for instance here ). Report your code and describe your choices concisely but extensively.

  • A 20 min presentation of your project (on November 4, 8:30am)

Replication

For the replication, you will participate in the replication games organised by the CERGIC and the Institute 4 Replication on October 9 at the ENS de Lyon. Participation is required, as part of this class.

Please make 3 teams (of 4-5) and indicate the name of your team members in the following registration form: https://www.surveymonkey.ca/r/Replication_Games_Lyon_2025

There will also be an online pre-game meeting organised by the I4R to describe how the day will unfold. You will receive the link to the meeting via email. This meeting will take place on Tuesday September 23, at 1pm.

You are asked to write a report on your replication and to send it to me by October 14, 8:30am.

Bibliography

The course website provides a series of references to complement and go beyond the material taught in this class. Among those, key handbooks references are:

Footnotes

  1. You will have various simulations with increasing levels of complexity↩︎