class: right, middle, inverse, title-slide .title[ # Lecture 1 - Overview and fundamental hurdles ] .subtitle[ ##
Topics in Econometrics ] .author[ ### Vincent Bagilet ] .date[ ### 2025-09-11 ] --- class: right, middle, inverse # Introduction and Steps of Econometric Analyses ??? I introduce myself, they introduce themselves (ask them what they'd like to do later) --- class: titled, middle # Objective of the class - Discuss **practical issues** and challenges that one may face when doing applied economics: - Help you be aware of some of them - Provide you with some tools to be able to spot others by yourself - These hurdles may arise at any step of the research process - *Topics* class: will not cover everything but instead pick points within topics --- # Steps of Applied Economics Analyses <br> -- .pull-left[ - Define question/topic - Find, get, wrangle and clean data - Summary statistics - Define an identification strategy - Build a regression model ] .pull-right[ - Estimate your model - Specification checks - Additional inference - Robustness checks - Communicate ] ??? - I need a running example here. My MA thesis on gasoline prices? - 4: You had a class on this last year, that's fun, that's also very much discussed in economics. Same for 7 - 6: You had a lot of classes and there are a lot of resources on this. That's also very much discussed in economics but clearly, not fun - 5: Talked about it last year but will talk more about it this year - For other points, pitfalls less discussed --- class: titled, middle # A More Structured Version - **Design**: decisions of data collection and measurement - *eg*, decisions related to sample size and ensuring exogeneity of the treatment - **Modeling**: define statistical models - In between design and analysis - **Analysis**: estimation and questions of statistical inference - *eg* standard errors, hypothesis tests, and estimator properties ??? - We can actually group these points into relevant categories: take some perspective - Let's talk about the main thing in the previous list that we will not discuss at all because it is not the purpose of the class: good research questions --- class: titled, middle # Outline of the class <br> 1. Overview and fundamental hurdles 1. Simulations 1. Design: beyond identification 1. Design: identification (Fixed Effects and related) 1. Data visualization 1. Design: identification (IV and RDD) 1. Modelling 1. Analysis ??? - Provide more details on the content of each section 1. Data visualization: how can we use data viz to put some of our modeling and identification hypotheses to test? --- class: right, middle, inverse # Research questions --- class: titled, middle # What is a good research question? -- - It **can be answered** - There is some sort of objective answer - It should **improve our understanding of the world** - Should **inform theory** in some way - Takes us from theory to an hypothesis (statement about what we will observe in the world) - A solid econometric analysis only matters to the extent that you have a good research question (but the opposite might be true as well) ??? - --- class: titled, middle # Example - Impact of the size of motors of boats in Norway and cod catch under a catch cap - Not that interesting in itself, is it? Would be more interesting if, for instance: -- - Look at this from a game theory and forced technological adoption perspective - Find a way to use this case to say something *new* or *different* on management of renewable natural resources - Can produce radically different papers on the same topic and setting - **Use theory to put light on your specific case *and* your specific case to inform theory** ??? - Example: motor size of boats in Norway on cod catch, in a situation where we have a catch cap/quota. Not really interesting in itself is it? - But might be more interesting if for instance you look at this from a game theory perspective and a perspective of forced technological adoption and use this theory to put light on your particular case and your particular case to inform theory - But --- class: titled, middle # Identifying a research question - Can start with a research question/hypothesis or from theory - **Or** can find a natural experiment and come up with a question - Know your literature to identify **gaps** - We are interested in **why** and not **what** - **Avoid data mining**: it can help but to identify questions to test on **other** data sets --- class: titled, middle # Is your research question good? - **Relevance**: is it interesting, important or policy relevant ? - **Potential results**: what would any result tell you about your theory? - **Feasibility**: is the right data available? - **Scale**: how much resources would you need? - **Research design**: is there a good one that would allow you to answer your question? - **Keep it simple**: avoid building several questions into one ??? - --- class: right, middle, inverse # Logistics --- # Website <center> <img src="data:image/png;base64,#images/website.png" width="931" /> https://vincentbagilet.github.io/metrics_m2/ </center> --- class: titled, middle # A typical lecture 1. I introduce concepts and intuition 1. We discuss a paper together (when reading assigned) 1. Some R coding together *and* on your own --- class: titled, middle # Grading and assignments | Assignment | Percentage of final grade | Due date | |--------------------|:--------------------------:|-------------------:| | Final report | 30 % | November 7, 8pm | | Final presentation | 20 % | November 4, 8:30am | | Participation | 10 % | - | | Replication | 20 % | October 13, 8pm | | Homework | 20 % | Throughout | --- class: titled, middle # Final project - In pairs - Build a simulation to replicate an analysis you may do in your master thesis - Generate realistic fake-data, run your analysis and discuss your results - Project proposal, short report, presentation --- class: titled, middle # Structure of the final project - Structure it as short research paper (pitch): - Quick motivation and context - Research question - Data section: describe how you generate your data. Start very simple, complexify later - Modeling and analysis: describe your model, your choices and the outputs of your regressions - Discussion: what did you learn with this exercise --- class: titled, middle # Homeworks (due before lecture) <br> 1. - 2. Non-graded assignment 3. - 4. Graded assignment + reading 5. Graded assignment 6. Graded project proposal + reading 7. Replication + reading 8. Graded assignment + reading --- class: right, middle, inverse # Fundamental hurdles --- class: titled, middle # Typical pitfalls in economics research -- .pull-left[ - Spurious correlation - Reverse causality - Confounders ] .pull-right[ - Model miss-specification - External validity - Insufficient power ] ??? - Spurious correlation: that's in part why theory matter - Examples of estimates informing policy: - Impact of a public policy on an outcome (eg Social Cost of Carbon) - If want to choose between different policies which one to implement, etc --- ## Spurious correlation <a href="https://www.tylervigen.com/spurious/correlation/1402_viewership-of-the-big-bang-theory_correlates-with_google-searches-for-how-to-make-baby" target="_blank"><img src="data:image/png;base64,#images/tbbt_baby.png" width="800" style="display: block; margin: auto;" /></a> --- ## Reverse causality <a href="https://ourworldindata.org/grapher/solar-pv-prices-vs-cumulative-capacity?time=earliest..2022" target="_blank"><img src="data:image/png;base64,#slides_01_hurdles_files/figure-html/pv-1.png" width="70%" style="display: block; margin: auto;" /></a> --- layout: true ## Confounders --- [https://forms.gle/tHcTYqaKPeDTAUn56](https://forms.gle/tHcTYqaKPeDTAUn56) <img src="data:image/png;base64,#images/qr_vege.png" width="400" style="display: block; margin: auto;" /> --- <img src="data:image/png;base64,#slides_01_hurdles_files/figure-html/graph_vege-1.png" width="70%" style="display: block; margin: auto;" /> --- <img src="data:image/png;base64,#slides_01_hurdles_files/figure-html/vege_smooth-1.png" width="70%" style="display: block; margin: auto;" /> --- <img src="data:image/png;base64,#slides_01_hurdles_files/figure-html/vege_gender-1.png" width="70%" style="display: block; margin: auto;" /> --- <img src="data:image/png;base64,#slides_01_hurdles_files/figure-html/DAG-1.png" width="80%" style="display: block; margin: auto;" /> --- layout: false ## Model miss-specification <img src="data:image/png;base64,#slides_01_hurdles_files/figure-html/anscombe-1.png" width="60%" style="display: block; margin: auto;" /> --- class: titled, middle # Examples of more subtle hurdles - Bad controls - Leverage and outliers - Measurement error - Clustering level - How do identification strategies actually work (*eg* FE and TWFE) - More complex identification strategies (*eg*, shift-share) --- class: right, middle, inverse # Avoiding hurdles --- # How to avoid hurdles? -- **Learn**, understand metrics and applied research -- <img src="data:image/png;base64,#images/duh.png" width="400" style="display: block; margin: auto;" /> ??? - Seriously, that can be helpful to avoid doing some wrong stuff - Knowing that some things do not work, etc --- class: center <a href="https://en.wikipedia.org/wiki/The_Barque_of_Dante_(Manet)" target="_blank"><img src="data:image/png;base64,#images/manet_delacroix.jpg" width="580" /></a> -- *The Barque of Dante* by Manet, after a painting by Delacroix **Replication**, a helpful learning tool ??? - La barque de Dante d'après Eugène Delacroix par Edouard Manet --- layout: true # How to avoid hurdles? --- ### Derive the maths <br><br> - Sometimes relatively straightforward and very illuminating - *eg* drivers of the variance of your estimator: `\(\mathbb{V}_{\hat{\beta}} = \dfrac{\sigma_u^2}{n \sigma_x^2}\)` - Deriving the maths can be more complex and time consuming --- ### Simulations can help - Super easy to implement simple simulations - Can be informative of **what does not work** - Can help you identify **where the issue comes from** - We will discuss that with an example in a second --- ### Implement checks -- .pull-left[ - Check if the model seems to represent the DGP - Check if our identification hypotheses seem to hold - Check if the hypotheses for estimation seem to hold ] -- .pull-right[ <br><br><br> Look at the consequences if this does not hold ] ------- - Robustness checks - Evaluate the design retrospectively ??? - Think about: identification strategy, but other aspects as well - We will study these other --- class: titled, middle layout: false # Objectives for this class - Build a mindful mindset - Help you be aware of some of them - Provide you with some tools to be able to spot others by yourself - Learn how to implement simulations ??? - Sim: learn how to implement them but also learn about their usefulness - Objective for me (beyond helping you learn important material): develop one of my ongoing research articles - Learn something that may not be often discussed but that is really helpful --- class: right, middle, inverse # Simulations ## Usefulness through an example --- class: titled, middle # A simple example: OVB - How does an **omitted variable** affect our point estimate of interest? Why? -- - Under which condition is an omitted variable an issue - How does it affect the point estimate? The s.e.? - How does that vary with various parameters? *eg* correlation between variable (sign and magnitude) - Start very simple and complexify the process - Let's move to R ??? - You may know that before, it is just a very simple example - Do you know everything there is to know? How does it affect --- class: right, middle, inverse # Lecture summary --- class: titled, middle # What did we do today? - Discussed the structure of applied econometric research and where we may encounter hurdles - Discussed logistics - Reviewed some of these common hurdles encountered in applied research - How to implement a simple simulation to understand the impact of an omitted variable --- class: titled, middle ## What did you learned, liked, disliked? ??? - First class here, with you - First time teaching my own material - First time teaching something like this - I want to have your opinion for next time - Was it too simple, too complex? --- class: right, middle, inverse # Thank you!