Present how we can analyze text data: the type of tasks we can do and different approaches to do so.
Summary
We have seen how to pre-process text data in the previous lecture. The next step is to transform this unstructured text data into a numeric format that we can use in analysis. In this lecture, we introduce usual numeric representations of text and discuss classical approaches to build sparse matrix representations (populated for instance with counts, TF-idf). Then we discuss about more advanced methods to build dense representations such as word embeddings. These representations allow to apply the supervised ML algorithms that you studied in the rest of the class as well as unsupervised methods (for instance LDA. We then briefly discuss recent developments in NLP; they allow to compute context-specific representations and to implement more complex economic analyses.
Almond, Douglas, Xinming Du, and Anna Papp. 2022. “Favourability Towards Natural Gas Relates to Funding Source of University Energy Centres.”Nature Climate Change 12 (12): 1122–28. https://doi.org/10.1038/s41558-022-01521-3.
Ash, Elliott, Daniel L. Chen, and Arianna Ornaghi. 2024. “Gender Attitudes in the Judiciary: Evidence from US Circuit Courts.”American Economic Journal: Applied Economics 16 (1): 314–50. https://doi.org/10.1257/app.20210435.
Bertrand, Marianne, Matilde Bombardini, Raymond Fisman, Brad Hackinen, and Francesco Trebbi. 2021. “Hall of Mirrors: Corporate Philanthropy and Strategic Advocacy.”The Quarterly Journal of Economics 136 (4): 2413–65. https://doi.org/10.1093/qje/qjab023.
Hassan, Tarek A, Stephan Hollander, Laurence van Lent, and Ahmed Tahoun. 2019. “Firm-Level Political Risk: Measurement and Effects.”The Quarterly Journal of Economics 134 (4): 2135–2202. https://doi.org/10.1093/qje/qjz021.
Moreno-Medina, Jonathan, Aurélie Ouss, Patrick Bayer, and Bocar A Ba. 2025. “Officer-Involved: The Media Language of Police Killings.”The Quarterly Journal of Economics 140 (2): 1525–80. https://doi.org/10.1093/qje/qjaf004.
Noailly, Joëlle, Laura Nowzohour, Matthias van den Heuvel, and Ireneu Pla. 2024. “Heard the News? Environmental Policy and Clean Investments.”Journal of Public Economics 238 (October): 105190. https://doi.org/10.1016/j.jpubeco.2024.105190.