Resources
Here is a list of resources and references that can be helpful for this class but also to go beyond the material taught in this class.
If you find a nice resource not listed here and that could be interesting for your classmates (or me), please add it to this shared document.
Textbooks
- Gelman, Hill and Vehtari, Regression and Other Stories: presents regression from a fresh stats perspective, centered on intuition, simulations and design.
- Huntington-Klein, The Effect: An Introduction to Research Design and Causality: an excellent causal inference textbook centered on intuitions and not maths.
- Cunningham, Causal Inference, The Mixtape: a nice overview of causal inference approaches, with R, Python and Stata code and examples
- Angrist, Joshua D. and Jorn-Stefen Prischke, Mostly Harmless Econometrics: An Empiricist’s Companion: the causal inference bible for economists
R
R Textbooks
Here is a list of useful resources.
- R for Data Science: definitely the best resource to learn R and the tidyverse by yourself.
- R cheatsheets: great summaries of the functions in key tidyverse packages.
- The tidyverse style guide: a very short book to help you write more legible R code.
- Tidy design principles a more advanced book to help you write better R code.
Useful R Packages
Data viz
Data visualization is key to any statistical analysis, whether it is used to explore the data, build a model, check hypotheses or present and communicate results. Here are a few resources that can help you build nice and relevant data viz in R.
- R for Data Science provides a nice introduction to ggplot, THE package to build plots in R. To go further you can read this book.
- Modern Data Visualization with R and Data Visualization - A practical introduction both discuss key data viz concepts and teaches you how to apply them in R and ggplot.
- Fundamentals of Data Visualization does not provide R code directly but provides a great introduction to data viz and R code can be found on the book’s GitHub.
Finding data
Data are obviously central to statistical analyses. Identifying relevant datasets is often not an easy task. Here is a non-exhaustive list of places where to look for useful datasets.
- Google Dataset Search: a search engine for datasets hosted in thousands of repositories across the Web. Info here.
- Data Europa: European public sector datasets.
- Data.gouv: open platform for French public data.
- Data Is Plural: a weekly newsletter providing useful and often original datasets. There is a spreadsheet archiving all the data sets.
- A currated list of French open data sources.
- Quantitative Social Science Data: a curated list of economics datasets.
- Awesome Public Datasets: a list of topic-centric public data sources.
- PolData: a list of political science datasets.
- ICPSR stores, curates, and provides access to scientific data so others can reuse the data.
- Harvard dataverse: A repository of datasets used in published research articles.