Summaries

This paper aims to identify tangible design parameters that might lead to inaccurate estimates of relatively small effects, the short-term health effects of air pollution. Based on our findings, we build a list of recommendations and pitfalls to avoid in future studies.

Abstract

This paper identifies tangible design parameters that might lead to inaccurate estimates of relatively small effects, the short-term health effects of air pollution. A lack of statistical power not only makes relatively small effects difficult to detect but resulting published estimates might exaggerate true effect sizes. We first document the existence of this issue in the epidemiology and economics literature of interest. Then, we identify its drivers using real data simulations that replicate most prevailing inference methods. Finally, we argue relevance to other settings and propose a principled workflow to evaluate and avoid exaggeration when conducting a non-experimental study.

Plain language summary

The negative health consequences of long term exposition to air pollution are well known to the general public. Yet, a vast scientific literature has shown that short term impacts can also be disastrous. Air pollution peaks increase the number of deaths and admissions in hospitals for respiratory and cardiovascular causes, even immediately on the day of the event. Importantly, the literature has shown that even low levels of air pollution can impair health. All this evidence led to the implementation of public policies to mitigate these adverse health effects. Inaccurate estimations of the short term health effects of air pollution may lead to an inadequate public policy response to this issue.

These effects are often of small magnitude, especially when considering the impact of small pollution peaks. To detect and measure them accurately, one therefore needs methods that are particularly precise. A precise study will return a small range of values within which one can be confident that the true effect lies. If a method is not precise enough, this range will be large and it will not be possible to know what the exact effect is. Research in statistics has shown that, coupled with the usual statistical practices, imprecision often leads to exaggerating effect sizes.

To evaluate whether existing studies are likely to overestimate these effects, we carry out an extensive review of the literature. We analyze the precision of papers in this literature and find that while many studies present an adequate precision, a substantial share of studies do not and are likely to overestimate the effects. The analysis of published results highlights potential shortcomings of the literature but does not enable us to identify what causes these issues. To pinpoint drivers of these issues, we run simulations. We simulate health effects and air pollution shocks and test the ability of different methods to pick up these fake effects. We find that some methods systematically perform poorly and greatly exaggerate the effects. For all methods, we show that using a small sample, focusing on a sub-population or studying a limited number of air pollution peaks leads to an overestimation of the effects.

Our findings induce a series of guidance for future studies investigating the short term health effects of air pollution. Overall, researchers should pay careful attention to the precision of their studies to avoid overestimation. In some cases, the use of certain statistical methods should be avoided. In general, sample size should be large enough to be able to measure the effects accurately. We recommend that researchers run simulations before starting their analyses in order to evaluate whether they are at risk of overestimating the effects they are trying to measure. After running their analysis, we advise researchers to run an other battery of quick tests to confirm that the effect they find is not exaggerated.