The Association of Opening K-12 Schools and Colleges with the Spread of COVID-19 in the United States: County-Level Panel Data Analysis

This paper empirically examines how the opening of K-12 schools and colleges is associated with the spread of COVID-19 using county-level panel data in the United States. Using data on foot traffic and K-12 school opening plans, we analyze how an increase in visits to schools and opening schools with different teaching methods (in-person, hybrid, and remote) is related to the 2-weeks forward growth rate of confirmed COVID-19 cases. Our debiased panel data regression analysis with a set of county dummies, interactions of state and week dummies, and other controls shows that an increase in visits to both K-12 schools and colleges is associated with a subsequent increase in case growth rates. The estimates indicate that fully opening K-12 schools with in-person learning is associated with a 5 (SE = 2) percentage points increase in the growth rate of cases. We also find that the positive association of K-12 school visits or in-person school openings with case growth is stronger for counties that do not require staff to wear masks at schools. These results have a causal interpretation in a structural model with unobserved county and time confounders. Sensitivity analysis shows that the baseline results are robust to timing assumptions and alternative specifications.


Policy Research Working Paper 8929
This paper presents the three-year impacts of an improved biomass cookstove on child and adult health in rural Ethiopia. After near complete stove adoption during an initial one-year randomized controlled trial, 60 percent of treatment households continued to use the improved stoves three-years on and experienced reductions in hazardous airborne particulate matter. The study finds that treatment status is associated with a precisely estimated 0.3-0.4 standard deviation improvement in height-for-age of young children exposed during their first years of life, compared with a control group of households that never used the improved stove. This is a substantial effect with implications for greater health and well-being throughout the life course. However, the study finds no changes in the respiratory symptoms or physical functioning of older children and adult cooks in treated households relative to control households. The results advance understanding of the health impacts of hazardous air pollution while also refining the design and implementation options for interventions geared toward improving well-being in similar environments.
This paper is a product of the Development Research Group, Development Economics. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://www.worldbank.org/prwp. The lead author and task team leader may be contacted at drlafave@colby.edu and mtoman@worldbank.org.

Introduction
The disease burden from indoor air pollution results in an estimated 4 million premature deaths per year and is one of the largest drivers of poor health across the world (Lim et al., 2013;GBD, 2018).
While the diminished quality of air in outdoor environments is, appropriately, a growing concern, the burning of solid biomass materials for cooking and heating within homes is the primary source of the pollution burden (Smith et al., 2012). Forty percent of the world relies on solid fuels such as wood, coal, charcoal, and dung for household cooking with resulting emissions disproportionately harming women and young children in the form of lower-respiratory infections, pneumonia, low birth weights, growth deficits, and fatigue (Alexander et al., 2018;Dadras and Chapman, 2017;Fullerton et al., 2008;Pope et al., 2010;Walker et al., 2007;WHO, 2015).
In addressing these and other health concerns, organizations have sought to improve the quality of indoor air by reducing the use of biomass fuel. Improved cookstove technologies (ICTs), which increase efficiency and reduce emissions by more thoroughly burning fuels, are often recommended as affordable ways for households to transition away from open-fire cooking (Jeuland and Pattanayak, 2012) and improve health outcomes if demonstrably clean (WHO, 2016). For example, in Ethiopia, the setting of this study, the federal government has proposed distributing ICTs to 30 million households by 2030 and identified reducing demand for biomass by increasing fuel efficiency as a strategic priority in the energy sector (FDRE, 2011(FDRE, , 2015. This paper contributes to a growing body of work that assesses health and environmental benefits of improved cookstoves in field settings (Bensch and Peters, 2015;Hanna et al., 2016;Mortimer et al., 2017;Smith et al., 2007;Quansah et al., 2017). We examine the health impacts of the Mirt stove, a particular type of ICT engineered to bake the Ethiopian staple bread, injera. Working with a representative set of approximately 500 households across 36 communities in rural Ethiopia, we distributed Mirt stoves to a random subset of households and study stove use, indoor air quality, and health outcomes of women and children three and a half years after the initial intervention. While tailored to a specific food, and thus not intended to entirely replace traditional cooking methods, the Mirt stove is well suited to the context as firewood for cooking injera is the end-use of over half of all primary energy consumed in Ethiopia (Bizzari, 2010). Evidence from laboratory-based controlled cooking tests shows Mirt reduces fine airborne particulate matter by 41 percent relative to traditional open-fire cooking (Teshome, 2007). While the exact gradient of the pollution-health relationship is not yet well established in the literature (WHO, 2016), the intensive energy use of injera cooking and laboratory-based values represent considerable reductions with potential health gains (Burnett et al., 2014).
We build on an initial one-year impact evaluation of Mirt dissemination that showed the stoves saved wood when cooking standardized injera batches and users highly valued the benefits provided by the stove (Gebreegziabher et al., 2018). We also found they were adopted and used by over 90 percent of treatment households in the 12 months after their distribution (Beyene et al., 2015). This rate is significantly higher than a number of high-profile prior trials involving improved stoves (e.g. Hanna et al., 2016;Mobarak et al., 2012;WHO, 2006a), and in line with more recent work studying low-cost stoves particularly well-suited to the local environment and cooking practices (e.g. Bensch and Peters, 2015;Rosa et al., 2014).
In the present study, we pay particular attention to secondary health outcomes that may benefit from several years of reduced indoor air pollution, including child growth, a marker of inflammation and early-life health sensitivity during the first years of life that is a powerful predictor of well-being in adulthood (Hoddinott et al., 2013), as well as symptoms of respiratory disease and measures of physical function. 1 Comparing across those randomly offered the Mirt stove with those that were not, our findings suggest the Mirt treatment increased height-for-age for children exposed to this technology before the age of three by approximately 0.3 to 0.4 standard deviation, while the respiratory symptoms and physical functioning of adult cooks and older children did not improve. These patterns are consistent with a critical early period in life when health interventions see their largest positive impacts (e.g. Cusick and Georgieff, 2016;Walker et al., 2007) and the effects of accumulated exposure causing lasting damage in older individuals. The comparison across treatment and control households is supported by within household estimates that focusing on those exposed during the sensitive earlylife period to isolate the Mirt treatment effect from confounding factors. The observed effect on child growth is both precisely estimated and meaningful in magnitude.
A wide body of evidence links early-life health to adult health, educational attainment, cognitive function, and earnings (e.g. Adair et al., 2013;Almond et al., 2018;Fogel, 2004;Glewwe and Miguel, 2008;Grimard and Laszlo, 2013;Groppo and Kraehnert, 2016;Hoddinott et al., 2008;LaFave and Thomas, 2017;Strauss and Thomas 1998;Wisniewski, 2016). For example, analysis of the Ethiopian 4 Rural Household Survey in Dercon and Porter (2014) suggests the observed height gains in the treatment group would result in approximately 3 to 4 percent increases in annual earnings as an adult. 2 To better understand the underlying mechanisms behind the effects, we simultaneously collected measures of fine airborne particulate matter with a diameter of 2.5 micrometers or less (PM 2.5 ) in a random subsample of study households. These concentrations of matter, averaging approximately 1250g/m 3 , or 50 times more than the World Health Organization's (WHO) threshold for healthy exposure (WHO, 2006b), decreased by 10 percent on average for all households in the Mirt treatment arm, relative to the control group. 3 Meanwhile, households with young children, which also cook more frequently, experienced reductions of 24 percent on average. When examining whether measured health outcomes relate directly to particulate matter, we demonstrate a clear gradient between reduced household air pollution and increased child growth, although we do not find a statistically significant or otherwise meaningful relationship between the level of particulate matter and respiratory symptoms or adult physical functioning.
These results advance our understanding of the health impacts of hazardous air pollution, while also refining design and implementation options for interventions geared toward mitigation in similarly situated environments. Health benefits for adults and older children do not arise in the data, but, given the link between early-life health and education, cognition, adult health, and earnings in later life, the demonstrated advancement for young children represents a substantial and meaningful impact.

Ethiopian context and Mirt improved cookstove technology
Despite enjoying economic growth over the preceding decade, Ethiopia is known as a "hotspot" for forest depletion, and it is one of the top four countries in the world to simultaneously maintain a high level of nonrenewable fuelwood consumption per capita and disease burden from household air pollution (Bailis et al., 2015). Neonatal disorders and lower respiratory infections, both linked to household air pollution (Amegah et al., 2014;Smith et al., 2000;Perez-Padilla et al., 2010), are two of the top three leading causes of death in Ethiopia (GBD, 2018), and improved cooking technology is virtually nonexistent in rural areas.
This study focuses on the dissemination and use of Mirt stoves in rural Ethiopia. Mirt (translated as "best") stoves are a locally developed technology specifically designed to cook injera, the staple bread of Ethiopia. The stove was designed by the Ethiopian Ministry of Water and Energy, is produced from locally available raw materials, and sells at a market price of 100-250 birr (approximately $10.00 USD at the start of the study). Repeated controlled cooking trials done is a laboratory setting suggest the Mirt reduces fuelwood use by 50.7 percent, carbon monoxide emissions by 92.3 percent, and particulate matter by 41 percent compared to traditional open-fire or three-stone technology (Teshome, 2007). Given the estimated fuel savings and a typical injera baking frequency of two to three times per week, the break-even period of two months compares favorably with the stove's five-year life expectancy.
While GIZ, a German governmental organization, has supported the stoves since 1998, very few households in rural settings have access to the Mirt technology or other improved cookstoves.
No improved stoves were present in our study households at baseline, and data from the 2013 Ethiopian Rural Socioeconomic Survey indicate that 98.4 percent of households in rural areas cook exclusively over traditional open-fires (Central Statistics Agency of Ethiopia, 2013).

Initial randomized controlled trial and adoption estimates
In 2013 we conducted an initial one-year randomized evaluation of Mirt-stove adoption, its fuel savings, and users' willingness to pay for the product. The baseline of the study surveyed households from 36 communities (Gots or sub-Kebeles) randomly selected to represent the forest cover in Amhara, Oromiya, and Southern Nations, Nationalities, and Peoples' (SNNP) regional states. The three states cover approximately 70 percent of the land area and 80 percent of the population of Ethiopia.
Within each enumeration area, we randomly drew 14 households from a local census, and assigned them into treatment and control groups at the household level within each community. The 10 households in the treatment group received a Mirt stove and were allocated an additional behavioral treatment based on price (free vs. subsidized), incentives based on use (none vs. payment for recorded use), or a social networking intervention (home training only vs. home training plus community training). The control and treatment households were balanced on covariates at baseline, as well as across arms within the treatment group (see Beyene et al., 2015 for detail on the sampling and baseline analysis). 4 Initial adoption of the stoves was assessed from both self-reports by the primary cook within the household and by electronic stove-use monitors (SUMs) attached to each stove to record time and temperature readings. Interviewers returned to the households to record SUM measurements four times throughout the first year following the randomization. Adoption was high and remained high throughout the year-12 months after the intervention use rates were approximately 90 percent with no statistically significant differences across the treatment arms. The frequency of use remained high as well, with treatment households averaging 2.5 cooking events per week on the Mirt stoves, a rate consistent with the traditional frequency for baking and storing injera (Beyene et al., 2015). Important for interpretation of the health results presented below, baseline results suggest households chose to continue using traditional three-stone open-fire stoves for all non-injera cooking.
Mirt stoves are highly specialized for cooking injera, with the main burner approximately 50 cm in diameter, leaving it necessary to have a second stove to cook other foods such as stews and coffee.
The Mirt technology was adopted for its specific use, with overall pollution and health improvements constrained by the continued use of open fires by both treatment and control households.

Three-year follow up and health outcomes
This study reports on the three-year follow-up investigation conducted in late 2016, approximately 40 months after the establishment of the baseline and the initial distribution of the stoves. Given the initial high adoption rates, we added collections of household indoor air pollution and health outcomes in the follow-up survey to assess the secondary outcomes potentially associated with adoption of the improved biomass stoves. We focused on children and adult cooks-the two groups thought to be most at risk from exposure to indoor air pollution from biomass cooking.
Households in the follow-up survey were re-interviewed with minimal and balanced attrition across treatment and control arms. Of the original 504 households, 480 were re-interviewed, yielding a 95 percent re-contact rate. 5 Uptake of the original stoves remained quite high as 60 percent of the treatment households continued to use their original Mirt stoves with no observed differences across the price, incentive, or networking groups (p-value = 0.67).
In the current study, we specifically focus on three sets of secondary health outcomes linked to indoor air pollution and, potentially, Mirt stove use: child growth, symptom reports of respiratoryrelated conditions, and measures of physical function captured through activities of daily living.
First, young children during the key early-life developmental period are thought to be the most 7 at risk from household air pollution as they inhale and absorb higher levels of particles given the same exposure as adults (Bruce et al., 2013;Kurt et al., 2016;Sturm, 2012). To assess health among this population, a trained enumerator measured height (or recumbent length for children under two years of age) which was then standardized into a z-score relative to a representative, well-nourished child of the same age in months and gender using the Centers for Disease Control and Prevention growth tables (CDC, 2012). Height, conditional on age and gender, is both straightforward to measure and a well-established indicator of health status during very early childhood, reflecting both the genetic endowment and the influence of the disease environment during the in utero period and the first two to three years of life, commonly referred to as the "1 st 1,000 days" (Cusick and Georgieff, 2016;Martorell and Habicht, 1986;Victora et al., 2010;Waterlow et al., 1977). As there is limited potential for catch-up growth to offset early-life deficits, child height is a powerful predictor of attained height as an adult and is thereby associated with reduced mortality and morbidity, as well as greater economic prosperity, educational attainment, and cognitive function Fogel, 2004;Glewwe and Miguel, 2008;Hoddinott et al., 2008;LaFave and Thomas, 2017;Strauss and Thomas 1998).
Understanding the biological mechanisms linking biomass smoke exposure and child growth is an active area of inquiry (e.g. Burnett et al., 2014;Gordon et al., 2014;Rylance et al., 2015) with a recent National Institutes of Health review noting the link between household air pollution and child growth a specific priority (Martin et al., 2013). Exposure to particulate matter from biomass smoke is a risk factor through in utero exposure as well as direct inhalation during early life (Jayachandran, 2009;Pope et al., 2010). Prenatal impacts occur both as a result of the effect of pollution on the mother and the transfer across the placenta of toxins present in wood smoke that reduce nutrient flows and disrupt the central nervous system of the fetus (Perera et al., 1998;Perera et al., 1999;Poursafa and Kelishadi, 2011). Additional medical evidence links exposure to wood smoke among children to increased inflammation and a weakened immune system (Rylance et al., 2015).
Second, we assess the presence of symptoms indicating respiratory disease for all children and each adult cook in the household. Participants were asked whether they have experienced a series of symptoms over the prior four weeks, including various types of coughs, difficulty breathing, wheezing, and eye problems. Prior studies show that such symptoms are closely linked to clinical markers of respiratory infection and chronic obstructive pulmonary disease (e.g. Bensch and Peters, 2015;Hanna et al., 2016;Pattanayak and Pfaff, 2009) and sensitive to particulate matter exposure (e.g. Duflo et al., 2008).
Finally, each primary cook within the household also provided information on their physical functioning through their ability to perform regular tasks. For each activity of daily living (ADL), the cook reported whether they could do the activity easily, whether they could do it by themselves but with some difficulty, whether they needed assistance, or whether they could not perform the task at all. We included difficult activities such as walking 5km or carrying a heavy load, tasks known as intermediate ADLs, as well as basic ADLs such as standing from sitting and performing routine housework.
Measurement of indoor air pollution in the cooking areas during the follow-up wave was collected in a random subsample of 204 of the 480 households. Particulate matter with an aerodynamic diameter of 2.5 micrometers or less (PM 2.5 ) was recorded using light-scattering sensors and gravimetric pumps and filters. 6 The health module and indoor air pollution measurements were not collected in the baseline survey and added only for the follow-up wave. In the empirical analysis that follows, we pay particular attention to this fact and provide evidence that it does not bias the identification of our causal treatment estimates given the randomized design of the study.

Empirical Approach
Our baseline analysis compares outcomes across treatment and control households with the control group representing the counterfactual if the treatment group had not been offered Mirt stoves. We complement this difference approach when examining child growth by exploiting the physiology of child development where height is only sensitive to interventions in the first three years of life (Cusick and Georgieff, 2016;Martorell and Habicht, 1986). This allows us to exploit variation between younger and older siblings to identify the effect by a difference-in-difference between children within a household.

Baseline treatment and control comparison
Our baseline models estimate the difference across individuals in treatment and control households.
For a given individual i in household h and community c, we estimate the following model: (1) where y is a health outcome and is the coefficient of interest that captures the causal effect of one's household being assigned to the treatment group. As treatment was randomly assigned and orthogonal to the vector of additional control variables measured at baseline, X, the inclusion of controls aids only in improving the precision of the estimates. Additional covariates include the following: gender; 9 flexible polynomials in years of age for adults or age in months for children; household size and demographic composition; the age, education, and gender of the head-of-household; dwelling characteristics; and a composite wealth index. Models also include community fixed effects, µ c , to capture observed and unobserved features of local areas that may impact health such as access to services and local markets.
We observe a median of 3 children under the age of 15 per study household and therefore are able to estimate household random effects extensions of equation (1) for child outcomes. This approach allows the unobserved error to contain a fixed component common to all household members yet uncorrelated with treatment status-a plausible specification given the randomized nature the treatment.
Equation (1) estimates intent-to-treat (ITT) effects identified by the random assignment of treatment status. This causal estimate includes both the 60 percent who continued using the Mirt stove and the 40 percent who had stopped using it at some point during the prior three and a half years. As stoppage is potentially a confounding choice, we also present estimates of actual stove use using an instrumental variable strategy. These two-stage models replace I(Treatment) with a variable measuring the fraction of time since the baseline the household has used the stove (Hanna et al., 2016). This variable takes a value of 1 for households still using the stove. For those who have abandoned the Mirt, use is measured based on SUM temperature sensor readings and reports by the cook on when the household stopped using the stove. The mean of this stove use measure is 0.89, suggesting that the 40 percent of households who had stopped using the stove at the time of the follow-up survey used the Mirt for approximately 29 months on average. Randomly assigned treatment status then serves as an instrumental variable for stove use in a first stage. As some from the treatment group do not to use the Mirt stove but households in the control group were not observed to start using Mirt on their own, our setting satisfies the necessary condition of one-sided noncompliance to interpret this estimate as the average treatment effect of actual Mirt use (Imbens and Rubin, 2015).

Child growth
Our analysis of child linear growth or height-for-age makes use of the additional feature that the timing of the intervention within a child's life is critical A significant literature in nutrition and epidemiology defines the first 1,000 days of a child's life to be a crucial period when health interventions can significantly affect child height (e.g. Currie and Vogl, 2013;Hoddinott et al., 2008), after which reduced exposure to indoor air pollution is expected to have no effect on growth (Berkey et al., 1984). Given the significance of this cut-off, improvements in height-for-age should only occur for those children young enough to benefit from the Mirt treatment. Similar identification strategies based on the biology of height-for-age have been used to examine the child health impacts of primary care services (Frankenberg et al., 2005), pension income (Duflo, 2003), cash transfers (Farfan et al., 2011), and supplementary child feeding programs (Giles and Satriawan, 2015), among others.
With 40 months between the baseline and follow-up waves, we are able to separate children into two groups based on their age and expected effect of exposure to the Mirt stove: (2) where a child's height-for-age z-score is the dependent variable. Additional baseline controls, in X, include the same factors as in model (1) Compared to the treatment vs. control difference in equation (1), the model in equation (2) incorporates two key additional features. First, the inclusion of household fixed effects, µ h , is now possible, as treatment effectively varies at the individual level. This allows for the fixed effects to capture any shared characteristics common to all children in a household and restricts identification to within-household comparisons. This strategy accounts for a large set of potentially confounding 11 variables that have plagued past associations of stove use and health. Second, it identifies an expected placebo effect for those treatment children three years and older in the baseline wave: Compared to children in the control group of the same age, these children should see no change in height and should be zero. Mean height-for-age in the sample is -1.2 standard deviations, suggesting that the average child is 1.2 standard deviations shorter than a healthy child of the same age (in months) and sex. Comparing across treatment and controls suggests that the treatment group is 0.06 standard deviations taller, but the difference is not statistically significant, as seen in column 4.  Figure 2 below illustrates the treatment and control height-for-age measure for children up to age 10, constructed by nonparametric regressions of height-for-age on age in months at the 2016 follow-up survey. We illustrate the sensitive period of growth with vertical lines as those children exposed during their first three years of life were up to 76 months old in the follow-up survey. Comparing across treatment and control households, there is a marked difference in the height of children at early ages. As the intervention had been in place for 40 months, children in the treatment group 40 months old and younger have had complete potential exposure to the Mirt stove, while those between 40 and 76 months have an amount of exposure decreasing in their age. Regression models below further examine these differential effects. Figure 3 below represents the corresponding treatment and control comparison for child and adult symptom reports. Children report an average of 2.3 symptoms in the control group compared to 2.0 in the treatment group, although the difference is not statistically significant (p-value = 0.14).

Summary Statistics and Descriptive Evidence
The gaps are generally quite small between the groups but tend toward the expected direction with treatment individuals 8 percentage points less likely to report any symptoms at all (p-value = 0.02).
Regression results below assess the gap between treatment and control including additional covariates.
14  severe or moderate difficulty with the activity. As in Figure 3, the differences across the groups are minimal and statistically insignificant. Lack of significant differences across the groups persists using alternative severity thresholds. We move next to the corresponding regression results.  Table 2 presents estimates of the impacts of treatment status and stove use on child growth from equation (2). Columns 1 through 3 focus on intent to treat effects of treatment assignment. All models include additional individual and household level controls and standard errors adjusted for clustering at the household level given the multiple children per household structure of the data.  Table reports intent to treat estimates of stove offers in columns 1-3 and instrumental variable effects of stove use in columns 4-6. Columns 2 and 5 include household level random effects and columns 3 and 6 include household fixed effects. Dependent variable is child height-for-age (z-scores), standardized according to the CDC growth tables. Sample is all children ages 15 and under. All models include a cubic polynomial in age in months, sex, mother's height and age, and father's age. Household level controls are omitted from regressions with household fixed effects, and include household size and demographic composition, age, education and sex of the household head, dwelling characteristics, community fixed effects, and a composite wealth index. Standard errors clustered at the household level in parentheses. ** p<0.05, * p<0.10 Column 1 presents treatment estimates for both those children exposed while in the sensitive period and older children. The estimates suggest being in a household that was offered a Mirt stove while in the sensitive period of growth is associated with a 0.277 standard deviation increase in predicted height-for-age relative to the control children of the same age. This is a sizeable effect and similar in scale to the estimated benefits of improved cookstoves on birth outcomes such as gestational age and birth weight (e.g. Alexander et al., 2018), suggesting part of the mechanism may work through in-utero exposure.

Child growth
The underlying physiology predicts there should be no impact on older children in the treatment group. The point estimate of -0.013 standard deviation (p-value = 0.92) points to this conclusion, as treatment children older than 36 months at the baseline saw no benefit in terms of their height-for-age at follow-up compared to children in the control group. This is in-line with established literatures in nutrition and epidemiology, and it supports the experimental design of our study as it implies there were no pre-existing differences in height-for-age across treatment and control children.
Column 2 of Table 2 exploits the multiple children per household structure of the data and includes household random effects. Estimates in Column 2 suggest a precisely estimated 0.367 standard deviation treatment effect for young children (p-value = 0.013) and a null effect for older children.
Column 3 is the most demanding of the specifications as it includes household fixed effects and is identified only from comparisons of siblings within the same household. Given the withinhousehold design, it is not possible to identify the effects for both younger and older children presented in columns 1 and 2, as older treatment children become the comparison group for younger treatment children. All observed and unobserved characteristics common at the household level are absorbed into the fixed effect, and biases threatening the causal interpretation of the estimates would have to vary within treatment and control households in such a way as to be correlated both with treatment status and only positively impact the youngest children.
The 0.383 estimate in Column 3 suggests young children in treatment households are 0.383 standard deviation taller compared to older children within their same households than young children in control households are compared to their older control siblings. The result is again precisely estimated (p-value = 0.023). This is a large and meaningful finding, corresponding to approximately a 2cm gain for a 3-year-old child.
Columns 4 through 6 report treatment effects of stove use measured as the share of months the household used the stove. These are instrumental variable estimates where stove use-by-age interactions are instrumented with treatment status interacted with the corresponding ages. The results suggest use of the stove over the full period relates to gains in height-for-age between 0.32 and 0.45 standard deviation for children in the sensitive period of growth, with no benefits for older children.
Both results are important-these are sizable gains for young children, and the expected placebo effect of older children again suggests this is a valid empirical approach. Table 3 summarizes the ITT and use impacts on acute respiratory symptoms and measures of physical functioning for older children and adults. Unlike height-for-age, these domains are potentially malleable across the age distribution, thus estimates are determined by differences across households.

Respiratory symptoms and adult physical functioning
Models include individual, household, and community-level controls as in equation (1) and standard errors are adjusted for clustering. 8 Appendix Tables A3, A4, and A5 report results for the 20 individual symptoms and 12 activities of daily living used to construct the summary measures in Table 3.
Columns 1 through 3 report impacts on acute symptoms for older children. The results suggest there are no precisely estimated links between the Mirt intervention and one's symptom burden over the prior 4-week period. There is no evidence of an ITT effect for the total number of symptoms in Column 1, an indicator for any symptoms in Column 2, or for cough specific symptoms in Column 3. There is an imprecisely estimated -5.3 percentage point use estimate for reporting any symptoms in Column 2 (p-value=0.097). Given a control group mean of 51.6 percent, this represents an approximately 10 percent reduction but is on the knife-edge of statistical significance.
While many of the point estimates are negative and suggest reduced symptom loads for children in treatment households, the results should be interpreted as suggestive at best. For the 20 specific symptoms in Appendix Table A3, we estimate one reduction at the 5 percent significance level on headaches and one on effects at the 10 percent level-effects which are balanced out by estimated increases in wheezing and shortness of breath. Such a pattern is consistent with pure chance given the expected error rates and the number of outcomes. Adjusting statistical power for multiple comparisons only further weakens the results. The null findings also hold true when examining sub groups of the population based on specific 5-year age groups (ages 5 and under, 6-10, and 11-15), regions, and baseline wealth quantiles.  Table reports estimate of impact of stove offers and stove use. ITT and stove use estimates come from separate regressions. The dependent variable is the number of symptoms reported in columns 1, 4, and 8 and linear probability models with binary outcome indicators for remaining columns. Sample is all children ages 15 and under. Additional controls include cubic polynomial in age in months, sex, mother's age and height, father's age, household size and demographic composition, household head age, education, and sex, dwelling characteristics, community fixed effects, and a wealth index. Robust standard errors in parentheses.
The remainder of Table 3 presents results for adults. Columns 4 through 6 show that there are no meaningful links between a cook's reported symptoms and either assigned treatment status or stove use. Columns 7 and 8 report effects for summary measures of the activities of daily living outcomes assessing the likelihood of experiencing any specific difficulties and the number of reported difficulties. The estimated effects are small in magnitude and not statistically different from zero. The null physical functioning results hold using ordered probit models to examine the gradient of difficulties for each outcome and when considering binary outcomes for severe levels of limitations rather than severe or moderate difficulty used in Table 3. As with children, the null findings hold after dividing the adult cooks by age, by region, and by wealth quantile.
Taken as a whole, the results suggest a nuanced pattern of significant gains in early-life growth for young children, yet no noticeable impacts for either adult cooks or older children. Understanding the physiological mechanisms facilitating growth effects alongside null results on other health domains for older individuals remains a key component in this research program. We next present evidence examining the link between pollution measurements and health outcomes.

Mechanisms -Pollution Concentrations and Health Outcomes
As a step toward explaining the high rates of Mirt use, positive child growth effects, and the absence of other health effects, we examine the relationship between indoor air quality measures and key health outcomes. Indoor exposure to pollutants from the combustion of solid fuels was measured with pump and filter kits and light scattering meters in a random subset of 201 households: 98 treatments and 103 controls. More details on the data collection process and treatment effects of the intervention on air pollution levels are provided in Bluffstone et al. (2018).
The recorded levels of particulate matter in the treatment and control households are consistent with measurements in similar environments (Chen et al., 2016;Dutta et al., 2007)  Here, we make use of pollution readings to descriptively assess the possibility of pollutionhealth gradients. For each of our key outcomes we estimate regressions of an individual's health marker against the natural log of their household's particulate concentration. As the shape of these functions is not well-established in the literature, particularly at the observed, high levels of PM 2.5 , this should be interpreted as only suggestive evidence of a link rather than the exact magnitude of the health-pollution slope.

Indoor air pollution and child growth
Panel A of Table 4 reports results with a child's height-for-age z-score as the dependent variable and the log of the maximum particulate matter concentrations as the key variable on the right-hand side.
Following the known sensitive period of growth, in columns 1 through 3 we focus on children currently under 36 months of age who would be contemporaneously impacted by the measured pollution level. 9 There is a clear, statistically significant link between reductions in log maximum PM 2.5 and improvements in child height-for-age. Column 3, which includes individual and household level controls, suggests that a 24 percent decrease in maximum PM 2.5 , the observed decrease across treatment and control households with young children, relates to a 0.1 standard deviation increase in height for age (0.24*(e -0.569 -1)). This effect is sizable yet smaller than the 0.3 standard deviation estimate in Table 2.
Columns 4 through 6 of Panel A repeat the analysis including all children under 76 months old who had the potential to benefit from the intervention during their sensitive period of growth.
This approach requires that pollution readings captured at the follow-up wave reflect exposure across the 40-month period. The negative and statistically significant relationship between maximum PM 2.5 and height-for-age persists, although the magnitude falls by approximately 40 percent. The 37 to 76month group independently has an estimate of -0.201 (p-value = 0.11), suggesting that the link between measured pollution and height-for-age is strongest for children in the sensitive growth period.  Table reports results from regressions of health outcomes on log maximum indoor PM2.5 concentrations. Panel A examines child height for age for children under 36 months old in columns 1-3 and under 76 months old in columns 4-6. Panel B. examines respiratory symptoms for children birth to 15 years in columns 1 and 2 and adult cook outcomes in columns 4 through 6. Individual level controls include a cubic polynomial in age, sex, mother's height and age, and father's age. Household level controls include household size and demographic composition, age, education and sex of the household head, dwelling characteristics, site fixed effects, and a composite wealth index. Standard errors robust to clustering at the household level used for child outcomes and community level used for adult outcomes. *** p<0.01, ** p<0.05, * p<0.1

Respiratory symptoms, adult ADLs, and pollution
While there is suggestive evidence of a pollution-height-for-age link at the observed levels of particulate concentration, we find no such link for respiratory symptoms or activities of daily living. Table 4 presents estimates for summary measures of child symptoms in columns 1 and 2, and adult cook outcomes in columns 3 through 6. All regressions include additional individual and household level controls.

Panel B of
Of the six estimates, none is statistically significant. The effects do not appear to be driven by small sample size concerns, as many of the effects are economically small in magnitude.
Taken together with Panel A, the null findings in Panel B suggest a plausible pollution-health mechanism that would lead to growth effects for young children, even without observed health benefits for adult cooks and older children. This pattern is plausibly consistent with physiological scarring attributed to cumulative exposure to high levels of particulate matter in the past for adults and older children (Naeher et al., 2007).

Discussion
We examine the health impacts of transitioning to the Mirt improved biomass cookstove in a randomized trial conducted in rural Ethiopia. Three years after the initial distribution of the stoves, households randomly assigned to receive the units continued to use the technology at a high rate, and young children in these households experienced significant gains in height-for-age on the order of 0.3 to 0.4 standard deviation.
However, adult cooks and older children experienced no gains in measured respiratory symptoms or markers of physical function. While little is known about the specific health benefits of pollution reductions at the high levels of PM 2.5 observed in the study, we find evidence suggesting the change in PM 2.5 due to the Mirt stove is enough to cause improvements in height-for-age, though not enough to see noticeable differences in older individuals. Compared to results seen in laboratory tests, the continued use of open-fire traditional cooking alongside the improved stove appears to attenuate the health benefits of the Mirt stove in actual households.
From a policy perspective these gains may appear modest given the null impact on adults and older children. Still, over the course of a child's life they represent significant benefits due to the laterlife impacts of early-life health improvements. Children who are taller early in life are more likely to complete primary and secondary education, develop greater cognitive capacity, earn more as adults, and live longer, healthier lives. The results presented here suggest exposure to high levels of particulate matter from biomass smoke is a contributing factor to a significant health burden for young children, and that its mitigation may result in significant improvements in well-being around the world.

Appendix Tables
Appendix Table A1 Notes: Table reports estimate of impact of stove offers and stove use. ITT and use estimates come from separate regressions. The dependent variable is the number of symptoms reported in Column 1 and linear probability models for presence of specific symptoms in columns 2 through 22. Sample is primary cook in the household. Additional controls include age indicators, sex, household size and demographic composition, household head age, education, and sex, dwelling characteristics, site fixed effects, and a wealth index. Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Appendix  Table reports results from regressions of health outcomes on log mean indoor PM2.5 concentrations. Panel A examines child height for age for children birth to 36 months in columns 1-3 and birth to 76 months in columns 4-6. Panel B. examines respiratory symptoms for children birth to 15 years in columns 1 and 2 and adult cook outcomes in columns 4 through 6. Individual level controls include a cubic polynomial in age, sex, mother's height and age, and father's age. Household level controls include household size and demographic composition, age, education and sex of the household head, dwelling characteristics, site fixed effects, and a composite wealth index. Standard errors robust to clustering at the household level used for child outcomes and community level used for adult outcomes. * p<0.1