Author's response to Poole, C. Commentary: How Many Are Affected? A Real Limit of Epidemiology

We read with interest Charlie Poole’ sc ommentary [1] on our paper, “Redundant causation from a sufficient cause perspective,”[2] in which he questions the utility of the sufficient component cause (SCC) model for examining differences between etiologic and excess effects. Poole contends that the concept we term “redundant causation” is uncomplicated and (we presume), well understood. He questions whether “it needs to be explained in terms any deeper than those of potential outcomes” [1]. His critique of our paper focuses on our hypothetical and simplistic example of sufficient causes (SCs) of liver cancer. To be of value, Poole believes our example must be realistic and must bring “aspects of the potential outcome and sufficient cause models, and their interface, into sharp relief” [1]. His concerns raise larger issues about the roles of simplifications and the SCC model in methods research in general. We address each of these below. Simplifications

We read with interest Charlie Poole's commentary [1] on our paper, "Redundant causation from a sufficient cause perspective," [2] in which he questions the utility of the sufficient component cause (SCC) model for examining differences between etiologic and excess effects. Poole contends that the concept we term "redundant causation" is uncomplicated and (we presume), well understood. He questions whether "it needs to be explained in terms any deeper than those of potential outcomes" [1]. His critique of our paper focuses on our hypothetical and simplistic example of sufficient causes (SCs) of liver cancer. To be of value, Poole believes our example must be realistic and must bring "aspects of the potential outcome and sufficient cause models, and their interface, into sharp relief" [1]. His concerns raise larger issues about the roles of simplifications and the SCC model in methods research in general. We address each of these below.

Simplifications
As Poole suggests, the conceptual and mathematical underpinnings of redundant causation (which we refer to colloquially as "redundancy") and its potential impact on etiologic versus excess effects have been explained by others in philosophy and epidemiology [3][4][5][6][7][8][9][10]. We disagree, however, that redundancy is "uncomplicated", fully examined in the literature, or well understood by epidemiologists. One purpose of our paper was to make the discussion of redundancy more accessible to a broad epidemiologic audience. To this end, we used a hypothetical and intentionally simplified liver cancer example to help crystallize this discussion for those who prefer tangible examples.
Poole was bothered by many simplifications in our liver cancer example, including the lack of competing risks, lack of disease recurrence, lack of shared causal components and presence of ubiquitous components. Although these simplifications may be unrealistic, we feel they were appropriate for illustration purposes. Indeed, the use of simplifying assumptions is ubiquitous in methodological work. To discover underlying principles, the complexity of real life situations must be controlled. Charges of oversimplification, therefore, should be grounded in a discussion of which simplifying assumptions are legitimate and which are not.
It seems to us that simplifying assumptions are legitimate if they are: (1) possible, or (2) impossible (or highly unlikely), but violating the assumption does not change the argument. For example, a monotonicity assumption is often invoked in methods work; it is often assumed that a risk factor is either causal or protective for the disease, but not both. This assumption is legitimate because it is possible. There are risk factors that meet this assumption. One can validly illustrate a methods principle invoking this assumption because there are realistic circumstances under which this principle would hold. Once the principle is understood within this context, the next step would be to examine the impact of relaxing this assumption [11,12].
Another frequently invoked assumption is an infinitely large sample size (see for example, [13]). This assumption is, by definition, impossible -no study can have an infinitely large sample. Nonetheless, this assumption can be legitimate because it often does not change the principle under investigation; it just allows the principle to be explicated more clearly.
In our liver cancer example, we think that all the simplifications meet the criteria for legitimate assumptions. Our assumptions are either possible (e.g., that alcohol consumption is the final component to complete the sufficient cause), or impossible but do not change the argument (e.g., that no one dies during the study period).
Ironically, in the methods literature, the absence of redundancy is often invoked as a simplifying assumption. Although Poole suggests that redundancy is well understood, its inevitability is often disregarded in methods work. That is, "no redundancy" is an impossible assumption that sometimes does change the principles under study. This illegitimacy of the "no redundancy" assumption became apparent when the effects of parallelism on the assessment of synergy entered the literature [14] and is a problem, we think, in some of the literature on mediation [15,16].
By invoking simplifying assumptions in our liver cancer example, we showed that the presence of even one individual with redundant sufficient causes can lead to a discrepancy between an etiologic and excess effect measure. To us, the next natural questions are "under what circumstances will redundancy lead to large discrepancies between etiologic and excess effects?" followed by "how realistic are those circumstances"? We have begun to address these additional questions (in manuscripts that are in various states of preparation), and we hope that other epidemiologists will be intrigued and help round out the answers.

Utility of the SCC model
Many methodologic innovations in epidemiology have been forged using POs. In recent years, drawing connections between POs and SCCs seems to have become a topic of interest [10,12,[17][18][19][20][21][22][23][24]. However, many of these discussions start with a set of POs and then connect them to an underlying general SCC model. We (the authors) begin methodologic research by specifying the directed acyclic graph (DAG) that depicts the relationships we intend to examine. From there, we apply the DAG rules to draw a SCC model that underlies the DAG. The potential outcomes arise directly from that specific SCC model and the prevalences of its components. We recognize that our perspective on the links among a DAG, SSC model, and set of POs is not typical; we have begun to address this elsewhere [25,26].
The SCC model has been used to illustrate several epidemiologic principles in the literature. For instance, in the Modern Epidemiology text, Rothman and Greenland use a SCC model to show how the strength of an association is caused by the prevalence of the causal partner [27] and to demonstrate the concepts of synergy and parallelismthat the extent of synergy is determined by the prevalences of all partners of the synergistic factors, and that additive interaction is synergy minus parallelism [28]. Notably, the most recent version of Modern Epidemiology invokes a simple SCC model to describe how redundant causation arises [29].
The second purpose of our paper was to make the discussion of redundancy more comprehensive. Like parallelism, redundancy is most clearly seen through the SCC lens. The SCC model makes clear: 1) how redundancy arises, 2) that it is a naturally occurring, inevitable phenomenon, and 3) which factors influence the proportion of redundant individuals in a population for a particular disease at a particular point in time. POs allow us to see the critical mathematical distinction between disease etiology and excess, but do not allow us to see what factors influence the distribution of the POs, which ultimately drives the strength of the etiologic effect and thus its discrepancy from the excess effect we actually measure.

Conclusions
Our examination of redundancy shows that even in the simplest (and perhaps most unrealistic) scenarios (e.g. when none of the sufficient causes shares components and there are no associations between components), redundancy can influence our effect estimates. It appears that the more complicated the SCC model (e.g., the more SCs, shared components, or associations between components in the SCC model), the more redundant individuals there will be, and the larger the discrepancy between the etiologic and excess effects. We do not believe that these principles could be revealed using potential outcomes alone.