Granick revisited: Synthesizing evolutionary and ecological evidence for the late origin of bacteriochlorophyll via ghost lineages and horizontal gene transfer

Photosynthesis—both oxygenic and more ancient anoxygenic forms—has fueled the bulk of primary productivity on Earth since it first evolved more than 3.4 billion years ago. However, the early evolutionary history of photosynthesis has been challenging to interpret due to the sparse, scattered distribution of metabolic pathways associated with photosynthesis, long timescales of evolution, and poor sampling of the true environmental diversity of photosynthetic bacteria. Here, we reconsider longstanding hypotheses for the evolutionary history of phototrophy by leveraging recent advances in metagenomic sequencing and phylogenetics to analyze relationships among phototrophic organisms and components of their photosynthesis pathways, including reaction centers and individual proteins and complexes involved in the multi-step synthesis of (bacterio)-chlorophyll pigments. We demonstrate that components of the photosynthetic apparatus have undergone extensive, independent histories of horizontal gene transfer. This suggests an evolutionary mode by which modular components of phototrophy are exchanged between diverse taxa in a piecemeal process that has led to biochemical innovation. We hypothesize that the evolution of extant anoxygenic photosynthetic bacteria has been spurred by ecological competition and restricted niches following the evolution of oxygenic Cyanobacteria and the accumulation of O2 in the atmosphere, leading to the relatively late evolution of bacteriochlorophyll pigments and the radiation of diverse crown group anoxygenic phototrophs. This hypothesis expands on the classic “Granick hypothesis” for the stepwise evolution of biochemical pathways, synthesizing recent expansion in our understanding of the diversity of phototrophic organisms as well as their evolving ecological context through Earth history.

photosynthesis, which supported the biosphere early in its history (Kharecha et Hao et al. 2020). However, the origin and early evolution of anoxygenic photosynthesis is not well understood due to the antiquity of these events and the lack of clear signatures in the rock record , Ward and Shih 2019). The earliest widely accepted evidence for anoxygenic photosynthesis in the rock record is in the form of depth-dependent organic carbon production in preserved microbial mats from ~3.4 Ga (Tice and Lowe 2004, Tice and Lowe 2006), and it is clear that both forms of reaction center had evolved in time to be brought back together in stem group Cyanobacteria to invent oxygenic photosynthesis and trigger the Great Oxygenation Event ~2.3 Ga (Shih et al. 2017a). However, this leaves over a billion years in which photosynthesis was likely present on Earth and actively evolving, and it is unclear how long ago the reaction centers diverged or how long the stem lineage persisted. While molecular clocks and other studies of evolution in deep time often assume that crown group synapomorphies are acquired either at the base of the crown group (e.g Battistuzzi et al. 2004, Schirrmeister et al. 2013 or the base of the total group (e.g. Magnabosco et al. 2018), there is no a priori way of determining where along a stem lineage these traits were acquired. It is therefore essential to recognize uncertainty that is inherited with long stem lineages and to make use of all available basal lineages and sister groups in phylogenetic analyses to break up long branches and reduce uncertainty in timing of acquisition of important traits (e.g. Shih et al. 2017b).
While simple forms of photoheterotrophy can be supported by ion-pumping rhodopsins, there are no known examples of this metabolism driving carbon fixation ); true photosynthesis is only known to be supported by a more complicated pathway involving multiple components including an electron transport chain (including Complex III or Alternative Complex III), phototrophic reaction centers or photosystems, synthesis and modification of chlorophyllide pigments for light harvesting, and optionally carbon fixation to enable photoautotrophy. The capacity for reaction center-based phototrophy is scattered across the tree of Bacteria, with a handful of phototrophic lineages separated by many nonphototrophic groups ( Figure 1). While early hypotheses for the evolutionary history of photosynthesis invoked vertical inheritance and extensive loss in most lineages (e.g. Woese 1987), more recent work is more consistent with this distribution being driven by horizontal gene transfer (HGT) of phototrophy (Zeng et  The extant distribution of chlorophyll versus bacteriochlorophyll synthesis is an additional puzzle. Bacteriochlorophyll is biochemically more complex to synthesize and uses lower quality light (Chew and Bryant 2007) yet is found today in bacteria using more ancient anoxygenic phototrophy (Table 1). Oxygenic Cyanobacteria, in contrast, use chlorophyll, which harvests higher quality light and is biochemically more straightforward to synthesize. This has led to hypotheses for the evolution of (bacterio)chlorophylls ranging from the stepwise evolution of progressively more complex pathways, with chlorophyll being more evolutionarily ancient than bacteriochlorophyll (i.e. the Granick hypothesis, Granick 1965), or hypotheses invoking the complete bacteriochlorophyll synthesis pathway in the last common ancestor of extant phototrophs followed by secondary simplification of the pathway in Cyanobacteria (Xiong et al. 2000). The synthesis of bacteriochlorophyll and chlorophyll both proceeds through chlorophyllide a, and so the two pathways share a "backbone" of shared steps, while bacteriochlorophyll synthesis has additional later steps, including those performed by the BchXYZ complex, which is homologous to the BchLNB complex involved in the synthesis of chlorophyllide a from protochlorophyllide a (Supplemental Table 1, Supplemental Figure 1). This homology provides an opportunity for querying the relative evolutionary histories of bacteriochlorophyll and chlorophyll: if the two complexes have congruent phylogenies with the exception of the absence of BchXYZ in Cyanobacteria, this would argue for a secondary simplification of the pathway as suggested by Xiong et al (2000). However, incongruent phylogenies would indicate an independent history of HGT, such as might be expected if bacteriochlorophyll synthesis was secondarily acquired in anoxygenic phototrophs relatively later in their evolutionary history after the divergence of crown group lineages.
Here, we perform phylogenetic analyses of various components of the phototrophy pathway, including individual steps in (bacterio)chlorophyll synthesis, and demonstrate that phylogenetic trees of these components have incongruent topologies, reflecting independent histories of horizontal gene transfer. This suggests a mix-andmatch style of modular evolution by which bacteria acquire separate components of phototrophy from separate sources at different times in their evolutionary histories. Moreover, we provide support for an early diverging "ghost lineage" of extinct or undiscovered phototrophs responsible for evolutionary divergence of RC2 and BchXYZ. The genes encoding these complexes were later transferred into crown group phototrophs, leading to evolutionary novelty such as coupled photosystems and bacteriochlorophyll synthesis. We then propose an ecological model for the evolution of phototrophy in deep time, whereby competition with Cyanobacteria and environmental partitioning by O 2 tolerance has led to innovation, diversification, and specialization by crown group anoxygenic phototrophs, leading to diverse bacteriochlorophyll-synthesizing anoxygenic phototrophs today that radiated after the GOE, supplanting ecologically and genetically distinct chlorophyll-synthesizing anoxygenic phototrophs which fueled primary productivity in Archean time.

Phylogenetic analyses of reaction center and (bacterio)chlorophyll synthesis proteins
We have constructed phylogenies of essential proteins for phototrophy including reaction centers and (bacterio)chlorophyll synthesis (listed in Supplemental Table 1) using representatives from all known clades of phototrophic bacteria (Table 1). Consensus trees for phototrophy proteins are shown in Figure 2, with individual phylogenies available as Supplemental Information (Supplemental . While these trees have difficulty recovering deep evolutionary relationships due to long evolutionary distances relative to protein lengths, a problem previous recognized for interpreting the evolutionary history of these proteins (e.g. Lockhart et al. 1996, Bryant et al. 2012, they robustly recover overall topologies and sister-group relationships that differ between different proteins. Breaking up long branches by adding newly discovered phototrophic taxa (e.g. Chloracidobacteria and Baltobacterales) improves resolution of deep branches relative to previous attempts with more limited datasets (e.g. Xiong et al. 2000). The topologies of phylogenetic trees constructed here are largely incongruent (e.g. organismal tree, RC tree, backbone chlorophyll synthesis, and bacteriochlorophyll synthesis, Figure 2), indicating that components of the phototrophic apparatus have undergone independent horizontal gene transfer events over the course of their evolutionary history (Doolittle 1986).
While branch support within diverse, shallow radiations such as the Cyanobacteria and the Proteobacteria were consistently poor (e.g. Supplemental Figures  3-16), and well-supported roots were not recovered for all proteins investigated (e.g. long branches between BchL/N/B and BchX/Y/Z made root placement inconsistent, Supplemental Figures 13-16, the consistent topology between consensus BchLNB and BchHDIM trees clustered at higher taxonomic levels (i.e. order through phylum) and robust rooting of BchH and BchM via characterized homolog outgroups allows extrapolation of branching order from these trees to the consensus backbone (bacterio)chlorophyll synthesis tree. While uncertainty in early branching order may affect hypotheses of the directionality of HGT invoked in hypotheses for evolutionary history of phototrophy, major sister-group relationships and incongruent topologies between different complexes are robustly recovered, supporting overall evolutionary trends even if the exact number, timing, and directionality of HGT events are uncertain. As a result, interpretations of the role of HGT in driving the evolution of phototrophy are considered robust here, even if the scenario depicted in Figure 3 is only a hypothesis.
Backbone (bacterio)chlorophyll synthesis genes have a somewhat different evolutionary history than downstream bacteriochlorophyll-specific genes, which have a distinct evolutionary history from phototrophic reaction centers, which have independent histories from carbon fixation pathways-all of which are incongruent with organismal phylogenies. Taken all together, the reticulated and piecemeal nature of this incongruencies suggest that the evolution of phototrophy, like many other metabolic pathways, is modular

Evolutionary history of photosynthesis-associated proteins
The incongruent topologies of trees built with various phototrophy-related proteins ( Figure 2) suggests that these proteins may not share a unified evolutionary history but may instead have undergone independent horizontal gene transfer events. By overlaying components of the phototrophy apparatus onto the backbone tree made from the consensus topology of proteins involved in shared early steps of (bacterio)chlorophyll synthesis, we can reconstruct a hypothetical evolutionary history of the genes, even if a largely unconstrained cooccurring history of organismal transfer must also be occurring throughout stem lineages (Figure 3). For example, the node representing the last common ancestor of (bacterio)chlorophyll synthesis genes in Chlorobi and Chloroflexi almost certainly did not occur in crown group members of either of those phyla or their last common ancestor, but instead independent HGT events introduced phototrophy to those groups following the divergence of their (bacterio)chlorophyll pathways in other host lineages. In some cases, we see transfer of the complete phototrophy pathway (e.g. into Gemmatimonadetes from Proteobacteria, Zeng et al. 2014), while in others it appears that some components were transferred independently of others (e.g. bacteriochlorophyll synthesis and reaction centers into Chlorobi and Chloroflexi, respectively). In many cases, we see transfer of phototrophy but not carbon fixation (e.g. Anaerolineae Below, we describe a hypothesis for the evolutionary history of phototrophy genes that is consistent with all of the available data, from the "ur-phototroph" (with the prefix "ur-" denoting the first or proto-version of a thing, here referencing the first phototrophic lineage to evolve, i.e. the first member of stem and total group phototrophs) until the radiation of extant clades of phototrophs (the crown group). The history presented here relies on undiscovered, likely extinct, "ghost lineages" which served as the source of genes to crown group phototrophs during ancient transfer events. The concept of "ghost lineages" is carried over from traditional fossil record-based phylogenetics, where it refers to a lineage which is inferred to exist but which has no known fossil record (Norell et al. 1992, Norell 1993). Importantly, this history is considered from the perspective of phototrophy genes, not of the organisms which harbor them.
The ur-phototroph likely utilized a single ancestral reaction center (almost certainly a homodimer that formed a stem lineage prior to the RC1/RC2 divergence, and which may have been biochemically more similar to heliobacterial RC1 than other extant types, made up of a relatively simple and inefficient homodimer which loosely bound mobile quinones to drive cyclic electron flow, which was adapted in the various reaction center and photosystem lineages to optimize reactions and eventually to adapt to oxygen, Orf et al. 2018) and chlorophyll a, synthesized using a DPOR complex ancestral to both BchLNB and BchXYZ. Eventually, two lineages of phototrophs diverged, either due to speciation of a single organismal lineage or due to HGT of phototrophy genes into a second host organism. One lineage (the "ghost lineage") possessed a reaction center that evolved into RC2 and a BchLNB-like complex that eventually evolved into BchXYZ but which functioned as a DPOR complex. The other lineage possessed an ancestral RC1 and BchLNB in order to synthesize chlorophyll. The RC1 lineage diversified, with RC1 and chlorophyll synthesis in several lineages. Eventually, HGT of RC2 from the ghost lineage into stem group Cyanobacteria led to the evolution of oxygenic photosynthesis. This triggered ecological restructuring and widespread evolutionary adaptation, including further HGT from the ghost lineage into other lineages of anoxygenic phototrophs. This included HGT of the BchXYZ complex from the ghost lineage, perhaps into stem group Proteobacteria, leading to the coupling of BchLNB and BchXYZ in series to lead to bacteriochlorophyll synthesis and the ability of anoxygenic phototrophs to better compete in deeper, lower oxygen regions of microbial mats and water columns (see below section "Ecological perspectives on the evolution of photosynthesis"). Further HGT of RC2 from the ghost lineage into other anoxygenic phototroph lineages led to further adaptation and specialization of anoxygenic phototroph lineages to specialized environments. The long branches between homologous proteins in the BchLNB and BchXYZ complexes is consistent with their early divergence sometime during Archean time before the radiation of crown group phototrophs; the congruence between BchLNB and other backbone chlorophyll synthesis genes suggests that these genes have largely been inherited together into extant phototrophs, while the differing topology of BchXYZ indicates an independent history of HGT (Supplemental Figure 2), in this scenario driven by HGT of BchXYZ from the ghost lineage into crown group anoxygenic phototrophic clades, allowing them to produce bacteriochlorophyll.
For simplicity one ghost lineage is described here, but it is likely there were many lineages that are not represented in characterized phototrophs today, both diverging from the stem of the phototroph tree as well as from throughout the crown group. For example, the ghost lineage invoked here is hypothesized to use a type 2 reaction center (though this may have been a more ancient homodimeric form, as heterodimerization postdated the divergence of RC2 and PSII, Cardona 2013), but there are also suggestions from the carbon isotope record suggesting ancient Wood-Ljungdahl utilizing Type 1 phototrophs (Ward and Shih 2019), indicating that there are multiple undiscovered or extinct phototroph lineages.
While some phototrophic Chloroflexi in the Anaerolineae class lack BchLNB (Klatt et al. 2011, Ward et al. 2018a, Ward et al. 2020), this appears to be a derived trait based on their placement in BchHDI and BchXYZ trees. The genomes of these organisms are derived from metagenomic data and so are somewhat incomplete, but the probability that the organisms contain these genes but they weren't recovered in the MAG is incredibly low (estimated by MetaPOAP as ~10 -12 , Ward et al. 2018b). Instead, this appears to be a case of secondary loss potentially coupled with bifunctionalization of BchXYZ to perform the reduction of both the C17/C18 double bond normally reduced by BchLNB as well as the C7/C8 double bond normally reduced by this complex. This seems feasible, as chimeras of other homologs of these genes have been demonstrated to be functionally exchangeable (e.g., Cheng (Cheng et al, 2005). Following the evolutionary history proposed in Figure 3, the BchXYZ complex is descended from a BchLNB-like complex for the reduction of the C17/C18 double bond that was later coopted to instead reduce the C7/C8 double bond to enable bacteriochlorophyll synthesis. However, isolation of phototrophic Anaerolineae and biochemical characterization of their BchXYZ complexes will be necessary to test this hypothesis.

Ecological perspectives on the evolution of photosynthesis
Overlaid onto the history of HGT-driven evolution of phototrophy is also a history of adaptation to changing environmental conditions, particularly the rise of atmospheric oxygen. The earliest evidence for photosynthesis on Earth is found in rocks ). Thus, many modern anoxygenic phototrophs are therefore not relicts of the Archean Earth, but instead have undergone billions of years of evolution, leading to extant phototrophs that are a palimpsest of genetic innovation and horizontal gene transfer, with over a billion years of evolution hidden in stem lineages. Straightforward comparative genomic and phylogenetic techniques extrapolate only from extant diversity and therefore may overlook nuance and complexity in the early evolution of pathways before the emergence of the last common ancestors of extant clades. A complementary alternative approach is to develop theories for the ecological drivers of the evolution of phototrophy, which can produce hypotheses which are testable with the biological record, and which can be used to choose between competing hypotheses which are equally supported by the biological record but of which some may be more ecologically viable.
While in principle the innovation and expansion of bacteriochlorophyll synthesis could have occurred before the evolution of oxygenic photosynthesis in Cyanobacteria, we propose that it was instead the expansion of oxygenic phototrophs and the rise of oxygen that likely led to the rampant HGT of the capacity for bacteriochlorophyll synthesis and overall success of this strategy among taxonomically and ecologically diverse anoxygenic phototrophs (Figure 4).
Before . As a result, these environments would have been permissive to anoxygenic phototrophs, allowing them to thrive in shallow water environments and as the major primary producers in microbial mats (Ward and Shih 2019). At this time, anoxygenic phototrophs would have been free to exploit abundant, high-quality light conditions to which chlorophyll a is well adapted. The earliest anoxygenic phototrophs therefore would have had little to no evolutionary pressure to evolve more biochemically complex bacteriochlorophyll pigments. Anoxygenic phototrophs may have specialized to particular environments where electron donor compounds or other nutrients were especially abundant (e.g. volcanic environments or near shallow hydrothermal vents), but otherwise would have had little barrier to dispersal and colonization of widespread environments. As Cyanobacteria are quite oxygen tolerant relative to other phototrophs and are not dependent on limited electron donors, these organisms would have been capable of colonizing all shallow aquatic environments, displacing anoxygenic phototrophic communities and leading to a distribution of phototroph types more similar to that seen today. Oxygenic phototrophs are able to grow in well-oxygenated surface environments with largely unobstructed sunlight, and so have had little evolutionary pressure to evolve away from using ancestral chlorophyll pigments. Anoxygenic phototrophs, in contrast, would have been unable to compete with Cyanobacteria in surface environments where they would experience oxygen toxicity as well as limited availability of electron donors due to biological or abiotic oxidation driven by O 2 . As a result, anoxygenic phototrophs would be restricted to environments deeper in water columns or microbial mats where oxygen concentrations are lower and electron donors are more available. As these environments typically underlie cyanobacterial populations (e.g., microbial mats), anoxygenic phototrophs are light-limited in the wavelengths absorbed by cyanobacterial pigments-including chlorophyll (e.g. Pierson et al. 1987). These organisms therefore have undergone significant evolutionary pressure to evolve light-harvesting pigments that are shifted to different wavelengths that reach these deep layers. Bacteriochlorophylls fill this role well: bacteriochlorophyll pigments have peak absorbances at longer wavelengths than chlorophyll, resulting in less energy absorbed per photon but allowing them to work deeper in the water column/microbial mat, once more light has been absorbed (both by the medium and by Cyanobacteria with chlorophylls) (Fenchel andStaarup 1971, Larkum et al. 2018). This scenario is consistent with the ancestral use of chlorophyll in anoxygenic lineages followed by multiple HGT-enabled acquisitions of BchXYZ and therefore bacteriochlorophyll synthesis in anoxygenic lineages after the GOE. Ecological exclusion of anoxygenic phototrophs from high-light surface environments by competition with Cyanobacteria and the distribution of O 2 and electron donors likely therefore provided the evolutionary pressure for the initial evolution of bacteriochlorophyll pigments as well as to drive horizontal gene transfer leading to the widespread adoption of bacteriochlorophyll pigments by diverse anoxygenic phototrophs.
As anoxygenic phototrophic lineages are typically restricted to particular niche environments, island biogeography-like evolutionary radiations have likely led to the diversification of anoxygenic phototrophs seen today, including the diversification of their bacteriochlorophyll pigments. Today, oxygenic phototrophs (Cyanobacteria, algae, and plants) are essentially ubiquitous in habitable Earth surface environments. Cyanobacteria therefore have a largely cosmopolitan distribution, with related species found in diverse environments around the world. However, anoxygenic phototrophs are typically restricted to more limited environments such as stratified water columns (Imhoff 2014

Conclusions:
The origin and early evolution of phototrophy is not clearly revealed by the rock record, and care must be taken in reading the biological record; in particular, the application of comparative biology to investigating the early evolution of traits in stem lineages before the last common ancestor of extant members of a clade is inherently challenging (Ward and Shih 2019). However, large-scale comparative phylogenetics of organismal relationships and phototrophy-related proteins can provide some insight into the early evolution of photosynthesis, especially given that the independent histories of genes involved with phototrophy allow some insight into the nature of stem groups and extinct lineages in instances where genes from these organisms have been horizontally transferred and then inherited into extant organisms. These data are particularly useful once coupled to consideration of the ecological context of the organisms in question, which can help to choose between otherwise equally viable hypotheses. As we've shown here, phylogenetic relationships provide abundant evidence of horizontal gene transfer of phototrophy-related proteins, though the directionality of transfers and the relationships among the deepest branches remain ambiguous in many cases. These relationships are consistent with an early evolution of chlorophyll synthesis followed by a post-GOE radiation of bacteriochlorophyll-synthesizing anoxygenic phototrophs driven by ecological competition and niche partitioning by Cyanobacteria.
The evolution of photosynthesis is modular, involving not only independent HGT of carbon fixation and Reaction Centers but also of separate components of (bacterio)chlorophyll synthesis. The relationships among characterized extant phototrophs suggests one or more "ghost lineages", some of which likely diverged prior to the radiation of the crown group. Chlorophyll synthesis appears to be more ancient than bacteriochlorophyll synthesis, consistent with the Granick hypothesis (Granick 1965). This also results in an evolutionary scenario for the origin of bacteriochlorophyll synthesis analogous to the origin of coupled photosystems for oxygenic photosynthesis, whereby divergence of paralogs followed by reintroduction into a single host organism allows biochemical innovation (e.g. Swithers et al. 2012). The early divergence between RC1 and RC2, and between BchLNB and BchXYZ, suggested by this history is consistent with the long branches between these sets of homologs relative to divergence within each.
The modular nature of the evolution of phototrophy is similar to broader trends in the evolution of pathways including denitrification (Chen and Strous 2013), methanotrophy (Chistoserdova 2011, Ward et al 2019c), and high-potential metabolism in general ). The ability of microbes to exchange components of preexisting metabolisms and recombine them into new and innovative pathways appears to be a major driver of innovation.
The innovation of bacteriochlorophyll by coupling in series of the orthologous BchLNB/BchXYZ complexes is analogous to the innovation of coupling divergent photosystems to drive oxygenic photosynthesis. Although the early evolution of phototrophy involved extensive shuffling of phylogenetic relationships via HGT and a role for extinct stem lineages, we can understand this history by careful comparative phylogenetics as long as we have sufficient sampling of extant diversity. Coupling this understanding to analysis of the rock record, potentially supported by molecular clock analysis, may allow us to tie absolute ages to these phylogenies casting events in relative time. For example, the organic biomarker and molecular clock evidence for the evolution of phototrophy in the Chlorobi by 1.6 Ga (Brocks et al. 2005), molecular clock estimates for the origin of phototrophic Chloroflexia around 1 Ga (Shih et al. 2017b), and the rise of atmospheric oxygen due to total group oxygenic Cyanobacteria around 2.3 Ga can provide some constraints. The fact that each of the clades of anoxygenic phototrophs appears to have acquired bacteriochlorophyll synthesis in stem lineages before the radiation of individual crown groups indicates that the radiations of extant anoxygenic phototroph clades occurred relatively late in Earth history after the origin of oxygenic photosynthesis in Cyanobacteria. This is consistent with estimates for increasing diversity of bacterial lineages through geologic time (Louca et al. 2018). The taxonomic affinity and phenotypes of ancient phototrophs responsible for fueling Archean productivity remain unknown, perhaps due to the extinction of these bacterial lineage following the rise of oxygen.
The ecological partitioning of anoxygenic phototrophs into localized environments with lower oxygen concentrations and lower light energy may have led to an island biogeography-like adaptive diversification of different phototrophic lineages to distinct environments. Discrete anoxygenic phototrophic lineages have adapted to particular preferred environments, such as hot springs for Chloroflexi and Chloracidobacteria  Kimble et al. 1995). Adaptations to these specialized environments may have been driven by island biogeography-like evolutionary trends in which particular phototroph lineages have been isolated and evolved not in specific locations but in particular geochemical settings Consistent with other analyses of (bacterio)chlorophyll synthesis protein phylogenies (e.g. Xiong et al. 2000, Bryant et al. 2012), we were unable to recover robust, consistent branching order of deep divergences in many protein families (e.g. BchLNB and BchXYZ), though sister group relationships appear robust for each protein.
The inconsistent branching order of deep divergences in rooted phylogenies of individual subunits of the BchLNB and BchXYZ complexes may be an artifact of long branch attraction and saturation of variable sequences in relatively small soluble proteins over billions of years of evolution, or this difference in branching order may reflect actual differences in evolutionary history, whereby subunits of these complexes underwent independent horizontal gene transfer events early in their histories. It is difficult (if not impossible) to distinguish between these possibilities given the limited sequence data and long evolutionary timescales with which we are left. Nonetheless, broad evolutionary trends and shallower sister-group relationships were robustly recovered, and these clearly indicate that different components of the complete phototrophy pathway have independent evolutionary histories.
These limitations to interpreting the early evolutionary history of (bacterio)chlorophyll synthesis reflect a larger problem that must be confronted in phylogenetic analysis over long geological timescales-sufficient information may not be left in the biological record to answer all questions (e.g. Meyer et al. 1986). The mutational saturation rate of protein sequences, particularly for short poorly conserved soluble proteins, limits the evolutionary timescale over which sequence-based phylogenies remain meaningful. The amount of evolutionary time represented by the diversity of (bacterio)chlorophyll synthesis proteins is on the edge of the range at which phylogenetic relationships are meaningfully recoverable, and the amount of evolutionary distance involved in outgroups necessary for rooting further complicate the recovery of meaningful evolutionary histories of these proteins (e.g. Lockhart et al. 1996). We therefore find particular value in integrating ecological scenarios for the evolution of phototrophy and (bacterio)chlorophyll synthesis as an independent means of supplementing the limited resolution available from extant sequence data.
This model predicts several potentially testable hypotheses, including the potential existence of a previously undiscovered early-diverging chlorophyll asynthesizing anoxygenic phototrophs including the "ghost lineage" depicted in Figure 3 which utilizes a basal (perhaps still homodimeric) form of RC2 along with a BchXYZlike complex to produce chlorophylls. Environmental metagenomic sequencing has proven incredibly powerful for recovering novel phototrophic lineages (e.g. Bryant et al. , and so it remains conceivable that a larger diversity of phototrophs may be recovered which can test the evolutionary history described here. However, it remains a strong possibility that this lineage has gone extinct, or has lost the capacity for phototrophy.

Methods:
To compare evolutionary relationships among steps in (bacterio)chlorophyll synthesis, phylogenies were constructed for individual proteins in steps involved in the conversion of protoporphyrin IX to protochlorophyllide a (the shared "backbone" (bacterio)chlorophyll synthesis pathway shared in all reaction center-based phototrophs) as well as the subsequent conversion of protoporphyrin IX to specific chlorophyll and bacteriochlorophyll pigments found in only some phototrophs. Congruence of tree topology between phylogenies of different proteins was taken as indicative of shared evolutionary history (i.e. vertically inherited or horizontally transferred together, but not transferred individually), while incongruence was taken as an indication of independent histories of horizontal gene transfer (i.e. HGT of a subset of (b)chl synthesis proteins rather than of the entire pathway).
Protein translations of all bacterial genomes available from Genbank were downloaded on 3 April 2018. This database was supplemented with data from Ward et al. . Following initial tree construction, iterative trimming of alignments was performed to remove incomplete sequences and functionally divergent homologs not involved in (bacterio)chlorophyll synthesis (e.g. nitrogenase NifD and NifH sequences returned by BchN and BchL searches, respectively). Outgroups were retained for proteins that displayed relatively short evolutionary distances and for which deep branches were robustly recovered (e.g. BchH and CobD), but trees were left unrooted when long evolutionary distances to outgroup sequences led to inconsistent rooting and deep branching order (e.g. between BchL and BchX). Congruent topologies between rooted and unrooted trees in related steps in (bacterio)chlorophyll synthesis (e.g. BchH with BchI and BchD) was considered sufficient to produce rooted consensus trees for steps in the (bacterio)chlorophyll synthesis pathway (i.e. insertion of Mg into protoporphyrin IX for BchH/D/I, the first committed step in (bacterio)chlorophyll synthesis, shared between all reaction center-based phototrophs).
For phototroph lineages only characterized via incomplete metagenomeassembled genomes (e.g. Eremiobacterota, Anaerolineae), the likelihood that missing genes may be present in the source genome was estimated with the False Negative estimate function of MetaPOAP (Ward et al. 2018b).
For steps in (bacterio)chlorophyll synthesis catalyzed by proteins that are more promiscuous and can be recruited into and from distinct pathways (e.g. methyltransferases), and for steps that can be catalyzed by multiple poorly characterized proteins (e.g. C-8 vinyl reductase, Chew and Bryant 2007b), we do not report phylogenies as these were deemed unreliable for recording robust deep evolutionary relationships.
Due to very low support values for deep nodes and extensive artifacts, concatenated protein trees (e.g. BchLNB) were deemed unreliable, likely related to extensive, independent HGT of individual subunits (particularly within the Proteobacteria, e.g. Occurrences of reaction center-based phototrophy within a phylum indicated by shading of the entire phylum, color coded by reaction center type. For clarity, the entire phylum is highlighted even when phototrophy is restricted to only some members (e.g. the phototrophic Heliobacteria within the much broader, predominantly nonphototrophic, Firmicutes phylum).
3) Cartoon overlayed phylogenies to demonstrate hypothesized history of HGT and ghost lineage. Underlying topology derived from backbone (bacterio)chlorophyll synthesis genes (BchH/D/I/M) (black). BchLNB and BchXYZ are derived from a common ancestor in stem group phototrophs. BchLNB (green) was inherited together with BchHDIM into extant phototrophs; BchXYZ (red) diverged in the ghost lineage before being introduced into extant anoxygenic phototrophs via HGT (a first HGT event introduced it into the stem of the proteobacterial lineage; a second HGT event introduced it into the stem of the WPS2/Chlorobi/Chloroflexi lineage; subsequent HGT introduced it into the Chloracidobacteria and Heliobacteria lineages). Type 1 reaction centers (RC1 and PSI) and Type 2 reaction centers (RC2 and PSII) diverged in stem lineage phototrophs. Type 1 reaction centers (peach) were vertically inherited into extant phototrophic lineages. Type 2 reaction centers (blue) diverged in the same ghost lineage as BchXYZ, and were introduced into extant clades via HGT (first into stem group Cyanobacteria, leading to PSII, then into stem group Proteobacteria and the stem lineage of WPS2, Chloroflexi, and Chlorobi). The most parsimonious history consistent with the data involves a secondary replacement of RC2 with RC1 in Chlorobi. Alternative evolutionary histories are similarly parsimonious, but all involve many events of HGT of individual phototrophy components and most involve secondary loss and replacement in some lineages. The inclusion of one or more ghost lineages improves parsimony and provides a good explanation for long branches between RC1/RC2 and BchLNB/BchXYZ homolog pairs. a e; d of 4) Cartoon timeline of phototroph evolution as hypothesized here. The "ancestral" phototroph was anoxygenic and utilized chlorophyll pigments, adapted to high light. At least two lineages of phototrophs diverged during Archean time, giving rise to the ancestors of Type 1 Reaction Centers and the BchLNB complex, and the Type 2 Reaction Centers and the BchXYZ complex. Eventually, HGT of a Type 2 RC into an RC1 and BchLNB-containing proto-cyanobacterium enabled the evolution of oxygenic photosynthesis, still using chlorophyll pigments. As oxygenated surface waters and competition with Cyanobacteria forced anoxygenic phototrophs into lower light regions of the water column or microbial mats (where oxygen is lower and electron donors are more abundant), they underwent adaptation to lower quality light. This included the HGT of BchXYZ complexes into BchLNB-containing lineages, allowing the innovation of bacteriochlorophyll pigments, probably first with bchl a. Eventually, anoxygenic phototrophs diversified in terms of organisms (including extant groups), pigments (bchl c-g), and reaction centers (further HGT of RC2). Oxygenic phototrophy diversified via eukaryotic endosymbiosis (primary and higher order) and colonization of land by plants. While the relative timing of these events can be inferred from comparative biology, absolute timing of many of these events is only poorly constrained if at all, based on the sparse microfossil and biomarker record of early phototrophic microbes and molecular clock estimates for the antiquity of crown group clades. C) consensus tree for BchX, BchY, and BchZ, used for the conversion of chlorophyllide a to 3-vinyl-bacteriochlorophyllide a, the first dedicated step in the synthesis of bacteriochlorophylls a, b, and g, and therefore found in all characterized anoxygenic phototrophs. Long branches between BchX/Y/Z and closest outgroups (BchL/N/B) resulted in poorly supported root placement, so tree is presented unrooted. The topology of this tree is incongruent with those presented in A) and B), suggesting independent histories of HGT of bacteriochlorophyll-specific genes versus shared backbone (bacterio)chlorophyll synthesis genes.