Uncoiling CNLs: Structure/Function Approaches to Understanding CC Domain Function in Plant NLRs

Abstract Plant nucleotide-binding leucine-rich repeat receptors (NLRs) are intracellular pathogen receptors whose N-terminal domains are integral to signal transduction after perception of a pathogen-derived effector protein. The two major plant NLR classes are defined by the presence of either a Toll/interleukin-1 receptor (TIR) or a coiled-coil (CC) domain at their N-terminus (TNLs and CNLs). Our knowledge of how CC domains function in plant CNLs lags behind that of how TIR domains function in plant TNLs. CNLs are the most abundant class of NLRs in monocotyledonous plants, and further research is required to understand the molecular mechanisms of how these domains contribute to disease resistance in cereal crops. Previous studies of CC domains have revealed functional diversity, making categorization difficult, which in turn makes experimental design for assaying function challenging. In this review, we summarize the current understanding of CC domain function in plant CNLs, highlighting the differences in modes of action and structure. To aid experimental design in exploring CC domain function, we present a ‘best-practice’ guide to designing constructs through use of sequence and secondary structure comparisons and discuss the relevant assays for investigating CC domain function. Finally, we discuss whether using homology modeling is useful to describe putative CC domain function in CNLs through parallels with the functions of previously characterized helical adaptor proteins.


Introduction
Plants use intracellular immune receptors to perceive virulence proteins (effectors) secreted to the host cell by microbial pathogens during infection (Dangl and Jones 2001). These receptors generally belong to the superfamily of nucleotide-binding (NB) leucine-rich repeat (LRR) receptors (NLRs) (Kourelis and van der Hoorn 2018), which are also components of mammalian innate immunity pathways (Duxbury et al. 2016, Meunier and Broz 2017. Plant NLR proteins are commonly divided into two classes based on their N-terminal domains: coiled-coil-(CC) containing NLRs (CNLs), and Toll/interleukin-1 receptor-(TIR) containing NLRs (TNLs). Mammalian NLRs typically detect PAMPs (pathogen-associated molecular patterns) (Broz and Monack 2013), although recently NLRP1B was shown to be activated by proteolytic cleavage mediated by bacterial pathogen effectors (Bachovchin et al. 2018, Sandstrom et al. 2018. In contrast, plant NLRs have only been shown to respond to specific effector molecules. A recognition event between a plant NLR and pathogen effector typically results in a form of programmed cell death known as the hypersensitive response (HR). This response results in host cell death, but also isolates the pathogen, preventing colonization of the plant and disease (Heidrich et al. 2012). NLRs offer genetic solutions to preventing disease in crops, and have been widely used in plant breeding programs (Mundt 2018), without requiring the use of costly and unsustainable pesticides. This has stimulated research focused on understanding how these receptors detect effectors and initiate immunity-related signaling.
Plant NLRs have a modular, multidomain architecture, and specific roles for each domain have been described that together contribute to the function of the full receptor (Takken and Goverse 2012).The C-terminal LRR domain has been implicated in direct binding of effectors in some NLRs (Jia et al. 2000, Dodds et al. 2006; however, it also appears to have a role in autoinhibition of the receptor preceding effector interaction (Ade et al. 2007, Faustin et al. 2007). Activation of immune signaling by NLRs appears to involve ADP/ATP exchange within the central NB domain (also known as NB-ARC in plants) , Bernoux et al. 2016, which may modulate conformational changes within the receptor in response to effector detection (Takken et al. 2006, Bernoux et al. 2016. The NB domain of plant NLRs shares similarities with the NACHT domain (the nucleotidebinding domain of mammalian NLRs) which undergoes conformational change in mammalian NLRs, as observed by cryo-electron microscopy (cryo-EM) structures of NLRC4 apoptosomes and the crystal structures of the NB domain of the NLR-like apoptosis protein, APAF1 (Reubold et al. 2011, Tenthorey et al. 2017. Recently, multiple studies have categorized the function of supplementary domains found only in some NLRs, which may be found attached to the N-or C-terminus of the protein, or even incorporated between the other domains of the receptor (Cesari et al. 2014, Maqbool et al. 2015, Kroj et al. 2016, Sarris et al. 2016, De la Concepcion et al. 2018. Known as integrated domains (IDs), these domains most probably have their evolutionary origin as host effector targets and are associated with direct effector perception (Cesari et al. 2014, Baggs et al. 2017). Finally, located at the N-terminus, is either the TIR domain or the CC domain. Both TIR domains and CC domains are thought to be the receptor modules required for downstream signal transduction post-NLR activation (Takken and Goverse 2012); however, CC domains from a variety of different NLRs have also been implicated in guardee or effector perception (Khan et al. 2016). The TIR and CC domains divide plant NLRs into the TNL and CNL classes (Meyers et al. 1999, Meyers et al. 2003. Research to date has established that the TIR domain has a role in signaling by plant TNLs; however, less is known about signaling by CC domains in CNLs. While TNLs can provide resistance to disease in solanaceous, brassicaceous and other crops, they contribute less to the immune systems of cereals, as their NLR repertoires consist of almost entirely CNLs (Bai et al. 2002, Meyers et al. 2002. With cereal crops contributing to approximately 50% of the world's daily caloric intake (FAO 2003), further research into the signaling capacity of CC domains and CNL function is of high priority.
Although considered the predominant signaling units of NLRs, TIR domains and CC domains are structurally and functionally very different from one another. TIR domains adopt a conserved flavodoxin-like fold consisting of five a-helices surrounding a five-strand b-sheet, as observed in crystal structures of TIR domains from a variety of different plant, animal and bacterial species (Ve et al. 2015). Signal transduction mediated by plant TIR domains has been intimately linked to their ability to self-associate, with two self-association interfaces formed by the surface-exposed regions of the aA and aE, and aD and aE helices, respectively (Bernoux et al. 2011, Williams et al. 2014, Zhang et al. 2017. In contrast, CC domains are largely helical proteins, and there is some debate concerning their overall structure (discussed in this review). Further, despite a growing number of studies, the function of the CC domain in NLR signaling downstream of effector perception remains unclear.
Here, we review current knowledge of CC domain-and CNLmediated signaling in plant immunity and highlight some methods for investigating CC domain structure and function. We discuss the current classification of CNLs in the context of function, provide some guidelines on how to design CC domain constructs for structural and functional studies through analyses of sequence and secondary structure, and finally discuss putative functions for the variety of different CC domains found in CNLs.

Highly Unclassified: Functional Analyses of CC Domains Complicate Current Classifications
Previously, CNLs have been characterized based on motifs in the NB domain, and not the CC domain, as low sequence similarity and the absence of consistent motifs in the CC domain made analyses with resources such as Pfam difficult (Meyers et al. 1999, Meyers et al. 2002, Meyers et al. 2003, Finn et al. 2016). More recently, there have been an increasing number of studies published focusing on the CC domain, and three major features have emerged that are frequently used to describe their function: (i) their ability to trigger cell death when transiently expressed in model host plants such as Nicotiana benthamiana; (ii) the need for self-association to signal cell death; and (iii) the presence/ absence of CC-specific motifs, such as the EDVID motif. A result of these studies is that CC domains are frequently grouped into several classes: CC EDVID , CC R , CC (often referred to as the canonical or classical CC domain; herein referred to as CC CAN for clarity) (Collier et al. 2011), and the I2-like and SD-CC classes, only found in Solanaceous plants (Hamel et al. 2016). The CC EDVID class, which includes NLRs such as Sr33, MLA10, Rx, SlNRC4, Rp1-D21 and RGA5, are named for the highly conserved EDVID motif that is suggested to be involved in intramolecular interactions with the NB domain (Rairdan et al. 2008, Bai et al. 2012, Wang et al. 2015, Leibman-Markus et al. 2018. The CC R subclass is characterized by NLRs with a CC domain that shares similarity to RPW8 (Collier et al. 2011); classical/canonical CC CAN domains are the CC domains from NLRs that do not fit into the previous two categories, with examples such as RPS2 and RPS5 (Qi et al. 2012). CC domains belonging to the SD-CC subclass include a large auxiliary domain N-terminal to the CC domain, known as a solanaceous domain (SD); this class includes the well-characterized NLRs, Sw-5b and Prf (Mucyn et al. 2006, De Oliveira et al. 2016. Finally, the I2-like CNL family is centered around CC domains with similarity to the CC domain of the tomato NLR, I2. While also possessing an EDVID motif, I2-like CNLs are different from their CC EDVID counterparts, segregating into their own monophyletic clade (Pan et al. 2000, Couch et al. 2006, Rairdan et al. 2008. CNLs with I2-like CC domains include I2, R3a, L and N' (Collier et al. 2011, Hamel et al. 2016.
Recently, it has been found that many NLRs (both TNLs and CNLs) function synergistically either as pairs (Ashikawa et al. 2008, Narusaka et al. 2009, Cesari et al. 2014 or as part of intricate signaling networks (Collier et al. 2011, Zhu et al. 2011, Wu et al. 2017. What role the CC domain of CNLs plays in heterologous pairs and/or NLR networks, for example in mediating oligomerization or signaling, is largely unknown. While many genes encoding CNLs have been identified in plant genomes, only a handful of studies addressing the function of CNL proteins, and the signaling mechanisms of the CC domain, have been performed. For this review, we selected CNLs for which functional data for their CC domains are available, and included experimental data which have tested at least one of the three following functions: (i) capacity to induce cell death; (ii) ability to self-associate; and (iii) ability to interact with a cofactor (for references, see Fig. 1). This includes three CC domains from the CC CAN and CC R subclasses, four CC domains from the I2-like subclass and the rest comprise CC domains of the CC EDVID subclass. While there have been multiple studies performed on the N-terminal SD-CC domains of Sw-5b and Prf SD-CNLs (Gutierrez et al. 2010, Saur et al. 2015, De Oliveira et al. 2016, there are few data for the CC domain function alone. This creates difficulties when attempting to  (yellow), the ability to self-associate (green) and the ability of the CC domain to interact with a cofactor (blue). The selected CC domains analyzed here have been placed in regions that correlate with observed functions. CC domains from all subclasses can be found across all regions of the Venn diagram, demonstrating little correlation between function and subclass. The one exception to this is Bs2 from the CC CAN subclasses, for which there are no observed functions in any of the three categories and is depicted in an orange circle separated from the other CNLs. (B) A table of reported functions of the CC domains analyzed here, accompanied by the studies in which they were observed. As with (A), little correlation can be seen between CC domain function and subclass assignment, with the exception of CC domains that belong to the monophyletic CC R and I2-like subclasses. delineate CC domain functions from the functions of the SD and other N-terminal domains, as in the case of Prf. Therefore, CC domains from SD-CNLs have not be included in the subsequent analyses.
When using our defined functional groups, it becomes clear that division into CC EDVID , CC R or CC CAN subclasses does not align directly with function ( Fig. 1). The exception to this is the CC R (comprising members of the ADR1 family), and I2-like subclasses, which distinctly segregate in sequence and function from other CNLs (Collier et al. 2011, Hamel et al. 2016. The diverse functional groupings of CC domains within the single CC EDVID subclass are the clearest (Fig. 1). CC EDVID domains differ in their ability to self-associate, signal cell death and directly interact with a cofactor. Using the RPM1, MLA10 and Rx CC domains as examples, several subclass contradictions can be observed based on reported functions (Rairdan et al. 2008, Casey et al. 2016, El Kasmi et al. 2017. The CC domains of MLA10, RPM1 and Rx are all of the CC EDVID type, but only the MLA10 CC domain is capable of autonomously signaling cell death (Rairdan et al. 2008, Casey et al. 2016, El Kasmi et al. 2017. Much like MLA10, the RPM1 CC domain self-associates; however, RPM1 CC does not autonomously signal cell death (El Kasmi et al. 2017), whereas the Rx CC domain does not signal cell death or self-associate (in the 1-122 CC domain construct only) (Moffett et al. 2002, Casey et al. 2016). Finally, the RPM1 and Rx CC domains, but not the MLA10 CC domain, interact with a cofactor that is essential for function for the receptor (Moffett et al. 2002, Sacco et al. 2007, Bai et al. 2012, El Kasmi et al. 2017. It is worth noting that the 1-144 construct of the Rx CC domain has been observed to form large homomeric protein complexes in size-exclusion chromatography (SEC; Townsend et al. 2018); however, with a lack of biophysical analyses it is difficult to determine whether these complexes are representative of an ordered oligomeric assembly or simply aggregation. Regardless, the different functions described above show that caution must be applied when addressing CC domain function in the context of the CC EDVID classification. While the EDVID motif has been shown to have a role in mediating interactions between the CC domain and the NB domain (Rairdan et al. 2008, Bai et al. 2012), its presence alone should not be taken as an indicator of CC domain (or NLR) function, as described here.
A further example highlights shared CC domain functions that span the previously described classes.
RPM1 and RPS5 CC domains share similar functions, with both capable of self-association and cofactor interactions (RPM1 with RIN4, and RPS5 with PBS1), despite belonging to the CC EDVID and CC CAN subclasses, respectively (Ade et al. 2007, El Kasmi et al. 2017. Moreover, neither RPM1 nor RPS5 CC domains are capable of independently signaling cell death. There are several other examples of CC domains from different classes that do not signal cell death but interact with a guardee/ cofactor, including Rx and RPS2 (Rairdan et al. 2008, Qi et al. 2012.
Taken together, these observations highlight that the commonly used classifications of CC domains are not especially useful for confidently defining function. Therefore, care should be taken if these classifications are used when designing functional studies for CC domains from uncharacterized NLRs. The differences in CC domain function, coupled with the small sample size of proteins studied to date, serves to highlight the difficulties in forming classes, and consequently, prediction of putative functions for these proteins based on sequence. One limiting factor in characterizing the function of CC domains in plant NLRs is the difficulty in assigning domain boundaries, and therefore the appropriate design of constructs to analyze. Next, we give a short guide on how to analyze the protein sequences of CC domains with the goal of designing experiments to assay their function robustly.

Predicting CC Domain Boundaries with Sequence and Secondary Structure Prediction to Guide Functional Studies
Due to a lack of knowledge, it is frequently necessary to make subjective decisions about the boundaries of protein domains when attempting to assess function. This can result in inappropriate constructs that complicate functional annotation. One potential point for error is to use domain boundaries based on homology or similarity to previously assayed proteins. For example, in TIR domains from plant NLRs, despite the core flavodoxin fold being conserved in the protein (Ve et al. 2015), there is variation in the extent of the domains, and surrounding regions, that are required for in planta cell death phenotypes (Bernoux et al. 2011, Williams et al. 2014, Schreiber et al. 2016. In this case, applying knowledge of construct boundaries from one protein to study the function of another would result in inappropriate conclusions. The same considerations should be applied when designing constructs to assay the function of CC domains, to avoid potential misrepresentation of protein function. As, to date, the explicit function of CC domains in plant NLRs remains undefined, any initial constructs designed to assay activity should be overly inclusive, preferably spanning the region of the NLR from the N-terminus to the beginning of the NB domain (the boundaries of which can be better defined). These constructs should then be assayed for cell death signaling in planta, cofactor association and self-association to establish a functional reference point before, perhaps, trying to delimit minimal regions required for function (however, it is worth noting that additional residues at the C-terminus of signaling domains has been observed to cause inhibition of signaling in planta (Bernoux et al. 2011)). Not all CC domains studied to date autonomously signal cell death in planta, and therefore a non-cell death-inducing construct does not imply a lack of biological relevance. Care must be exercised when interpreting assays with new CC domain constructs, and best practice would include trialing several CC domain constructs before making conclusions concerning function.
Two of the more useful, and extensively used, tools for assessing construct boundaries are secondary structure prediction and Pfam domain analysis (Finn et al. 2016). In many cases, Pfam is an excellent resource for initial identification of regions of interest; however, it can be unsuitable for guiding bespoke domain boundaries. It is well established that Pfam struggles with CC domain identification (Meyers et al. 2003), therefore it is advisable to include the amino acid sequence of CC domains from the first N-terminal residue of the NLR up to the first Pfam predicted residue of the NB domain when designing initial CC domain constructs. As an example, Pfam analysis of the NLR Sr33 defines the CC domain as encompassing residues 6-134; however, structure/function studies have revealed that residues 1-142 are required for the cell death phenotype, and any less prevents this activity (Casey et al. 2016. Hence, were the Sr33 CC domain boundaries defined solely by Pfam, and subsequently used as the basis for construct design, this would generate a protein that does not signal cell death, potentially leading to the loss of biologically relevant information. Additional sequence analysis tools should be used when designing CC domain constructs shorter than the N-terminus of the NLR to the start of the NB domain. As shown in Fig. 2, the secondary structure composition of CC domains greatly varies, even within subclasses, and therefore care must be taken when designing constructs. Programs such as COILS and PSIPRED, as well as 3-D homology modeling servers such as Phyre2 and I-Tasser, run a secondary structure prediction as a part of their pipelines (Lupas et al. 1991, Buchan et al. 2013, Kelley et al. 2015, Yang and Zhang 2015. In general, these programs predict the position of ahelices and b-strands, accompanied by a confidence score. This knowledge is very valuable as keeping protein secondary structure units intact, without partial removal or truncation, will probably be important for protein stability. In the context of structural biology, secondary structure prediction is also useful to avoid long, disordered regions in the protein, which can result in solubility issues, promote aggregation or make crystallization difficult (Dong et al. 2007). While the prediction of secondary structure is a very useful tool, much like Pfam, domain boundaries should not solely rely on the outputs of this software, but rather be used as a guide. Best practice would be to generate several CC constructs, with additional residues at the C-terminus of the last secondary structure to ensure that any predicted a-helix or b-strand is fully covered.

Assaying CC Domain Function: Comparing Results and Understanding Technique Limitations
Generalizing plant NLR CC domain function has proven challenging. As discussed previously, the ability to cause cell death upon heterologous expression in model plants, self-association and whether the domain interacts with a cofactor are the most commonly assigned activities. Each of these activities are informative concerning CC domain function, but the conclusions can be subjective, in particular where a lack of activity is observed, as this could just be due to suboptimal experimental design.
After assembling suitable constructs through incorporating best estimates of domain boundaries, assays to assess the CC domain functions described above can be employed. The in planta 'HR assay', often performed in heterologous hosts such as N. benthamiana or N. tabacum, is used as an indicator of cell death signaling consistent with plant immunity pathways. As it is less likely that negative (lack of cell death) responses will be reported in the literature, it is difficult to assess how common it is for CC domains to lead to cell death on expression. Good examples of how to use the HR assay to evaluate CC domain autoactivity are found in Bai et al. (2012), Cesari et al. (2016), Casey et al. (2016) andEl Kasmi et al. (2017). In each of these studies, several CC domain constructs were tested with different domain boundaries to explore the extent required for signaling, or whether signaling was not observed, as in the case of RPM1 (El Kasmi et al. 2017).
To study CC domain self-association, or association with cofactors, yeast two-hybrid (Y2H), co-immunoprecipitation (CoIP) and analytical SEC (also known as gel filtration) have all been extensively used. While powerful techniques, Y2H and CoIP (from plant tissue, usually after co-expression in N. benthamiana) suffer from false negatives and positives due to the context of the assay. Y2H assays can positively report the interaction of two proteins, but this may not be biologically relevant, and false negatives can occur from inhibition of interactions by the reporter fusions, or because a partner of the interaction (e.g. a 'bridging molecule') is missing in yeast. While CoIP assesses associations derived from plant tissue (that can also be affected by 'bridging molecules'), extraction conditions may both positively promote and negatively influence interactions between the proteins of interest. In cases where differences are observed between interactions in Y2H and CoIP analyses, further experiments should be conducted.
One such technique is SEC, but this is a purely in vitro assay that relies on heterologous expression and purification of the protein(s) of interest, mostly commonly Escherichia coli. SEC reports protein shape, which is correlated with size, and can be used to study protein self-association or complex formation by comparison of retention times with known standards. CC domain self-association in vitro has been observed as weak and transient, and SEC alone may not have the resolution required to measure self-association confidently (Casey et al. 2016). The nature of oligomeric, self-associating complexes (or complexes comprised of different proteins) observed in SEC can be further investigated using other in-solution biophysical techniques, such as small-angle X-ray scattering (SAXS), or multiangle laser light scattering (MALS). These techniques derive accurate measurements of the average molecular mass of the sample, but do require specialist equipment that may not be routinely available. Examples of MALS and SAXS analyses with the plant immunity field include those of CC domains (Casey et al. 2016), characterization of TIR domain self-association (Bernoux et al. 2011, Williams et al. 2014, Zhang et al. 2017 and investigation of heterocomplex formation between the RxLR effector PexRD54 and the host autophagy-related protein, ATG8 (Maqbool et al. 2016). Furthermore, the stoichiometry of protein complexes can be difficult to determine by SEC, and additional techniques, such as MALS, surface plasmon resonance (SPR) or isothermal titration calorimetry (ITC), can be used to assess stoichiometry of the complex and (for the latter two) determine binding affinities when investigating heterocomplexes (Maqbool et al. 2015, De la Concepcion et al. 2018.
Overcoming the limitations of individual techniques to study plant NLR CC domain function requires multiple approaches. One of the best predictors of protein function is 3-D structure, and currently structural data on CC domains are limited. For example, to date, there is no structure of a plant NLR CC domain that fully covers a construct that induces cell death in model plants. In the next section, we assess the current understanding of CC domain structure, and ask whether homology modeling is useful to provide insight into plant NLR CC domain function, specifically to inform further experiments.

CC Domain Structure: A Protein Interaction Scaffold or Another Function?
The Sr33 CC domain [determined by nuclear magnetic resonance (NMR)], and the MLA10 and Rx CC domains (determined by X-ray crystallography), are the only structures of plant CC domains available to date (Maekawa et al. 2011, Hao et al. 2013, Casey et al. 2016. The Sr33 and Rx CC domains comprise a fourhelix bundle fold (Hao et al. 2013, Casey et al. 2016. In contrast, in the crystal structure of the MLA10 CC domain, the protein forms an obligate dimer made up of two helix-loop-helix monomers (Maekawa et al. 2011) (Fig. 3A). Interestingly, while the structure of the MLA10 CC domain does not resemble the structures of Rx CC, or of MLA10's ortholog Sr33, a comparative study of the biophysical characteristics of the three CC domains reveals that they probably share the same four-helix bundle fold in solution (Casey et al. 2016). This raises intriguing questions on the helix-loop-helix dimer structure of MLA10. It is possible that this conformation represents either a crystallographic artifact, or a biologically significant oligomerization state that may also be observed for other plant NLR CC domains (El Kasmi and Nishimura 2016).
As previously mentioned, none of the existing plant NLR CC domain structures represents functional signaling units, at least in the context of induced cell death in model plants (Fig. 3). For both the Sr33 and MLA10 CC domains, an additional 22 residues are required at the C-terminus for this activity that were not included in the expressed protein for structural studies (Casey et al. 2016. Secondary structure predictions of Sr33 and MLA10 suggest that these additional 22 residues are involved in the completion of a fourth a-helix (Fig. 3B), and it was shown that these additional residues are also necessary for the self-association of the MLA10 and Sr33 CC domains (Casey et al. 2016. It is noteworthy that the structure of the Rx CC domain includes the entirety of a predicted four-helix bundle, but this construct does not induce cell death. Therefore, it would appear that the presence of the entire fourth a-helix in the Rx CC is not sufficient for cell death activity.
In the absence of easy access to structural information, homology modeling offers an opportunity to gain insight into protein structure/function relationships, but should always be used with caution to prevent falling into the 'functional homology trap' (Moréra et al. 1994, Lahm et al. 2003, Launay and Simonson 2008. Specific to homology modeling plant NLR CC domains, with only three template structures available there is a limited pool of information that can lead to bias in the outputs. For this reason, we have removed homology models from the analysis below when these structures have been used as templates. We took each of the CC domains detailed in Fig. 1, and submitted the sequences to the protein structure prediction server PHYRE2 (Kelley et al. 2015). Intriguingly, two structures were consistently identified as reasonable templates for homology modeling of these CC domains (Fig. 4A, B). For NRG1 and ADR1, the highest confidence hit was to the NMR structure of the N-terminal domain of the mixed-lineage kinase domain-like (MLKL) protein, and for all other CC domains the highest confidence hit was the CARD (caspase-activation and recruitment domain) of the Caenorhabditis elegans CED-4. The similarity between the CC domains of MLKL/CED-4 CARD and those of plant NLRs, as predicted by Phyre2, may be useful to inform further studies of plant NLR CC domain function.
MLKL is required in the activation of necroptosis, an auxiliary form of cell death thought to be triggered after suppression of apoptosis. The four-helix bundle region of MLKL was necessary for the insertion of the protein into the plasma membrane post-oligomerization, causing pore formation and resulting in the collapse of cell integrity (Su et al. 2014). Further, the four-helix bundle of MLKL shares similar biochemical properties with CC domains, being highly amphipathic with a highly charge solvent-exposed surface and hydrophobic core (Su et al. 2014, Casey et al. 2016). The homology model of the ADR1 CC domain generated from MLKL with PHYRE2 has a high confidence score of 99.5% over the 133 residues able to be modeled (13-146; input 1-150) (Fig. 4A). Intriguingly, the MLKL structure (PDB: 2MSV) was the strongest hit, higher than that of MLA10, which is found ubiquitously as the 'best template' when modeling CC domains. Combinatorial extension-(CE) based superimposition of the ADR1 homology model with the four-helix bundle Sr33 NMR structure demonstrates a clear structural similarity, with root mean square deviation of 3.88 Å over 96 residues (Fig. 4C). This, combined with the similar biochemical properties between MLKL and CC domains, is indicative of this homology model possibly representing a reasonable approximation of the CC domain structure. However, similarity to the N-terminus of MLKL is only observed for the CC R domains.
As previously mentioned, other homology models of CC domains (all of the CC EDVID , CC CAN and I2-like subclasses) were generated using the CARD of CED-4 (PDB: 2A5Y) as a The structures of the MLA10 and Sr33 CC domains do not represent functional HR signaling units, which is likely to be compromised by the truncation of the fourth a-helix, as seen in the secondary structure prediction. The Rx CC domain structure comprises an entire four-helix bundle; however, this region does not display autonomous cell death signaling in model plants. Fig. 4 Homology modeling of CC domains compared with experimentally determined structural data. Initial homology models were generated for each of the CC domains previously analyzed in Figs. 1 and 2. For each of the CC domains, sequences from the distal N-terminus to the start of the NB-ARC domain (as predicted by Pfam) were used to generate the models. Only two templates were consistently selected for modeling by PHYRE2 (when excluding the MLA10 CC domain crystal structure, PDB: 3QFL). These were the NMR structure of the N-terminal domain of MLKL (PDB: 2MSV) for CC domains of the CC R subclass, and the crystal structure CARD domain of CED-4 (PDB: 2A5Y) for all other CC domains from the CC EDVID , CC CAN , and I2-like subclasses. Models of the ADR1 and Sr33 CC domains were generated by one to one threading as representatives of the CC domain homology models based on the two templates, 2MSV and 2A5Y, using domain boundaries defined by secondary structure and cell death signaling capacity in planta. (A) The homology model of the ADR1 CC domain (right, in violet) is shown as a representative of the CC R subclass. Although only sharing 17% sequence identity to MLKL (structure on the left, shown in blue), the model generated covered 89% of the query sequence, modeling residues 13-146 (133 of 150 residues input) with 99.5% confidence. (B) The homology model of the Sr33 CC domain (right, shown in cyan), chosen as the representative of the CC EDVID , CC CAN and I2-like subclasses. The Sr33 CC domain shares 10% sequence identity with the CED-4 CARD (structure on the left, shown in green), and the homology model generated covers 61% of the query sequence modeling residues 44-132 (88 of 144 residues input) with a confidence of 95.5%. (C) Left: superimposition of the CC domain homology model of ADR1 (violet) with the NMR structure of the Sr33 CC domain (red) using combinatorial extension. Of the 133 residues modeled, 96 residues of the ADR1 homology model could be superimposed on the Sr33 CC domain NMR structure with a root mean square deviation of 3.88 Å . This shows similarity in the overall fold, and suggests that the ADR1 CC homology model may represent a reasonable structure for this CC domain. Right: superimposition of the Sr33 CC domain homology model (cyan) with the NMR structure of the Sr33 CC domain (red) using combinatorial extension. The Sr33 homology model does not represent an accurate depiction of the Sr33 CC domain as seen by the poor superimposition on the Sr33 NMR structure with 56 of the 88 modeled residues superimposed with a root mean square deviation of 6.04 Å . This is despite the high confidence score assigned by PHYRE2 to the model. template by the prediction program. The structural folds of CARD domains, including that of CED-4 (Fig. 4B), are known as death domains (DDs), and these are most often observed as the signaling domains of proteins involved in animal immunity (including NLRs) and apoptotic pathways (Vajjhala et al. 2017). Although often highly divergent amino acid sequences, DDs share a globular structure consisting of six antiparallel amphipathic helices that form a helical bundle. DDs form homotypic interactions with other DD-containing proteins, and regularly assemble into larger oligomeric structures through induced proximity promoted by the oligomerization of the C-terminal domains as seen in APAF-1 and CED-4 apoptosomes (Qi et al. 2010. The theoretical structural homology of the CC EDVID and CC CAN classes to DDs fits well with current models of plant NLR activation, in which oligomerization may be required for signal transduction by the N-terminal domains (Duxbury et al. 2016. While appealing observations, the homology models based on the CARD domain of CED-4 are problematic, and highlight the potential pitfalls of structural modeling. Using the Sr33 CC domain homology model of as an example, CE-based superimposition of the Sr33 CC homology model on the Sr33 CC NMR structure reveals that the homology model does not conform to the known structure (Fig. 4 C). It is important to note that the confidence of the Sr33 CC domain homology model was high (95%), much like the homology model of ADR1; however, of the 88 residues that were modeled (44-132; input 1-144), only 56 could be aligned to the Sr33 NMR structure with a root mean square deviation of 6.04 Å (Fig. 4C). Therefore, care must be taken if using confidence scores alone to assess model quality. For structural modeling of CC domains to be sufficient to guide functional studies, additional experimentally derived structures are required.

Concluding Remarks
In this review we have summarized the current understanding of CC domain structure/function in plant CNLs, and discussed the limitations of CC domain classifications. We highlight the importance of careful consideration in defining CC domain boundaries prior to structural and functional analysis to avoid unintended loss of activity (e.g. cell death induction, ability to self-associate) and maximize biological relevance. From the work presented here, it is clear that significant gaps remain in our knowledge of how CC domains of plant CNLs are involved in transducing defense-related signaling in cells in response to pathogens. Further studies are required to understand how these domains contribute to disease resistance in some of the world's critical food crops.

Disclosures
The authors have no conflicts of interest to declare.