Non-homologous DNA end joining and alternative pathways to double-strand break repair

DNA double-strand breaks (DSBs) are the most dangerous type of DNA damage because they can result in the loss of large chromosomal regions. In all mammalian cells, DSBs that occur throughout the cell cycle are repaired predominantly by the non-homologous DNA end joining (NHEJ) pathway. Defects in NHEJ result in sensitivity to ionizing radiation and the ablation of lymphocytes. The NHEJ pathway utilizes proteins that recognize, resect, polymerize and ligate the DNA ends in a flexible manner. This flexibility permits NHEJ to function on a wide range of DNA-end configurations, with the resulting repaired DNA junctions often containing mutations. In this Review, we discuss the most recent findings regarding the relative involvement of the different NHEJ proteins in the repair of various DNA-end configurations. We also discuss the shunting of DNA-end repair to the auxiliary pathways of alternative end joining (a-EJ) or single-strand annealing (SSA) and the relevance of these different pathways to human disease.


Immunoglobulin heavy chain class switch recombination
The DNA recombination process by which the immunoglobulin heavy chain isotype is changed from producing IgM to producing IgG, IgA or IgE.

Microhomology
One or more base pairs of complementarity at the two DNA ends of a break.

Pol X family polymerases
Subfamily of DNA polymerases; based on homology it includes Pol β, Pol μ, Pol λ and terminal deoxynucleotidyltransferase (TdT). (BRCT). Protein domain of approximately 100 aa that binds to phosphoproteins that are often involved in the DNA damage response.

BRCA1 C terminus
Artemis belongs to the metallo-β-lactamase family of nucleases, which are characterized by conserved metalloβ-lactamase and β-CASP domains (FIG. 2a). This family of nucleases can hydrolyse DNA or RNA in various configurations 10 . Artemis has intrinsic 5ʹ exonuclease activity on ssDNA, even without DNA-PKcs 11 . At duplex DNA ends, Artemis, in complex with DNA-PKcs, has endo nuclease activity on both the 5ʹ and the 3ʹ DNA overhangs (which are often created at pathological DNA breaks) and on DNA hairpins that are formed during V(D)J recombination. This DNA hairpin opening process during V(D)J recombination specifically requires Artemis, and thus patients lacking Artemis suffer from severe combined immunodeficiency (SCID) owing to a V(D)J recombination defect in antigen receptor gene assembly 12,13 . Amino acids 402-403 of Artemis interact with the FAT domain of DNA-PKcs (FIG. 2b), whereas the carboxy-terminal region of Artemis (aa 454-458) interacts with its own aminoterminal catalytic domain (aa 1-7) to inhibit the endonuclease activity 14,15 . Artemis (aa 485-495) also interacts with the N-terminal region of DNA ligase IV 16,17 .
Of ionizing radiation-induced DSBs, 20-50% require Artemis for repair 18,19 . It is unclear whether the remaining DSBs have DNA-end configurations that can be joined without the benefit of any nuclease. Other nucleases that might contribute to the repair of these DSBs include aprataxin and PNKP-like factor (APLF; also known as PALF) [20][21][22] , the MRN complex (MRE11-RAD50-NBS1 (Nijmegen breakage syndrome protein 1; also known as nibrin)), CtBP-interacting protein (CtIP; also known as RBBP8), Werner syndrome ATP-dependent helicase (WRN), flap endonuclease 1 (FEN1) and exonuclease 1 (EXO1) 23 . The abundance and localization of these nucleases at DSB sites may determine which nucleases are responsible for the most resection at DSBs (see Supplementary information S1 (table) for a list of the known cellular abundance of human NHEJ and auxili ary repair proteins). But for the limited resection that occurs during most NHEJ events, the Artemis-DNA-PKcs complex seems to be the primary nuclease 8 .
The polymerases. DNA polymerase μ (Pol μ) and Pol λ are the two members of the Pol X family polymerases that are involved in NHEJ in humans 24,25 . These polymerases interact with Ku through their N-terminal BRCA1 C terminus (BRCT) domains 26 (FIG. 2c). Primary cells derived from mice with genetic knockouts of both Pol μ and Pol λ exhibit little or no sensitivity to ionizing radiation, although knockouts in cell lines can have limited deficit in DSB repair in some assays 27,28 . Both Pol μ and Pol λ can incorporate either dNTPs or rNTPs 24,25 , and any ribonucleotides that are incorporated are likely to be subsequently removed by base excision repair 29 . Both polymerases can incorporate in a template-dependent or a template-independent manner 27 , although Pol μ does the latter more than Pol λ 30,31 .

DNA end breathing
Break of the hydrogen bonds between one or more base pairs in the anti-parallel strands of the DNA duplex break.
Pol X family members also include Pol β and terminal deoxynucleotidyltransferase (TdT; also known as DNTT), but Pol β does not contain a BRCT domain to allow interaction with the Ku complex, and TdT is only expressed in early B lymphocytes and T lymphocytes during V(D)J recombination. DNA polymerases outside the Pol X family can incorporate nucleotides during NHEJ but only in a template-dependent manner [32][33][34][35][36] .
The ligase complex. DNA ligase IV and X-ray repair cross-complementing protein 4 (XRCC4) (FIG. 2d) are the most central components of NHEJ in eukaryotes 1 . XRCC4 stimulates DNA ligase IV enzyme activity in biochemical assays 37 . XRCC4-like factor (XLF; also known as Cernunnos in humans or Nej1 in yeast), is a 33 kDa protein with weak sequence homology and structural similarity to XRCC4 . The N-terminal head domain of XLF interacts with the N-terminal head domain of XRCC4 (REF. 39), and the XRCC4-XLF complex forms a sleeve-like structure around a DNA duplex 41 . This proposed sleeve would presumably stabilize the positioning of the ends before covalent ligation, but this is still an area of active investigation. PAXX (paralogue of XRCC4 and XLF) is a recently discovered 22 kDa protein with structural similar ity to XRCC4 and XLF 42,43 . The C terminus of PAXX (aa 199-201) interacts with Ku, and PAXX mutants are more sensitive to ionizing radiation and DSB-inducing agents 42,44,45 . It will be interesting to discover how PAXX participates in such a large assembly of other NHEJ proteins, especially in the context of chromatin.
Polynucleotide kinase, aprataxin and tyrosyl DNA phosphodiesterase 1. Other proteins are involved in NHEJ if the chemistry of the DNA ends requires further proteins. For example, a 5ʹ end lacking a phosphate would require phosphorylation by polynucleotide kinase (PNK; also known as PNKP). Human PNK is also a phosphatase, which is important for removing 3ʹ phosphates that can arise from some types of oxidative damage 46 .
Ligase IV sometimes initiates but does not complete a covalent join, and this can result in the formation of an intermediate or an aborted ligation product in which an AMP group remains covalently bound to the 5ʹ end of one of the strands at the DSB. The enzyme aprataxin is required to remove the AMP group as part of a deadenyl ation reaction 47 . Both PNK and aprataxin bind to XRCC4 via their forkhead-associated domain (FHA), which is located near their N termini, but only after the kinase CK2 has phosphorylated XRCC4 (REF. 48).
Tyrosyl DNA phosphodiesterase 1 (TDP1) is the only identified enzyme that can specifically process 3ʹ-phosphoglycolates (3ʹ-PGs), which are by-products of ionizing radiation-induced DSBs at 3ʹ ends 49 . These 3ʹ-PGs are unligatable ends that can account for 10% of radiation-induced DSBs 50 . However, TDP1 mutants in human cells show only marginal radiosensitivity, suggesting that another enzyme could be involved in processing 3ʹ-PGs 51 .
End structure directs repair subpathway Structural and biochemical studies support a model in which different sets of NHEJ proteins serve to align the two DNA ends in an end-to-end configuration (FIG. 3). One parameter that affects DNA end joining is how much transient base pairing can occur between the two DNA ends before joining or, in other words, the degree of microhomology between the ends. However, after several base pairs of DNA end breathing, any two DNA ends will share at least one nucleotide of homology that can be used for annealing, even if it is only by non-Watson-Crick base pairing 52 . Some DNA ends can be joined together using only the ligase complex, but other DNA ends require the action of polymerases or nucleases, which together form different NHEJ subpathways.

Blunt-end ligation by Ku-XRCC4-DNA ligase IV.
Biochemical studies have demonstrated that NHEJ of blunt DNA ends lacking microhomology rely on Ku for efficient joining (FIG. 3a). By contrast, DNA ends joined using microhomology do not require Ku, indicating that Ku becomes more important the less the ends are able to form terminal base pairs 31 . Ku has a high affinity for DNA ends (K d = 6 × 10 −10 M) and can promote the binding of XRCC4-DNA ligase IV to the DNA ends 53 . The C terminus of DNA ligase IV contains two BRCT domains, which bind Ku 54 , and the region between the two BRCT domains also binds to a homodimer of XRCC4 (FIG. 2d). Thus, XRCC4 associates with DNA ligase IV in a 2:1 ratio, which could contribute to the bridging between the two DNA ends [55][56][57] . This Ku-XRCC4-DNA ligase IV complex is required for the efficient reconstitution of the NHEJ pathway using human proteins 58 . The addition of DNA-PKcs, Artemis and Pol μ does not further stimulate ligation, suggesting that the direct ligation of blunt ends is preferred over their processing (FIG. 3a).
The relatively high efficiency of blunt-end ligation by human Ku and XRCC4-DNA ligase IV contrasts with in vivo observations in Saccharomyces cerevisiae, in which blunt-end joining is inefficient 59,60 . However, it is possible that such inefficiency in yeast could be the consequence of more aggressive DNA end resection that exposes long 3ʹ overhangs in preparation for homologous recombination (HR). The 6.6 Å and the more recent 4.3 Å crystal structures of DNA-PKcs raise the possibility of dimerization of DNA-PKcs, and one could speculate that this contributes to bridging of the two DNA ends before ligation 61,62 . DNA is not present in these crystal structures, and thus one can only speculate about its location. The ligation of DNA ends with only Ku and XRCC4-DNA ligase IV provides biochemical evidence that DNA end bridging is not reliant on DNA-PKcs or on NHEJ factors other than Ku and XRCC4-DNA ligase IV 5 . It is clear that signal joint formation during V(D)J recombination also does not require any NHEJ proteins other than Ku and XRCC4-DNA ligase IV 1 , and this is consistent with the biochemistry of blunt end ligation (FIG. 3a).
Nuclease-dependent subpathways. DNA-PKcs weakly interacts with DNA but its binding increases 100-fold when Ku is present 63 . The FAT domain of DNA-PKcs binds to the C terminus (aa 718-732) of Ku80 (REF. 64) (FIG. 2a). One of the major roles of DNA-PKcs is to interact with and activate the endonuclease activity of Artemis at DNA ends. DNA-PKcs autophosphorylation upon binding of the DNA end activates Artemis endonuclease activity 6 . DNA-PKcs also phosphorylates the C-terminal domain of Artemis 65 (FIG. 2b). It is likely that autophosphorylated DNA-PKcs promotes the dissoci ation of the C-terminal inhibitory region of Artemis (aa 454-458) from the N-terminal catalytic domain (aa 1-7) of Artemis (FIG. 2b)  removes 5ʹ and 3ʹ DNA overhangs to create DNA end structures that can be ligated by the XRCC4-DNA ligase IV complex 26,66 . At 5ʹ overhangs, Artemis directly cuts at the ss-dsDNA boundary (FIG. 3b). However, when processing 3ʹ overhangs and DNA hairpins, Artemis preferentially leaves a 4-nucleotide 3′ overhang (FIG. 3c). DNA hairpins are structurally similar to DNA overhangs, because they have a sterically constrained hairpin tip that results in only transient base pairing of the terminal base pairs (4 nucleotides), thereby creating a ss-ds boundary 67 . From these observations, Artemis activity on duplex DNA can be explained using a model in which Artemis-DNA-PKcs binds to the ss-dsDNA boundary to occupy 4 nucleotides along the single-stranded segment at the boundary 8 . This binding is followed by nicking on the 3ʹ side of the 4 nucleotides.
In addition to stable ss-dsDNA boundaries, Artemis acts at blunt DNA ends that breathe to an open state, thereby forming transient ss-ds boundaries 8 . Such blunt DNA ends may be generated by chemotherapeutic agents, reactive oxygen species or ionizing radiation 68 . A more comprehensive version of this model, which can explain the essential structural features of all the DNA substrates at which Artemis functions, including blunt DNA ends (transient ss-ds boundaries), proposes that Artemis recognizes all ss-dsDNA boundaries through putative contact points in the duplex DNA that are either adjacent to the 5ʹ or 3ʹ overhang, or adjacent to the hairpin 9 . To achieve hydrolysis of the phosphodiester backbone, the Artemis active site would then act within the single-stranded portion of the overhang or the hairpin. Although this model must await the elucidation of a DNA-Artemis co-crystal, it does explain all the known cutting patterns of Artemis.
Although the role of Artemis in V(D)J recombination is well characterized, its role in NHEJ is less clear. One role of Artemis in NHEJ is when ionizing radiation-induced DSBs have a 3ʹ-PG terminus [69][70][71] . These DNA ends are unable to undergo ligation because this step requires a 3ʹ hydroxyl on one end and a 5ʹ phosphate on the other. As discussed above, TDP1 can remove these 3ʹ modifications. However, TDP1-mutant cells are only marginally radiosensitive; Artemis mutants, however, are sensitive to ionizing radiation, and therefore it is likely that Artemis is involved in removing the damaged strand (FIG. 3e). Indeed, it has been shown biochemically that the Artemis-DNA-PKcs complex is able to process these ends 72,73 . More recently, the C-terminal region of Artemis (aa 485-495) has been shown to interact with the N-terminal head domain of ligase IV 16,17,74 (FIG. 2b). This interaction may promote Artemis activity by recruiting Artemis to 3ʹ overhangs through the DNA ligase IV binding 5 .
Biochemical reconstitution of NHEJ with purified proteins has shown that Artemis resects 5ʹ and 3ʹ overhangs to generate regions of microhomology for NHEJ to occur 5 . In instances in which the overhangs have the potential for microhomology after partial resection of the overhang, the endonuclease activity of Artemis exposes the nucleotides within a stretch of ssDNA (FIG. 3c). However, in instances in which there are no regions of substantial microhomology in the overhangs, the Artemis-DNA-PKcs complex often resects into the duplex to generate overhangs that expose microhomology 5 . Interestingly, Artemis-DNA-PKcs does not strongly stimulate the ligation of blunt-ended DNA. This suggests that, even though Artemis-DNA-PKcs is able to resect at blunt ends, these ends are usually joined directly without resection 5,8 (FIG. 3a). By contrast, the ligation of incompatible overhangs is strongly stimulated by the presence of the Artemis-DNA-PKcs complex, which is probably recruited to the DNA end only when resection is required.
Polymerase-dependent subpathways. Pol μ and Pol λ are recruited to the DNA end by interaction of their N-terminal BRCT domain with the Ku-DNA complex 26 (FIG. 2c). Pol μ primarily has template-independent polymerase activity, whereas Pol λ primarily has templatedependent polymerase activity (fill-in synthesis) 30 . This difference in activity is due to structural variations of these polymerases in a region known as loop 1 (REF. 75). This loop is structurally flexible and provides hydrogen bonding with the DNA template strand, allowing Pol μ to add nucleotides without an actual template.
In reactions that involve only the Ku-XRCC4-DNA ligase IV complex, Pol μ strongly promotes the ligation of incompatible 3ʹ overhangs 31 . At these overhangs, Pol μ can add nucleotides in a template-independent manner, generating regions of microhomology for subsequent base pairing and ligation 31 . Pol μ is also required for the joining of two DNA substrates with short (1 nucleotide or 2 nucleotides) incompatible 3ʹ overhangs 28 . In biochemical reactions involving DNA-PKcs and Artemis, Pol μ strongly stimulates the joining of two mismatched 3ʹ overhangs by promoting the formation of terminal microhomology 5 (FIG. 3d). Sequences at the resulting junctions reveal nucleotides that represent templateindependent nucleotide addition by Pol μ, as well as an absence of nucleotide resection by Artemis-DNA-PKcs 5 .
Pol λ primarily promotes the ligation of terminally compatible overhangs that require fill-in synthesis 28,31 . These situations arise when opposing DNA ends terminally base pair but leave a gap that needs to be filled in before ligation. Unsurprisingly, Pol λ has little effect on NHEJ of completely mismatched 3ʹ overhangs because these overhangs do not provide a template strand 5 .
Ligation by the XLF and PAXX subpathways. XLF and PAXX are the most recently characterized NHEJ factors that have been shown to support ligation by the ligase IV complex. Both XLF and PAXX have structural similarity to XRCC4 (REFS 39,42). XLF forms homodimers and its head domain binds to XRCC4 (REF. 76). The XLF head domain also interacts with the Ku-DNA complex 77 . PAXX also forms homodimers, and its C terminus has been found to associate with Ku 42,43 (FIG. 2d). XLF and PAXX promote NHEJ of a subset of DNA ends that require maximal stabilization by the ligase complex. In biochemical reactions involving only Ku and the XRCC4-DNA ligase IV complex, XLF was shown to only stimulate the ligation of short incompatible 3ʹ overhangs 31 . However, in another study involving Ku, DNA-PKcs and XRCC4-DNA ligase IV, XLF was shown to promote the ligation of all mismatched and noncohesive overhangs 78 . This difference may be partly due to the dependence of XLF on the length of the dsDNA present, as ~70 bp fragments were used in the first study mentioned above compared with >3 kb linearized plasmids in the second study. Alternatively, DNA-PKcs could be interfering with XLF interactions.
PAXX was shown to promote the ligation of two blunt ends in reactions involving only Ku and the XRCC4-DNA ligase IV complex 42 (FIG. 3a). PAXX may also promote the ligation of a blunt end to a 3ʹ overhang in reactions involving Ku, XLF and the XRCC4-DNA ligase IV complex 43 . However, a more recent biochemical study that also included Artemis and Pol μ failed to demonstrate this effect, but did show that XLF and PAXX stimulate the NHEJ of 5ʹ incompatible overhangs 5 (FIG. 3b). These data suggest that the role of XLF and PAXX may be to help to stabilize Ku along with other NHEJ proteins at a DNA end under conditions in which terminal microhomology is not available.
Inactivation of XLF and PAXX together is synthetic lethal in mice and reduces V(D)J recombination in human B lymphocytes [79][80][81][82] . These data suggest a possible redundant role of XLF and PAXX; however, additional roles may become apparent when DNA substrates can be studied in the context of chromatin.

Shunting to auxiliary pathways
When NHEJ is compromised owing to the lack of one or more of its key protein components, the activity of the other end joining pathways becomes apparent, which typically involve much more extensive resection of the DNA ends to reveal sequence homology, the annealing of which stabilizes the two ends of a break to allow for more efficient joining and ligation 1 . The a-EJ pathway 83 (FIG. 4) (also known as microhomology-mediated end joining and Pol θ-mediated end joining) requires microhomology that ranges between 2 bp and 20 bp. At the low end of this range, NHEJ overlaps with a-EJ and requires usually ≤4 bp of microhomology; nonconservative homology-directed repair pathways (which involve the loss of nucleotides), such as SSA, require >20 bp of homology 33,34,84,85 (FIG. 5). The conservative HR pathway (in which no nucleotides are lost) generally requires lengths of homology longer than 100 bp (HR is beyond the scope of this Review and is discussed in detail elsewhere [86][87][88] ).

Templated insertions
Nucleotide additions at a double-strand break repair junction that seem to be direct or inverted repeat copies derived from either strand of either of the two DNA ends.
Extensive DNA-end resection is prevented by Ku 89 . The high abundance of Ku in cells (Supplementary information S1 (table)) increases the likelihood that Ku is the first protein to bind to a broken DNA end and, therefore, that repair is carried out through NHEJ (FIG. 4). There is evidence that the DNA damage response protein p53-binding protein 1 (53BP1; Rad9 in S. cerevisiae) acts as an antagonist to end resection, along with replication timing regulatory factor 1 (RIF1) 90 . 53BP1 and medi ator of DNA damage checkpoint protein 1 (MDC1) are recruited to DSBs through several modified histone resid ues and seem to have distinct roles in DSB repair 86,91,92 . Further work is required to elucidate specific ally how 53BP1 recruitment inhibits extensive end resection. Overcoming this barrier to resection, however, is the first step to enable either a-EJ or SSA.
Alternative end joining. Given the rarity of humans with NHEJ mutations, it is unclear whether a-EJ represents a standing pathway or whether the components of the pathway usually serve other functions in dsDNA processing, such as in replication, recombination or repair, and only become involved in end joining when NHEJ is compromised. Importantly, a-EJ requires Pol θ [93][94][95][96][97][98][99] and may also include poly(ADP-ribose) polymerase 1 (PARP1), CtIP and the MRN complex [100][101][102][103] . The endonuclease function of MRN, which is stimulated by phosphorylated CtIP, seems to initiate a-EJ by processing DNA ends to generate 15-100-nucleotide 3ʹ overhangs (FIG. 4). MRN proteins are considerably less abundant than Ku (Supplementary information S1 (table)) and are therefore less likely to bind to dsDNA ends, suggesting a limited role for MRN in end joining when Ku is present 104 . However, in vitro studies demonstrated that the endonuclease function of MRN may remove protein adducts from the ends of DNA, suggesting a possible role for MRN and CtIP in a subset of reactions in NHEJ 105,106 . PARP1 is an ADP-ribosylating enzyme that is involved in sensing DNA damage and promoting the a-EJ pathway 107 .
Cells with mutations in both NHEJ proteins (to allow detection of a-EJ) and Pol θ have a marked reduction in a-EJ to nearly undetectable levels [93][94][95][96][97][98][99] . Pol θ is encoded by the POLQ gene and belongs to the A family of DNA polymerases. It has a C-terminal polymerase domain and, uniquely among DNA polymerases, an N-terminal helicase-like domain 94,108,109 . Pol θ has been shown to stabilize the annealing of two long 3′ ssDNA overhangs (often known as 3′ tails) with as little as 2 bp of homology, extending one 3ʹ DNA end by using the annealing partner as a template 95 . This creates a more stable annealed intermediate that can be sealed by either DNA ligase I or DNA ligase III. The polymerase activity of Pol θ probably prevents further extensive resection of ends, thereby minimizing the potential formation of large deletions by SSA. Pol θ also has terminal transferase activity and thus can add nucleotides to provide microhomology that is not already present 97 .
A subset of Pol θ-mediated end joining products includes templated insertions 95,99,109 . Some NHEJ templated insertions also arise owing to the activity of the errorprone polymerases, Pol μ and Pol λ 26 , but it seems that Pol θ creates longer (>10 nucleotides) templated insertions, which initiate from a short length (often 2 or more bp) of microhomology 95,99,109 . Short templated insertions ( usually <10 nucleotides, and not necessarily associated with microhomology) are also seen in some normal murine lymphoid V(D)J recombination junctions 110 and in a substantial proportion (20-50%) of human lymphoid translocations. These are likely to be mediated by Pol μ or Pol λ [111][112][113][114] . The junctional sequences in a large majority of human lymphoid translocations are most consistent with NHEJ, and the hairpin opening at the D or J coding ends are clearly mediated by Artemis-DNA-PKcs-Ku 111 . If Pol θ rather than Pol μ or Pol λ is responsible for the longest (>10 nucleotides) templated insertions, it is possible that Pol θ could modify some of these DNA ends in the context of NHEJ, but this possibility must be investigated. Experiments in Drosophila melanogaster predicted much of what is now being discovered in mammalian systems regarding Pol θ 96,109 . It is important to note that D. melanogaster does not have Pol μ or Pol λ, and it will be interesting to determine which templated insertions are carried out by which polymerase in mammalian cells.
Pol θ can also function when the annealed microhomologies are embedded within the long 3ʹ ssDNA tails that are generated by extensive resection [94][95][96][97][98] . This would create non-homologous 3ʹ ssDNA tails that would need to be removed before extension by Pol θ. Therefore, nuclease activities from other repair pathways may be util ized during a-EJ. For example, the xeroderma pigmento sum group F (XPF)-ERCC1 nuclease complex, APLF or Artemis-DNA-PKcs could conceivably be used.
It is possible that a-EJ is slower than NHEJ. For example, in immunoglobulin class switch recombination, when DNA ligase IV is missing, DNA ligase I or DNA ligase III can substitute for DNA ligase IV, but with approximately tenfold slower kinetics 101,102 . This illustrates that even the most central NHEJ proteins such as ligase IV have back-up components, but that these back-up enzymes function with slower kinetics and lower repair efficiency. The slower end joining repair kinetics could be due to a requirement for more resection to reveal additional microhomology to stabilize the junction before the final ligation step. a-EJ junctional sequences in humans have microhomology lengths that are usually >4 bp, and often >10 bp (REF. 115) (FIG. 5). This observation suggests that Pol θ is active after the resection by a nuclease. Future work will help to identify all the components of a-EJ and explain how a-EJ is distinct from NHEJ 116 . The kinetics of repair by these various pathways is also an important factor to consider. In addition, the ataxia telangiectasia mutated (ATM)-mediated DNA damage response may be important for the balance of NHEJ and a-EJ, given that the absence of ATM favours NHEJ 117 . It is also important to note that humans who do not have major NHEJ components are exceedingly rare. Therefore, the a-EJ proteins and enzymes may have functions other than merely as substitutes for an absence of NHEJ that is too lethal to be usually found in mammals.
Single-strand annealing. a-EJ has more in common with SSA than it does with NHEJ as both a-EJ and SSA require extensive resection to reveal microhomology. By contrast, NHEJ often uses 1-4 bp of microhomology but this is not a requirement 5 . Neither the a-EJ pathway nor the SSA pathway is reliant on Ku, and the binding of Ku to DNA ends may need to be attenuated for a-EJ and SSA to proceed. These pathways also rely on the initiation of extensive resection by the MRN complex and CtIP, which generate 15-100-nucleotide 3ʹ ssDNA tails. It is at this point that the a-EJ and SSA pathways diverge. In a-EJ, annealing of microhomology seems to be sufficient for Pol θ to extend one of the DNA strands to stabilize the intermediate for ligation.
SSA requires the exposure of more sequence homology; therefore, more extensive resection is required 118 . The 3ʹ ssDNA tails created by MRN and CtIP are further extended by the action of the nuclease EXO1 or Bloom syndrome RecQ-like helicase (BLM) or DNA replication helicase/nuclease 2 (DNA2) (acting as part of a complex) to generate longer 3ʹ ssDNA tails 119,120 (FIG. 4). The long 3ʹ ssDNA tails do not remain exposed, but are bound by multiple copies of the replication protein A (RPA) complex, the components of which are abundant within the cell (Supplementary information S1 (table)). RPA forms a filament on the ssDNA to prevent the formation of secondary structures. During HR, the RecA homologue RAD51 replaces RPA to allow for homology search and strand invasion 121 (FIG. 4). BRCA1, BRCA2 and RAD54 may have a role in promoting HR, but this is beyond the scope of this Review. SSA, however, is a RAD51-independent mechanism that generally depends on the presence of 3ʹ ssDNA tails that share suitable sequence homology to form a stable annealing intermediate. The promiscuity of joining partners is what makes SSA non-conservative and prone to generating deletions and translocations. The annealing of complementary ssDNA tails is mediated by the strand annealing protein RAD52 (FIG. 4). Before ligation, the unannealed, non-homologous portions of the 3ʹ ssDNA tails must be processed and removed. In this case, SSA uses the nucleo tide excision repair complex XPF-ERCC1 (FIG. 4) and the mismatch repair complex MSH2-MSH3, further highlighting the overlap among repair pathways 85,122,123 .
The influence of the cell cycle. Although binding of DNA ends by Ku inhibits extensive resection by MRN and CtIP, and favours repair by NHEJ, extensive resection is also dependent on the cell cycle owing to the action of cyclin-dependent kinases (CDKs) 124,125 . Factors that promote extensive end resection are more active during S and G2 phases, favouring HR when a sister chromatid is present. This is another reason why repair by NHEJ is dominant throughout the cell cycle, whereas repair by HR and SSA is favoured in S and G2 phases. Targets of CDKs include the DNA damage response checkpoint proteins ATM and ataxia telangiectasia and Rad3-related (ATR), as well as enzymes that promote extensive resection 125 . For example, CDK2 phosphorylates CtIP at Thr847 (REF. 126), and phosphorylated CtIP may form a complex with BRCA1 and MRN in

Box 1 | Non-homologous end joining and human diseases
Non-homologous end joining (NHEJ) is not the cause of DNA double-strand breaks (DSBs). Rather, DNA breakage occurs owing to a variety of causes, and NHEJ simply restores chromosomal structure, usually with the loss of a few nucleotides from one DNA end or both ends 1 . The role of NHEJ in repairing DSB sites during the formation of human chromosomal translocations has recently been discussed elsewhere 111 . NHEJ is the dominant pathway for the joining phase during chromosomal translocations in human cells 111,135 , although this may be different in murine cells [136][137][138] . The contribution of alternative end joining (a-EJ) to disease, including chromosomal translocations has not yet been proved or, at least, fully evaluated, except in cases in which there is an existing mutation in another major DSB repair pathway 116 .
Spontaneous mutations in NHEJ proteins are exceedingly rare in humans 139 . Mutations in Artemis arise in cases of consanguinity, especially in Athabascan-speaking Native Americans 140 . Artemis mutations can have a range of phenotypes from deficiency in antibody production to severe combined immunodeficiency (SCID) owing to deficiency in V(D)J recombination, as discussed in detail elsewhere 139,141 . The same range of Artemis mutations also has a corresponding range of effects on patient responses to therapeutic ionizing radiation 141,142 .
Rare mutations in DNA ligase IV also have a range of phenotypic severity 142 . Mutations in other components of the ligase IV complex, such as in X-ray repair crosscomplementing protein 4 (XRCC4) and XRCC4-like factor (XLF), also have widely varying degrees of severity, including immunodeficiency, progeria-like features, microcephaly, growth retardation, autoimmunity and infections 83,139,143,144 . A subset of human mutations in XRCC4 can cause dwarfism without causing immunodeficiency 83 . Mutations in DNA-dependent protein kinase catalytic subunit (DNA-PKcs) have been described in only a few patients, and cause SCID, radiosensitivity and varying related abnormalities [145][146][147] .
Within an individual, repair in somatic cells is relevant to both cancer and ageing. But it is also interesting to consider the role of DSBs on a broader scale. On a population level and in regard to inherited genomic changes (such as in inherited disorders), meiotic changes are the relevant cell type in which DSB repair may lead to chromosomal changes. In somatic cells, NHEJ occurs 50-fold more frequently than DNA polymerase θ (Pol θ)-mediated a-EJ 95 . By contrast, in meiotic cells (based on the sequences at repaired breaks in inherited human disorders), Pol θ-mediated a-EJ and single-strand annealing may account for nearly as many joining events as NHEJ 148,149 .
Pol θ activity may be important in tumours that are deficient in homologous recombination 116 . Based on the template switching that has been observed in some studies 150 , we speculate that Pol θ activity may also be responsible for end joining in a subset of chromothripsis events that occur in a minority of genomes in human neoplasms, whereas NHEJ may be responsible for end joining of another subset of chromothripsis junctions.

Chromothripsis
Shattering of chromosomal regions followed by random repair of the DNA fragments in some human neoplasms and inherited disorders. a process that is important for removing the inhibitory 53BP1 from histones near the DNA ends, thereby allowing for longer resection 90,91,127,128 . In addition, CDK control over Dna2 in yeast and EXO1 in humans further limits the extent of end resection that can occur outside of S and G2 (REFS 118,129,130). In a recent study, DNA-PKcs phosphorylation of ATM was shown to contribute to the regulation of pathway choice between NHEJ and HR 131 .
Therefore, in G1 phase, NHEJ is favoured by more than 50-fold for the repair of DSBs owing to both the level of Ku and the suppression of extensive end resection by CtIP and MRN. Even in S and G2 phases, when extensive end resection can take place, the resection machinery must still overcome the presence of Ku at DNA ends either by outcompeting Ku for DNA-end binding or by processing the DNA ends to the point at which Ku binding is less favoured. The ratio of NHEJ to HR in wild-type mammalian somatic cells, even during S phase and G2 phase, is estimated to be 4:1 (REF. 132).
If Ku is absent (which is exceedingly rare in normal human tissues, as well as in neoplastic human tissues), a-EJ may be favoured over SSA in G1 phase, owing to the limited amount of resection that a-EJ involves. It remains to be determined what dictates the use of a-EJ versus SSA in S and G2 phases. However, time is likely to be a key determinant because the longer a DSB remains unrepaired, the more end processing can occur to generate longer 3ʹ ssDNA tails to favour SSA. Finally, quantification of the relative ratio of various pathways is complicated because the absence of one pathway results in the accumulation of substrate for other pathways to a level that may not reflect the situation in wild-type cells.
A recent study examined how resection differences in G1 versus G2 phase relate to NHEJ and end joining pathway choice but this study used doses of ionizing radiation that were sufficient to induce non-physiological levels of ~200 DSBs per cell 133 . The authors also fused S/G2-arrested cells to asynchronous cells, examined the G1 subset and inferred that several HR enzymes might also be involved in NHEJ. It will be interesting to discover how these findings relate to the more typical situation in cells in which far fewer DSBs are present. In any event, it highlights the complex relationship between end resection and pathway choice and reiterates that many key questions have yet to be fully answered.

Conclusions and future perspectives
Recent biochemical and genetic studies have provided clearer mechanistic insights into which NHEJ proteins are used depending on the DNA end configuration. For the joining of two blunt DNA ends, Ku and XRCC4-DNA ligase IV are sufficient, and the addition of other NHEJ proteins does not substantially improve the joining. By contrast, the joining of DSBs that require nuclease or polymerase activity is more dependent on Artemis-DNA-PKcs and the Pol X polymerases. When NHEJ is missing key components, as is the case in rare human genetic disorders (BOX 1) or in experimental animal models, a-EJ becomes increasingly important in somatic cells. The potential roles of a-EJ in meiosis, in the rare chromothripsis events in tumour cells, or in other DSB transactions in somatic cells (such as random integration of exogenous DNA), are areas for future study.
Many questions remain. The first set of questions relates to DNA repair pathway choice. How much variation exists between different somatic cells or between mitotic versus meiotic cells in the use of NHEJ or in pathway choice? Ku is abundant and can thread onto dsDNA ends despite ssDNA overhangs that are longer than 20 nucleotides 134 . What other proteins affect repair pathway choice? Could pathway choice simply be stochastic? Could it be determined purely by the relative abundance of the proteins in the competing pathways? Could it be determined by the number of DSBs present in the cell? Which pathways contribute to chromothripsis in tumour cells and the formation of heritable chromosomal rearrangements in meiotic cells?
The second set of questions involves evaluating the contribution of Pol θ to NHEJ. Can Pol θ participate in a small subset of NHEJ events, in addition to its involvement in a-EJ? Many lymphoid chromosomal translocation junctions have templated insertions, which have been thought to be generated by Pol μ or Pol λ. Could some of these insertions actually be generated by Pol θ? Ku does not bind (or recruit) Pol θ, but does Ku obstruct Pol θ? How are non-homologous 3′ DNA tails removed before extension by Pol θ? Without such removal, the a-EJ pathway cannot proceed. Conversely, are all templated insertions generated by Pol θ, or are some generated by Pol μ or by Pol λ? The answer to this final question will determine whether some junctions form with the participation of both NHEJ proteins and this key protein of a-EJ.