Fine-Scale Recombination Maps of Fungal Plant Pathogens Reveal Dynamic Recombination Landscapes and Intragenic Hotspots

Meiotic recombination is an important driver of evolution. Variability in the intensity of recombination across chromosomes can affect sequence composition, nucleotide variation, and rates of adaptation. In many organisms, recombination events are concentrated within short segments termed recombination hotspots. The variation in recombination rate and positions of recombination hotspot can be studied using population genomics data and statistical methods. In this study, we conducted population genomics analyses to address the evolution of recombination in two closely related fungal plant pathogens: the prominent wheat pathogen Zymoseptoria tritici and a sister species infecting wild grasses Z. ardabiliae. We specifically addressed whether recombination landscapes, including hotspot positions, are conserved in the two recently diverged species and if recombination contributes to rapid evolution of pathogenicity traits. We conducted a detailed simulation analysis to assess the performance of methods of recombination rate estimation based on patterns of linkage disequilibrium, in particular in the context of high nucleotide diversity. Our analyses reveal overall high recombination rates, a lack of suppressed recombination in centromeres, and significantly lower recombination rates on chromosomes that are known to be accessory. The comparison of the recombination landscapes of the two species reveals a strong correlation of recombination rate at the megabase scale, but little correlation at smaller scales. The recombination landscapes in both pathogen species are dominated by frequent recombination hotspots across the genome including coding regions, suggesting a strong impact of recombination on gene evolution. A significant but small fraction of these hotspots colocalize between the two species, suggesting that hotspot dynamics contribute to the overall pattern of fast evolving recombination in these species.

on large numbers of individuals and produce only low-resolution rate estimates because of the relatively low number of meiotic events that can practically be observed (Stumpf and McVean 2003). Furthermore, many microbial eukaryotic species, including important pathogens, are difficult or even impossible to cross under laboratory conditions (Taylor et al. 2015). While experimental measures of recombination rate can be challenging in many species, advances in statistical analyses provide powerful tools to generate fine-scale recombination maps using population genomic data (e.g., Myers et al. 2005;Chan et al. 2012;Wang and Rannala 2014). These methods are based on genome-wide patterns of linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs) and have the potential to capture the history of recombination events in a population sample. Thus, recombination studies based on population genomic data have provided detailed insights into the genomics of recombination in a range of species (Winckler et al. 2005;Horton et al. 2012;Singhal et al. 2015;Hunter et al. 2016). In many organisms, but not all, the majority of recombination events tend to concentrate in short segments termed recombination hotspots (Petes 2001;Chan et al. 2012). In the human genome, .25,000 recombination hotspots have been identified, with a number of them showing a .100-fold increase in recombination rates and exhibiting a strong impact on the overall recombination landscape and genome evolution in general Winckler et al. 2005;Jeffreys and Neumann 2009).
Comparative analyses of recombination maps between closely related species have shed light on the dynamics of recombination landscapes in different taxa. A comparative analysis of recombination landscapes of chimpanzees and humans found a strong correlation of recombination rates at broad scales (whole-chromosome and megabase scale), whereas fine-scale recombination rates were considerably less conserved because of nonoverlapping recombination hotspots (Auton et al. 2012). The localization of recombination hotspots in primates and mice is in large part determined by PRDM9, a histone methyltransferase with an array of DNA-binding, Zn-finger domains (Myers et al. 2010). In some species-including species without PRDM9 such as yeast, plants, birds, and some mammals-recombination hotspots associate with particular functional features such as transcription start and stop sites as well as CpG islands (Horton et al. 2012;Choi et al. 2013;Lam and Keeney 2015;Singhal et al. 2015;Smeds et al. 2016). A model developed to explain the association of recombination hotspots and functional elements proposes that a depletion of nucleosome occupancy at these sites increases the accessibility of the recombination machinery (Kaplan et al. 2009;de Castro et al. 2012). Indeed, in the fission yeast Schizosaccharomyces pombe and the Brassicaceae plant Arabidopsis thaliana, meiotic recombination hotspots were shown to colocalize with nucleosome-depleted regions, supporting a link between chromatin structure and recombination in these species (de Castro et al. 2012;Wijnker et al. 2013).
Although many pathogens and parasites are sexual, the impact of recombination on the evolution of their genomes has been rarely addressed (Awadalla 2003). Genome studies have revealed exceptionally high rates of sequence evolution in some filamentous pathogens, including oomycetes and fungi (Raffaele and Kamoun 2012;Möller and Stukenbrock 2017). TEs, in particular, have been shown to play an important role in shaping the architecture and size of these pathogen genomes. TEs have often been found to be enriched in specific genomic compartments such as accessory chromosomes and repeatrich regions that further encode virulence-related genes (reviewed in Möller and Stukenbrock 2017). Increased mutation rates in TE-rich regions have been shown to contribute to the rapid evolution of new virulence specificities in pathogens and can contribute to the rapid generation of new genetic variation in pathogen genomes in the absence of sexual recombination (Daverdin et al. 2012;de Jonge et al. 2013;Dutheil et al. 2016). While TEs may contribute to the rapid evolution of specific genome compartments there are few population genetic studies of genome evolution, genome evolution in eukaryotic pathogens. As recombination can be an important driver of overall genome evolution in pathogen species, we set out to investigate patterns of recombination in plant pathogenic fungi. We focused on the economically important wheat pathogen Zymoseptoria tritici, which causes septoria leaf blotch on wheat. Z. tritici originated in the Middle East during the Neolithic revolution and has coevolved and dispersed with its host since early wheat domestication (Stukenbrock et al. 2007). A close relative of Z. tritici, Z. ardabiliae, has been isolated from wild grass species in the Middle East (Stukenbrock et al. 2012). The two pathogen species diverged recently but have nonoverlapping host ranges and show some differences in morphology and host infection patterns (Stukenbrock et al. 2011(Stukenbrock et al. , 2012. Both species undergo frequent sexual recombination, which results in the formation of ascospores that serve as a means of long distance wind dispersal and primary infection of new hosts (Stukenbrock et al. 2011). The colinear genomes of Z. tritici and Z. ardabiliae share 90% nucleotide similarity on average, thus providing an excellent resource for comparative analyses of genome evolution (Stukenbrock et al. 2011). The 40-Mb haploid genome of the reference Z. tritici isolate comprises 21 chromosomes, of which 8 are accessory chromosomes (Goodwin et al. 2011). These highly variable chromosomes are characterized by presence/absence variation, structural variation, high repeat content, and low gene densities (Goodwin et al. 2011;Grandaubert et al. 2015). Interestingly, the accessory chromosomes are partly conserved among several species in the genus Zymoseptoria, suggesting that these small chromosomes have been maintained over long evolutionary times, predating the divergence of species (Stukenbrock et al. 2011).
In a previous study, we applied a whole-genome coalescence approach to generate a map of incomplete lineage sorting of the ancestral species of Z. tritici and another closely related species, Z. pseudotritici (Stukenbrock et al. 2011). We found evidence of a high recombination rate in the ancestral species (genome average 46 cM/Mb) and showed a significantly higher proportion of sites showing incomplete lineage sorting in regions with high recombination rate. The existence of high recombination rates in the genus Zymoseptoria was recently supported by experimental data. Croll et al. (2015) generated a linkage map of Z. tritici from two independent crosses of Swiss field isolates. This map, based on actual crossover events along the 40-Mb genome, confirms the high recombination rates (genome average 66 cM/Mb, measured in windows of 20 kb) in the present-day pathogen species. Interestingly, the study also reported large differences between the two independent crosses of Z. tritici, suggesting that the recombination landscape is highly dynamic in this pathogen (Croll et al. 2015).
In this study, we addressed the evolution of recombination rate in fungal pathogens. We applied a population genomics approach to generate a fine-scale recombination map of the two recently diverged species Z. tritici and Z. ardabiliae. This allowed us to infer and compare finescale, genome-wide patterns of recombination rates in the two species and investigate the evolution of recombination landscapes. We first confirm the exceptionally high recombination rates, as also observed in a previous coalescence-based genome analysis and as shown by experimental crosses (Stukenbrock et al. 2011;Croll et al. 2015). Furthermore, we identify 2578 and 862 recombination hotspots in Z. tritici and Z. ardabiliae, respectively. Intriguingly, detailed analyses of the recombination hotspots show not only a comparatively higher hotspot frequency in the wheat pathogen but also the occurrence of stronger hotspots in Z. tritici. Our findings confirm that recombination rate landscapes are highly dynamic across time in the two fungal pathogens. Furthermore, the prominence of dynamic recombination hotspots in genes suggests a high impact on gene evolution; a finding that is unprecedented in other species.

Genome data
The life cycle of Z. tritici is predominantly haploid and the genome analyses conducted here thus rely on haploid genome data. The 40-Mb reference genome of the Z. tritici isolate IPO323 was sequenced at the Joint Genome Institute using Sanger sequencing (Goodwin et al. 2011). Two Iranian Z. tritici isolates and four Iranian Z. ardabiliae isolates were sequenced in a previous study using Illumina sequencing (Supplemental Material, Table S1) (Stukenbrock et al. 2011). We used genome data from an additional 10 isolates of Z. tritici that originate from wheat fields in Denmark, France, and Germany (Grandaubert et al. 2017). In this study, we report the genome sequences of 13 isolates of Z. ardabiliae that originate from wild grasses collected in the province of Ardabil in Iran (Table S1). DNA extraction was performed as previously described (Stukenbrock et al. 2011). Library preparation and paired-end sequencing using an Illumina HiSeq2000 platform were conducted at Aros, Skejby, Denmark.
The 13 resequenced Z. ardabiliae genomes were assembled from 100-bp, paired-end reads using the de novo assembly algorithm of the CLC Genomics Workbench version 5.5 (QIAGEN, Aarhus, Denmark). The assemblies were created using standard settings for paired-end reads. We used a previously published RNA-sequencing-based annotation to distinguish the parameter estimates for coding and noncoding sequences . To predict the genes that encode effectors, we used the software EffectorP (Sperschneider et al. 2016), with default settings, on genes predicted by SignalP (Petersen et al. 2011) to encode a secreted protein.

Genome alignment and SNP calling
Genome alignments were separately created for each population using the MultiZ program from the TBA package (Blanchette et al. 2004). Default parameters were used, although LastZ was used instead of BlastZ for pairwise alignments. Genome alignments were projected against the two reference genomes of each species: IPO323 for Z. tritici and STO4IR-1.1.1 for Z. ardabiliae (Goodwin et al. 2011;Stukenbrock et al. 2011). The projected alignments in MAF format were filtered using the MafFilter program ) with the following filters: (1) each syntenic block was realigned using MAFFT (Katoh et al. 2009), and blocks with .10 kb were split for computational efficiency; (2) only blocks where all individuals were present were retained (13 Z. tritici and 17 Z. ardabiliae); (3) a window of 10 bp was slid by 1 bp, and windows containing at least two indel events were discarded and the containing blocks were split; (4) a window of 10 bp was slid by 1 bp, and windows with a total of .100 gap characters were discarded and the containing blocks were split; and (5) all blocks were merged according to the reference genome with empty positions filled by "N," which resulted in one masked alignment per chromosome for Z. tritici and one masked alignment per contig for Z. ardabiliae. The chromosome and contig alignments were further divided into nonoverlapping windows of 1 Mb (data set 1) or 100 kb (data set 2). The MafFilter program was further used to estimate statistics on the alignments at each filtering step, and to compute the nucleotide diversity (Watterson's u) from the final filtered genome alignments.

Estimating recombination
Filtered alignments (1-Mb windows, data set 1) were exported as fasta files for the LDhat and LDhelmet packages. The program "convert" from the LDhat package was used to convert fasta files into input loci files for the program "interval" (Auton and McVean 2007). Only fully resolved biallelic positions were exported (see Table 1 for the details of SNP numbers). Likelihood tables were generated for u values of 0.0005, 0.005, and 0.05. The interval program was run with 10,000,000 iterations and sampled every 5000 iterations with a burn-in of 100,000 iterations. LDhelmet was run with the parameters suggested in the user manual (Chan et al. 2012; https://sourceforge.net/ projects/ldhelmet/). Comparison of recombination maps on the same set of SNPs was performed using standard principal component analysis, as implemented in the R package ade4 (Dray and Dufour 2007). A table was computed with one column per method (LDhat and LDhelmet, each with u set to 0.0005, 0.005, or 0.05) and one line per analyzed SNP pair, and the two first principal components were kept to plot a correlation circle ( Figure 1A).
To assess the robustness of the recombination maps, alternative maps for Z. tritici were constructed using the same protocol (1) after discarding all singletons, and (2) after removing five individuals to ensure absence of population structure. All maps were compared to the previously published genetic map of Croll et al. (2015) in windows of 20 kb. Correlations were assessed using Kendall's rank correlation test, and confidence intervals were obtained by bootstrapping windows.
We calculated average recombination rates in windows and regions by taking the average of recombination estimates between every pair of SNPs, weighted by the physical distance between the SNPs. Pairs of SNPs for which the confidence interval of the recombination estimate was higher than two times the mean were discarded and therefore not used in the average computation. Using the gene annotations available for the two reference species , we calculated the following information for each gene: (1) the average recombination rate in exons, (2) the average recombination rate in introns, (3) the average recombination rate in the 500 bp flanking the 59 region, and (4) in the 500 bp flanking the 39 region. We also calculated the average recombination rate for each intergenic region (500 bp from/to genes). GFF3 files from Grandaubert et al. (2015) were retrieved and processed using the GenomeTools package to add intron annotations (Gremme et al. 2013). The resulting gene annotations were analyzed in R together with recombination maps (R Core Team 2013).

Assessment of LD-based recombination estimates by simulation
We used the SCRM coalescent simulator (Staab et al. 2015) to simulate polymorphism data with a constant mutation rate but variable recombination rate. Recombination rates were drawn randomly from an exponential distribution with a mean of 0.02. Segments with a piecewise constant recombination rate were taken randomly from an exponential distribution with a mean of 100 kb. Sample sizes of 10, 30, and 100 individuals were tested for comparison with a population mutation rate equal to 0.05, 0.005, 0.0005, and 0.00005. We generated a locus of 10 Mb for simulations with u equal to 0.005, 0.0005, and 0.00005; but only 1 Mb for simulations with u equal to 0.05, as the resulting output file from LDhat would otherwise become excessively large due to the high number of SNPs. The true recombination rate used at each position of the alignment was recorded for later comparison. The output of SCRM was converted to LDhat input format using Python scripts. Recombination rates were estimated using the interval program from the LDhat package (Auton and McVean 2007). For simulations with u = 0.05 and 0.005, a likelihood lookup table with u = 0.01 was used; whereas a lookup table with u = 0.001 was used for simulations with u = 0.0005 and 0.00005. The inferred recombination rate at each position was then compared to the true rate. A variant of this simulation procedure was used to assess the impact of population structure on the inference of recombination rate. The SCRM coalescent simulator was used with a five-islands population model, with sample sizes of 2, 3, 4, 5, and 6 per deme, resulting in a total of 20 individuals. Migration rates were assumed to be all identical between demes, and values of M = 4N e 3 m = 1, 10, and 100 were tested. Regions of 1 Mb were simulated with u = 0.005 for each migration rate.

Reference species alignment and comparison
The two reference strains IPO323 (Z. tritici) and ST11IR-11.4.1 (Z. ardabiliae) were aligned using LastZ (Blanchette et al. 2004). The resulting genome alignment was used to map the coordinates of Z. ardabiliae SNPs to the Z. tritici genome, using the MafFilters LiftOver filter . A total of 893,171 (86%) positions could be mapped from Z. ardabiliae to Z. tritici and were used for further analyses.
Nonoverlapping windows containing at least 100 analyzed SNPs in each species were generated for the comparison of recombination rates between the two species.

Multi-scale correlations
We calculated the average recombination rates in windows of varying sizes and retained only windows that contained at least 1% of the polymorphic positions. To enforce a similar statistical power among different window sizes, a number of windows were chosen randomly. The same number of randomly chosen windows was used for the distinct comparisons. To assess the sampling variance, 1000 independent samplings (with replacement) were performed for each window size. Window sizes of 0.5, 1,2,4,8,16,32,64,128,256,512, and 1024 kb were tested, with 27 windows sampled in each case. We measured correlation coefficients using the Spearman, Kendall, and Pearson's correlation coefficients. Spearman and Kendall's coefficients are ranked based; therefore they do not assume binormality as Pearson's coefficient does. Because recombination rates are typically exponentially distributed, Pearson's coefficient was measured for the log rates instead of the raw r rates. Spearman's coefficient assumes that the variables are continuously distributed; therefore it does not resolve ties. Thus "jittering" was used to randomly resolve ties in the input variables (R function jitter, with default parameters). Conversely, Kendall's coefficient assumes ordinal input variables. Therefore, using the three correlation measures allows assessing the robustness of the correlation signal. A graphical representation was performed using the ggplot2 package for R, which performed local polynomial regression fitting for the curves (Wickham 2016).

Mapping of hotspots
Hotspots were detected using the LDhot program (Auton et al. 2014). For computational efficiency, LDhot was run on the 100-kb alignments (data set 2). A background recombination map was first estimated for each alignment using the interval program of LDhat with a u value of 0.005 (Auton and McVean 2007). The resulting maps were highly correlated with the maps based on 1-Mb alignments, and showed little effect of the discretization scheme. The background recombination map was used as input to LDhot with default parameter values and 1000 simulations. Significant hotspots were filtered for further analysis. First, only the hotspots with a value of r between 5 and 100 across the hotspot coordinates were selected, because higher values are most likely artifacts and the performance of LDhot is low for weak hotspots (Auton et al. 2014). A few hotspots with extremely large sizes (.2 kb) were further discarded. This process identified 9133 hotspots in Z. tritici and 1287 hotspots in Z. ardabiliae. We calculated the mean background rate in each detected hotspot and in the two 20-kb flanking regions. We further selected hotspots for which the within-hotspot rate was at least 10 times higher than the flanking regions. Thus 2578 and 862 hotspots were identified in Z. tritici and Z. ardabiliae, respectively. The Z. ardabiliae hotspots were mapped onto the Z. tritici genome using MafFilter's LiftOver function . We considered a hotspot in Z. tritici as colocalizing with a hotspot in Z. ardabiliae if the distance between them was ,1 kb, and if no other hotspot was found between the two. We compared statistics on the distribution of hotspots by randomizing the hotspot positions while keeping their original size, for each chromosome independently. To do so, we used the following procedure: 1. Compute the total "interhotspots" distance, L, as the sum of all distances between consecutive hotspots. 2. Draw random distinct positions uniformly in [1, L]. These positions are the starting positions of each randomized interval. 3. Order and then expand each interval to match its original size and compute the corresponding end positions. Correct the coordinates to account for previous intervals.
To account for variable coverage along the genome, we also simulated intervals corresponding to chromosome regions that were not included in our analysis, using the same procedure as for hotspot randomization. Each randomized set of hotspots therefore contains the same amount of callable sites as the actual analysis. We assessed the significance of the number of colocalizing hotspots using 10,000 permutations.

Models of GC-content evolution
The two reference strains IPO323 (Z. tritici) and ST11IR-11.4.1 (Z. ardabiliae) were aligned using LastZ (Blanchette et al. 2004). Several filtering steps were further applied to the alignment. First, each synteny block was realigned using the MAFFT aligner (Katoh et al. 2009) after splitting blocks .10 kb for computational efficiency, which resulted in an alignment of 27,918,318 bp that included both species. Second, a window of 30 bp was slid by 1 bp along the alignment. Windows with .29 gaps in total between the two species were further discarded, which resulted in 27,237,601 filtered positions. To minimize the effect of selection on GC patterns, we further discarded regions in the alignment that were annotated as protein-coding genes in one or both species. This resulted in a total alignment of 9,143,114 bp. The alignment was further divided into windows ranging from 1 to 4 kb and only data from the essential chromosomes (Z. tritici chromosomes 1-13) were retained. The final alignment contained 2052 cleaned windows containing sequences for both species with no synteny break, and it encompassed 3,179,581 bp. A model of sequence evolution was independently fitted on each window using maximum likelihood (Dutheil and Boussau 2008). The HKY85 model was used as a basis allowing three frequency parameters [(G + C)/(A + C + G + T), A/(A + T), and G/(G + C)] in addition to the transition over transversion ratio (Hasegawa et al. 1985). We fitted a nonhomogeneous, nonstationary model of substitution, allowing us to estimate three distinct GC contents for Z. tritici, Z. ardabiliae and their common ancestor. Other parameters were considered constant between species and their ancestor. A molecular clock was assumed (so that the two branches leading to Z. tritici and Z. ardabiliae were equal in length) and a four-class gamma distribution of rates with a shape parameter fixed to 0.5 was used. We further calculated the observed GC content in each species for each window. The average recombination rate was calculated for each window containing at least 1% polymorphic positions (leaving 1642 windows).
A similar analysis was conducted using recombination rate estimated from Croll et al. (2015), which was calculated in 20 kb windows. The corresponding pairwise alignment regions were extracted and filtered, and coding regions from both species were discarded; resulting in 1948 windows of at least 1 kb where a nonhomogeneous, nonstationary model of substitution could be fitted.

Results and Discussion
Genome alignments and SNP calling A total of 30 whole haploid genome sequences was used to infer the recombination landscapes of the two species Z. tritici and Z. ardabiliae. First, we generated de novo genome assemblies of 10 Z. tritici and 13 Z. ardabiliae isolates previously not studied (Table S1). The haploid genomes, including an additional three Z. tritici and four Z. ardabiliae genomes already published (Stukenbrock et al. 2011), were aligned for each species; resulting in multiple genome alignments of 40.8 Mb for Z. tritici and 32.4 Mb for Z. ardabiliae. (Table 1) Recombination analyses rely on SNP data. However, erroneously called SNPs or alignment errors can greatly bias LD inference. To generate high-quality SNP data sets, we therefore filtered the genome alignments (see Materials and Methods) to retain only the alignment blocks in which all isolates were represented. This filtering yielded genome alignments of 27.7 and 28.2 Mb for Z. tritici and Z. ardabiliae, respectively (Table 1). We further filtered the alignments to mask ambiguously aligned positions, leading to a final alignment size of 27.3 Mb for Z. tritici and 27.7 Mb for Z. ardabiliae. Less than 2% of the final alignment contained repetitive sequences, including TEs. In the case of Z. tritici, repeat regions have been filtered out during the alignment quality checking; while in the case of Z. ardabiliae, for which no telomere-to-telomere sequencing is available, most repeats were poorly assembled and therefore virtually absent from the original alignment (Table 1). After filtering we identified 1.48 million SNPs in Z. tritici and 1.07 million SNPs in Z. ardabiliae, which corresponds to the nucleotide diversities, measured as Watterson's u, of 0.0139 in Z. tritici and 0.0087 in Z. ardabiliae (Table 1). Thus, despite the larger sample size, Z. ardabiliae shows a much lower SNP density and sequence diversity than the wheat pathogen Z. tritici.

Inference of fine-scale recombination maps
We estimated and compared the local recombination rates in Z. tritici and Z. ardabiliae using two methods implemented in the LDhat (Auton and McVean 2007) and LDhelmet (Chan et al. 2012) packages. Both methods estimate the local population recombination rates based on the LD between SNPs in a given genome data set using a composite likelihood method. The methods infer the population-scaled recombination rate r across the genome, based on an a priori specified population mutation rate u. The parameter r relates to the actual recombination frequency by the equation r = 2N e 3 r for haploid individuals, where N e is the effective population size and r is the per site rate of recombination per generation across the region. Inferring r from r therefore requires knowledge of N e . Furthermore, in Zymoseptoria species, sexual reproduction is not obligatory and may vary from year to year with environmental conditions and the availability of compatible hosts and mating partners, rendering the estimation of r very difficult without any additional knowledge of the amount of clonal reproduction. To avoid the bias of incorrect assumptions, we therefore further analyzed and compared the recombination maps of Z. tritici and Z. ardabiliae based on the parameter r.
As u substantially varies along genomes and between species, we generated recombination maps using three scaled effective population size values as inputs (u = 0.05, 0.005, and 0.0005). For both LDhat and LDhelmet, we find that the three different input u values only have a marginal influence on the recombination rate estimates obtained ( Figure 1A). We therefore proceeded with the recombination map estimated using a u of 0.005, similar to the median of u values estimated in 10-kb windows in Z. tritici (u = 0.0139) and in Z. ardabiliae (u = 0.0087) ( Table 1).
To assess the performance of the two methods and input parameters for the fungal data sets, we first compared the inferred recombination maps of Z. tritici with data from previously published genetic maps (Croll et al. 2015). We compared both the LDhat and LDhelmet recombination maps with the genetic maps created from two sexual crosses of Swiss Z. tritici isolates, 3D733D1 and SW53SW39 (Croll et al. 2015). The recombination maps estimated by LDhat and LDhelmet from SNP data both correlate with the genetic maps, confirming that the composite likelihood methods allow us to assess the recombination landscapes in the fungal pathogens ( Figure  1B and Table 2). We find a significant correlation between the LDhat map and the two genetic maps (3D733D1: Kendall's rank correlation test, t = 0.27, and P , 2.2e 216 ; SW53SW39: Kendall's rank correlation test, t = 0.23, and P , 2.2e 216 ). Using an average recombination rate of the 3D733D1 and SW53SW39 crosses, the correlation further increases (Kendall's rank correlation test, t = 0.29, P , 2.2e 216 ) ( Figure 1B and Table 2). While correlated, the new recombination maps of Z. tritici encompasses .1 million SNPs and thereby provides a considerably finer resolution of the recombination landscape in Z. tritici than previously obtained from experimental crosses (based on 23,000 SNPs) (Croll et al. 2015). The same correlation analyses using the LDhelmet map show consistent results with slightly lower correlations (Kendall's rank correlation test, t = 0.24 for the cross 3D733D1, 0.20 for the cross SW53SW39, and 0.25 using the average of the two crosses; all P , 2.2e 216 ) ( Table 2). These correlations, although highly significant, have relatively small size effects. However, it is also noteworthy that the correlation between the two Swiss crosses 3D733D1 and SW53SW39 is only 0.43 (Kendall's rank correlation test, P , 2.2e 216 ), supporting a high variability in recombination even between individual crosses of Z. tritici (Table 2). Based on the comparison of the outputs of LDhat and LDhelmet, we decided to use the LDhat map as our reference population map for the remainder of this study. We next investigated the impact of possible confounding factors on the recombination rate estimates, including SNP densities, possible sequencing errors, population structure, and natural selection.
SNP density and filtering based on confidence intervals: LDhat and LDhelmet have been developed for recombination analyses in animals (Auton and McVean 2007;Auton et al. 2012;Chan et al. 2012), and their performance on data from haploid eukaryotes with high recombination rates has not been tested. Therefore, we next assessed the robustness of the composite likelihood approach using simulations with distinct sample sizes and SNP densities. We report that the interval program infers recombination rate with the highest reliability for intermediate diversity levels (u = 0.0005 or 0.005). Furthermore, while larger sample sizes decrease the variance in the estimate, we show that LDhat reliably infers recombination when as few as 10 haploid genomes are used (Figure 2). We observe that r generally tends to be underestimated, and its estimation variance is larger for small sample sizes. However, better estimates can be obtained by discarding all estimates where the width of the 95% confidence interval is larger than or equal to two times the mean. Interestingly, this filtering has the strongest effect for highly diverse regions (u = 0.05), where the raw estimates of LDhat appear to be highly underestimated even for large sample sizes (n = 100). Discarding estimates with large confidence intervals efficiently suppresses this bias ( Figure  2). We also note that the inference bias is stronger for low recombination rates, and that this effect is independent of the sample size ( Figure 2). Based on these simulation results, we similarly filtered our recombination estimates based on the 95% confidence interval reported by LDhat.
This filtering discards 49 and 20% of all SNP pairs for Z. tritici and Z. ardabiliae, respectively. The large difference between the two data sets is imputable to the much higher nucleotide diversity of Z. tritici. When compared with the genetic map (Croll et al. 2015), the filtered map of Z. tritici shows a correlation of 0.34 (Kendall's rank correlation test, P , 2.2e 216 ; Table 2). Interestingly, correlations between the genetic map and the LD map inferred with LDhat increase with increased window size: using 500-kb windows, the correlation becomes 0.43 (Kendall's t, P = 0.000206).
Putative sequencing errors: Sequencing errors can affect LD estimates as they appear as unlinked singletons in population genomic data sets. Such potential effects are partially accounted for at two levels in our analyses. First, our data sets are based on de novo assembly of each individual genome, which already corrects for putative sequencing errors in the sequencing read outs. Second, many of the SNPs discarded for having a large confidence interval in the estimation of recombination rates by LDhat are singletons. To further assess the potential impact of putative sequencing errors, we ran LDhat on the Z. tritici data set after discarding all singletons and filtering for confidence interval as described above. The resulting recombination map appeared to be highly correlated to the map including all singletons ( Figure S1 and Table 2), and the correlation of the filtered map with the crossover map was significant, but weaker than when including them (Kendall's t = 0.3083, P , 2.2e 216 ) ( Table 2). We therefore conclude that putative sequencing errors have no significant impact on our inferred recombination maps.
Effect of population structure: Previous studies reported that Z. tritici strains are sampled from a globally panmictic population (Linde et al. 2002;Zhan et al. 2003). However, in a recent study based on whole-genome data, we report evidence for slight population structure, notably between Iranian isolates vs. European isolates (Grandaubert et al. 2017). To assess whether this structure could bias our recombination estimates, we generated a new recombination map using LDhat on a reduced sample of eight strains of Z. tritici. We excluded the two Iranian isolates in our data set as well as three German isolates forming a separate, yet nonsignificant, cluster. We report that the resulting map is highly correlated to our recombination map ( Figure S1) as well as to the previously published genetic map (Kendall's t = 0.3237, P , 2.2e 216 ) ( Table  2), suggesting that population structure has little effect on our inference of recombination rate. Because the resulting correlation with the crossover map was slightly lower than when using the complete data set, we use the complete data set for the further analyses.
As little is known about the population structure of Z. ardabiliae, we conducted additional simulations to assess the putative impact of structure on the inference of recombination rate. We used a five-islands structure model, with sample sizes equal to 2, 3, 4, 5, and 6 in each deme, with a total sample size of 20; which is comparable to the 17 genomes of the Z. ardabiliae data set analyzed here. Migration rates between demes were symmetrical and all equal, and we tested several rates. We find that while r is systematically overestimated in the presence of population structure, it is remarkably proportional to the true value, in particular after filtering on the confidence intervals ( Figure S2). Population structure, if any, is therefore not expected to bias our comparison of recombination rates along the genome. In addition, these results suggest that the true recombination rate in Z. ardabiliae is potentially even lower than the value reported here.
Coding sequences: Recombination inference based on patterns of LD is affected by various patterns of selection. The genomes of Z. tritici and Z. ardabiliae are gene dense and protein-coding genes occupy nearly 50% of the sequences. We therefore considered the impact of selection on our recombination inference in the two species, assuming lower selection in noncoding regions. To this end, we compared the previously published crossover map with estimates of r exclusively in the intergenic regions, excluding coding sequences and 500 bp up-and downstream of the annotated genes ( Figure S1 and Table 2). These analyses based on noncoding sequences and filtering of SNPs based on the confidence interval of recombination rate estimates resulted in correlations of 0.22 for the LDhat map and the average of the two genetic crosses (Kendall's rank correlation test, P , 2.2e 216 ) and 0.24 for the LDhelmet map (Kendall's rank correlation test, P , 2.2e 216 ). Thus, the best correlations of LD based on the recombination maps and genetic crosses are obtained when coding regions are included ( Table 2). The finding suggests that the composite likelihood method provides robust estimates of recombination, even in regions likely to deviate from purely neutral evolution. Based on these simulation results, we chose to use the LDhat-inferred recombination rates on the full genome, with an input u of 0.005 and filtered according to confidence intervals for both Z. tritici and Z. ardabiliae.
A fivefold higher population-scaled recombination rate in Z. tritici The inference of r across the genomes of Z. tritici and Z. ardabiliae reveals highly heterogeneous recombination landscapes in both species (Figure 3 and File S1). We find a fivefold higher recombination rate in Z. tritici than in Z. ardabiliae: the mean values of r are 0.0217 and 0.0045 for Z. tritici and Z. ardabiliae, respectively. As r = 2N e 3 r, where r is the actual recombination rate per generation per nucleotide and N e is the effective population size, this fivefold difference might reflect differences in r or global differences in N e . Furthermore, the inferred parameter r reflects the historical rates of recombination in the two species, which may have varied according to different demographic events since their divergence. Nonetheless, the nucleotide diversity estimated by Watterson's u is 1.6 times higher in Z. tritici than in Z. ardabiliae, indicating that different population sizes alone cannot explain the observed difference in recombination rates, assuming that the two species have comparable mutation rates. The higher value of r estimated in Z. tritici thus likely reflects a higher actual recombination rate (in the past or presently) in the wheat pathogen compared to Z. ardabiliae.

Recombination on small arms of acrocentric chromosomes
Physical factors, such as chromosome length, centromere position, or distance to the centromere, have been reported to affect broadscale recombination patterns in eukaryotes (Jensen-Seaman et al. 2004). To investigate the rate and distribution of crossover events along the genomes of the two Zymoseptoria species, we correlated the inferred recombination maps with features of the well-characterized karyotype of Z. tritici. The reference genome sequence of Z. tritici consists of 21 fully sequenced chromosomes, including 8 socalled accessory chromosomes showing presence/absence polymorphisms between individuals (Goodwin et al. 2011). Furthermore, the exact positions of the centromeres for all chromosomes have been characterized experimentally using a chromatin immunoprecipitation assay targeting the centromere-specific protein CenH3 (Schotanus et al. 2015). An interesting finding is that the chromosomes in Z. tritici are either acrocentric or near acrocentric, and every chromosome consequently consists of one long and one short chromosome arm (Schotanus et al. 2015). Because a complete chromosome assembly is not available for Z. ardabiliae, we mapped the recombination estimates of Z. ardabiliae on the genome of Z. tritici to assess the impact of the karyotype structure on recombination rate variation. Similar to findings from other species (Jensen-Seaman et al. 2004;Munch et al. 2014), we observe a negative correlation between recombination rate and the size of the 13 core chromosomes (Kendall's t = 20.59 with P = 4.29e 23 for Z. tritici and t = 20.72 with P = 2.84e 24 for Z. ardabiliae; Figure 4A). This pattern is generally explained by the necessity of one crossing over to occur per chromosome or chromosome arm per generation, resulting in a higher recombination rate on smaller chromosomes (e.g., Kong et al. 2002;Smeds et al. 2016). The significant correlation of the recombination map of Z. ardabiliae with the genome structure of Z. tritici is an indication of a conserved karyotype of the ancestral species of Z. tritici and Z. ardabiliae.
Given the acrocentric nature of the Z. tritici chromosomes, we considered to what extent recombination also occurs on the short chromosome arms. If meiosis involves one crossover event per chromosome, then the recombination rate should be correlated with the chromosome size and not the chromosome arm length. However, if meiosis involves one crossover event per chromosome arm, then a higher frequency of recombination should occur on shorter chromosome arms. Correlations between recombination rates and chromosome arm lengths also show negative values, yet they are only significant in Z. ardabiliae (Kendall's t = 20.14 with P = 0.3356 for Z. tritici and t = 20.42 with P = 2.16e 23 for Z. ardabiliae; Figure  4B). The negative correlation observed at the chromosome-arm level suggests that meiosis in the Zymoseptoria pathogens requires at least one crossing over per chromosome arm and that the small chromosome arms consequently also recombine. The weaker correlations and lack of significance in Z. tritici could be due to a fast evolution of centromere positions, erasing the signal of arm-specific recombination rates.
Extremely weak or absent GC-biased gene conversion in Z. tritici and Z. ardabiliae In many species, recombination strongly affects evolution of GC content by a mechanism called GC-biased gene conversion (gBGC) (Duret and Galtier 2009;Mugal et al. 2015). The effect of gBGC has been demonstrated in mammals (Piganeau et al. 2002;Duret and Galtier 2009), birds (Weber et al. 2014), plants (Serres-Giardi et al. 2012, and even bacteria (Lassalle et al. 2015). However, gBGC has been poorly addressed in fungal species beyond the yeast model, which represents one of the rare organisms for which gBGC was experimentally demonstrated (Mancera et al. 2008). To study the possible occurrence and impact of gBGC in the Z. tritici and Z. ardabiliae genomes, we studied the patterns of GC content along the genomes of the two species. We fitted a nonhomogeneous, nonstationary model of substitution in 10-kb windows in intergenic regions, allowing us to estimate the equilibrium GC content (frequency of GC toward which the sequences evolve) in the extant species. We inferred the dynamics of GC content by comparing the actual GC content of the sequence (observed GC content) with the equilibrium GC content (Duret and Arndt 2008). We find that both the observed and equilibrium GC are highly correlated between Z. tritici and Z. ardabiliae (Kendall's rank correlation test, t = 0.69 and 0.45, P , 2.2e 216 for the observed and equilibrium GC content, respectively, essential chromosomes only; Figure S3). However, although both species show similar observed GC content (median of 53.3% for Z. tritici and 53.6% for Z. ardabiliae) they also show contrasting patterns, with the GC content found to be slightly increasing in Z. ardabiliae (median equilibrium GC content on autosomes of 53.8, significantly higher that the observed GC content, Wilcoxon paired rank test, P = 0.04712) while it is decreasing in Z. tritici (median equilibrium GC content of 51.6%, which is significantly lower than the observed GC content, Wilcoxon paired rank test, P = 2.728e 215 ).
To assess the impact of recombination on GC evolution, we correlated the equilibrium GC content in Z. tritici and Z. ardabiliae to the recombination maps in the two species. We find overall negative, yet weak or nonsignificant, correlations between GC content and recombination rate ( Figure S3), both for observed (Kendal's t = 20.047, P = 0.04304 for Z. tritici and t = 20.054, P = 0.02253 for Z. ardabiliae) and equilibrium GC content (Kendal's t = 20.02, P = 0.5082 for Z. tritici and t = 0.01, P = 0.7128 for Z. ardabiliae). These results do not support gBGC as a major mechanism shaping GC content in the two fungal pathogen genomes. To test whether this conclusion could be an artifact of recombination rates estimated from population data, we also correlated the equilibrium GC content with the two previously published genetic maps (Croll et al. 2015). Consistent with our finding from the LDhat-based recombination map, we confirm an absence of correlation between the equilibrium GC content and the crossover rate and GC content in Z. tritici, (Kendall's rank test, t = 0.006 and P = 0.7035 for observed GC; and t = 20.024, P = 0.1149 for equilibrium GC content).
The absence of correlation between GC content and recombination could also be because of a lack of statistical power due to the overall very homogeneous GC content and large-scale recombination landscapes (recall Figure 3), and the notable absence of isochores that characterize genome composition in other organisms, e.g., in mammals (Galtier et al. 2001). As a complementary line of evidence, we investigated the segregation patterns of AT and GC alleles at AT/ GC biallelic sites in intergenic regions of both Z. tritici and Z. ardabiliae, as gBGC is expected to increase the frequency of GC alleles (Escobar et al. 2011). We find that the frequency of GC alleles is virtually identical to the frequency of AT alleles in Z. tritici and only slightly higher in Z. ardabiliae (Table 3), supporting an absence or only weak effect of gBGC in Z. tritici and Z. ardabiliae, respectively.

No suppression of recombination in centromeres
Recombination is normally found to be absent in centromeric regions where spindles attach during chromosome segregation (see review by Petes 2001). A known exception is Drosophila mauritiana, which, in contrast to D. melanogaster and D. simulans, shows no suppression of recombination in centromeres (True et al. 1996). The centromeres of core and accessory chromosomes in Z. tritici range from 5.5 to 14 kb in size and do not locate in AT-rich regions (Schotanus et al. 2015) as is otherwise observed for centromeres of other species such as Neurospora crassa (Smith et al. 2011). Correlating the recombination map of Z. tritici with centromere positions, we observe (as in D. mauritiana) no significant suppression in recombination rate across the centromeric chromosome regions (Wilcoxon signed rank test on 11 chromosomes for which recombination rate in the centromeric region could be inferred, P = 0.5771) (Figure 3 and Table 4). The centromeres of Z. tritici exhibit several features common to neocentromeres such as a short length (10,000 bp in length), lack of enriched repetitive DNA, and weakly transcribed genes (Schotanus et al. 2015). We hypothesize that recombination in centromeric sequences has additional implications for evolution of the centromeres in these fungi. A more detailed characterization of chromosome structures and centromere locations in Z. ardabiliae is necessary to better understand karyotype evolution in these grass pathogens.

Absence of recombination on accessory chromosomes
The small accessory chromosomes have previously been well characterized in Z. tritici (Goodwin et al. 2011). They differ considerably from the core chromosomes as they display a higher repeat content, lower gene density, overall lower transcription rate, and are enriched with different chromatin modifications (Stukenbrock et al. 2010;Kellner et al. 2014;  Grandaubert et al. 2015;Schotanus et al. 2015). Electrophoretic separation of accessory chromosomes from several isolates of Z. ardabiliae has shown that this species also comprises accessory chromosomes (Stukenbrock et al. 2011). In this study, we used sequence homology to define the accessory components of the Z. ardabiliae genome. We find that the aligned fragments of the accessory chromosomes show very low recombination rates in both species (median r = 0.0059 in Z. tritici and median r = 0.0001 in Z. ardabiliae over 13 10-kb windows where both genomes could be aligned, which is 25 and 2% of the autosomal rates, respectively) ( Figure 4C). The lower recombination rates reflect the lower effective population size of accessory chromosomes that are present at lower frequencies in populations of Z. tritici and Z. ardabiliae compared to the core chromosomes. Furthermore, we speculate that frequent structural rearrangements on accessory chromosomes can prevent homologous chromosome pairings and also contribute to the low recombination rates. Our findings add further evidence to support different evolutionary modes of the two sets of chromosomes (core and accessory chromosomes) contained in the same genome. Suppression of recombination is also found on mating-type chromosomes in other fungi including species of Neurospora and Microbotryum Petit et al. 2012;Hood et al. 2013). These regions are characterized by an increased accumulation of TEs and structural variants as well as nonadaptive mutations in coding sequences as a consequence of suppressed recombination Badouin et al. 2015).
We also observe a remarkable drop in the recombination rate on the right arm of chromosome 7 (File S1). The right arm of chromosome 7 displays several similarities to the DNA of the accessory chromosomes, including a lower gene density, higher repeat content, and less gene transcription . Furthermore, the entire chromosome arm is enriched with the heterochromatic mark H3K27me3, which is similarly enriched on the accessory chromosomes (Schotanus et al. 2015). We previously proposed that this particular chromosome region represents a recent translocation of an accessory chromosome to a core chromosome (Schotanus et al. 2015). This hypothesis is consistent with the observation that the recombination rate of the chromosome arm resembles the overall reduced recombination rate of the accessory chromosomes (File S1).
High recombination rates in coding sequences of Z. tritici In primates and birds, recombination increases at CpG islands and around transcription start and end sites (Auton et al. 2012;Singhal et al. 2015;Smeds et al. 2016). In the honeybee, recombination rates in introns and intergenic regions are significantly higher than recombination rates in 39 and 59 UTRs and coding sequences (Wallberg et al. 2015). It has been proposed that altered chromatin structures, such as destabilized nucleosome occupancy at CpG islands and promoters contribute to this fine-scale variation in recombination rate (Jones 2012). To determine whether specific sequence features in the fungal pathogen genomes similarly affect the overall recombination landscape, we inferred and compared the mean recombination rates in exons, introns, intergenic regions, and 59 and 39 flanking regions (500-bp upstream and downstream coding DNA sequence regions, respectively) with a minimum of three filtered SNPs ( Figure  5A). Overall, we observe significant differences but with small size effects in fine-scale rates of recombination across different genome regions (Kruskal-Wallis test with post hoc comparisons, false discovery rate set to 1%). In both Z. tritici and Z. ardabiliae, we find the lowest recombination rates in introns and the highest rates in intergenic sequences ( Figure  5A). A lower value of r = 2N e r can result from a reduced N e , a reduced r, or both. N e in the proximity of genes is expected to be lower due to the presence of background selection (Nordborg et al. 1996;Hobolth et al. 2011;Scally et al. 2012). The highly similar observed recombination rates in coding and noncoding sequences in Z. tritici and Z. ardabiliae suggests that r is not suppressed in these regions in the same way as is observed in other organisms. The pattern indicates that other mechanisms define fine-scale recombination rates in these fungi which lead to high recombination frequencies in protein-coding sequences.
Because of the relatively high rates of recombination in exons of Z. tritici and Z. ardabiliae, we sought to determine whether recombination could play a particular role in plantpathogen coevolution. Plant pathogens interfere with host defenses and manipulate the host metabolism by the secretion of so-called effector proteins produced to target molecules from the host (Lo Presti et al. 2015). Antagonistic coevolution of these interacting proteins is often reflected in accelerated evolution and signatures of positives selection (Stukenbrock and McDonald 2009). To assess the role of recombination on effector evolution, we first predicted effector proteins computationally in the secretomes of both species using the EffectorP software (Sperschneider et al. 2016). This approach identified 868 putative effector proteins in Z. tritici and 1122 and Z. ardabiliae.
By comparing the recombination rates in different genomic regions encoding effector and noneffector genes, we show a significantly lower recombination rate in exons and introns of effector proteins in Z. ardabiliae (Wilcoxon rank test, P = 1.305e 24 for exons and 2.534e 25 for introns, P-values corrected for multiple testing) ( Figure 5B). The differences are mostly driven by an excess of zero estimates in effector-encoding regions in Z. ardabiliae, as visible on the distribution of measures ( Figure 5B). Discarding these regions with a mean recombination of zero leads to nonsignificant differences between effector and noneffector genes. A recombination rate estimated to zero can either be due to suppression of recombination in the region or due to an estimation error. Introns and exons with a recombination estimate of zero in Z. ardabiliae are found to be shorter and to have a higher SNP density. While these differences are significant, they are of a small size and are unlikely to be a cause of estimation error, and the suppression of recombination in some effector genes of Z. ardabiliae therefore appears to be a biological signal whose origin remains to be elucidated by detailed analysis of these regions.
Large-scale but not fine-scale correlation of recombination landscapes in Z. tritici and Z. ardabiliae Recombination landscapes have been compared in different model species to assess the extent of conservation of recombination rate variation. Broadscale recombination rates in zebra finches and long-tailed finches have similar levels and present correlation factors as high as 0.82 and 0.86 at the 10-kb and 1-Mb scales, respectively (Singhal et al. 2015). Similarly, broadscale recombination rates in human and chimpanzee tend to be conserved with few exceptions, such as the human chromosome 2 which originates from a chromosome fusion in the human lineage (Auton et al. 2012). However, when comparing the recombination rates of more distantly related mammal species, the correlation of recombination rates decreases even when comparing homologous syntenic blocks (Jensen-Seaman et al. 2004). In studies of mammals and fruit flies, it is considered that the recombination landscape evolves as a result of evolution of other sequence variables (Jensen-Seaman et al. 2004) and the dynamics of fine-scale recombination rates, including the positions of hotspots (Winckler et al. 2005;Chan et al. 2012).
To address the evolution of recombination landscapes in Z. tritici and Z. ardabiliae, we compared the genome-wide recombination maps of the two species. We previously reported that the genomes of the two species show a high extent of colinearity and we found a mean sequence divergence of d xy = 0.13 substitutions per site (Stukenbrock et al. 2011). Here, we first aligned the two reference genomes of Z. tritici and Z. ardabiliae to compare recombination rates in homologous genome regions (Figure 6; see Materials and Methods). Next, we calculated the average recombination rate in nonoverlapping windows with at least 100 SNPs in each species, which resulted in 3851 windows for which recombination in both species could be averaged. The two maps show a moderate yet highly significant correlation (Kendall's rank correlation test, t = 0.2327, P , 2.2e 216 ; Figure 7A), which suggests certain similarities in the recombination landscape of the two fungi. To determine the scale at which the maps are most correlated (broad-or fine-scale recombination rates), we further investigated how the correlations vary when various window sizes are used. We find that the correlations, consistently inferred with different correlation measures, peak at the 0.5-1 Mb scale ( Figure 7B), suggesting that the recombination landscape is conserved at large scales but shows rapid evolution at smaller scales. These results mirror findings from other eukaryotic species (e.g., Winckler et al. 2005;Singhal et al. 2015) and suggest that distinct mechanisms determine the recombination landscape at fine and broad scales in these two species.

Frequency and intensity of recombination hotspots is higher in Z. tritici
The fine-scale LDhat recombination maps clearly reveal the presence of distinct peaks of recombination in both Z. tritici and Z. ardabiliae (Figure 3). We used the program LDhot to call positions of statistically significant recombination hotspots (Auton et al. 2014) and applied highly stringent selection criteria (see Materials and Methods) to obtain positions of the most significant hotspots for which the within-hotspot rate was at least 10 times higher than the flanking regions ( Figure 8A). Interestingly, our approach revealed a considerably greater number of recombination hotspots in Z. tritici (2578 hotspots) than in Z. ardabiliae (862 hotspots). Furthermore, we find a significant difference in the size of the hotspot regions between the two species. In general, the recombination hotspots span significantly shorter regions in Z. tritici (median 39 bp) than in Z. ardabiliae (66 bp, Wilcoxon ranked test P , 2.2e 216 ). We also compared the intensity of the recombination hotspots, as estimated by LDhot (r across hotspot) and also find the median value of r in hotspots to be significantly higher in Z. tritici (median of 16.44 compared with 8.42 for Z. ardabiliae, Wilcoxon rank test P , 2.2e 26 ). The higher frequency of more intense hotspots in Z. tritici not only reveals a different hotspot landscape in the wheat pathogen, it also suggests that the overall higher recombination rate we observe in Z. tritici partly is explained by the different recombination hotspot architecture. These differences to some extent mirror the larger density of SNPs in Z. tritici that enables a finer resolution of the hotspot distribution and structure, and could potentially be affected by a different demography and population structure in the two species. We also speculate that recombination hotspots in these fungi have evolved since the divergence of Z. tritici and Z. ardabiliae. To address the extent of conservation in hotspot positions, we correlated the hotspot maps of the two species.
The position of recombination hotspots is defined by different mechanisms in different taxa, e.g., PRDM9 in primates and transcription start and end sites in other species such as birds Singhal et al. 2015). Consequently, hotspot positions are highly conserved in some species (Tsai et al. 2010;Singhal et al. 2015), while highly variable in others (Myers et al. 2010). We mapped Z. ardabiliae hotspots on the Z. tritici genomes and counted the number of colocalizing hotspots in the two species. We considered a hotspot in Z. tritici as colocalizing with a hotspot in Z. ardabiliae if the distance between the two hotspots is ,1 kb and if no other hotspot is present in between. We report that only 149 hotspots are colocalizing (6% of hotspots in Z. tritici and 20% of hotspots in Z. ardabiliae). This number is however significantly more than expected by chance (P , 9.99e 25 , permutation test; Figure 8B). These results are consistent with the previously reported genetic maps of Z. tritici, which also show little overlap of hotspot positions between two Swiss crosses (Croll et al. 2015). Conversely, the patterns are highly different from Saccharomyces species in which hotspot positions are highly conserved and associated with functional elements across the yeast genomes (Tsai et al. 2010).
Given the dense genomes of Z. tritici and Z. ardabiliae, we assessed the number of hotspots mapped to coding sequences. Of the 2578 Z. tritici hotspots, 132 are located in introns and 1435 are located in exons. Interestingly, in Z. ardabiliae, we find 44 hotspots in introns and only 396 in exons. We plotted the number of hotspots as a function of the number of called sites in each region ( Figure  8C). We observe a general trend in which the number of detected hotspots increases with the number of called sites as a power law (linear relationship in log space), and with more hotspots detected in Z. tritici. In contrast to patterns of previously studied species, this reveals the presence of hotspots in all parts of the genome, including coding regions. We do not observe a significant enrichment close to transcription start sites (upstream regions) like in yeast (Lam and Keeney 2015). We further note that comparatively fewer hotspots locate in intergenic regions of Z. tritici, these regions displaying a density of hotspots similar to what is expected in Z. ardabiliae for the observed number of callable sites. We hypothesize two nonexclusive possible origins for this result: (1) the number of callable sites is higher in Z. tritici intergenic regions than in Z. ardabiliae, due to the lack of telomere-to-telomere assembly of a reference genome for this species. The missing regions could potentially bias our estimate of hotspot densities in intergenic regions. (2) Another possible explanation is that the comparatively larger number of hotspots in Z. tritici is due to an increased hotspot density in protein-coding genes in this species, which raises the question whether intragenic recombination hotspots represent a selected feature during evolution of the wheat-infecting lineage.

Conclusions
Pathogens need to adapt rapidly to overcome immune responses in their host (Jones and Dangl 2006). Several examples from animal and plant pathogens document exceptionally high rates of genome rearrangements, including changes in ploidy and full chromosome gains or losses (e.g., Ma et al. 2010;Croll et al. 2013;Hickman et al. 2013Hickman et al. , 2015. So far, the importance of meiotic recombination in rapid evolution of pathogens has been poorly addressed. Our analyses demonstrate extraordinarily high recombination rates in two fungal plant pathogens and thereby suggest that sexual recombination can also be a major driver of rapid pathogen evolution. The overall higher recombination rate and the increased density of recombination hotspots in the crop pathogen Z. tritici are remarkable. Z. tritici and Z. ardabiliae share a recent common ancestor, but exist and evolve in highly different environments. While Z. ardabiliae infects wild grasses in a natural ecosystem, Z. tritici infects a crop host and propagates only in managed ecosystems. Agricultural management strategies, dense host populations, and increased gene flow between geographically distant populations are factors that contribute to the different population structure of Z. tritici. We hypothesize that an increased rate of recombination in coding sequences of Z. tritici was selected as it favored the rapid generation of new alleles and allele combinations (Brunner et al. 2008). The exceptionally high recombination rate in Z. tritici allows the pathogen to rapidly overcome new host resistances and explains the current difficulties of controlling this important wheat pathogen. Daniel Croll for providing genetic data from experimental crosses of Z. tritici. E.H.S. is supported by intramural funding from the Max Planck Society, Germany; a personal grant from the State of Schleswig-Holstein, Germany; and a grant from the German Research Council, Deutsche Forschungsgemeinschaft, grant number HO 4435/1-1 in the framework of the SPP1819. J.Y.D. is supported by intramural funding from the Max Planck Society, Germany. The authors declare that they have no competing interests.