DFEnitely different: Genome-wide characterization of differences in mutation fitness effects between populations

The effect of a mutation on fitness may differ between populations, depending on environmental and genetic context. Experimental studies have shown that such differences exist, but little is known about the broad patterns of such differences or the factors that drive them. To quantify genome-wide patterns of differences in mutation fitness effects, we extended the concept of a distribution of fitness effects (DFE) to a joint DFE between populations. To infer the joint DFE, we fit parametric models that included demographic history to genomic data summarized by the joint allele frequency spectrum. Using simulations, we showed that our approach is statistically powerful and robust to many forms of model misspecification. We then applied our approach to populations of Drosophila melanogaster, wild tomatoes, and humans. We found that mutation fitness effects are overall least correlated between populations in tomatoes and most correlated in humans, corresponding to overall genetic differentiation. In D. melanogaster and tomatoes, mutations in genes involved in immunity and stress response showed the lowest correlation of fitness effects, consistent with environmental influence. In D. melanogaster and humans, deleterious mutations showed a lower correlation of fitness effects than tolerated mutations, hinting at the complexity of the joint DFE. Together, our results show that the joint DFE can be reliably inferred and that it offers extensive insight into the genetics of population divergence.

: The joint allele frequency spectrum and joint distribution of fitness effects. A: Populations that have recently diverged or have gene flow between them will share genetic variants. Some of those variants will have a different effect on fitness in the diverged population (s 2 ) than in the ancestral population (s 1 ). B: The joint distribution of fitness effects (DFE) is defined over pairs of selection coefficients (s 1 , s 2 ). Insets show joint allele frequency spectra corresponding to selection coefficients for pairs of variants that are strongly or weakly deleterious in each population. In each frequency spectrum, the number of segregating variants at a given pair of allele frequencies is exponential with the color depth. C: One potential model for the joint DFE is a bivariate lognormal distribution, illustrated here for strong correlation. D: Another potential model is a mixture of components corresponding to equality (ρ = 1) and independence (ρ = 0) of fitness effects. E: As illustrated by these simulated allele frequency spectra, stronger correlations of mutation fitness effects lead to more shared polymorphism. Here p 1 is the weight of the ρ = 1 component in the mixture model. Using simulated data, we tested the statistical power of joint DFE inference and its robustness to model misspecification, focusing on inferences of the mixture proportion p 1 for the perfectly correlated component of the joint DFE (Fig. 1D). See Methods for simulation details. A: Precise inference of p 1 was possible even for modest sample sizes. B: Precise inference of p 1 was also possible even when the populations were only recently diverged. Here T is the populationscaled divergence time. C: The bias in inferred values of p 1 was modest when data were simulated with a demographic model including both exponential growth and migration but analyzed assuming instantaneous growth or no migration. D: The bias in p 1 was also modest when data were simulated with either dominant (h = 0.75) or recessive (h = 0.25) derived alleles, but analyzed assuming additive alleles (h = 0.5). E: The bias in p 1 was also modest when data were simulated under a mixture model with gamma-distributed components, but analyzed assuming lognormal components. F: When data were simulated under a bivariate lognormal model with given correlation coefficient ρ, the inferred mixture proportion p 1 was similar to the simulated ρ, for both symmetric and asymmetric bivariate lognormal simulations.
numerically poorly behaved as the correlation coefficient approaches one and the distribution becomes very 68 thin. We thus also considered a mixture model that consisted of a component corresponding to perfect cor-69 relation, with weight p 1 , and a component corresponding to zero correlation, with weight (1 − p 1 ) (Fig. 1D).

70
The mixture proportion p 1 can be interpreted similarly to the correlation coefficient in the bivariate normal 71 distribution.

72
The correlation of the joint DFE profoundly affects the expected AFS (Fig. 1E) the inferred mean µ and variance σ of the marginal DFEs (data not shown), but it did not substantially bias 104 the mixture proportion p 1 (Fig 2D).

105
The parametric form assumed for the joint DFE is another potential source of model misspecification. To 106 test how this might bias inference, we first simulated a true mixture model in which the marginal distributions 107 were gamma, rather than lognormal. In this case, we found that inference of the mixture proportion was 108 not substantially biased (Fig. 2E). We also considered fitting the mixture lognormal model (Fig. 1D) to data 109 simulated under a bivariate lognormal model (Fig. 1C). In this case, we found that the inferred values of 110 p 1 we larger than the simulated correlation coefficient ρ, although they were similar (Fig. 2F). The mixture We applied our joint DFE inference approach to D. melanogaster , wild tomatoes, and humans, using data  To begin our analysis, we first fit models of demographic history to synonymous variants in each pop- Figure 3: Model fits to allele frequency spectra. A) For Drosophila, we fit an isolation-with-migration demographic model to the synonymous data. B) That model fit the data well, as evidenced by the small and mostly uncorrelated residuals. C) The nonsynonymous data showed a substantial reduction in shared polymorphism. Those data were best fit by a joint DFE mixture model ( Fig. 1D) with p 1 = 0.97, and the resulting model had small residuals. D) For humans, we fit an isolation-with-migration demographic model that included growth before divergence. E) That model also fit the data well. (In these plots, we projected down to a smaller sample size, to smooth variation due to the sparsity of the spectrum.) F) The nonsynonymous human data were well-fit by a joint DFE mixture model with p 1 = 0.99. G) For wild tomatoes, we fit an isolation-with-migration model to data from two closely related species. H) That model fit the data moderately well. I) When we fit the nonsynonymous tomato data with a joint DFE mixture model, we inferred p 1 = 0.91, with similar fit quality. We next estimated the joint DFE using all nonsynonymous variants in each proteome, using our lognormal 137 mixture model for the joint DFE (Fig. 1D). For D. melanogaster , we found that mutation fitness effects 138 between Zambian and French populations were highly correlated, with p 1 = 0.967 ± 0.022 (Table S4). For 139 humans we found an even higher correlation that was statistically indistinguishable from perfect correlation, 140 p 1 = 0.990±0.010 (Table S5). For tomatoes, we found a substantially smaller correlation, p 1 = 0.906±0.019.

141
In all three cases, the resulting models fit the nonsynonymous joint frequency spectrum well, with similar 142 patterns of residuals to the demographic models fit to synonymous data ( Fig. 3C,F,&I).

143
To investigate the biological basis of the joint DFE, we then considered genes of different function based  Table S4). Although the p-values for many individual terms were 147 modest, considering all tested GO terms, Fisher's combined probability test yielded a p-value of p ∼ 10 −6 , 148 suggesting that our results are highly enriched for deviations below p 1 = 1. For tomatoes, we found an even 149 wider range of inferred correlations, with the lowest being genes involved in responding to abiotic stimulus 150 at p 1 = 0.858 ± 0.093 (Fig. 5, Table S6). For humans, we found that all GO terms yielded values of p 1 151 that were statistically indistinguishable from one ( Fig. S2). Among the D. melanogaster terms, we found 152 no correlation between the inferred mixture proportion p 1 and other parameters in the joint DFE ( Fig. S3),

153
suggesting that the variation we see in p 1 is not driven simply by variation in overall constraint.

154
In humans, we further explored the biological context of the joint DFE by considering genes that are 155 involved in disease and that interact with pathogens. We found a weak signal that genes involved in Mendelian 156 disease tend to have less correlated selection coefficients than other genes (Fig. 6). We also found that genes 157 that interacted with multiple viruses had highly correlated selection coefficients, but those that were known 158 to interact with only a single virus exhibited a significantly lower correlation (Fig. 6).

159
We used the variation among our inferences for D. melanogaster GO terms to test the robustness of our 160 analyses to various modeling choices. We first fit simpler models of demographic history with instantaneous 161 growth or no migration to the synonymous data (Table S1), then used those models as the basis of joint DFE analysis. Although these demographic models fit the data much less well than our main model, the inferred values of p 1 for the GO terms were highly correlated with those from our main model (Fig. S4) Moreover, we found that the mixture proportion p 1 was dramatically smaller for the deleterious class than 179 the tolerated class in humans and D. melanogaster (Fig. 4&6), but not in wild tomatoes (Fig. 5).

181
We developed an approach for inferring the joint distribution of fitnesses effects of new mutations between 182 pairs of populations. We tested our approach with simulation studies, finding that it is statistically powerful 183 and robust to many forms of model misspecification (Fig. 2). We then applied it to D. melanogaster , wild 184 tomatoes, and humans (Fig. 3). Among these examples, we found the lowest correlation of fitness effects in 185 tomatoes and the highest in humans. In D. melanogaster and tomatoes, we found variation in the correlation 186 of the DFE among genes with different functions (Fig. 4&5). In D. melanogaster and humans, we found that 187 the correlation of the DFE is lower for mutations that are more deleterious (Fig. 4&6). Lastly, in humans,

188
we found that genes involved in interactions with specific viruses also exhibited lower correlation (Fig. 6).

189
Together, these results illustrate the biological insights that can be gained by considering the joint DFE correlation of fitness effects may be smaller for more deleterious variants (Fig. 4&6). Moreover, including 204 positive selection was not necessary to explain the data we analyzed (Fig. S5A), but it may be necessary in 205 other cases. Overall, our simple models of the joint DFE fit the data well, but more complex models may 206 be more informative.

207
In our models for the joint DFE, we assumed that the marginal DFE was identical between the two 208 populations. In other words, we assumed that the mean and the variance of mutation fitness effects did not marginal DFEs (Fig. 2F).

215
In both humans and D. melanogaster , we found the lowest correlation of mutation fitness effects in genes 216 related to immunity. In D. melanogaster , genes annotated as functioning in the immune system had the 217 lowest correlation among all GO terms tested (Fig. 4). It is well established that immune system genes with these findings. In humans, we found that genes known to physically interact with only a single virus 224 showed significantly lower correlation of fitness effects than genes known to interact with no or multiple 225 viruses (Fig. 6). This may be because genes that are known to interact with only a single virus are more 226 likely to be affected by differences in pathogen content between environments. Thus both our human and 227 D. melanogaster results are consistent with differences in selection due to distinct pathogen environments in 228 Africa and Europe.

274
Genomic data 275 We Calling success was uneven across the genome, so our projection excludes many sites from Chr3L and Chr3R 281 (Fig.S6). To annotate SNPs to their corresponding genes and as synonymous or nonsynonymous, we used followed their scheme for assigning individuals to species. To more easily apply SIFT, we used the NCBI 290 genome remapping service to convert the data from SL2.50 coordinates to SL2.40.

291
Fitting demographic history models 292 We used dadi to fit models for demographic history to spectra for synonymous mutations (Gutenkunst et al.  (Table S1). We also analyzed simpler models assuming instantaneous growth or no 296 migration. For the human analysis, we used dadi grid points of [800,1000,1200], and we found that an IM 297 model with growth in the ancestral population fit the data well (Table S3). For the tomato analysis, we 298 used dadi grid points of [300,400,500], and fit the same isolation-with-migration model that Beddows et al.

299
(2017) fit to the data, obtaining very similar parameter estimates (our Table S2 vs. their Table S4). Cached allele frequency spectra were created for the corresponding demographic models, using the same grid 310 points settings. Models of the joint DFE were then fit to nonsynonymous data by maximizing the likelihood 311 of the data, assuming a Poisson Random Field (Sawyer & Hartl 1992). In these fits, the population-312 scaled mutation rate for nonsynonymous mutations θ N S was held at fixed at a given ratio to the rate for 313 synonymous mutations θ S in the same subset of genes, as inferred from our demographic history model. For

321
We then created a spectra cache that included that γ + value (Fig. S1).

322
We separately analyzed SNPs from genes associated with different Gene Ontology terms. We used terms 323 from the Generic GO subset, which is a set of curated terms that are applicable to a range of species (The

324
Gene Ontology Consortium 2000). We considered the direct children of GO:0008510 "Biological Process", 325 and any gene annotated with a child of a given term was assumed to also be annotated by the parent term.

326
Thus in our analysis a given gene may be present in multiple GO term subsets. We used Ensembl Biomart 327 (Cunningham et al. 2019) to retrieve the annotated GO terms for each gene. 328 We also separately analyzed SNPs classified by SIFT as deleterious or tolerated (Vaser et al. 2016).

329
To do so, we considered only sites assigned "regular" confidence and ignored "low" confidence sites. To

337
For humans, we also divided genes into classes based on their role in disease and interactions with viruses.      melanogaster data, each panel shows a different subset of genes/mutations. In each panel, the histogram shows results from conventional bootstrap fitting, while the smooth curve is a normal distribution centered at the maximum likelihood inferred value and standard deviation estimated using the Godambe approach.