Controlling for off-target genetic effects using polygenic scores improves the power of genome-wide association studies

Ongoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data have brought forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Optimal realization of these objectives requires ongoing methodological developments, designed to identify true trait-associated variants and accurately predict phenotype from genotype. These methods must be computationally efficient, in order to remain tractable in the context of high variant densities and very large sample sizes. Here we show that the power of linear mixed models that are in widespread use for GWAS can be increased significantly by modeling off-target genetic effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of off-target genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.


Introduction
applied the method to standing height data from the UK Biobank and determined the number and characteristics of additional variants that were detected. For an objective assessment of performance on real data, where the true associations are unknown, 49 we divided the data into test and training sets and predicted the phenotype in the test set. The improved performance on the 50 critical task of complex phenotype prediction illustrates the utility of the PGS as a means of accounting for background genetic 51 variance.

53
We simulated data to evaluate the impact of including the PGS as a fixed effect in GWAS. The simulations consisted initially of a 54 normally-distributed continuous trait in 100,000 individuals. The trait had a narrow-sense heritability (h 2 ) of 0.5 and there were 55 1,000 causal SNPs with normally-distributed effects on the trait (see Methods for details). We incorporated the LOCO PGS 56 as a fixed effect in a linear mixed model using GCTA fastGWA 23 (we refer to this as fastGWA-PGS) and found a substantial 57 improvement in power to detect the known causal SNPs (Fig. 1). In 100 simulations we found that fastGWA-PGS recovered, on 58 average, 76 additional causal variants below the conventional P-value threshold of 5x10 −8 compared to fastGWA, corresponding 59 to a relative increase in power of 17%. The contribution to phenotype variance of off-target SNPs can also be modelled as a 60 random effect in a linear mixed model. This approach is applied by Bolt-LMM, which uses a normal mixture random effect, 61 with a component corresponding to SNPs with large effects; however, adding the PGS as a fixed effect to Bolt-LMM (which we 62 refer to here as Bolt-LMM-PGS) led to an improvement in power (Fig. 1). In 100 simulations Bolt-LMM-PGS recovered, on 63 average, 52 more causal variants than Bolt-LMM without the PGS fixed effect, corresponding to a 10% relative improvement in 64 power. Bolt-LMM achieved higher power than fastGWA (Fig 2b), at a substantial cost in terms of computational speed, as has been shown previously 23 . However, the power of fastGWA-PGS exceeded that of Bolt-LMM and was intermediate between 66 Bolt-LMM and Bolt-LMM-PGS. These results suggest that first calculating a PGS from an initial GWAS and including this as 67 a fixed effect in a linear mixed model can provide an improved means to control for off-target genetic effects. 68 We calculated receiver operator characteristic (ROC) curves to investigate whether the increased number of causal variants 69 recovered when we included the LOCO PGS as a fixed effect reflected a reduction in P-values across the board or also an 70 improvement in the ordering of the variants, when the variants are ordered by the evidence of an association with the phenotype.

4/17
between fastGWA-PGS and Bolt-LMM. Over the 100 simulations the mean sensitivity of fastGWA-PGS was consistently 80 higher than Bolt-LMM, at a given specificity (the largest increase in mean sensitivity was 0.021, which corresponded to a 81 relative increase in sensitivity of 3.2% at a specificity of 0.9991; Fig. S1, Additional File 1).

82
Dependence on trait heritability and number of causal variants 83 We simulated data over a range of values of h 2 and of the number of causal SNPs to investigate the impact of including the 84 LOCO PGS as a fixed effect on GWAS over the range of these simulation parameters. Due to the computational intensity of   95 Including the PGS as fixed effect resulted in a higher proportion of causal variants recovered across the full range of sample 96 sizes investigated (Fig. 4). Even at N=10,000 more causal variants were recovered with fastGWA-PGS than with fastGWA or 97 Bolt-LMM. This was somewhat surprising given that it is assumed that large sample sizes are required for accurate phenotype 98 prediction from PGS 24 . As the sample size increased, the number of variants recovered increased sharply for all methods 99 (fastGWA, fastGWA-PGS and Bolt-LMM), before beginning to level out at around 100,000 participants.

8/17
Application to phenotype prediction 112 One way to determine objectively whether fastGWA-PGS outperforms fastGWA on real data is to apply both methods on the 113 key task of phenotype prediction. We partitioned the UKB height data into an 80%/20% training and testing datasets and 114 calculated the PGS scores using a partitioning and thresholding (P+T) method as well as a more powerful Bayesian method to 115 estimate the phenotype in the test dataset (see Methods for details). In both cases fastGWA-PGS consistently outperformed 116 fastGWA across all inclusion thresholds (Posterior inclusion probability (PIP) and P-value). The maximum R 2 using P+T was 117 achieved at a P value threshold of 0.05 for both fastGWA (R 2 = 0.129) and fastGWA-PGS (R 2 = 0.138), corresponding to a 118 relative improvement in the proportion of the phenotype explained of 7% (Fig. 5a). The Bayesian model provided substantially 119 better prediction of the phenotype than the P+T method when variants below 0.4 PIP were included in the model and, again, given threshold being used in prediction, from improvement in the ordering of SNPs, so that a better set of SNPs are being used 126 or, lastly, from better estimates of the SNP effect sizes. To distinguish between these possibilities we repeated the prediction 127 using the same number of SNPs for fastGWA-PGS and fastGWA and then for the same set of SNPs for both (see Methods for 128 details). The performance gain was maintained when we used the same number of SNPs, but was much smaller when we used 129 exactly the same set of SNPs (Table S3, Additional File 1).

131
Omitting covariates that are associated with a response and independent of an effect of interest can result in a reduction in the 132 efficiency of the estimation of the effect of interest 25 . Complex traits are associated with the genotype of many loci across the 133 genome, but the effects of the off-target SNPs are frequently not explicitly modelled in GWAS and, instead tests of association 134 are performed for one target SNP at a time. We evaluated a simple two-stage approach to accounting for off-target genetic 135 effects that consists of performing an initial GWAS and using the summary statistics to calculate a polygenic score and then 136 including the polygenic score, derived from SNPs not on the same chromosome as the target SNP, as a fixed effect in a second 137 round of association testing. We found that this led to a substantial improvement in GWAS power using simulated data. A 138 popular approach to account for the effects of off-target SNPs is to make use of a mixture distribution for the random effect in  (Fig. 1); however, this improvement came at a substantial cost, as fastGWA is much more computationally 145 efficient than Bolt-LMM 23 . 146 The increase in power using the PGS fixed effect was largest for simulated phenotypes with high heritability and a large 147 number of causal variants (Fig. 3b). In these cases the many off-target SNPs collectively explain a substantial proportion of the 148 phenotypic variance and summarizing the contribution of these off-target SNPs to the phenotype via the LOCO PGS is likely to 149 result in a better estimate of the effect of the target SNP and its standard error. The boost in performance derived from including 150 the LOCO PGS as a fixed effect was evident even for relatively small study sizes. Across all the simulation parameters we 151 investigated, the performance of the fastGWA-PGS was never worse than fastGWA without the LOCO PGS. We also note that 152 we calculated the LOCO PGS using SNPs that were selected based on a fixed P-value threshold. Further increases in power 153 may be possible by optimizing the SNPs that are used to calculate the PGS separately for each omitted chromosome. 154 We also applied the method to real data (standing height in individuals of British ancestry in the UK Biobank). Height was The use of polygenic scores to contribute to phenotype prediction from genotype is an increasingly important application of 178 the results of GWAS 30 . Recent work has shown that PGS can identify sizeable proportions of a population with substantially 179 greater risk for CAD, type II diabetes (T2D), arterial fibrillation (AF), infammatory bowel disease (IBD) and breast cancer 16 .

180
A separate study using the FinnGen dataset compared the overall lifetime risk of coronary heart disease (CHD), T2D, AF, 181 breast cancer and prostate cancer between individuals having an average compared to a high PGS score. A high PGS score 182 was associated with a 21% to 38% increase in overall lifetime risk with age of onset 4 to 9 years earlier than in the average To test the performance of fastGWA-PGS on the task of predicting standing height, the UK Biobank data was partitioned into 247 80/20 and an independent 20/80 training and test datasets. We used two PGS strategies, pruning and thresholding (P+T) and  For the P+T method, the same fastGWA-PGS methodology was used as in the simulation study and PGS scores were 253 calculated in the test set using P-value thresholds (0.001, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1). The Bayesian method replaces the P+T 254 method in the PGS-LOCO step with SBayesR. To limit the effect of over-fitting, we excluded variants with a posterior inclusion 255 probability (PIP) score of less than 0.2 in the PGS-LOCO calculation. As SBayesR uses Bayesian inference to estimate the 256 true effect sizes, Plink2 was used to calculate the PGS-LOCO score from the genotype and SBayesR effect size data before 257 association testing. SBayesR was then applied once more on the fastGWA-PGS summary statistics over a range of PIP scores.

258
The model fit was assessed for each method by fitting a linear model to the values of the phenotype in the test set as a function 259 of their predicted values. 260 We repeated the phenotype prediction using the same number of SNPs (74,000) with fastGWA-PGS and fastGWA. We