Schizophrenia, autism spectrum disorders and developmental disorders share specific disruptive coding mutations

People with schizophrenia are enriched for rare coding variants in genes associated with neurodevelopmental disorders, particularly autism spectrum disorders and intellectual disability. However, it is unclear if the same changes to gene function that increase risk to neurodevelopmental disorders also do so for schizophrenia. Using data from 3444 schizophrenia trios and 37,488 neurodevelopmental disorder trios, we show that within shared risk genes, de novo variants in schizophrenia and neurodevelopmental disorders are generally of the same functional category, and that specific de novo variants observed in neurodevelopmental disorders are enriched in schizophrenia (P = 5.0 × 10−6). The latter includes variants known to be pathogenic for syndromic disorders, suggesting that schizophrenia be included as a characteristic of those syndromes. Our findings imply that, in part, neurodevelopmental disorders and schizophrenia have shared molecular aetiology, and therefore likely overlapping pathophysiology, and support the hypothesis that at least some forms of schizophrenia lie on a continuum of neurodevelopmental disorders.


Introduction
Schizophrenia is a severe psychiatric disorder associated with a decreased life expectancy and marked variation in clinical presentation, course and outcome. The disorder is highly heritable and polygenic, with risk alleles distributed widely across the genome 1 . Common risk alleles collectively contribute to around a third of the genetic liability 2,3 , and at least 8 rare copy number variants have been identified as risk factors 4,5 . Exome-sequencing studies have also shown a contribution to risk from ultra-rare protein-coding variants; the de novo mutation rate is modestly elevated above the expected population rate, and there is an excess of ultra-rare damaging coding variants (frequency < 0.0001 in population) in genes with evidence for strong selective constraint against protein-truncating variants (PTVs) [6][7][8][9] .
SETD1A is currently the only gene in which rare coding variants are associated with schizophrenia at genome-wide significance 10 .
All of the rare CNVs known to be associated with schizophrenia also confer risk for NDDs, that is they are pleiotropic 5 . However, these CNVs are, with one exception, multigenic and therefore it is not established that genic pleiotropy exists whereby the same genes within the CNVs increase liability to each of the disorders 11 . The only example of a single-gene schizophrenia susceptibility CNV is NRXN1. Consistent with genic pleiotropy, exonic deletions of NRXN1 increase liability to schizophrenia and NDDs, but there is marked heterogeneity in the exons affected and the deletion sizes 12 , leaving uncertainty as to whether precisely the same mutation can cause schizophrenia and NDD. In contrast, sequencing studies provide strong support for the hypothesis of genic pleiotropy, with genes that are enriched for ultra-rare coding variants in people with NDDs being enriched for de novo variants in people with schizophrenia 7,8 . Moreover, SETD1A is not only genome-wide significantly associated with schizophrenia, is also is associated with DD 13 While SETD1A is enriched for PTVs in both schizophrenia and DD 10,13 , the degree to which pleiotropic effects across these disorders are confined to the same functional class of variant within genes is unknown. Moreover, although a specific PTV (c.4582-2delAG>-) in SETD1A has been observed multiple times in people with schizophrenia and DD 10 , little is known generally about pleiotropic effects from individual rare coding variants in schizophrenia and NDDs (i.e. true or allelic pleiotropy); this is not a trivial point, as only allelic pleiotropy implies that equivalent changes in gene function confer risk to both schizophrenia and NDDs. For example, some PTVs may cause disease by disrupting dosage sensitive genes (i.e. haploinsufficiency), whereas others in the same gene might result in truncated proteins that have pathogenic gain-of-function or dominant-negative effects [14][15][16] . Moreover, different PTVs in the same gene can affect different transcripts 17 leading to different effects. Similar considerations apply to missense variants whose functions are usually unknown, but which 5 can have very different functional consequences for the same gene translating to different pathogenic effects 18 .
In the current study, we analysed sequencing data from schizophrenia and new large NDD cohorts to investigate the nature of the pleiotropic effects of rare coding variants on schizophrenia and NDDs. Specifically, given that neurodevelopmental impairment is typically more severe in NDDs than in schizophrenia 19 , we hypothesised that there would be a tendency for pleiotropic genes to be enriched for a more severe class of mutation in NDDs than in schizophrenia. However, in contrast to expectation, we found that genes enriched for specific classes of de novo variant in people with NDDs were also enriched for congruent variant classes in people with schizophrenia. We followed this finding with a more stringent, and conservative, test of allelic pleiotropy, and demonstrated an enrichment in schizophrenia of specific variants that have been observed de novo in NDDs. Our findings provide strong evidence for true pleiotropic effects from rare coding variants across these disorders, thus indicating that the same changes in gene function can have very different neurodevelopmental and psychiatric outcomes.

Genic pleiotropy
In schizophrenia, de novo PTVs were significantly enriched in 127 genes associated with NDD through the same mutation class (rate ratio = 4.89; Table 1). No significant enrichment was observed for schizophrenia de novo missense variants in PTV NDD genes ( Table 1).
The excess of de novo PTVs in schizophrenia was ~3.7 fold greater than that observed for missense variants in PTV NDD genes (Table 1).
For 103 missense NDD genes, we again observed enrichment of the same class of de novo variant in schizophrenia (rate ratio = 1.86) and no enrichment for de novo PTVs (Table 1).
The excess of schizophrenia de novo missense variants in NDD missense genes was greater than that observed for schizophrenia de novo PTVs (Table 1), but this was not statistically significant, possibly due to the small number of observations. For 53 genes that are independently associated with de novo PTVs and missense variants in NDDs, both classes of variant were enriched in schizophrenia (Table 1), thus providing further evidence that congruent classes of variants within genes confer risk for these disorders.  13 . A Poisson regression model was used to evaluate differences between the rate of schizophrenia de novo PTVs and missense variants in NDD associated genes.

NDD gene set
Across the full exome, we found that association test statistics per gene for PTV enrichment in schizophrenia was positively associated with the association test statistics for PTVs in NDDs (beta = 0.18; P = 2.97 x 10 -11 ) but no evidence that PTV enrichment in schizophrenia was related to NDD missense test statistics (beta = -0.018; P = 0.67). We found a trend for missense enrichment in schizophrenia to be positively associated with the NDD missense test statistics (beta = 0.056; P = 0.069) but not with NDD PTV significance (beta = 0.033; P = 0.37). These results provide support for congruent variant classes, particularly for PTVs, contributing to risk for schizophrenia and NDDs beyond genes that meet exome-wide significance in NDD.

Allelic pleiotropy
Of 46,772 unique single-nucleotide variants observed as de novo mutations in NDD studies (defined as NDD variants), 17 were also observed de novo in 3,444 schizophrenia trios (   pathogenicity score 21 ; pLi = "probability of loss-of-function intolerance" 22 . Schizophrenia de novo variants were significantly enriched for variants in the NDD primary set (9 observed, 1.20 expected; P = 5.0 x 10 -6 , Table 3). The enrichment for de novo primary NDD variants in schizophrenia was significantly greater than the general enrichment for the same types of de novo mutation in constrained genes and coding sequences (i.e. PTVs in LoF intolerant genes, and MPC ≥ 2 mutations) (P = 1.01 x 10 -5 ; rate ratio (95% CI) = 6.91 (3.11, 13.38); Supplementary Table S1). The enrichment for specific NDD variants in schizophrenia is therefore not simply a reflection of the known modest excess of constrained de novo variants in the disorder. Schizophrenia de novo variants were not enriched in the negative control set (Table 3), which suggests that our finding is not caused by inaccuracies in estimating the expected de novo mutation rates.
We next looked at the specific classes of NDD mutation enriched in schizophrenia and found significant enrichment for both NDD PTVs (P = 0.01; rate ratio (95% CI) = 6.  We sought replication of association between NDD variants in the primary set and schizophrenia using exome sequencing data from 4,070 schizophrenia cases and 5,712 controls. The rate of variants from the primary set was ~2 fold higher in schizophrenia cases than in controls (P = 0.036; Table 4). The rate of both missense and PTV variants in the primary set were increased in schizophrenia cases compared with controls, although only the later was significantly higher (P = 0.024; Supplementary Table S3). The rate of NDD variants in the negative control set did not differ between cases and controls (  Firth's penalised logistic regression models were used to evaluate the burden of NDD variants in 4,070 schizophrenia cases and 5,712 controls. As this analysis included frameshift variants, the number of NDD variants in the primary and negative control set differs to that presented in Table 3. P-values are uncorrected and one-tailed. NDD = neurodevelopmental disorders; CI = confidence interval.
We were able to obtain additional phenotype data for each of the 9 schizophrenia probands who carried a de novo variant from the primary variant set (Supplementary Given that neurodevelopmental impairment is typically more severe in NDDs than in schizophrenia 19 , we hypothesised that there would be a tendency for pleiotropic genes to be enriched for a more severe class of mutation in NDDs than in schizophrenia. However, in contrast to our expectation, we found that genes associated with de novo variants in NDDs were enriched for the same class of variant in schizophrenia, with the evidence being particularly strong for PTVs. Conversely, there was no evidence of schizophrenia de novo variant enrichment in genes associated with a different class of variant in NDD. We next conducted a more stringent analysis of allelic pleiotropy, by testing only the same set of rare coding variants in schizophrenia that occurred in people with NDDs. Here, our findings supported our hypothesis that constrained variants observed in NDDs are enriched in schizophrenia, thus providing evidence for pleiotropy at the allelic level.
Our study suggests that the same rare variants can confer risk to a range of neurodevelopmental and psychiatric outcomes, including DD, ASD and schizophrenia, thus supporting the hypothesis that at least some fraction of schizophrenia can be conceived of as part of a continuum of NDDs 19 . We did not find support for our prediction that the type of 1 3 mutation is a major factor in determining the severity of neurodevelopmental outcome, though we cannot exclude the possibility that instances of this will become apparent through large-scale sequencing studies. Rather our findings suggest that clinical outcome reflects additional genetic, environmental or stochastic factors that can modify the effects of deleterious mutations. Indeed, there is evidence that the outcome for pathogenic CNVs is influenced by common genetic variation [25][26][27] . The situation for rare coding variants in schizophrenia is less clear 8 , but it is likely that similar considerations will apply. Our findings also indicate that the same changes in gene function can underlie both NDDs and schizophrenia, pointing to a shared molecular aetiology and therefore likely overlapping pathophysiology.
Schizophrenia and intellectual disability co-occur more often than is expected by chance, with around 3-5% of cases of schizophrenia being co-morbid 19,28 . The present findings further support the idea that co-morbidity is due to partly shared pathophysiology. However, it is unlikely that shared pathophysiology is restricted to those with co-morbid diagnoses.
Many of the schizophrenia trios, including 3 of the 9 probands with a primary de novo NDD variant, were from studies that specifically excluded individuals with intellectual disability (Supplementary Table S8 Supplementary Table S7.
In conclusion, we performed the first genome-wide study of allelic pleiotropy from rare coding variants for schizophrenia and NDDs. We show sets of genes associated with NDDs are enriched for congruent classes of variant in schizophrenia, and identify specific variants enriched for pleiotropic effects across both disorders. Collectively, our findings support the 1 5 hypothesis that schizophrenia forms part of a continuum of NDDs including ASD and developmental disorders. Our study points to a shared molecular aetiology and the need for more work exploring the mechanistic and clinical relationships between NDDs and schizophrenia.

Ethics statement
All research conducted as part of this study was approved by the Research Ethics Committee for Wales and consistent with regulatory and ethical guidelines.

Schizophrenia de novo data
De novo variants from 3,444 schizophrenia proband-parent trios (2,121 male and 1,323 female probands) were obtained from 11 published studies (Supplementary Table S8) Supplementary Table S8).
De novo variants were re-annotated using Ensemble Variant Effect Predictor (version 96) 38 .
PTVs included stop-gain, frameshift, or splice donor/acceptor variants. Missense variants were annotated with their "Missense badness, Polyphen-2, constraint" (MPC) score, which is a pathogenicity metric that combines predictions of variant deleteriousness with measures of regional missense constraint 21 . We prioritised missense variants with MPC scores ≥ 2 in our analyses, as this class of variant has been shown to be enriched in ASD cases compared with controls 20 .

Genic pleiotropy
Neurodevelopmental disorder gene sets NDD associated genes were identified from the Deciphering Developmental Disorders study 13 . In that study, 180 and 156 genes were, respectively, associated with de novo PTV and missense variants at exome-wide significance (P value < 2.5 x 10 -6 ). 53 genes were independently associated at this threshold with both PTVs and missense variants. We stratified the NDD associated genes into 3 independent groups -PTV specific (127 genes), missense specific (156 genes) and PTV + missense (53 genes) -and tested each group for enrichment for de novo variants in the schizophrenia probands. The genes included in these sets are provided in Supplementary Table S9. We did not include ASD associated genes in these sets as independent PTV and missense P values were not reported in the largest published ASD study 20 .

Statistics
For 3,444 schizophrenia trios, we used published gene mutation rates to estimate the number of de novo variants expected to occur under the null in the NDD gene sets 39,40 .
Where possible, gene mutation rates were adjusted for sequencing coverage; the use of unadjusted per-gene mutation would overestimate the expected number of de novo variants in these trios, and produce more conservative enrichment results (see 8  This approach allows for the background enrichment of schizophrenia de novo variants in non-NDD genes to differ between PTV and missense variants. We also used a Poisson regression model to evaluate the relationship between schizophrenia de novo variant enrichment and gene level P values for PTV and missense variants in NDDs simultaneously. NDD gene P values were taken from 13 . Unlike the gene set analysis, which required an arbitrary significance threshold for a gene being considered NDD associated (i.e. P < 2.5 x 10 -6 ), this Poisson regression was applied to all genes.

Neurodevelopmental disorder variants
NDD variants were identified from de novo variants observed in the largest ASD and DD proband-parent sequencing studies (total NDD trios = 37,488;

Statistics
Tri-nucleotide mutation rates were used to estimate the expected per-generation mutation rates for NDD variants 21 . These mutation rates were then used to derive the number of NDD variants expected to occur de novo under the null hypothesis in the 3,444 schizophrenia trios. As mutation rates have not been empirically established for indels, only singlenucleotide variants were considered ( Table 5).
The numbers of schizophrenia de novo variants overlapping our primary and negative control variant sets were compared to that expected under the null using a two-tailed NDD variants in our primary and negative control sets were further evaluated using a Swedish schizophrenia case-control exome sequencing data set, which consists of 4,079 cases and 5,712 controls 9 . Case-control exome sequencing data were analysed using Hail (https://github.com/hailis/hail). To test for an excess burden of NDD variants in cases 2 0 compared with controls, a one-tailed Firth's penalized-likelihood logistic regression model was used, correcting for the first 10 principal components derived from the sequencing data, and for the exome-wide burden of synonymous variants, sequencing platform and sex. To focus the case-control analysis on ultra-rare alleles, as those are more likely to be pathogenic, we excluded variants with an allele count > 5 in gnomAD 22  MCOD wrote the manuscript, which was read, edited and approved by all authors.

Data availability
All schizophrenia de novo variants were obtained from the published sources outlined in Supplementary Table S8. De novo variants from ASD trios were obtained from Satterstrom et al 2020 20 , and de novo variants from DD trios were obtained from Kaplanis et al 2019 13 .
DD gene level association statistics were obtained from Kaplanis et al 2019 13 .