Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations

It is well established that autism spectrum disorders (ASD) have a strong genetic component. However, for at least 70% of cases, the underlying genetic cause is unknown1. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes—so-called sporadic or simplex families2,3, we sequenced all coding regions of the genome, i.e. the exome, for parent-child trios exhibiting sporadic ASD, including 189 new trios and 20 previously reported4. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19)4, for a total of 677 individual exomes from 209 families. Here we show de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD5. Moreover, 39% (49/126) of the most severe or disruptive de novo mutations map to a highly interconnected beta-catenin/chromatin remodeling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes, CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3, and SCN1A. Combined with copy number variant (CNV) data, these results suggest extreme locus heterogeneity but also provide a target for future discovery, diagnostics, and therapeutics.


Sample Overlap
Three of the previously reported families (12325, 12680, and 12647) are the only samples known to overlap with other studies (Sanders et al. 2012).

Rate of De Novo CNVs
We expected the de novo CNV rate for this cohort would be less than for other ASD cohorts as 77% (94/122) had previously been screened negative for large, disruptive de novo events.
Nonetheless, our observed rate of de novo CNVs (6/122, ~5%) is in line with other recent estimates for ASD 1,2 owing possibly to the increased resolution of detecting gene disruptions with exome sequencing.

Effect of Multiple Genetic Lesions on Intellectual Functioning
We performed a multivariate analysis to examine effect of number of "extreme" de novo coding mutations (0, 1 or 2 or more) and the presence of either de novo or rare inherited copy number variation (122/189 probands) on nonverbal IQ (NVIQ) and verbal IQ (VIQ) ( Supplementary   Fig. 5). Extreme mutations (n = 62) were defined as de novo protein truncating, intersections with known OMIM and ASD candidate genes, and CNVs predicted to be gene breaking and pathogenic. In the sample of 122 individuals for whom CNV analysis had been completed, we observed a significant decrease in NVIQ with increased numbers of events (F(2,116) = 5.45, p<.01, partial η 2 = 0.09), but not in VIQ (F(2,116) = 1.13, p = ns, partial η 2 = 0.02). This result in NVIQ was strengthened, but not exclusively driven, by the presence of CNVs (F(2,116) = 0.97, p = ns, partial η 2 = 0.02); there was no main effect of strictly having a CNV on cognitive ability (F(2,116) = 0.71, p = ns, partial η 2 = 0.006). Post hoc analyses indicated individuals with one and two or more events scored significantly lower in NVIQ than individuals with no events (mean difference = 18.0 points, p < 0.05, Cohen's d = 0.63; 38.5 points, p < 0.01, d = 1.69; respectively). The significant difference in NVIQ between individuals with no de novo coding mutations and those individuals with two or more mutations was also observed with the complete sample of 189 individuals (F(2,186) = 6.129, p<.01, partial η 2 = 0.06) (Fig. 1c).

IPA Analysis
Within our 49 PPI network members, IPA detected the most significant functional enrichment in  Table 13).
We then performed an additional IPA analysis on the 126 genes identified in 209 samples.
The top interconnected network consists of 22 genes (15 of which are PPI members), of which CTNNB1 is a central node (Supplementary Fig. 11). To further investigate the potential role of CTNNB1 interactors in autism, we selected all direct upstream interacting genes from beta-Catenin in IPA and noted that 8/358 (p = 0.0030) were present in our mutation list. Furthermore, we note that CTNNB1 is directly linked to multiple highly interconnected genes in the PPI network (MYBBP1A, PBRM1, RUVBL1, TBL1XR1, and CHD8), suggesting that additional mutated genes involved in CTNNB1 function are represented in autism. This enrichment for CTNNB1 interactors further supports the hypothesis that the WNT/beta-catenin pathway may play a role in the etiology of autism 3 .

Phenotyping Summaries for Selected Families
Family 13844. Proband is second of three children with an older sister (13844.s1) and younger brother (13844.s2).
Patient ID: 13844.fa Summary: Father is an adult non-Hispanic white male. Age at conception of proband is 40.
Normative range of social responsiveness, but elevated score for rigidity on broader autism phenotype. Some signs of alcoholism (use, attempting to cut down, annoyed by criticism about drinking, feeling bad about drinking, eye opening experience). No medication use endorsed for current or past. Some college education. Annual household income = $101-130K. Father has head circumference of 58.5 cm (z = 1.57) and normative BMI. No comorbid diagnoses endorsed.    The proportion of individuals within various NVIQ bins with 1+ event across "disruptive" classification schemes: any event, any nonsynonymous event, any severe nonsynonymous event, and our "top candidate" list (    UW: Variant calling threshold was a minimum of 8x in each member of a trio. Bases screened were calculated based on trio concordant positions at 8x (n) and converting to diploid bases (2n), adjusting for sex chromosomes. Probands and siblings were calculated separately as trio units. Observed mutation rate was calculated by dividing the total number of events by total number of bases. The exact 95% Poisson confidence intervals were generated from the observed counts and then dividend by the total number of bases to get the rate confidence intervals.

Yale:
The variant calling threshold was a minimum of 20x unique reads in each member of the quartet and a minimum of 8x unique non-reference reads in the child. Bases screened were estimated by assessing the number of nucleotides in each family with a minimum of 20x unique reads in each member of the quartet and converting to diploid bases (2n). Observed mutation rates were calculated by dividing the total events per sample by the number of nucleotides analyzed per sample and averaging across individuals. The 95% confidence intervals were calculated from the variance of this measure.