The Shock of the New: Progress in Schizophrenia Genomics

A growing list of common and rare genetic risk variants are being implicated in schizophrenia susceptibility. As with other complex genetic disorders most of the variance in genetic risk is still to be attributed. What can be learned from progress to date? The available data challenges how we conceptualize schizophrenia and suggests strong aetiological links with other psychiatric and developmental disorders. With the identification of rare copy number risk variants implicating specific genes (e.g. VIPR2 and NRXN1) it is increasingly possible to investigate molecular aetiology in patient subgroups to establish whether schizophrenia represents one or many different disease processes. This review summarizes recent research progress and suggests how the tools of modern genomics and neuroscience can be applied to best understand this devastating disorder.


INTRODUCTION
The 'shock of the new' referred to a reflexive, premature and generally negative response provoked by the emergence of modern art [1]. Inevitably, with time, distance and perspective a more balanced consensus evolved as it does for other 'new' things. The early genomic data for schizophrenia provide a fascinating parallel where many responses have been prematurely negative [2] or focussed on what genomewide association studies (GWAS) didn't find (the 'missing heritability debate') [3] rather than on the many novel, interesting findings that are emerging.
We believe that genomic and other research tools are available to help us come to grips with these findings. Translating them into pathophysiological insights will take time, but is important for several reasons. The onset of schizophrenia is typically in early adulthood, but the evolution and severity of symptoms, and course of illness are variable despite modern treatments. Schizophrenia is a public health problem as approximately 1% of the adult population are affected and life expectancy is reduced by an average of 20-25 years [4]. Because schizophrenia is substantially heritable, the search for risk genes is not new, but has faced significant obstacles.
Schizophrenia is diagnosed clinically based on a triad of symptom domains (DSM-IV) [5]. One of these symptom groups, psychosis (delusions and hallucinations), is almost ubiquitous during illness episodes. The second, involving disorganization of thinking and behavior, is also strongly associated with psychotic episodes. The final group of 'negative' symptoms represents a loss of social function and volition and is more insidious and less amenable to current *Address correspondence to this author at the Dept. of Psychiatry, Trinity Centre for Health Sciences, St. James' Hospital, Dublin 8, Ireland; Tel: +35318962468; Fax: +35318963405; E-mail: acorvin@tcd.ie therapeutics. Many patients are only ever affected in one or two of these domains and within domains, symptoms are also heterogeneous. Although clinically useful, it is uncertain how this diagnosis maps to underlying biology. The extent to which these symptom groups overlap with other psychiatric disorders, medical disorders [6], and even normal human experience [7] is underappreciated but striking (Fig. 1).
One implication is that psychosis may be an endpoint for many different pathological processes and even a variant of normal human experience in states of stress or restricted consciousness.
Social, behavioural and cognitive abnormalities, which also feature in schizophrenia, overlap prominently with other disorders including learning disability and autistic spectrum disorders [8]. This symptom overlap causes ongoing controversy about how the clinical boundaries of the disorder should be drawn [9]. At the same time, a new and more radical view of schizophrenia, based on modern genomics is emerging. Many of the early findings detailed below have been surprising, even shocking, but (with the benefit of hindsight) should not have been entirely unexpected.

SCHIZOPHRENIA THE PHENOTYPE
Given the rich genetic epidemiological literature, genetics has long offered the promise of insight into the molecular mechanisms involved in schizophrenia risk. Estimates based on twin-data suggest substantial heritability (h2~0.80-0.85) [10] although many patients have no firstdegree relatives with the disorder. Model fitting of twin data indicates that schizophrenia overlaps with another major psychotic condition, bipolar disorder [11]. In fact, data from high-risk studies of the offspring of mothers with schizophrenia suggest that this liability extends to include other psychotic disorders and related personality disorders, termed the 'schizophrenia spectrum disorders ' [12]. This overlap is not reflected in the early molecular genetics literature for several reasons. Schizophrenia is a clinical diagnosis and in some cases it can be difficult to reach a clear diagnosis, for example in cases with mixed psychotic and affective symptoms [13]. Because of concern about misclassification errors, these types of cases were excluded from genetics analysis. Researchers were also concerned that extending studies beyond the established heritable core diagnosis would reduce study power by including cases with less clear genetic aetiology.
A remarkable exception to this orthodoxy involved the mapping of the gene Disrupted-in-Schizophrenia-1 (DISC1), which was identified in a large Scottish kindred. A balanced translocation between chromosome 1 and 11 strongly cosegregates with mental disorder in this family. The index case had a diagnosis of conduct disorder and within the family, 18 of 29 (70%) translocation carriers had a major mental disorder (including schizophrenia, bipolar disorder or major depressive disorder), whereas none of the non translocation carriers had such a diagnosis [14]. More than three decades of follow-up in the family indicates that this mutation has a large effect on liability to both schizophrenia and mood disorder in carriers. Recent large-scale epidemiological studies across disorders, demonstrate conclusively that this is a more general phenomenon. In a study of more than 9 million Swedish individuals, Lichtenstein and colleagues [15] confirmed that first degree relatives of probands with either schizophrenia or bipolar disorder are at increased risk of both disorders. The extent of shared liability extends to increased risk of autism [16] and a broad range of mental disorders in relatives of schizophrenia patients [17]. Rates of schizophrenia are also known to be three times higher in people with intellectual disability.
Collectively, this data suggests that genetic liability represented by schizophrenia substantially overlaps with other psychiatric and neurodevelopmental disorders.

WHAT IS THE GENETIC MODEL?
Based on genetic epidemiological data of risk in different classes of relatives it was suggested that several (or more) risk genes interacted with each other and environmental risk to cause schizophrenia [18]. This model, framed the common disease common variant (CDCV) hypothesis but was challenged by an opposing view that susceptibility involved the influence of rare genetic variants, the common disease rare variant (CDRV) hypothesis. An added dimension to the model is whether schizophrenia represents a single disease entity or many different disease processes or molecular mechanisms; a rare disease rare variant (RDRV) model. Only in the last few years have the molecular methods required to begin testing these models become available; what is emerging is a genetic architecture involving both common and rare risk variation.
Early molecular studies, using linkage and candidate gene association studies (Zone A in Fig. 2), excluded large, common single gene effects. The recent genome-wide association studies (GWAS) discussed in the next section have started to confirm some common risk variants of modest effect (Zone B in Fig. 2). From studies of copy number variation we know that examples like the DISC1 translocation, or 22q11.2 deletion syndrome (discussed below) are not unique oddities within 'mainstream' schizophrenia. A growing number of rare variants are being discovered and large-scale genome sequencing will allow more comprehensive investigation for rare risk variants  (Zone C in Fig. 2).
It remains an open question as to whether the risk variants identified contribute to one or many different molecular mechanisms and represent one or many diseases.

Common Variant Common Disorder
In the pre-genome era, linkage and candidate gene studies provided a relatively meager return of potential susceptibility loci for schizophrenia and putative candidate genes. This list of candidate genes (including Dysbindin and Neuregulin-1) has received some degree of statistical support. A more definitive list may be found on the SZGene database [20]. As these variants are in some cases poorly assayed by the platforms used for the larger genome-wide association studies (GWAS) detailed below, their status remains equivocal [21].
A series of (GWAS) [22][23][24][25] and a large meta-analysis of GWAS data, conducted through the Psychiatric GWAS Consortium (PGC), have provided genome-wide significant evidence for at least nine susceptibility loci as demonstrated in Table 1.
As has been the consistent theme across common disorders, initial schizophrenia GWAS findings explain only a modest proportion of the variance in susceptibility. A significant proportion of the remainder may involve a polygenic component including hundreds, if not thousands, of common alleles of small effect. Using a polygene score method, the International Schizophrenia Consortium [23] identified substantial overlap in common putative risk alleles of small effect across both schizophrenia and bipolar samples and estimated that these explained at least one-third of total variation in liability. From the emerging GWAS data, many of the associated loci appear to confer liability to both schizophrenia and bipolar disorder [27].

Rare Variants and Schizophrenia
The 22q11.2 deletion syndrome (22q11.2DS; also known as velo-cardio-facial syndrome (VCFS)) is caused by the most common large microdeletion in the human genome and has an incidence of 1 in ~4000 live births. The phenotype is highly variable and can affect multiple organs and tissues, but carriers have a 30-fold increased risk of schizophrenia [28]. An increased, if less substantial risk of schizophrenia is reported in Marfan syndrome [29] and conversely a reduced risk is reported in Down syndrome [8]. Until relatively recently these findings have been seen as novel curiosities that make a limited contribution to what is a relatively common disorder.
As we became able to detect submicroscopic deletions and duplications in the human genome [30][31] and discovered these to be more common than expected, this has changed. A seminal paper by Walsh and colleagues [32] identified an increased rate of novel deletions and duplications of genes in schizophrenia cases, particularly young onset cases. These mutations were reported to disproportionately disrupt signaling networks controlling neurodevelopment. Two large consortia studies identified association with copy number change at chromosome 1q21.1 and deletions of chromosome15q13.3 [33][34]. Subsequent studies have reported evidence for association with more CNVs implicating 2p16.3, 3q29 and 15q11.2 deletions and duplications of 7q36.3 and 16p11.2. Relative to the common SNP variants these mutations are rare and cumulatively they involve 2-3% of cases, so far [35]. Many of these span many genes, but the 2p16.3 and 7q36 loci implicate specific genes, NRXN1 and VIPR2 respectively, which bring them sharply into view for further investigation.
Unexpectedly, all of these CNVs confer susceptibility to a range of other developmental phenotypes including mental  Allele Frequency retardation, autism, ADHD, seizure disorders and obesity [36][37][38][39]. As an example, the 15q13.3 increases risk for a wide range of clinical features including schizophrenia, autism, seizure disorder, learning disability and cardiac malformations, but a subset of carriers have no discernable clinical findings [40]. By contrast, CNVs appear to be less common in bipolar disorder than in control populations [41]. This is not simply related to the size of the genomic region; at the VIPR2 locus, implicated in schizophenia but also autism cases, the overlapping copy variable region was localized to exons 3 and 4 of the gene [42]. For each of these loci further studies are required to identify the range of phenotypic expression that exists and the disease penetrances. The form of phenotype expression may reflect the influence of other genetic (e.g. common or rare variants) or environmental factors, or other stochastic effects during neurodevelopment.

DISCUSSION
Genomic data is re-shaping our understanding of schizophrenia. As with other common disorders, common variants of small effect are being implicated. Some of these variants appear to also contribute risk to other psychiatric phenotypes, in particular bipolar disorder. A point of difference from bipolar disorder, which links schizophrenia to other neurodevelopmental disorders, is the identified contribution of rare CNVs, which collectively may account for some proportion of total susceptibility to schizophrenia within the population.
It is too early to know whether the contribution of this rare variation is as significant as it is proving to be in autism [35,43]. However, these data represent a starting point to test new hypotheses using the modern tools of genomics and neuroscience research to make real breakthroughs in our understanding of the molecular mechanisms involved.

Reshaping How We Define the Disorder
The few pieces of the genetic puzzle that we now have, suggest a radical re-shaping of the clinical boundaries that define the disorder may be required. At a genetic level schizophrenia overlaps with both adult and childhood disorders of a neurodevelopmental aetiology. Knowing this allows novel studies to assess the biological underpinnings of these clinical entities. Further studies of common variation, for example, using polygene analyses and crossdisorder meta-analyses (currently underway within the Psychiatric GWAS Consortium) will clarify the extent of this overlap. The range of phenotypes associated with known rare variants suggests that its reasonable to extend GWAS studies of common variation to include related medical phenotypes (e.g. seizure disorder). This approach may not go far enough. As previously discussed, a striking number of different biological causes can lead to schizophrenia symptoms; this may involve many brain circuits. The discovery of rare, highly penetrant mutations makes feasible a reverse approach, where biological dissection of these mutations, rather than heterogeneous clinical phenotypes can shape understanding of causation for at least a subset of patients currently defined as having schizophrenia.

Larger, Bigger, Better?
The Psychiatric GWAS Consortium is now trying to extend its sample to increase power to detect more modest genetic effects. The pace of gene discovery so far is typical of complex disease [44] and data from other disorders suggests that, if successful in doubling the current sample (to >40,000 cases), this is likely to generate many new loci [45]. An obvious question is what can be learned from genetic variants that individually increase risk of the disorder from 1 to 1.1%? The endpoint of GWAS studies is the discovery of biological pathways underlying complex disorders, rather than individual risk loci. For other disorders and traits, including inflammatory bowel disease and body mass, confirmed risk genes fall into specific molecular pathways which extend our understanding of the genetics and biology of these traits [46][47]. Arguably this is already happening in schizophrenia where both the micro-RNA MIR137 and zinc finger protein ZNF804A genes appear to be involved in  [23,26] 2q32 zinc finger protein 804A (ZNF804A) rs1344706 [22] 2q32.3 prostate-specific transcript 1 (PCGEM1) rs17662626 [26] 6p21-6p22 major histocompatibility complex (MHC) rs6913660 rs6932590 rs13211507 rs3131296 [24] 8p21.3 matrix metallopeptidase 16 (MMP16) rs7004633 [26] 8p23.2 CUB and Sushi multiple domains 1 (CSMD1) gene rs10503253 [26] 10q24.32 cyclin M2 (CNNM2) rs7914558 [26] 11q24 neurogranin (NRGN) rs12807809 [24] 18q21 transcription factor 4 (TCF4) rs9960767 [24] regulating the function of other genes. MIR137 is a particularly promising example, as it is implicated in the regulation of adult neurogenesis [48] and four of the eleven loci identified in the PGC meta-analyses of schizophrenia and bipolar disorder are predicted MIR137 target [49].
Having better estimates of small effect sizes will also be useful for molecular pathway based analyses of the GWAS data [50]. In principle, jointly examining whether a group of related genes in a functional pathway are associated with a disease may be more powerful than testing individual markers. Before GWAS studies, this approach was hampered by our limited understanding of biology: candidate genes for analysis stemmed from existing hypotheses (e.g. of glutamatergic dysfunction), leading analyses to the circular conclusion that 'enrichment' for association within these genes confirmed the hypothesis. A number of different pathway based methods have been applied to schizophrenia GWAS data [51][52][53]. Although these studies overlapped in the samples examined, with the exception of pathways involved in cell adhesion [51,53], there is little agreement as to which pathways are being implicated. This may reflect differences in methodology, including different pathways being examined, but is also likely to reflect 'noise' in the data due to the many false positive findings in smaller datasets, where measures of individual risk at a SNP-level are imprecise. If this is the case, testing with larger samples may provide more consistent signals and be valuable in improving pathway annotation. Having better estimates of common, small genetic risk effects may also address a second important question. If there are a large number of common risk variants within the population, does the total number of variants carried by an individual predict risk or outcome for the disorder? A question that could be addressed by modelling total risk SNP burden.

Smaller, Better, Best?
From a recent review of the emerging CNV literature <5% of schizophrenia patients carry at least one of the risk CNVs identified to date [35]. Assessing their pathogenic significance is still a challenge (this issue is discussed more fully in Lee & Scherer, 2010 [54]). For instance, recurrent CNVs involving the putative schizophrenia risk genes NRXN1 and erbB4 are reported in control populations. More information on the prevalence in the general population and the phenotypic consequences of carrying these mutations for diagnosis and prognosis is urgently required. With the exception of 22q11.2DS we know little about their clinical features [28]. Some of these mutations may have a core of shared phenotypic features, as is the case with certain ASDs (e.g. Prader-Willi/Angelman syndrome) but others may have a wide range of phenotypic effects.
Whole genome sequencing data will become available for hundreds, if not thousands of schizophrenia patients in the next couple of years. Sampling all classes of rare genetic mutations may substantially increase our estimation of how important rare mutations are in the aetiology of schizophrenia. Based on deep re-sequencing data, a recent study suggests that there is an increased rate of potentially deleterious de novo mutations in schizophrenia and autism patients [55]. Attaching pathogenic significance to rare, or even private, point mutations will be even more challenging than for CNVs. The likely first step will be to assess the full spectrum of potentially causative variants at known common (e.g. ZNF804A and TCF4) and rare risk genes (e.g. VIPR2 and NRXN1). A second step will be to establish whether implicated genes can be logically grouped based on their biology, and whether other genes in these pathways also harbor risk mutations (as was the case for DISC1 [56]). This will bring into focus much smaller, more detailed studies to assess the impact of particular mutations within carriers and their families, to establish if they are inherited or are occurring de novo and assessment of whether mutations can be grouped, based on biology, for phenotypic or pharmacogenetics studies.
Other study designs may also be informative. One possibility being to sample families affected by a 'syndromal' form of schizophrenia who have additional phenotypes including learning disability, autism, seizure disorder or other developmental difficulties. This approach has been remarkably successful for severe neurodevelopmental disorders, where a genetic cause is identifiable in more than 60% of cases [57]. We can learn several lessons from these studies. Firstly, mutations that block, or impair formation of full-length proteins have more severe phenotypic consequences. If these types of mutations contribute appreciably to schizophrenia they will be the most easily detected by whole-genome sequencing analysis. Secondly, for any gene there is likely to be a spectrum of mutations within the human population [58]. For genes where null mutations are known to cause severe neurodevelopmental phenotypes affecting gross brain structure (e.g. DCX), less deleterious mutations of the same gene are associated with more subtle brain phenotypes [59]. It may be reasonable to test genes known for involvement in severe brain developmental phenotypes for less deleterious mutations, which could be relevant to schizophrenia or psychosis. As an example, Pitt-Hopkins Syndrome (PTHS), which is a developmental disorder with severe learning disability, can be caused by haploinsufficiency of TCF4 or deletions/missense mutations of NRXN1. Both of these genes are implicated in schizophrenia [60].

One or Many Diseases?
At this point we do not know whether the term 'schizophrenia' captures one or many different diseases. We have identified common risk variants, which may implicate many different brain circuits and predispose to psychosis through one or many different mechanisms involving information processing or salience i.e. one's ability to attend to the most relevant or important aspect of available sensory information. Is this a generic psychosis risk for some or all of the disorders that predispose to psychosis (in Fig. 1)? The identification of rare variants with much larger effects on individual risk raises further questions. Do the deletions at VIPR2 and NRXN1 represent entirely different diseases? If so, do they implicate different signaling pathways or do they converge on a common molecular mechanism, for example, involving GSK3ß signaling? Future models of psychotic disorder may be defined by genetic risk, where groups with higher rates of risk will be identified and possibly defined as having distinct diseases at a molecular level (e.g. Fig. 3).

CONCLUSION
The modern research toolbox allows mutations to be studied using approaches that can address these questions. For example, as demonstrated by Brennand and colleagues [61], human cellular models of schizophrenia are increasingly feasible. They have reported on a cellular model of schizophrenia generated using neuron-type cells generated using human induced pluripotent stem cell (hiPSC) technology from schizophrenia patients. Combined with gene expression profiling this identified altered expression of many components of the cyclic AMP and WNT signaling pathways in a small number of patients. As the authors acknowledge, this analysis assumes that schizophrenia is one disease. Applying this approach using patients grouped by mutation is a potentially powerful way of addressing the question of whether 'schizophrenia' involves one or many disease processes. Modern imaging methods make it possible to examine neural circuitry, using diffusion tensor imaging, in these same subjects. In parallel, these same mutations can be modeled directly in animal systems to generate cellular phenotypes and investigate neural circuits in vivo using viral tracing [62] and optogenetics [63] methods. The application of these approaches, with better models, promises new insights into molecular aetiology and potentially novel therapeutics. From some future perspective the diagnosis and treatment of neurodevelopmental disorders based on clinical symptomatology may appear shocking indeed.