A direct physical interaction between Nanog and Sox2 regulates embryonic stem cell self-renewal

Embryonic stem (ES) cell self-renewal efficiency is determined by the Nanog protein level. However, the protein partners of Nanog that function to direct self-renewal are unclear. Here, we identify a Nanog interactome of over 130 proteins including transcription factors, chromatin modifying complexes, phosphorylation and ubiquitination enzymes, basal transcriptional machinery members, and RNA processing factors. Sox2 was identified as a robust interacting partner of Nanog. The purified Nanog–Sox2 complex identified a DNA recognition sequence present in multiple overlapping Nanog/Sox2 ChIP-Seq data sets. The Nanog tryptophan repeat region is necessary and sufficient for interaction with Sox2, with tryptophan residues required. In Sox2, tyrosine to alanine mutations within a triple-repeat motif (S X T/S Y) abrogates the Nanog–Sox2 interaction, alters expression of genes associated with the Nanog-Sox2 cognate sequence, and reduces the ability of Sox2 to rescue ES cell differentiation induced by endogenous Sox2 deletion. Substitution of the tyrosines with phenylalanine rescues both the Sox2–Nanog interaction and efficient self-renewal. These results suggest that aromatic stacking of Nanog tryptophans and Sox2 tyrosines mediates an interaction central to ES cell self-renewal.


Introduction
Embryonic stem (ES) cell self-renewal efficiency depends on the level of expression of components of the pluripotency gene regulatory network. Among these, Oct4, Sox2 and Nanog play central roles. While the levels of Oct4 and Sox2 are relatively uniform in undifferentiated ES cells, the levels of Nanog vary considerably (Hatano et al, 2005;Chambers et al, 2007;Singh et al, 2007) with high levels of Nanog directing efficient self-renewal (Chambers et al, 2003(Chambers et al, , 2007. However, the mechanisms by which Nanog delivers this function in ES cells are not fully understood. In particular, although Nanog has been reported to interact with several proteins Wu et al, 2006;Liang et al, 2008;Costa et al, 2013), the full extent of the Nanog interactome is not known.
In the past few years, proteomic approaches have been employed to characterize and begin to understand the network of biochemical interactions controlling pluripotent cell function. This has resulted in the identification of additional proteins that interact with the key transcriptional factors Nanog, Sox2 and Oct4 to control and maintain the pluripotent state. Pioneering studies by Wang et al (2006) identified a Nanog-centred interactome of 17 proteins that extended to other transcription factors including Oct4, Zfp281, Nac1, Rex1 and Nr0b1. This list of Nanog-interacting proteins has since been extended (Liang et al, 2008) with a recent interactome identifying a total of 27 Nanog interactors (Costa et al, 2013). This relatively small number is in contrast to the larger number of interactors identified in recent Oct4 (Pardo et al, 2010;van den Berg et al, 2010;Ding et al, 2012) and Sox2 (Gao et al, 2012) interactomes. Interactome studies have the potential to contribute to the elucidation of the mechanisms by which specific factors function. Central to this is the identification of the interacting amino-acid side chains on partner proteins as well as the functional significance of their association. To date, biochemical characterization of proteinprotein interactions in pluripotent cells has been relatively sparse with most effort analysing the interaction between Sox2 and Oct4 (Yuan et al, 1995;Ambrosetti et al, 1997Ambrosetti et al, , 2000Remenyi et al, 2003;Kim et al, 2008;Chen et al, 2008a;Lam et al, 2012). From a biochemical perspective, little is known about how Nanog fits into the tight relationship between Oct4 and Sox2.
Previously, we described a method to identify partner proteins interacting with nuclear proteins of interest in ES cells and used this to identify an extensive interaction network for the transcription factor Oct4 (van den Berg et al, 2010). Here, this technique is applied to Nanog, resulting in identification of a Nanog interactome which includes over 130 Nanog partners in ES cells. From this, the direct interaction between Nanog and Sox2 was selected for further characterization, pinpointing individual residues required for the interaction and defining the functional consequences of elimination of the interaction between these central pluripotency regulators.

Identification of a Nanog interactome
An ES cell line expressing epitope-tagged Nanog protein was obtained by transfection of E14Tg2a cells with a construct in which the constitutive CAG promoter directs expression of (FLAG) 3 Nanog, linked via an IRES to puromycin resistance. Puromycin-resistant colonies were expanded and the resulting cell lines were analysed by immunoblotting. A cell line was identified (hereafter called F-Nanog) that expressed (FLAG) 3 Nanog at close to endogenous levels ( Figure 1A). A qRT-PCR analysis of F-Nanog and E14Tg2a wild-type cells showed no significant difference in the expression level of the ES cell-specific genes Oct4, Sox2 and Rex1 ( Figure 1B). In agreement with recent reports of autorepression by Nanog (Fidalgo et al, 2012;Navarro et al, 2012b), F-Nanog cells Figure 1 Characterization of E14Tg2a Flag Nanog cell line. (A) Expression levels of Nanog protein in E14Tg2a and E14Tg2a F-Nanog cells compared by immunoblot analysis using b-actin as a loading control. Note the reduced expression of endogenous Nanog protein in cells transfected with (Flag) 3 Nanog, consistent with autorepression of the Nanog gene by Nanog protein (Navarro et al, 2012a). (B) Expression levels of Sox2, Oct4 and Rex1 in E14Tg2a F-Nanog relative to E14Tg2a which was set to 1. Error bars are s.e.m. of three biological replicates. (C) Coomassie-stained SDS-polyacrylamide gel of the FLAG immunoprecipitation from E14Tg2a F-Nanog and control E14Tg2a cells. (D) Proteins detected by mass spectrometry analysis are grouped in classes. Transcription factors are shown in blue circles, NuRD components are in green, Trrap/p400 complex is in yellow, PcG components are in red, E2F6 complex is in purple, Sin3a complex is in burgundy, N-CoR complex is in khaki, LSD1 complex is white, Mll complex is in blue green, chromatin remodelling/transcriptional regulation proteins are in dark orange, transcriptional machinery proteins are in pale green, proteins involved in phosphorylation are in pale blue, proteins involved in ubiquitination are in amber, proteins involved in RNA processing are in fuschia, proteins involved in cell cycle or DNA replication are in coral, proteins involved in DNA repair are in pink and other proteins are in grey. (E) Nuclear extracts from E14Tg2a F-Nanog cells (top) or from RCNbH-B(t):F-Nanog (bottom) were immunoprecipitated as indicated and immunoblots analysed with the indicated antibodies. In the bottom panel, C refers to control samples from RCNbH-B(t) parental cells. Source data for this figure is available on the online supplementary information page.
show a strong decrease in expression of endogenous Nanog, which has the fortuitous consequence of maximizing the proportion of Nanog protein immunoprecipitated by anti-FLAG reagents.
Nuclear extracts were prepared from F-Nanog cells and parental E14Tg2a cells and used for FLAG-affinity purifications as previously described (van den Berg et al, 2010). A Coomassie-stained gel of the eluted fractions showed several bands absent from the control E14Tg2a sample, indicating good signal-to-background ratio ( Figure 1C). Mass spectrometry analysis was then performed on two independent affinity purifications from both F-Nanog and E14Tg2a control cells. An extensive set of Nanog partners was identified that could be grouped into several functional categories (Table I;  Supplementary Table I; Figure 1D). The group with the highest representation is transcription factors, other groups present being chromatin modification complexes (e.g., NuRD and NcoR), proteins involved in phosphorylation or ubiquitination, basal transcriptional machinery members and RNA processing proteins. Mass spectrometric analysis of an independent cell line generated by expressing the same (Flag) 3 Nanog expression cassette in a Nanog-null ES cell line (RCNbH-B(t)) (Chambers et al, 2007) was used to verify candidate Nanog-interacting proteins (Table I;  Supplementary Table I). Only the proteins identified in two out of three purifications are included in Table I and  Supplementary Table I. Interactions between Nanog and Sox2, RNA polymerase II (RNAPolII), Nac-1, Sall4 and the NuRD subunit Mta2 were also observed by immunoblotting ( Figure 1E). The Nanog interactome substantially overlaps with the published interactome of Oct4, Esrrb, Sall4, Nr0b1 and Tcfcp2l1 (van den Berg et al, 2010) ( Figure 2). Interestingly, Chd7 and the Ncor1 complex, which are not part of the Oct4/Esrrb/Sall4/Nr0b1/Tcfcp2l1 interactome, do interact with Nanog ( Figure 2). This may reflect the robust interaction of Nanog with Sox2 (Table I; Supplementary  Table I) as both Chd7 and the Ncor1 complex interact with Sox2 (Engelen et al, 2011).

Mapping the domain of Nanog interacting with Sox2
Due to the key role of Sox2 in ES cell biology, further characterization of the interaction between Nanog and Sox2 was undertaken. To determine whether the interaction between Nanog and Sox2 could be detected in wild-type ES cells, E14Tg2a nuclear extract was incubated either with an anti-Sox2 antibody and immunoprecipitates examined for the presence of Nanog or with an anti-Nanog antibody and immunoprecipitates examined for Sox2. Nanog was detected in Sox2 immunoprecipitates ( Figure 3A) and Sox2 was also detected in Nanog immunoprecipitates ( Figure 3B). To map the sites of interaction, co-transfections of (Flag) 3 Sox2 with (HA) 3 Nanog or Nanog deletion mutants were performed ( Figure 3C) in E14/T cells (Chambers et al, 2003). Nuclear extracts from ES cells transfected with (HA) 3 Nanog and (HA) 3 -tagged mutants lacking the N-terminus, the DNA binding homeodomain (HD) or the C-terminus of Nanog were immunoprecipitated with the HA antibody and after SDS-PAGE, immunoblots were probed for the presence of interacting Sox2 using a Flag antibody. (Flag) 3 Sox2 does not interact with a Nanog mutant lacking the C-terminal domain but the interaction between Sox2 and Nanog variants lacking either the N-terminus or the HD remained intact ( Figure 3C).
To identify the subregion of the Nanog C-terminal domain responsible for the interaction with Sox2, (Flag) 3 Sox2 was cotransfected with (HA) 3 Nanog variants carrying mutations within the C-terminal domain. Co-immunoprecipitations showed that deletion of the tryptophan repeat (WR) region, within the C-terminal domain of Nanog, but not residues C-terminal to the WR, abrogated the interaction with Sox2 ( Figure 3D). Importantly, a Nanog mutant in which all 10 tryptophan residues in the WR region were mutated to alanines, (HA) 3 Nanog WR W10-A , also failed to interact with Sox2, pinpointing the tryptophan residues as critical determinants of the interaction with Sox2. To determine whether the interaction of Nanog and Sox2 was direct, Sox2 was coexpressed in E. coli alongside a fusion between Maltose Binding Protein and, either the Nanog tryptophan repeat, or the Nanog tryptophan repeat in which all the tryptophans were replaced by alanines (MBP-WR or MBP-WR W10-A ) ( Figure 3E). The MBP-fusion proteins were then purified on an amylose column and any interacting Sox2 was detected by immunoblotting with a Sox2 antibody. Only MBP-WR but not MBP-WR W10-A was able to co-precipitate Sox2 ( Figure 3E). Taken together, these experiments indicate that Nanog and Sox2 interact directly, that the interaction with Sox2 can be mediated by the Nanog WR domain alone and that tryptophan residues within the WR are required for interaction with Sox2. In addition, the ability of these proteins to interact in E. coli implies that post-translational modifications are not required for interaction between Nanog and Sox2.

The region of Sox2 interacting with Nanog
To identify the region of Sox2 involved in the interaction with Nanog, we investigated mutants carrying deletions within the C-terminal domain, the HMG DNA binding domain or residues at the N-terminus of Sox2 ( Figure 4A). Each of these mutants was co-expressed with (HA) 3 Nanog in E14/T cells, nuclear extracts prepared and the HA antibody used to coimmunoprecipitate (HA) 3 Nanog and interacting proteins. Samples were then analysed by SDS-PAGE and immunoblotting. (Flag) 3 Sox2 mutants lacking the N-terminal region, the DNA binding domain or the C-terminal 56 amino acid residues [(Flag) 3 Sox2 1-263] were still able to interact with Nanog ( Figure 4A). However, (Flag) 3 Sox2 1-204 does not interact with Nanog, suggesting that the serine-rich region is involved in the interaction with the Nanog WR.
The persistence of the Nanog-Sox2 interaction in nuclear extracts that have been treated with the nuclease, benzonase, to eliminate interactions mediated via DNA bridging, suggests that DNA binding is not required for the Nanog-Sox2 interaction. Moreover, the above results indicate that Nanog and Sox2 can interact in the absence of a DNA binding domain on either of the proteins ( Figures 3C and 4A). To consolidate the notion that Nanog-Sox2 interaction is fully DNA independent, we show by co-immunoprecipitation of (Flag) 3 Sox2DHMG and (HA) 3 NanogDHD that Nanog and Sox2 molecules that lack the DNA binding domains can still interact ( Figure 4B).
Our analysis of the ability of (HA) 3 Nanog to co-immunoprecipitate Sox2 mutants ( Figure 4) suggested that the serinerich region, from residues 205 to 263, plays a key role in the Nanog interaction. To narrow down the region of Sox2 interacting with Nanog, further deletion mutants within this  region were generated ( Figure 5A). Co-immunoprecipitation analyses show that while a Sox2 mutant truncated after residue 233 retained the ability to interact with Nanog, Sox2 mutants with deletion of residues between 205 and 233, or truncated after residue 212 were unable to interact with Nanog ( Figure 5B). These analyses identify a critical Nanog-interacting region in Sox2 between residues 212 and 233. This sequence is highly enriched for hydroxyamino acids (12/21 residues) and, like the WR of Nanog, is devoid of acidic and basic side chains. Moreover, careful examination of this 21 amino-acid region highlighted three repeats of the sequence S X T/S Y that may be responsible for mediating the interaction with Nanog. To determine the potential importance of these motifs for the Nanog-Sox2 interaction, additional truncations were made after residues 218 and 226, which truncate Sox2 after repeat 1 or 2, respectively. This indicates that repeat 1 is sufficient for interaction with Nanog but that together repeats 1 and 2 interact with Nanog with an efficiency approaching that of wild-type Sox2 ( Figure 5B). To examine the sequences required on Sox2 in more detail, a series of point mutations were generated within the repeats ( Figure 6). Individual or combinatorial contributions of each of the three repeats to Nanog binding were initially examined ( Figure 6A and B). Mutation of individual repeats suggests an order of importance for Nanog interaction of repeat 14repeat 34repeat 2 ( Figure 6A). This is supported by analyses of the combinatorial mutants where mutation of repeats 1 þ 3 almost entirely eliminates the ability of Sox2 to interact with Nanog ( Figure 6B). Mutations of the amino acids at positions 1, 3 or 4 in all the three repeats were next analysed. Combined mutations at positions 1 and 3 had negligible effects ( Figure 6C), implying that residues at these positions are not required for Nanog interaction. In contrast, the combined mutation of the tyrosines at position 4 indicates that these residues play a key role in the interaction with Nanog ( Figure 6C). Together, these experiments suggest that the tyrosines are the residues directly interacting with the Nanog WR, with the tyrosines in repeats 1 and 3 being more important in this regard than the tyrosine in repeat 2. The Sox2 tyrosine residues could interact with Nanog partially via the hydroxyl, the phenyl ring or both. Since the tryptophans in WR are critical for the Nanog/Sox2 interaction this raises the hypothesis that hydrophobic stacking of the aromatic rings in the Sox2 tyrosines and the Nanog WR tryptophans mediate the interaction. If these were the case, then the tyrosines hydroxyl groups should be unimportant for the interaction between Nanog and Sox2. To test this hypothesis, the tyrosines were mutated to phenylalanine. The direct comparison of the interaction between (HA) 3 Nanog and (Flag) 3 Sox2:YYY4A or (Flag) 3 Sox2:YYY4F by co-immunoprecipitations clearly shows that substitution of the tyrosine residues with phenylalanines rescues the Nanog interaction, indicating that it is the benzene ring of these amino-acid residues that is required for the interaction to occur ( Figure 6D).

Identification of Nanog/Sox2 binding motif in vitro
To investigate possible DNA sequences bound by the Nanog/ Sox2 complex, (His) 6 -tagged Nanog and unmodified Sox2 were co-expressed in E. coli for use in Systematic Evolution of Ligands by Exponential Enrichment (SELEX). As controls, MBP-Nanog and (His) 6 -Sox2 were expressed individually. Purification from bacterial lysate containing co-expressed proteins on a nickel column followed by elution with imidazole yielded two proteins of the expected size for Nanog and Sox2. These were recognized by a-Nanog and a-Sox2 antibodies ( Figure 7A), with N-terminal sequencing establishing the identities of the two bands as Nanog and Sox2. The Nanog-Sox2 interaction is robust, since the proteins co-purify through subsequent ion exchange ( Figure 7B). The Nanog-Sox2 complex bound to the Ni-agarose, MBP-Nanog bound to amylose resin and (His) 6 -Sox2 bound to Ni-agarose were used for SELEX, the bound oligonucleotides cloned and the sequences determined ( Figure 7C) used to derive the motifs shown ( Figure 7D). The motif obtained from Nanog alone has a TAAT core sequence followed by CG, consistent with the motif obtained previously by SELEX (Mitsui et al, 2003) and the nucleotide preferences of the isolated Nanog HD in EMSAs (Jauch et al, 2008). Sox2 also gives a motif highly similar to that determined by SELEX (CA/TTTGA/T) (Harley et al, 1994;Maruyama et al, 2005). The motif obtained from the Nanog/Sox2 complex is bipartite with bases 10-15 similar to the motif obtained by us and others for Sox2 alone (Harley et al, 1994;Maruyama et al, 2005) and bases 5-7 showing similarity to the central core of the Nanog motif identified by SELEX (TAAT) in this work and by others (Mitsui et al, 2003). However, the published Nanog motif has a high degree of confidence over a four base sequence (TAAT) while the Nanog-Sox2 binding sequence shows high certainty for only three bases (TAA) with the preference for the 3 0 -flanking CG no longer apparent. This difference may reflect an alteration in the binding specificity of Nanog when in complex with Sox2. Interestingly, the SELEX motif shows high similarity to a Nanog/Sox2 motif identified by de novo methods from ChIP-Seq data (Hutchins et al, 2013), which notably also contains a 2-bp gap between the major binding nucleotide groups ( Figure 7D). Therefore, a combined motif was generated and used to search available ChIP-Seq data sets. Analysis of three independent ChIP-Seq data sets (Chen et al, 2008b;Marson et al, 2008;Whyte et al, 2013) identified 3257 Nanog/ Sox2 overlapping peaks, which are common to the three data sets (out of a total of 16 454 from all Nanog/Sox2 overlapping peaks in the three data sets). Of these 3257 high confidence peaks, 29.1% (948 peaks) contain the motif. The motif occurs in a significantly smaller fraction of the Nanog only or Sox2 only peaks (4898 peaks out of a total of 31 271 peaks (15.7%; hypergeometric P-value o1 Â10 À 10 ). Examples of occurrences of the motif relative to the nearest gene are shown ( Figure 7E; Supplementary Table II).

The Nanog-Sox2 interaction is critical for Sox2 function
To investigate the functional significance of the interaction between Nanog and Sox2, we took advantage of ES cells carrying a conditional Sox2 knock-out allele (Sox2CKO). In this cell line, one of the Sox2 alleles is flanked by loxP sites (Favaro et al, 2009), while the other Sox2 allele has been replaced with a b-geo cassette (Zappone et al, 2000;Avilion et al, 2003). These cells also have a constitutively expressed CreER T2 -IRES-Puro transgene integrated randomly in the genome. Upon addition of tamoxifen, CreER T2 is translocated to the nucleus and excises the Sox2 gene between the loxP sites ( Figure 8A). As ES cells from which Sox2 activity has been removed are unable to self-renew and differentiate into trophectoderm-like cells (Masui et al, 2007), this cell line was used to test whether Sox2 mutant molecules impaired in Nanog binding could rescue the Sox2 null phenotype ( Figure 8B). Sox2CKO cells expressing a GFP control plasmid completely differentiate upon Tamoxifen treatment ( Figure 8B). As expected, cells transfected with an unmutated (Flag) 3 Sox2 cDNA rescued this differentiation phenotype. In contrast, cells expressing (Flag) 3 Sox2:YYY4A transgene showed a decrease in self-renewal activity with 50% fewer undifferentiated colonies compared to wild-type Sox2 ( Figure 8C). In accordance with the interaction data, expression of (Flag) 3 Sox2:YYY4F fully rescued the differentiation phenotype ( Figure 8C). To examine the possibility that the reduced colony formation by the Sox2:YYY4A cells was due to a reduced expression level, an immunoblot for Sox2 was performed. However, the amount of Sox2 expressed is comparable between Sox2:YYY4A and other lines and does not differ from the endogenous Sox2 level expressed by the parental line ( Figure 8D). These data suggest that the interaction with Nanog is a key component in the function of Sox2 in ES cell self-renewal.
To further investigate the effect of disrupting the Nanog-Sox2 interaction, the expression of genes present in the ChIP-Seq data sets was examined in cell lines expressing wild-type or mutant Sox2 (YYY4A). Of 13 genes analysed, 5 showed consistent differences by qRT-PCR when the Nanog/Sox2 complex was disrupted ( Figure 8E). The genes that show altered expression include transcription factors reported to be important for ES cell identity (Rex1 and Klf5 (Shi et al, 2006;Parisi et al, 2010), the gene encoding the chromatin remodelling protein Myst4 (Ura et al, 2011) as well as the cell-surface markers Ncam and Itga9 (Rugg-Gunn et al, 2012). In addition, Oct4, which does not contain the Nanog/Sox2 motif, did not change expression level in absence of a Nanog/Sox2 functional complex. It is therefore likely that the effect of disrupting the Nanog/Sox2 complex on self-renewal is a consequence of the misregulation of the genes controlled by the two proteins in complex.

Discussion
By taking advantage of improved methodology (van den Berg et al, 2010) the Nanog interactome has been expanded to over 130 proteins which can be subdivided into a number of different categories (Table I; Supplementary Table I). Many of the proteins identified in the interactome are components of large multi subunit complexes involved in chromatin modification, for several of which, all the known subunits are detected. Most of these are considered to be transcriptional repressors (NuRD, Polycomb Group protein (PcG), the atypical Polycomb complex E2F6, Sin3a and N-CoR) that bind to genomic sites adjacent to differentiation-specific genes to mediate repression (Jepsen and Rosenfeld, 2002;McDonel et al, 2009;Surface et al, 2010;Qin et al, 2012). Emerging evidence suggests that NuRD and PcG complexes are also found at sites that are actively transcribed (Brookes et al, 2012;Reynolds et al, 2012). How the NuRD complex is directed to target genes is not fully understood but Nanog and/or other NuRD-interacting transcription factors may target the complex to the relevant sites in the genome. In this respect, it is interesting that inducing Nanog protein results in enhanced binding of both Nanog and NuRD to the Nanog enhancer (Fidalgo et al, 2012).
Another proposed role for the chromatin modification complexes is to maintain repressed genes in a state that allows a rapid response to external cues. Evidence for this comes from the co-localization of enzymatically active PRC complexes and the paused form of RNA PolII at a large number of developmentally important genes (Brookes et al, 2012). This could allow alterations in the signalling environment to promptly increase the level of gene expression. The interaction of Nanog with both PRC and RNA PolII may reflect this poised state of some genes. The association with the chromatin modification machinery is common to transcription factors involved in maintenance of ES cell pluripotency Liang et al, 2008;Pardo et al, 2010;van den Berg et al, 2010;Ding et al, 2012). However, the range of complexes binding to individual factors differs with SWI/SNF not directly connecting to Nanog (this study; Wang et al, 2006) but interacting with other transcription factors (van den Berg et al, 2010). Recent data showing that Esrrb can substitute for Nanog function in ES cells (Festuccia et al, 2012) could in part be explained by the fact that Esrrb and Nanog bind to a number of the same chromatin modification complexes.
The Nanog interactome includes a number of proteins that have not previously been identified in an ES cell transcription factor interactome. In addition to TET-1, which has also been shown to interact with Nanog (Costa et al, 2013), these include the RNA processing proteins Ilf3, Rbm9, Pum1/2 and the transcription factors Zfp326, Arid5b, Zfp609. Examining the function of these molecules and the significance of their interaction with Nanog will provide further detail on how the extensive protein interaction network functions to control pluripotency.
In this study, we have focussed on the interaction between Nanog and Sox2 because of the central role of these proteins in the pluripotency gene regulatory network. Sox2 has been shown to interact with another key pluripotency factor, Oct4 by interaction of side chains within the DNA binding domains (Ambrosetti et al, 1997(Ambrosetti et al, , 2000. In the case of Nanog and Sox2, interaction occurs through sequences outwith the DNA binding domains. Nevertheless, the sequence of the SELEX motif suggests that this interaction results in a specific spatial relationship of DNA binding domains of both proteins on DNA. The sequence of Sox2 that mediates interaction with Nanog is a triple repeat of the sequence S X S/T Y. Experiments analysing Sox2 mutants for their ability to rescue ES cells from differentiation induced by Sox2 deletion demonstrate the importance of the interaction of Nanog with Sox2. Mutation of the tyrosines in the S X S/T Y motifs to alanines reduces the formation of undifferentiated ES cell colonies to 50% of the level achieved using a nonmutant Sox2 cDNA in the absence of any difference in protein levels expressed by the transgene. Therefore, the 50% drop in undifferentiated colonies observed in the presence of Sox2:YYY4A is a result of the misregulation of Nanog/Sox2 gene targets. The use of the SELEX motif identified as a Nanog/Sox2 target sequence together with a previously published de novo target sequence (Hutchins et al, 2013) allowed potential target genes of the Nanog/Sox2 complex to be identified. A number of these genes show altered expression upon abrogation of the Nanog/Sox2 interaction (e.g., Ncam, Itga9, Klf5 and Myst4). However, not all the genes tested are sensitive to loss of the interaction between Nanog and Sox2 (Supplementary Table II). This could suggest that in such cases the hydrophobic interaction of Nanog and Sox2 proteins is not required for chromatin binding, or that only in some cases is the associated gene sensitive to disruption of the interaction. The latter is reminiscent of our finding that only a subset of loci that bind Nanog respond to the presence of Nanog by modulating expression of a nearby gene (Festuccia et al, 2012).
In ES cells, composite Oct/Sox binding sites have been proposed to be redundantly regulated by Sox4, Sox11 and Sox15 (Masui et al, 2007). However, this redundancy does not extend to blockade of differentiation caused by Sox2 deletion. Consistent with this, Sox4, Sox11 and Sox15 are not present in the Nanog interactome and none of these Sox proteins contains a sequence that matches the S X S/T Y motif.
The three copies of the S X S/T Y motif in Sox2 occur within a 15-residue sequence in which 9 residues are hydroxyamino acids. Despite this preponderance of hydroxyamino acids, it is the aromatic rings of the tyrosine residues that are Figure 6 Identification of amino-acid residues within Sox2 (213-233) interacting with Nanog. (A) Top, schematic representation of hydroxyamino acid mutations in repeats 1, 2 or 3 in Sox2. Bottom, E14/T cells were transfected with (HA) 3 Nanog and the indicated (FLAG) 3 Sox2 mutants. Immunoblots of the HA immunoprecipitates were analysed by immunoblotting with an anti-FLAG or an anti-HA antibody. I is 1% of input. (B) Top, schematic representation of the combinatorial mutations of the hydroxyamino acids in repeats 1, 2 and 3 of Sox2. Bottom, E14/T cells were transfected with (HA) 3 Nanog and the indicated (FLAG) 3 Sox2 mutants. Immunoblots of the HA immunoprecipitates were analysed by immunoblotting with an anti-FLAG or an anti-HA antibody. I is 1% of input. (C) Top, schematic representation of the mutations of the hydroxyamino acids in positions 1, 3 or 4 of repeats 1, 2 and 3 of Sox2. Bottom, E14/T cells were transfected with (HA) 3 Nanog and the indicated (FLAG) 3 Sox2 mutants. Immunoblots of the HA immunoprecipitates were analysed by immunoblotting with an anti-FLAG or an anti-HA antibody. I is 1% of input. (D) Top, schematic representation of the mutations of the hydroxyamino acids in position 4 of repeats 1, 2 and 3 of Sox2. Bottom, E14/T cells were transfected with (HA) 3 Nanog and the indicated (FLAG) 3 Sox2 mutants. Immunoblots of the HA immunoprecipitates were analysed by immunoblotting with an anti-FLAG or an anti-HA antibody. I is 1% of input. Source data for this figure is available on the online supplementary information page.
Proteomic identification of a Nanog-Sox2 complex A Gagliardi et al critical mediators of the interaction with Nanog. This conclusion is derived from the fact that alanine substitution of all three serines at position 1 of the repeats or all three serines/threonines at position 3 of the repeats allowed continued efficient binding to Nanog, whereas alanine substitution of all three tyrosines decreased the Nanog interaction severely. Moreover, the fact that the Nanog interaction could be rescued when the tyrosines were substituted by phenylalanines indicates that the tyrosine hydroxyl groups are not required for the interaction and is highly suggestive that the two proteins interact by stacking of the aromatic rings. This is consistent with the fact that tyrosine and tryptophan residues cluster at protein-protein interaction 'hot spots' (Bogan and Thorn, 1998;DeLano, 2002). Functionally relevant stacking of tryptophan and tyrosine residues has also been demonstrated in the (C) Sequence of 22 oligonucleotides that contribute to the motif generated by the de novo discovery program MEME. (D) Top panel, SELEX motifs generated for Nanog and Sox2 expressed individually from a total of 19 (Nanog) and 15 (Sox2) sequences submitted to MEME; middle panel, SELEX motif generated for Nanog/Sox2 complex from 38 sequences submitted to MEME; bottom panel, representation of the de novo Nanog/Sox2 motif (Hutchins et al, 2013) and the combined motif from SELEX sequence for Nanog/Sox2 and de novo Nanog/Sox2 motif. Motifs in the bottom panels were generated with Web Logo 3.3. (E) Nanog and Sox2 ChIP-seq peaks located near the transcriptional start sites of Zfp42, Klf5, Ncam1 and Myst4. The peaks that contain the Nanog/Sox2 motif are highlighted in the shaded box; Nanog (N) and Sox2 (S) peaks in data sets from Chen (C), Marson (M) and Whyte (W) data sets. Source data for this figure is available on the online supplementary information page.
formation of an aromatic gate in apo flavodoxin (Genzor et al, 1996), in the regulation of galactose oxidase activity (Rogers et al, 2007) and, of particular relevance to this study, in the interlocking of tyrosine and tryptophan residues at the interaction interface of human nuclear receptor pregnane X receptor (PXR) that mediates protein homodimerization (Noble et al, 2006).
The lack of a requirement for the hydroxyl groups on Sox2 for the Nanog interaction is underscored by experiments using bacterially expressed recombinant proteins that demonstrate that the Nanog-Sox2 interaction occurs in the absence of posttranslational modifications. However, this does not mean that post-translational modifications might not affect the interaction between Nanog and Sox2. The interaction between the two proteins occurs through polypeptide stretches devoid of strongly charged amino-acid side chains. Potential modification of hydroxyl groups on the Sox2 interaction surface, whether on the tyrosine or on the neighbouring serine and threonine residues, would introduce charged moieties that would be expected to interfere with the interaction between the hydrophobic interacting residues. Moreover, recent work indicates that hydroxyl groups on Sox2 can also be modified by addition of N-acetylglucosamine, although the effect on Sox2 function is unclear (Jang et al, 2012). The fact that Nanog interacts with proteins that mediate post-translational modifications such as phosphorylation and ubiquitination is consistent with the observation that Nanog is phosphorylated (Yates and Chambers, 2005;Moretto-Zita et al, 2010) andubiquitinated (Moretto-Zita et al, 2010). In addition, Nanog partners could also be affected by such modifications because of physical proximity to the relevant enzymes. The role for these modifications and how they influence interactions between transcription factors and/or transcription factor function in ES cells is an important area for future investigation.
The high number of Nanog-interacting proteins identified in this study suggest that Nanog acts as a 'hub' protein (Han et al, 2004;Mullin and Chambers, 2012). The ability of individual partner proteins to interact with a hub protein like Nanog depends on the affinity of the interaction and the availability of the binding sites on both the hub protein and the partner, as has been discussed previously (Han et al, 2004;Mullin and Chambers, 2012). Since both competitive and noncompetitive interactions are simultaneously possible, it will be important to determine which factors compete for the same regions of Nanog. Of particular relevance will be whether factors that bind through the WR interact through a precise subregion of the WR or if there is variability in the exact sequence bound by a specific partner. To date, only Sox2 and Nac1 have been demonstrated to interact directly with the WR. Loss of the tryptophans of the WR has also been demonstrated to abrogate the interaction of Nanog with Sall4, Nr0b1, Zfp198 and Zfp281 ) but a direct interaction has not yet been shown for these proteins. In the situation where multiple factors bind the WR it is possible that binding of one factor could increase the affinity of another protein for interaction with an adjacent site in the same region resulting in co-operative binding of two or more factors. A clear potential example of this could be Nac-1, which has been reported to bind the C-terminal WR subunit (Ma et al, 2009). It is possible that both competitive and non-competitive binding to distinct sites on Nanog occurs simultaneously, allowing the assembly of large, functionally active complexes. An additional level of complexity arises from the possibility that the Nanog/Sox2 interaction may occur with either monomeric or dimeric Nanog (Mullin et al, 2008;Wang et al, 2008). Potential mechanisms that affect the Nanog dimerization equilibrium, such as covalent modifications, could thereby play an important part in regulating interactions and subsequent downstream events.

ES cell culture
Mouse ESC lines were cultured on gelatin-coated dishes without feeders in GMEM/b-mercaptoethanol/10% FCS/LIF (GMEMb/FCS/ LIF) as described (Smith, 1991). Nanog null RCNbH-B(t) cells have been described (Chambers et al, 2007): briefly, these cells have an IRES-HygromycinR-pA or an IRES-bgeo-pA replacement of Nanog sequences from intron I through to the 3'UTR. Sox2 conditional knock-out cells were obtained by re-targeting ES cells heterozygous for a Sox2 flox allele (Favaro et al, 2009) with a Sox2-b-geo 'knock-in' targeting vector (Zappone et al, 2000;Avilion et al, 2003). This was followed by stable transfection of a pPyCAG-CreER T2 IP construct ( Figure 8A). Puromycin-resistant clones were screened for efficient deletion of the Sox2 flox allele following tamoxifen treatment to select the Sox2CKO clone used here.

Protein purification
Preparation of nuclear extracts and purification of Flag-tagged proteins were performed as described (van den Berg et al, 2010). Briefly, nuclear extract was prepared from cells (Dignam et al, 1983) and Flag-tagged protein purified using 60 ml Flag-agarose beads per 1.5 ml of nuclear extract, during which samples were treated with 150 U/ml DNase Benzonase (41C, 3 h) to decrease spurious protein purification due to DNA bridging. Nanog and interacting proteins were then eluted using Flag peptide (0.2 mg/ml). For production of proteins in E. coli MBP-WR/Sox2 and MBP-WR W104A /Sox2 were cloned into pET Duet (Novagen) and expressed in BL21(DE3) cells. MBP-tagged proteins were lysed in 10 mM Tris pH 8.0, 100 mM NaCl, passed over amylose resin (NEB, E8021S), washed and proteins eluted with 10 mM maltose. Co-purifying protein was detected by immunoblotting. For SELEX, Nanog was cloned into pMalc2e (NEB) in frame with MBP and expressed in BL21 cells by addition of 1 mM IPTG. Cells were lysed in 10 mM Tris pH 8.0, 200 mM NaCl and purified on amylose resin. Sox2 was cloned into pET15b (Novagen) and expressed in BL21(DE3) by addition of 1 mM IPTG. Protein was purified by lysing cells in 25 mM Tris pH 8.0, 30 mM imidazole, 500 mM NaCl and passing lysate over nickel resin (His-select, Sigma, P6611). For co-expression, Nanog and Sox2 were cloned into pET Duet, to encode (His) 6 -Nanog and unmodified Sox2 and expressed in BL21(DE3) cells induced with 1 mM IPTG. Cells were lysed in 25 mM Hepes pH 7.6, 1 M NaCl, 5 mM imidazole and lysate incubated in batch mode with nickel resin. Ion-exchange purification of Nanog/Sox2 was performed at pH 7.6 on a 1-ml CM Sepharose FF column (GE Healthcare, 17-5056-01). Bound protein was eluted using a gradient of 0-1 M NaCl over 20 column volumes.