Enhancer priming by H3K4 methylation safeguards germline competence

Germline specification in mammals occurs through an inductive process whereby competent cells in the post-implantation epiblast reactivate a naïve pluripotency expression program and differentiate into primordial germ cells (PGC). The intrinsic factors that endow epiblast cells with the competence to respond to germline inductive signals remain largely unknown. Here we show that early germline genes that are active in the naïve pluripotent state become homogeneously dismantled in germline competent epiblast cells. In contrast, the enhancers controlling the expression of major PGC genes transiently and heterogeneously acquire a primed state characterized by intermediate DNA methylation, chromatin accessibility, and H3K4me1. This primed enhancer state is lost, together with germline competence, as epiblast cells develop further. Importantly, we demonstrate that priming by H3K4me1/2 enables the robust activation of PGC enhancers and is required for germline competence and specification. Our work suggests that H3K4me1/2 is directly involved in enhancer priming and the acquisition of competence.


SUMMARY:
Germline specification in mammals occurs through an inductive process whereby competent cells in the post-implantation epiblast reactivate a naïve pluripotency expression program and differentiate into primordial germ cells (PGC). The intrinsic factors that endow epiblast cells with the competence to respond to germline inductive signals remain largely unknown.
Here we show that early germline genes that are active in the naïve pluripotent state become homogeneously dismantled in germline competent epiblast cells. In contrast, the enhancers controlling the expression of major PGC genes transiently and heterogeneously acquire a primed state characterized by intermediate DNA methylation, chromatin accessibility, and H3K4me1. This primed enhancer state is lost, together with germline competence, as epiblast cells develop further.
Importantly, we demonstrate that priming by H3K4me1/2 enables the robust activation of PGC enhancers and is required for germline competence and specification. Our work suggests that H3K4me1/2 is directly involved in enhancer priming and the acquisition of competence.

INTRODUCTION:
Competence can be defined as the ability of a cell to differentiate towards a specific cell fate in response to intrinsic and extrinsic signals 1 . While the extracellular signals involved in the induction of multiple cell fates have been described 2 , the intrinsic factors that determine the cellular competence to respond to those signals remain elusive.
One major example illustrating the dynamic and transient nature of competence occurs early during mammalian embryogenesis, as the primordial germ cells (PGC), the precursors of the gametes, become specified. In mice, following implantation and the exit of naïve pluripotency (E4.5 -E5.5), PGC are induced from the formative epiblast around E6.0~E6.25 at the proximo-posterior end of the mouse embryo 3 .
The induction of PGC occurs in response to signals emanating from the extraembryonic tissues surrounding the epiblast: BMP4 from the extraembryonic ectoderm and WNT3 from the visceral endoderm. Furthermore, regardless of their position within the embryo, formative epiblast cells (~E6.0-6.25) are germline competent when exposed to appropriate signals, but this ability gets lost as the epiblast progresses towards a primed pluripotency state (>E6.5) 4 . Nevertheless, only a fraction (typically <20%) of the formative epiblast cells can give rise to PGCs when exposed to the appropriate signals 5 , suggesting that the formative epiblast is heterogeneous in terms of its intrinsic germline competence. Importantly, the intrinsic factors that confer germline competence on the formative but not the primed epiblast cells remain obscure 6,7 . This reflects the difficulties with investigating the molecular basis of germline competence and PGC specification in general, due to the limited cell numbers that can be obtained in vivo from mouse peri-implantation embryos (E4.5-E6.5). These limitations were overcome by establishing a robust in vitro differentiation system whereby mouse embryonic stem cells (ESC) grown under 2i conditions (naïve pluripotency) can be sequentially differentiated into epiblast-like cells (EpiLC; formative pluripotency) and primordial germ cell-like cells (PGCLC) 5 .
This system revealed transcription factors (TFs) [8][9][10] and epigenomic reprogramming events involved in PGC specification [11][12][13] and led to a better understanding of the mouse peri-implantation transitions in general 14 . Hence, several TFs, including FOXD3 and OTX2, were found to promote the transition from naïve to formative pluripotency by coordinating the silencing of naïve genes and the activation of early post-implantation epiblast markers 7,15,16 . Subsequently, FOXD3 and OTX2 restrict the differentiation of formative epiblast cells into PGCs and, thus, the silencing of these TFs is required for germline specification. Previous work from our lab showed that the regulatory function of FOXD3 during these developmental transitions involves binding to and silencing of enhancers shared between naïve pluripotent cells and PGCLC 7 . Interestingly, during the transition from naïve to formative pluripotency, FOXD3-bound enhancers lose H3K27ac but retain H3K4me1, suggesting that they do not become fully decommissioned but transiently primed instead. This led us to propose that the priming of PGCLC enhancers in the formative epiblast might confer germline competence 6,7 .
Pre-marking of enhancers with H3K4me1 ( i.e. enhancer priming) can precede their activation and has been implicated in endodermal differentiation, hematopoiesis or brown adipogenesis [17][18][19][20] . H3K4me1 at enhancers is catalyzed by the SET domains of the histone methyltransferases MLL3 (KMT2C) and MLL4 (KMT2D), which are part of the COMPASS Complex 21,22 . In agreement with the relevance of enhancer priming by H3K4me1, the knockout (KO) of Mll3/4 or Mll4 alone impairs enhancer activation and results in differentiation defects in various lineages [23][24][25][26][27][28][29][30] . Despite their manifold and relevant findings, one limitation of studies based on KO models is that they can not discriminate between the catalytic and non-catalytic functions 20,31 of MLL3/4 and, thus, can not directly assess the relevance of H3K4me1 for enhancer function. This question was recently addressed by generating ESC lines with amino acid substitutions in the SET domains of MLL3 and MLL4 that enabled the decoupling of their methyltransferase activity from their non-catalytic functions 32 . In agreement with previous reports [23][24][25]33 , in Mll3/4 double KO ESC the loss of H3K4me1 at active enhancers was accompanied by a strong reduction in H3K27ac levels, RNA polymerase II binding and enhancer RNA (eRNA) production. In contrast, the loss of H3K4me1 in Mll3/4 catalytic mutant ESC ( Mll3/4 dCD) only partially reduced H3K27ac at enhancers and had no effects on RNA polymerase II binding or eRNA levels. Furthermore, the Mll3/4 dCD ESC displayed minor gene expression changes compared to the Mll3/4 double KO ESC, suggesting that the function of MLL3/4 as long-range co-activators is largely independent of H3K4me1 32 . In addition, work in Drosophila melanogaster showed that while the KO of Trr , the homolog of Mll3/4 in flies, was embryonic lethal, an amino acid substitution in the SET domain of Trr that globally reduced H3K4me1 did not impair development or viability 34 . Altogether, these reports imply that H3K4me1 might be dispensable for the maintenance of enhancer activity as well as for enhancer priming. Nevertheless, subsequent work with Mll3/4 dCD ESC showed that the recruitment of chromatin remodelers 35 and the establishment of long-range chromatin interactions 25 require H3K4me1. Moreover MLL3/4 catalyze the deposition of not only H3K4me1 but also H3K4me2 at enhancers 34 and both of these histone marks antagonize de novo CpG methylation 36 .
As the function of H3K4me1 has been specifically interrogated in naïve ESC and flies, which are both largely devoid of CpG methylation, it is possible that, under strong de novo DNA methylation conditions ( e.g. exit from naïve pluripotency), H3K4me1 might protect enhancers from CpG methylation and render them competent for future activation 37 .
Here we performed an extensive comparison of the transcriptional and epigenetic features of formative and primed epiblast cells to gain insights into the molecular basis of germline competence. Importantly, these comparisons revealed that PGCLC enhancers, which tend to be already active in the naïve state, retain H3K4me1 and remain CpG hypomethylated in formative epiblast cells in comparison to primed pluripotency ones. Most importantly, we show that, in the absence of H3K4me1/2, PGCLC enhancers do not get properly re-activated in PGCLC and germline specification is impaired. Overall, our work demonstrates that enhancer priming by H3K4me1/2 is a major determinant of germline competence and highlights the importance of the epigenetic state of enhancers for the robust deployment of developmental gene expression programs.

Characterization of the PGCLC in vitro differentiation system by single-cell RNA-seq
To overcome the scarcity and transient nature of PGCs in vivo, we used an in vitro system that faithfully recapitulates PGC specification 5  can not be efficiently differentiated into PGCLC and, thus, display limited germline competence (Fig. 1a).
Previous work indicated that the acquisition of germline competence by the formative epiblast ( i.e. E5.5-6.5 epiblast; EpiLC) requires the complete dismantling of the naïve gene expression program, which is subsequently re-activated during PGC induction 38 . However, another possibility that has not been thoroughly investigated is that germline competence is associated with a few cells of the formative epiblast in which the naïve expression program is totally or partially retained. To assess these two alternative scenarios, we performed single-cell RNA sequencing (scRNA-seq) across multiple stages of PGCLC differentiation (the scRNA-seq data can be explored using Supplementary Data 1). t-SNE analysis of the resulting single cell transcriptomes (Supplementary Data 2) showed that cells tended to cluster within their corresponding differentiation stage and were characterized by the expression of stage specific markers (Fig. 1b,d). However, d2 and d4 EB showed cellular heterogeneity and formed distinct subclusters. One of these subclusters was identified as PGCLC based on the high expression of major PGC markers (e.g. Prdm14 , Prdm1, Tfap2c, Dppa3 ) (Fig. 1d, Supplementary Fig. 1a,b). Furthermore, the additional subpopulations within d4 EB were annotated based on the expression of cell identity markers identified by single-cell transcriptional profiling of E8.25 mouse embryos 39 . Remarkably, these subclusters were similar to the extraembryonic tissues ( i.e. extraembryonic ectoderm, extraembryonic mesoderm and endothelium) that surround PGCs in the proximo-posterior end of the mouse embryo following germline specification in vivo (Fig. 1c, Supplementary Fig. 1a). Overall, these observations demonstrate the quality of our scRNA-seq data and further support the robustness of the PGCLC in vitro differentiation system to investigate germline competence and specification.

Germline competence involves a robust dismantling of the naïve gene expression programme
To better characterize the in vitro-derived PGCLC, we performed differential expression analysis by comparing the expression of the PGCLC clusters with the remaining cells of the EB and d2 EpiLC (Supplementary Table 1). This led to the identification of a set of genes containing previously described PGC markers (e.g. Nanog , Tfap2c , Prdm14, Prdm1, Dppa3 ) (Fig. 1d). In agreement with previous reports 5 , we found that many of these genes were highly expressed in ESC, rapidly and strongly silenced in individual EpiLC and EpiSC and finally reactivated in PGCLC (Fig. 1e, Supplementary Fig. 1c). The similarity in gene expression     shown. e.) Expression dynamics of the PGCLC genes during PGCLC formation. The PGCLC genes were identified by differential expression analysis of the d2 and d4 PGCLC cluster versus the remaining cells of the EB and d2 EpiLC using a likelihood ratio test. Each dot represents the mean expression of all PGCLC genes in a single cell of the indicated stage. d2 and d4 EB refers to all the cells analyzed within the corresponding EB except those considered as PGCLC. f.) Violin plots showing transcriptional noise, defined as cell-to-cell transcript variation for the 500 most variable genes, for ESC, EpiLC and EpiSC. Lower transcriptional noise indicates higher transcriptional similarity between the cells belonging to a particular stage. All stages were compared to d2 EpiLC using wilcoxon tests (*: pvalue < 2.2 10 -16 ).
programs between ESC and PGCLC is further supported by the t-SNE plot (Fig. 1b), in which the clusters corresponding to these two cell types are close to each other.
Evaluation of the PGCLC gene set expression dynamics in EpiLC indicated that the silencing of these genes occurs rapidly as the formative pluripotency state is established (Fig. 1d,e), which we also confirmed at the protein level using publicly available proteomic data 40 (Supplementary Fig. 1d). Furthermore, in germline competent d2 EpiLC, all cells clustered together and neither a distinct subpopulation indicative of a retained naïve pluripotency expression program nor signs of precocious germline induction could be identified with different computational methods 41,42 (Supplementary Fig. 1e). Congruently, the cell-to-cell variability in gene expression levels, defined as transcriptional noise, was significantly lower in d2 EpiLC than in the preceding or subsequent cellular states (Fig. 1f). This is in full agreement with the E5.5 epiblast bearing the lowest transcriptional noise during mouse peri-implantation development 43 . Notably, using published scRNA-seq data 44 , we confirmed that the PGCLC genes also become rapidly and homogeneously silenced in vivo following implantation (>E4.5, Supplementary Fig. 1f). Lastly, we analyzed bulk RNA-seq data generated in Otx2 -/-16 and Prdm14 -/-11 EpiLC, which despite displaying increased and decreased germline competence, respectively, showed normal gene expression profiles in EpiLC (Fig. 1g).  Fig. 1d,e), which, nevertheless, display limited germline competence. Therefore, the extinction of the naïve program seems to be necessary but not sufficient for the acquisition of germline competence, suggesting that differences, other than transcriptional, should exist between competent (EpiLC, E5.5 epiblast) and non-competent (EpiSC, >E6.5 epiblast) epiblast cells. Based on previous reports 7, 45,46 , we hypothesized that enhancers involved in PGC specification might display epigenetic differences between EpiLC and EpiSC that could explain the distinct germline competence of these two epiblast stem cell populations.
To test the former hypothesis, we identified active PGCLC enhancers by using H3K27ac ChIP-seq data generated in d2 and d6-sorted PGCLC 12 Supplementary Fig. 2a). Furthermore, and consistent with previous observations 7 and the expression dynamics of the PGCLC genes, the majority of PGCLC enhancers were initially active in ESC ( i.e high H3K27ac levels), lost H3K27ac in EpiLC and became progressively reactivated in d2 and d6 PGCLC (Fig.   2b). The loss of H3K27ac was specific for PGCLC enhancers and did not affect EpiLC enhancers ( Supplementary Fig. 2b).
To explore whether PGCLC enhancers display any differences in their epigenetic status between germline competent ( i.e. EpiLC) and non-competent ( i.e. EpiSC) epiblast cells, we performed ChIP-seq experiments for several histone modifications in ESC, EpiLC and EpiSC (Fig. 2c). In agreement with their high H3K27ac levels in ESC, PGCLC enhancers were also enriched in other active histone marks ( i.e. H3K4me1/2/3) in ESC. All these active chromatin features decreased upon exit from naïve pluripotency, but, H3K4me1 was partly retained in EpiLC in comparison to EpiSC (Fig. 2d, Supplementary Fig. 2c). In contrast, histone modifications associated with polycomb repressive complexes ( i.e. H3K27me2/3) increased upon exit from the naïve pluripotent stage. However, such an increase only occurred at a small subset of the PGCLC enhancers and no major differences were observed between EpiLC and EpiSC ( Supplementary Fig. 2d). Lastly, we also investigated H3K9me2/3, two repressive histone modifications mostly found within heterochromatin, and capable of restricting cellular competence 47,48 . H3K9me2/3 progressively increased as cells exited naïve pluripotency, reaching significantly higher levels in EpiSC than in EpiLC ( Supplementary Fig. 2e).  c.) Summary of the ChIP-seq signals for the PGCLC enhancers in ESC, EpiLC and EpiSC. Mean ChIP-seq signals were calculated for all PGCLC enhancers using a -/+ 1kb window (RPGC: Reads per genomic content). d-f.) H3K4me1 (d), ATAC-seq (e) and CpG methylation (f) signals within PGCLC enhancers are shown in ESC, EpiLC and EpiSC as median profile (top) and heatmap plots (bottom). In the heatmaps plots the PGCLC enhancers were ordered as in b. For each epigenomic dataset, the mean signals within -/+ 1kb of the PGCLC enhancers were determined and those obtained in EpiLC were compared to the ones measured in ESC or EpiSC using paired wilcoxon tests (n = 511, *: p < 2.2 10 -16 ). CpG methylation data was obtained from Zylicz et al. 2018 89,90 . g.) The magnitude of all epigenetic features between EpiLC and EpiSC at PGCLC enhancers determined as effective size from the paired wilcoxon tests in c-f. The error bar represents the confidence interval.
Since all the previous ChIP-seq experiments were done in E14 ESC, we used another mouse ESC strain (R1) and confirmed that EpiLC displayed higher H3K4me1 levels at PGCLC enhancers in both genetic backgrounds ( Supplementary   Fig. 2f). As chromatin accessibility has been linked to enhancer priming 44,49 , we performed ATAC-seq experiments in ESC, EpiLC and EpiSC and observed that PGCLC enhancers remained more accessible in EpiLC than in EpiSC (Fig. 2e).
Furthermore, chromatin accessibility within PGCLC enhancers was orthogonally assessed by analyzing ChIP-seq data for POU5F1 (OCT4) 50,51 . Although Pou5f1 is similarly expressed in the three cell types (Fig. 1d), its binding within PGCLC enhancers was considerably higher in ESC and EpiLC than in EpiSC ( Supplementary Fig. 2g). Given that CpG methylation is a repressive epigenetic modification that frequently co-occurs with H3K9me2/3 and is antagonized by H3K4 methylation 36,52 , we also analyzed public whole-genome bisulfite sequencing data generated across pluripotent states 45  Furthermore, such differences were not observed around the transcription start sites (TSS) of the PGCLC genes, which also lost H3K27ac, but retained high levels of In conclusion, PGCLC enhancers are active in ESC and, as they become decommissioned (loss of H3K27ac), they transiently acquire a primed state in EpiLC (intermediate H3K4me1 and CpG methylation levels) before becoming fully silenced in EpiSC (CpG hypermethylation, gain of H3K9me2/3 and loss of H3K4me1). Based on these observations, we hypothesize that the priming of PGCLC enhancers in EpiLC might enable their subsequent re-activation in PGCLC and, thus, contribute to germline competence.

The formative epiblast displays epigenetic heterogeneity within PGCLC enhancers
To evaluate whether the priming of PGCLC enhancers also occurs in germline competent cells in vivo, we analyzed genome-wide CpG methylation data from mouse epiblasts 53  Previous single cell analyses of CpG methylation revealed that, during mouse peri-implantation stages, the formative epiblast is particularly heterogeneous, especially within enhancers with low CpG content ( i.e. 2.5% CpG) 54 . Interestingly, PGCLC enhancers displayed a low CpG content (2.4%, Supplementary Fig. 3a), which made us hypothesize that the intermediate mCpG levels observed for these enhancers in the formative epiblast could be the result of cell-to-cell variation. To evaluate this idea, we analyzed a comprehensive single cell DNA methylation data set from different epiblast stages ( i.e. E4.5, E5.5 and E6.5) 44 and measured the CpG methylation heterogeneity by comparing the methylation status of individual CpGs within PGCLC enhancers 55 (Fig. 3b). This analysis revealed that the formative stage (E5.5) displays the highest variation in CpG methylation, with~30 % of the compared CpG sites being differentially methylated between E5.5 epiblast cells (Fig.   3c,d). Moreover, the CpG methylation heterogeneity of the E5.5 epiblast was more pronounced for PGCLC enhancers than for other enhancers or the whole genome ( Fig. 3e, Supplementary Fig. 3b,c). As the CpG coverage was comparable across epiblast stages and the mCpG levels at PGCLC enhancers correlated with the genome-wide levels, the differences in mCpG heterogeneity are unlikely to be caused by technical reasons (Supplementary Fig. 3d).
Next, we reasoned that the CpG methylation heterogeneity observed in the E5.5 epiblast for the PGCLC enhancers could reflect differences in developmental timing and, thus, in germline competence. To explore this possibility, individual cells of each epiblast stage were classified in two groups (high and low mCpG) based on their mean mCpG levels within PGCLC enhancers. Then, cells within the high and low mCpG groups were compared regarding their chromatin accessibility and gene expression levels ( Fig. 3f-h). The E4.5 epiblast cells with higher mCpG levels showed decreased chromatin accessibility within PGCLC enhancers and lower expression of PGCLC genes, which could indicate an incipient exit from naïve pluripotency and, hence, a more advanced developmental age. In contrast, although the E5.5 epiblast cells with higher mCpG methylation also displayed lower chromatin accessibility within PGCLC enhancers, they did not show major differences in the expression of the PGCLC genes compared to E5.5 cells with low mCpG levels ( Fig. 3f-h). Therefore, in the formative epiblast, the epigenetic status of the PGCLC Epigenetic heterogeneity Epigenetic heterogeneity mCpG heterogeneity calculation  a.) The in vivo CpG methylation levels of the PGCLC enhancers were analyzed in E4.5, E5.5 and E6.5 epiblasts using data from Zhang et al. 2018 53 . The mCpG levels of each PGCLC enhancer (using a -/+ 1kb window) in E5.5 were compared to those measured in E4.5 or E6.5 using a paired wilcoxon test (n = 511, *: p < 2.2 10 -16 ). b.) Illustration of the mCpG heterogeneity estimation, which is based on the mCpG dissimilarity concept 91

Assessment of PGCLC enhancer function during PGCLC specification
Before investigating PGCLC enhancer priming in more depth, we evaluated the functional relevance of some representative enhancers during PGCLC specification.
We selected three PGCLC enhancers associated with Esrrb , Klf5 , and Lrrc31 that showed the preponderant epigenetic dynamics described in Fig. 2 (Fig. 4a,   Supplementary Fig. 4a). We used CRISPR/Cas9 to delete them individually in ESC containing a DPPA3-GFP reporter that facilitates subsequent PGCLC quantification ( Supplementary Fig. 4b). Two clonal lines for each individual enhancer deletion were generated and analysed for their PGCLC differentiation capacity. The deletion of the enhancers associated with Lrrc31 and Klf5 significantly diminished PGCLC specification (Fig. 4b, Supplementary Fig. 4c). Furthermore, both enhancer deletions strongly reduced the expression of Klf5 and Lrrc31 not only in d4 EB but already in ESC ( Supplementary Fig. 4d,e)   b.) WT ESC and ESC lines with the indicated PGCLC enhancer deletions were differentiated into PGCLC. The number of PGCLC present within d4 EB was quantified using a DPPA-GFP reporter. Each PGCLC quantification was performed in biological duplicates and two different clonal lines were used for each enhancer deletion (n=2x2). The percentages of PGCLC obtained when differentiating the ESC with the enhancer deletions were compared to those obtained when differentiating WT ESC using wilcoxon tests. c.) Esrrb expression levels were measured by RT-qPCR in d4 EB differentiated from WT ESC and the two different ESC clonal lines with the Esrrb enhancer deletion shown in a. The expression values were normalized to two housekeeping genes (Eef1a1 and Hprt). Error bars represent standard deviations from 6 measurements (two clonal lines x three technical replicates). d.) Genome-browser view showing the PGCLC enhancers (E1-E3) found within the Prdm14 locus and their H3K27ac, H3K4me1 and CpG methylation dynamics during PGCLC formation. e.) Prdm14 expression levels were measured by RT-qPCR in d4 EB differentiated from WT ESC and ESC with the indicated Prdm14 enhancer deletions (two clonal lines for each). The expression values were normalized to two housekeeping genes (Eef1a1 and Hprt). Error bars represent standard deviations from 6 measurements (two clonal lines x three technical replicates). f.) WT ESC and ESC lines with the indicated Prdm14 enhancer deletions were differentiated into PGCLC. PGCLC were measured as CD15 + CD61 + cells within d4 EB. Each PGCLC quantification was performed in biological duplicates and two different clonal lines were used for each enhancer deletion (n=2x2). The percentages of PGCLC obtained when differentiating the ESC with the enhancer deletions were compared to those obtained with WT ESC using wilcoxon tests.
early PGCLC specification ( i.e. d2 EB). Together with the epigenetic profiles observed within the Prdm14 locus (Fig. 4d), these results suggest that, rather than components of an ESC super-enhancer, the E1-E3 elements differentially contribute to Prdm14 expression in either ESC ( i.e. E2) or PGCLC ( i.e. E1). Furthermore, in agreement with the role of Prdm14 as a PGC master regulator 57 , we found that the individual E1-E3 deletions significantly impaired PGCLC differentiation (Fig. 4f,   Supplementary Fig. 4f). Since the E2 and E3 deletions already reduced the expression of Prdm14 in ESC, their effects on PGCLC specification could arise secondarily due to compromised naïve pluripotency 58,59 . In contrast, the E1 enhancer directly contributes to PGCLC specification, as its deletion affected Prdm14 expression during PGCLC differentiation but not in ESC.
In summary, these results indicate that PGCLC enhancers frequently control the expression of their target genes already in ESC, further supporting that a significant set of enhancers is functionally shared between naïve pluripotency and PGCLC 7 .
Most importantly, the enhancer deletions that reduced the expression of their target genes preferentially (i.e. Esrrb enhancer) or solely ( i.e. E1 Prdm14 enhancer) in PGCLC support the functional relevance of the identified PGCLC enhancers during germline specification.

The priming of PGCLC enhancers is associated with permissive chromatin and topological features
The priming of enhancers has been suggested to facilitate their future activation and the subsequent induction of their target genes 60 Fig. 5a,b). In line with previous reports, the overexpression of either PRDM14 or NANOG upon differentiation of EpiLC into PGCLC yielded a high percentage of PGCLC (Fig. 5a, Supplementary Fig. 5c). In contrast, the overexpression of PRDM14 or NANOG upon differentiation from EpiSC resulted in considerably less PGCLC. To assess whether PGCLC enhancers were differentially accessible to these TFs in EpiLC and EpiSC, we performed ChIP-seq experiments after a short induction of HA-tagged PRDM14 or NANOG in both EpiLC and EpiSC.
Importantly, PGCLC enhancers were more strongly bound by PRDM14-HA and NANOG-HA in EpiLC than in EpiSC (Fig. 5b). Furthermore, the binding of NANOG-HA to the PGCLC enhancers in EpiLC resulted in increased H3K27ac and H3K4me2 levels, indicating that NANOG overexpression can specifically ( i.e. H3K27ac or H3K4me2 did not increase at EpiLC enhancers) and more effectively activate PGCLC enhancers in EpiLC than in EpiSC (Fig. 5c,d     EpiSC using the Prdm14 TSS as a viewpoint (Fig. 5e). In agreement with their initial active state, in ESC the three Prdm14 enhancers (i.e. E1-E3) were in close spatial proximity to the Prdm14 TSS. Remarkably, the contacts between Prdm14 and the E1-E3 enhancers were partly preserved in EpiLC but strongly diminished in EpiSC (Fig. 5e).
Overall, these results suggest that the priming of PGCLC enhancers in EpiLC might contribute to germline competence by conferring these enhancers with permissive chromatin and topological features that facilitate their activation and regulatory function.

H3K4me1/2 is necessary for germline competence
Next, we wanted to specifically assess the relevance of H3K4me1 for PGCLC enhancer priming and germline competence. To do so, we used a previously generated mESC line (dCD ESC) that is catalytically deficient for the H3K4 methyltransferases MLL3 (KMT2C) and MLL4 (KMT2D) 32 . In these dCD ESC, H3K4me1 is completely lost from enhancers, while the binding of MLL3/4 and their associated complexes is largely maintained 32 (Fig. 6a). Gene expression, H3K27ac and eRNA levels within active enhancers were only mildly affected in dCD ESC, indicating that H3K4me1 is dispensable for the maintenance of enhancer activity 32,34 .
However, the importance of H3K4me1 for the priming and subsequent activation of enhancers upon ESC differentiation has not been investigated yet. Using previously generated ChIP-seq data 32 , we found that PGCLC enhancers were strongly and similarly bound by MLL3/4 in WT and dCD ESC (Fig. 6b). Moreover, H3K4me1 levels within PGCLC enhancers were strongly reduced in dCD ESC as well as upon their differentiation into EpiLC and EpiSC (Fig. 6c, Supplementary Fig. 6a). The loss of H3K4me1 and the milder reduction of H3K27ac within PGCLC enhancers were also observed in an independent MLL3/4 catalytic mutant ESC line (dCT; Fig. 6a, Supplementary Fig. 6a). Additionally, the dCD cells displayed a strong reduction in H3K4me2 within PGCLC enhancers ( Supplementary Fig. 6b). In agreement with the histone modification changes observed for active enhancers in MLL3/4 catalytic mutant ESC lines 34 , the dCD EpiLC also displayed strong H3K4me1/2 losses and milder H3K27ac reductions within other enhancer categories (e.g. EpiLC enhancers, Supplementary Fig. 6c). Lastly, in agreement with the protective role of H3K4me1/2 against DNA methylation 36,62 , mCpG levels within PGCLC enhancers were elevated in dCD ESC and EpiLC (Fig. 6d).    j.) Gene expression changes for the PGCLC genes in dCD vs. R1 WT d4 EB. PGCLC genes were separated into two groups depending on whether they were associated with a nearby PGCLC enhancer (n=113 genes) or not (n=38 genes). As a control group, 100 random genes were sampled. The log2 fold changes for the PGCLC genes associated with enhancers was compared to the fold changes measured for PGCLC genes without enhancers or the randomly sampled genes, respectively, using one-tailed wilcoxon tests. Log2 Fold Changes were estimated with DESeq2. Supplementary Fig. 6d). In agreement with MLL3 and MLL4 being functionally redundant 70 , such PGCLC differentiation defects were not observed when using cells that were catalytic deficient for MLL4 but not for MLL3 ( i.e. 4CT) (Fig. 6e).
Furthermore, the compromised PGCLC differentiation of the dCD cells was still observed at a later time point (day 6, Supplementary Fig. 6e), indicating that the observed defects are not simply explained by a delay in PGCLC specification.
As H3K4me1 priming has been suggested to facilitate enhancer activation 19,60,71 , we addressed this question in the context of PGCLC differentiation by performing H3K27ac ChIP-seq in EpiLC, EpiSC and d4 EB derived from WT and dCD ESC (Fig.   6f,g and Supplementary Fig. 6f). In EpiLC and EpiSC, H3K27ac levels within PGCLC enhancers were similarly low in both WT and dCD cells. In contrast, most of the PGCLC enhancers, including those associated with major PGCLC regulators, gained H3K27ac in WT but not in dCD d4 EB, indicating that, upon PGCLC differentiation, PGCLC enhancers did not get properly activated in the absence of H3K4me1 (Fig.   6f,g and Supplementary Fig. 6f). Next, to investigate the transcriptional consequences of this defective PGCLC enhancer re-activation, we performed Table 3). The expression of PGCLC genes in dCD cells was particularly reduced among genes associated with at least one PGCLC enhancer, which included relevant PGCLC markers and regulators (e.g. Prdm1 , Prdm14 , Tfap2c, Dppa3 ) (Fig. 6h-j).

RNA-seq experiments in WT and dCD d4 EB (Supplementary
Nonetheless, gene expression changes in dCD cells were in general moderate and some PGCLC genes and enhancers were not affected (Fig. 6h, Supplementary Fig.   6f), which could perhaps be attributed to the compensatory activity of other histone methyltransferases (e.g. MLL1/KMT2A 72 , MLL2/KMT2B 73,74 Fig. 6g,h and Supplementary Table 3). These minor, but measurable, gene expression changes in dCD ESC and EpiLC could also contribute to the observed PGCLC differentiation defects. In any case, our results show that H3K4me1/2 is required for proper PGCLC specification and supports the importance of PGCLC enhancer priming for germline competence.

H3K4me1/2 is required for the increased competence of OTX2 deficient epiblast cells
Recently, it has been shown that the deletion of Otx2 markedly increases and prolongs germline competence in epiblast cells 16 . Therefore, we wondered whether the priming of PGCLC enhancers could be involved in the extended germline competence of Otx2 -/cells. Firstly, we found that OTX2 binds to PGCLC enhancers in both ESC and EpiLC ( Supplementary Fig. 7a), suggesting that this TF might be directly involved in their decommissioning upon exit from naïve pluripotency 16 . Next, we confirmed the increased and prolonged germline competence of Otx2 -/cells, which can be robustly differentiated into PGCLC after keeping them for up to eight days in EpiSC culture conditions (Fig. 7a, Supplementary Fig. 7b). H3K4me1 and H3K4me2 ChIP-seq experiments revealed that the increased germline competence of Otx2 -/cells was coupled with the retention of H3K4me1/2 at PGCLC enhancers but did not affect EpiLC enhancers (Fig. 7b, Supplementary Fig. 7c,d). Moreover, bisulfite sequencing of the representative Esrrb enhancer showed that the increased competence of Otx2 -/cells was also reflected in reduced mCpG levels within PGCLC enhancers (Fig. 7c). Although some PGCLC enhancers displayed elevated H3K27ac levels in Otx2 -/-EpiLC ( Supplementary Fig. 7e), this did not result in major gene expression changes ( Supplementary Fig. 1g), suggesting that the increased germline competence of these cells could be preferentially linked to PGCLC enhancer priming rather than activation.
To directly assess whether the extended germline competence of Otx2 -/cells requires the priming of PGCLC enhancers by H3K4me1/2, we deleted  Fig. 7f) and differentiated them into PGCLC. Chiefly, both dCD and dCD Otx2 -/-EpiLC showed a strong and similar reduction in their PGCLC differentiation capacity (Fig. 7d). Furthermore, genome-wide CpG methylation analysis revealed that PGCLC enhancers were considerably more methylated in dCD Otx2 -/-EpiLC than in Otx2 -/-EpiLC (Fig. 7e). Nevertheless, despite their reduced germline competence, dCD Otx2 -/-EpiLC displayed mCpG levels within PGCLC enhancers comparable to those measured in WT cells (Fig. 7e, Supplementary Fig.   7g), suggesting that, in addition to protecting from CpG methylation, H3K4me1/2 might contribute to germline competence through additional regulatory mechanisms (e.g. permissive 3D chromatin architecture, Fig. 5e). Most importantly, these results further support the importance of H3K4me1/2 and enhancer priming for germline competence (Fig.7f)   a.) E14 WT and Otx2 -/-ESC were differentiated into d2 EpiLC, d4 EpiSC or d8 EpiSC, which were then further differentiated into PGCLC. PGCLC were quantified as the proportion of CD15 + CD61 + cells found within d4 EB. The error bar represents the standard deviation from two biological replicates. b.) H3K4me1 ChIPseq experiments were performed in d2 EpiLC, d4 EpiSC and d8 EpiSC differentiated from E14 WT and Otx2 -/-ESC. H3K4me1 levels within PGCLC enhancers are shown as median profile (top) and heatmap plots (bottom). In the heatmap plots the PGCLC enhancers were ordered as in Fig. 2b. c.) The CpG methylation levels of the Esrrb enhancer were determined by bisulfite sequencing in the same cell types and stages described in b. The columns of the plots correspond to individual CpG dinucleotides located within the Esrrb enhancer. Unmethylated CpGs are shown in blue, methylated CpGs in red and CpGs which were not covered are shown in gray. At least 10 alleles were analyzed in each cell type (rows). d.) R1 WT, Otx2 -/-, dCD and dCD Otx2 -/-ESC were differentiated into PGCLC. PGCLC were quantified as the proportion of CD15 + CD61 + cells found within d4 EB. The PGCLC differentiations for the dCD Otx2 -/-ESC were performed in three biological replicates and using two different clonal lines (n=3x2). The other PGCLC measurements were performed in biological triplicates. e.) Genome-wide bisulfite sequencing experiments were performed in d2 EpiLC differentiated from R1 WT, Otx2 -/-, dCD and dCD Otx2 -/-ESC. mCpG levels within PGCLC enhancers are shown as median profile (top) and heatmap plots (bottom). In the heatmap plots the PGCLC enhancers were ordered as in Fig. 2b. f.) Model illustrating the priming of PGCLC enhancers during PGCLC differentiation and its relevance for germline competence. In the naïve pluripotent stage, PGCLC enhancers are active (H3K27ac + H3K4me1 + CpG hypermethylation), while with further development into the formative (EpiLC) and primed (EpiSC) pluripotent states, the PGCLC enhancers become either heterogeneously primed (intermediate H3K4me1 + CpG methylation) or fully decommissioned (H3K9me2/3 + CpG hypermethylation), respectively. Upon PGCLC differentiation, the primed but not the decommissioned PGCLC enhancers can get re-activated, thus indicating that PGCLC enhancer priming is important for germline competence. Congruently, Otx2 -/-EpiLC display increase PGCLC enhancer priming and germline competence, while the opposite is true for H3K4me1/2 deficient EpiLC in which PGCLC specification is impaired.

DISCUSSION:
Enhancer priming by H3K4me1 has been suggested to precede and facilitate enhancer activation 17,19,60,71 . However, direct evidence supporting the functional relevance of H3K4me1 is scarce, partly due to the difficulties to separate the enzymatic and non-enzymatic functions of histone methyltransferases 75  Similarly, we found that the induction of PGCLC genes in dCD cells was moderately and partly disrupted. Therefore, we propose that priming by H3K4me1/2 might facilitate, rather than being essential for, enhancer activation and the robust induction of developmental gene expression programs. This H3K4me1/2 facilitator role might involve not only protection from DNA methylation but also increased chromatin accessibility and physical proximity between genes and enhancers 25 .
Future work in additional species and cellular transitions will establish the relevance and prevalence of H3K4me1/2 enhancer priming.
In the context of PGCLC differentiation, our scRNA-seq analysis suggests that, despite the remarkable transcriptional similarities between the naïve pluripotency and PGCLC states, germline competence is not attained until the naïve expression programme is robustly and homogeneously dismantled. Remarkably, the transcriptional homogeneity of the formative epiblast is accompanied by a simultaneous increase in epigenetic heterogeneity within enhancers. Hence, in contrast to ESC or the naïve epiblast, where CpG methylation levels within enhancers are negatively correlated with the expression of major naïve pluripotency regulators 54,81 , in the formative epiblast the epigenetic heterogeneity within PGCLC enhancers seems to be uncoupled from the transcriptional status of the PGCLC genes. Therefore, in formative epiblast cells, mCpG levels within enhancers might represent a marker for inferring both developmental timing and germline competence. Moreover, the epigenetic heterogeneity of PGCLC enhancers implies that epiblast cells might acquire germline competence in an unsynchronized and transient manner shortly after exiting the naïve pluripotent state, thus offering a plausible explanation as to why only a fraction of formative epiblast cells can be differentiated into PGCs when exposed to high inductive signals 4,5 . Similarly, the extended germline competence of epiblast cells under certain genetic or metabolic conditions 16,46 might be also explained by an increase in the fraction of cells in which PGCLC enhancers are epigenetically primed, as shown for Otx2 -/-EpiLC (Fig. 7).
On the other hand, it is still unclear how PGCLC enhancers transiently acquire a  18,19 . In the case of germline competence, we uncovered that the priming of PGCLC enhancers in formative epiblast cells involves the transient and partial retention of H3K4me1 from a preceding active state. The mechanisms involved in this H3K4me1 retention are unknown, but we can envision at least two non-mutually exclusive scenarios: (i) an active maintenance mechanism similar to the one reported in other cell lineages 18,19 , whereby the persistent binding of MLL3/4 and of certain TFs in EpiLC enables the retention of H3K4me1 within PGCLC enhancers; (ii) a passive mechanism whereby MLL3/4 binding to PGCLC enhancers is already lost in EpiLC, but H3K4me1 can still be transiently retained due to the slow dynamics of H3K4 demethylation 86,88 . On the other hand, recent work based on chromatin accessibility or multi-omic profiling indicates that while the commitment towards certain cell lineages might involve the activation of already primed enhancers, in other cases this can occur through de novo enhancer activation 44,49 .
Future work will elucidate the prevalence and regulatory mechanisms by which the epigenetic state of enhancers can contribute to cellular competence and the robust deployment of developmental gene expression programs.

ACKNOWLEDGMENTS:
We thank the Rada-Iglesias lab members for insightful comments and critical reading

Declaration of Interests
The authors declare no competing interests.

Data availability
The sequencing datasets generated during this study will be made available through GEO upon publication.

Generation of the Mll3/4 dCD, Mll3/4 dCT and Mll4 CT ESC lines
The Mll3/4 dCD ESC line was previously generated and described 32

Quantification of PGCLC by flow cytometry
In general, PGCLC were quantified using antibody staining and flow cytometry 94 .
Briefly, after four days of PGCLC differentiation, the resulting EB were dissociated and stained for 45 minutes with antibodies against CD61 ( Biolegends ) and CD15

ATAC-seq
The

ChIP-seq and ChIPmentation
Cells were cross-linked with 1 % formaldehyde for 10 min, followed by quenching

ChIP-bisulfite sequencing
ChIP-bisulfite experiments were performed as described in Thomson et al. 2010 100 with slight modifications. Firstly the ChIP protocol described above was followed.
Then, the previously generated lineage assignments 44 were used to solely select epiblast cells for further analysis.

scRNAseq data analysis
The code used to define PGCLC genes will be fully available through Github. Briefly,

ATAC-seq data processing
For ATAC-seq data processing, paired end reads were mapped to the mouse genome ( mm10 ) using BWA 108 . Read duplicates and reads within blacklisted regions were discarded.
Given the concordance of the ATAC-seq replicas (Spearman correlation of a 2 kb window for ESC: 0.86; EpiLC: 0.88; EpiSC: 0.86), BAM files for each stage were merged and converted into bigWig files by normalization to 1x sequencing depth with deeptools-3.3.1 109 . The processed ATAC-seq data can be also found in the UCSC browser session mentioned above.
For the statistical analysis, we first determined the average signals of the ChIP-seq, ATAC-seq or genome-wide CpG bisulfite sequencing within -/+ 1kb of the PGCLC enhancers in ESC, EpiLC and EpiSC using deeptools-3.3.1 109 .. Then, for the signal comparison relative to d2 EpiLC the p-values were estimated with a paired and two-sided wilcoxon test (confidence level: 0.95). The effective size of paired wilcoxon tests were calculated by dividing the z-statistics by the square roots of the sample sizes using rstatix and boots.ci for the approximation of the confidence intervals.

CpG methylation analysis
The analysis of the local bisulfite sequencing experiments performed for selected PGCLC enhancers was performed with BISMA 112 .
Genome-wide DNA methylation data was analysed with Bismark 113 . However, as the considered data sets were prepared with slightly different protocols, the preprocessing steps were adjusted accordingly: For the whole-genome bisulfite sequencing data from 2i+LIF ESC (GSE41923), d2 EpiLC and EpiSC (GSE70355), the adapter trimming was performed with Trim Galore ( http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ ) using the default settings; for data sets generated by post bisulfite adaptor tagging (pbat), either 9 (data from d2 EpiLC and PGCLC (DRA003471)) or 6 bp (genome-wide methylation data generated in this study) were removed. The DNA methylation data from E4.5 -E6.5 epiblasts were generated by STEM-seq (GSE76505), and, in this case, adapter sequences were removed with cutadapt 114 and Trim Galore, respectively. For all the previous samples, reads were mapped with Bismark-v0. 16

CpG methylation heterogeneity and scNMT-data analysis
With the scNMT-seq method the transcriptome, methylome (CpG methylation) and chromatin accessibility (GpC methylation) are recorded from the same single cell 116 .
Among all the single cells analyzed in E4. 5 Table 2) using bedtools ( https://github.com/arq5x/bedtools2 ) from which the mean methylation of all covered CpGs was determined in each single cell. Next, the average PGCLC enhancer methylation for all the epiblast cells within a developmental stage was determined and the cells in each stage were classified as either low or high if their mean PGCLC enhancer methylation was below or above the stage average, respectively (Fig. 3f).
From the parsed chromatin accessibility data, the mean mGpC level for each PGCLC enhancer within a -/+ 1kb window was determined and the values of all PGCLC enhancers were summarized for each cell. For the scRNAseq data, the mean expression of all PGCLC genes (FPKM-normalized) was determined per cell.

4C-seq
Reads were assigned to samples based on the first 10 bases of the read.
Subsequently, the primer sequence was removed from the read and the remaining sequence starting before the restriction site for NlaIII (CATG) was trimmed to 41 bases. These 41 bases were aligned to the mouse reference genome ( mm10 ) using HISAT2 117 . From these alignments, RPM (reads per million) normalized bedgraph files were generated for downstream visualization and analysis 118 .

Bulk RNA-sequencing
Paired-end RNA-seq data was mapped with STAR 119 to the mouse reference genome (Ensembl gene annotation, v99) and reads within genes were counted with featureCounts 120 . The log 2 fold changes (Supplementary Table 3), rlog and read count normalization were estimated with DESeq2 from all replicates generated 121 .    Figure S1: Characterization of the PGCLC differentiation system by scRNA-seq (related to Figure 1). a.) t-SNE plot based on the scRNA-seq data generated in d4 EB (n = 368 cells) (left) showing the expression of a representative marker for each main cluster. For each of the main four clusters described in Fig. 1c, the expression of representative gene markers are shown for individual cells as a UMI count heatmap (right) . b.) D4 EB cells were sorted by FACS into CD15 + CD61 + (i.e. PGCLC) or CD61 -(i.e. non-PGCLC control cells). Then, the expression of the indicated genes, including the PGC markers Prdm14, Nanos3 and Nanog, was determined by RT-qPCR in those two cell populations. The error bars represent standard deviations from three technical replicates. c.) Expression dynamics of the PGCLC genes during PGCLC formation. Each dot represents the mean expression of one PGCLC gene as measured in all individual cells belonging to the indicated stages. EB refers to any cell of the EB except those identified as PGCLC. d.) Box plots displaying the protein levels measured for the PGCLC genes at different time points during the differentiation of ESC into EpiLC. Dots represent the mean protein levels expressed by each PGCLC gene. The data for each EpiLC time point was normalized to the protein levels measured in ESC and presented as log2 values. The proteomic data was obtained from Yang et al. 2019 40 . e.) Computational approaches used to investigate potential transcriptional priming of the PGCLC expression program in EpiLC. Left: The t-SNE plot of d2 EpiLC scRNA-seq data did not reveal any obvious clusters suggestive of differential cell identity based on gene expression. Right: The RNA velocity analysis, which takes into account unspliced transcript reads for lineage tracing 41 , did not show any evidences of transcriptional priming of d2 EpiLC (red) towards the PGCLC cluster (gray). In contrast, this same analysis revealed that a fraction of the cells profiled for the d2 EB (with a mixed mesodermal/PGCLC identity; gray arrow) seem to acquire a transcriptional program similar to the one found for PGCLC (shadowed in gray), indicating the feasibility of the RNA velocity method. f.) The dismantling of the PGCLC genes upon exit from the naïve pluripotent state (E4.5) in vivo was evaluated using scRNA-seq data obtained from Argelaguet et al. 2019 44 . Top: tSNE plot based on scRNA-seq data from E4.5, E5.5 and E6.5 epiblasts. Middle: The dots represent the mean expression for all PGCLC genes in individual cells belonging to the indicated stages. Bottom: The violin plots show the transcriptional noise, defined as cell-to-cell transcript variation for the 500 most variable genes, for E4.5, E5.5 and E6.5 epiblast cells. The * indicates that the transcriptional noise for the indicated stage was significantly different in comparison to E5.5 (wilcoxon test, *: p < 2.2 10 -16 ). g.) Scatter plots comparing the transcriptomes of WT d2 EpiLC (x-axis) with those of d1.5 EpiLC, d3 EpiLC, Prdm14 -/-d2 EpiLC or Otx2 -/-d2 EpiLC (y-axis). All genes considered as expressed are shown as dots, with the PGCLC genes highlighted in green. The gene expression values are r-log normalized (DESeq2). The RNA-seq data was obtained from 11,40,50 .     CpG density is defined as the fraction of CpGs found within 100 bp of each PGCLC enhancer sequence. For each PGCLC enhancer, the average CpG density within a -/+ 500 kb window was considered. PGCLC enhancers with a CpG density > 0. 15, which are about 4.5 %, were omitted in the graph. b) CpG methylation heterogeneity heatmap showing the pairwise comparisons between individual cells based on their genome-wide mCpG levels. The CpG methylation heterogeneity values are presented with a blue-red scale (blue means that cells are epigenetically similar and red that they are different). Above the heatmap, the developmental stages of the analyzed cells (E4.5, E5.5 or E6.5) and the average mCpG/CpG ratio (blue-orange scale) measured for all PGCLC enhancers within each single cell are shown (n = 258 cells). Cells were ranked according to genome-wide mCpG levels within each stage. c.) Violin plots showing genome-wide CpG methylation heterogeneity measured in E4.5, E5.5 and E6.5 epiblast cells. The CpG methylation heterogeneity in the E5.5 epiblast cells was compared to the one measured in the E4.5 or E6.5 cells using wilcoxon tests (*: p < 2.2 10 -16 ). d.) Scatterplot showing the correlation between (left) genome-wide mCpG or (right) CpG coverage and the mean mCpG/CpG ratio of PGCLC enhancers for individual cells belonging to E4.5, E5.5 and E6.5 epiblasts (n=258 cells), respectively. The correlation coefficient (R) and pvalue were calculated using Spearman correlation.