Haematopoietic stem and progenitor cells from human pluripotent stem cells

A variety of tissue lineages can be differentiated from pluripotent stem cells by mimicking embryonic development through stepwise exposure to morphogens, or by conversion of one differentiated cell type into another by enforced expression of master transcription factors. Here, to yield functional human haematopoietic stem cells, we perform morphogen-directed differentiation of human pluripotent stem cells into haemogenic endothelium followed by screening of 26 candidate haematopoietic stem-cell-specifying transcription factors for their capacity to promote multi-lineage haematopoietic engraftment in mouse hosts. We recover seven transcription factors (ERG, HOXA5, HOXA9, HOXA10, LCOR, RUNX1 and SPI1) that are sufficient to convert haemogenic endothelium into haematopoietic stem and progenitor cells that engraft myeloid, B and T cells in primary and secondary mouse recipients. Our combined approach of morphogen-driven differentiation and transcription-factor-mediated cell fate conversion produces haematopoietic stem and progenitor cells from pluripotent stem cells and holds promise for modelling haematopoietic disease in humanized mice and for therapeutic strategies in genetic blood disorders.

LCOR, RUNX1, SPI1) were consistently detected in myeloid, B, and T cells of five engrafted recipients, suggesting that these factors conferred multi-lineage haematopoietic reconstitution potential (Extended Data Fig. 4c). These seven transcription factors were recovered both in iPSC and in hESC-derived cells. Additionally, ZKSCAN1, SSBP2, MAFF, DACH1, and SOX4 were detected in some animals, perhaps reflecting their potential to enhance engraftment under some experimental conditions. Distinct factors were recovered when screening for in vitro colony-forming potential (Extended Data Fig. 4d).
We next determined whether the seven common transcription factors were necessary and sufficient to support multi-lineage engraftment of haemogenic endothelium in vivo. We transduced haemogenic endothelium with these seven transcription factors, injected cells intrafemorally into sub-lethally irradiated NSG mice, and treated the mice with doxycycline for 2 weeks. We observed multi-lineage engraftment in bone marrow and thymus at 12 weeks (Fig. 1c). We sought to determine the minimal combination of transcription factors required for multi-lineage engraftment by a factor-minus-one approach. No exclusion of a single factor abolished engraftment, although omitting RUNX1, ERG, LCOR, HOXA5, or HOXA9 compromised multi-lineage reconstitution and reduced total chimaerism in bone marrow at 8 weeks ( Fig. 1d and Extended Data Fig. 4e). These data suggest that, at a minimum, RUNX1, ERG, LCOR, HOXA5, and HOXA9 facilitate engraftment and multilineage differentiation.

Transcription factors confer multi-lineage engraftment
We monitored mice engrafted with haemogenic endothelium transduced with the defined 7 transcription factors (HE-7TF cells) and documented multi-lineage engraftment with erythroid cells (GLY-A + ), myeloid cells (CD33 + ), B cells (CD19 + ), and T cells (CD3 + ) in 5 of 13 recipients at 12 weeks. The remaining eight recipients were engrafted with B cells and T cells and either erythroid or myeloid cells (Fig. 2a, b). We next validated the self-renewal capacity of haemogenic endothelium-derived cells by secondary transplantation. We transplanted marrow from three primary mice (9,11,and 16;Fig. 2b) that showed multilineage engraftment at 8 and 12 weeks into secondary animals. Multiple secondary recipients engrafted with multi-lineage haematopoiesis at 8, 14, and 16 weeks ( Fig. 2b and Extended Data Fig. 5a-c). To quantify the frequency of these secondary repopulation units, we transplanted secondary recipient mice with 1,000 and 3,000 CD34 + cells isolated from the marrow of primary engrafted recipients, and observed multi-lineage engraftment in approximately one-third of animals. Among 10 secondary mice injected with 3,000 CD34 + cells from primary recipients of HE-7TF, a total of 3 recipients showed multi-lineage reconstitution in independent experiments, whereas 1,000 CD34 + cells from HE-7TF engrafted 0 out of 5 mice (Fig. 2b, c). Using ELDA software, the calculated stem-cell frequency for our HE-7TF cells was 1 in 10,093 cells ( Fig. 2c and Extended Data Fig. 5d).
To determine which of the seven transcription factors were consistently recovered from secondary engrafted mice, we performed PCR amplification of myeloid, B and T cells from two mice, and detected LCOR, HOXA5, HOXA9, and RUNX1 in every lineage. ERG was noted in only myeloid and B cells, while SPI1 and HOXA10 seemed dispensable (Extended sufficient to confer multi-lineage engraftment. We engineered two polycistronic lentiviral vectors to transduce iPS-derived haemogenic endothelium with LCOR-P2A-HOXA9-T2A-HOXA5 and RUNX1-P2A-ERG, and detected multi-lineage erythroid cells, neutrophils, B cells, and T cells in multiple engrafted recipients at 12 weeks (5-TFs poly ; Fig. 2b, left, and Extended Data Fig. 5f, g). Bone marrow transplanted from these mice into irradiated secondary recipients reconstituted multi-lineage haematopoiesis in two of five recipients at 8-12 weeks (Fig. 2b, middle). Our combined experience showed that when haemogenic endothelium was transduced with the entire 26-factor library, 11 out of 40 mice showed engraftment, 5 of which were multi-lineage (3 from iPSC and 2 from hESC). Transduction with 7 transcription factors yielded engraftment in 33 out of 76 mice, with 9 multi-lineage, whereas transduction with 5 transcription factors yielded engraftment in 15 out of 30 mice, with 5 multi-lineage. Engraftment and multi-lineage contribution were assessed from bone marrow at 4-16 weeks (Extended Data Fig. 3).
To investigate the source of our long-term engraftable cells, we performed the following three sets of experiments. First, we transduced haemogenic endothelium cells on day 0 or day 3 of EHT followed by transplantation into mice, and observed that day 3 EHT cells transduced with seven transcription factors resulted in engraftment, whereas day 0 EHT cells did not (Extended Data Fig. 6a). This emphasizes the role of EHT culture in generating engraftable cells, and suggests that the seven transcription factors promote engraftment not in haemogenic endothelium cells but in the context of cells undergoing EHT. Second, to show the specificity of the seven transcription factors for haemogenic endothelium cells undergoing EHT, we transduced human umbilical vein endothelial cells, but failed to observe the appearance of a CD34 + haematopoietic population (Extended Data Fig. 6b). Finally, we transduced our seven transcription factors on day 1 of EHT, when there were no detectable haematopoietic cells 21 and cultured them for 6 days, after which we sorted CD34 + CD43 + CD45 + triple-positive cells and CD34 + CD43 − CD45 − single-positive cells. We transplanted these two cell populations into non-irradiated c-Kit-deficient immune-deficient recipients 22 , and monitored engraftment capacity after 8 weeks of transplantation. Two out of five mice transplanted with triple-positive cells showed multi-lineage engraftment, whereas five mice transplanted with single-positive cells did not (Extended Data Fig. 6c, d), suggesting that haemogenic endothelium cells undergoing EHT are the source of long-term engraftment in our system. In summary, these three experiments suggest that haematopoietic cells that emerge from definitive haemogenic endothelium during the in vitro process of EHT, when transduced by a core set of transcription factors, obtain the capacity for multilineage haematopoietic reconstitution in recipient mice.
To examine the extent to which the HE-7TF cells recapitulated HSC or progenitor-intrinsic gene expression programs, we compared RNA sequencing (RNA-seq) data from CD34 + CD38 − CD45 + HSPC populations harvested from bone marrow after 12 weeks of engraftment to publicly available data sets for HSCs and progenitors 23,24 . Global transcriptomic analysis showed that HE-7TF cells strongly correlated with haemogenic endothelium, engrafted cord blood HSCs, and fresh cord blood CD34 + cells (Pearson correlation > 0.7; Extended Data Fig. 7a). HE-7TF cells clustered closest to the haemogenic endothelium cells, thereby implying incomplete conversion of HE-7TF cells to an HSC-like state (Extended Data Fig. 7a). Furthermore, gene set enrichment analysis indicated a positive correlation with signatures associated with transcription factors required for specification and/or maintenance of HSCs (Extended Data Fig. 7b). Gene expression signatures correlated with activation of chemokine receptor signalling, and integrin signalling, pathways known to influence HSC homing and engraftment [25][26][27] . In summary, HE-7TF cells have activated specific transcription factor modules and pathways that reflect their capacity for multilineage haematopoietic engraftment and reconstitution in transplant recipients.
Furthermore, we observed that the expression levels of HOXA5, HOXA9, HOXA10, and SPI1 transgenes in HE-7TF cells were comparable to those in HSCs (Extended Data Fig.  7c), whereas the expression of LCOR, RUNX1, and ERG was suppressed (Extended Data Fig. 7c), indicating that the converted HSPC-like cells partly reprogram only certain loci to establish expression patterns characteristic of native HSCs. Recently, it was shown that the overexpression of HOXA genes in iPS-HPCs failed to adequately activate HOX target genes 28 , but in the case of HE-7TF cells, multiple HOXA target genes were successfully activated including SOX4 and ID2 (Extended Data Fig. 7c, right), highlighting the likely context-dependent expression of co-factors that are present in HE-7TF cells. Activation of HOX target genes also confirms the comparable expression level of HOX genes in the engrafted HE-7TF cells compared with other HSC/HSPCs, as well as upregulation of other HSC-specific transcription factors (GATA2, TGIF2, SOX4, EVI1) (Extended Data Fig. 7d).
To understand the heterogeneity of HE-7TF cells compared with cord blood HSCs, we performed Indrops single-cell RNA-seq on HE-7TF and cord blood HSCs 29 . Comparing the 500 most variable genes, we observed distinct clustering of HE-7TF and cord blood HSCs (Extended Data Fig. 7e, f); however, when we restricted our analysis to 62 canonical haematopoietic genes, we noted substantial overlap between cord blood HSCs and HE-7TF cells (Fig. 2d). Overall, these data demonstrate that HE-7TF cells recapitulate gene expression signatures consistent with their potential for multi-lineage haematopoietic engraftment, yet remain molecularly distinct from cord blood HSCs.

Characterization of differentiated cells
We examined erythroid, myeloid, and lymphoid cells recovered from engrafted mice, and compared their phenotypic and functional properties to those of cord blood HSC-derived cells. Definitive erythropoiesis is characterized by globin switching and enucleation 30 . Most erythroid cells generated from hPSCs express embryonic and fetal globins and retain nuclei 7 , whereas human erythroid cells recovered from our engrafted mice lacked expression of embryonic HBE, and expressed fetal HBG and adult HBB at levels comparable to colonyforming unit-erythroid (CFU-E) cells from human cord blood ( Fig. 3a and Extended Data Fig. 8a-d). Human GLY-A + cells derived from HE-7TF cells underwent enucleation, indicating developmental maturation of the engrafted human erythroid cells into definitive erythrocytes (Fig. 3b). To measure the response of human myeloid cells in NSG recipients to cytokine stimuli by activation of myeloperoxidase (MPO), we isolated human CD45 + CD15 + PECAM + neutrophils from engrafted bone marrow, and compared MPO production to engrafted cord-blood-derived cells. Phorbol myristate acetate (PMA) stimulation enhanced the production of MPO three fold relative to unstimulated neutrophils (Fig. 3c). Bona fide human HSCs generate functional T and B cells in NSG mice 19 . In the sera of NSG mice engrafted with HE-7TF cells, we detected human immunoglobulin-M and -G (IgM and IgG), indicating cooperative activity of T and B cells in mediating immunoglobulin class switching and antibody secretion (Fig. 3d). Moreover, inoculation with ovalbumin protein boosted levels of ovalbumin-specific IgM and IgG, indicating a functional immune response to a protein antigen (Fig. 3d). We isolated mature CD3 + T cells from bone marrow and observed interferon-γ (IFN-γ) production after re-stimulation using PMA/ionomycin (Fig. 3e). T cells develop from CD4 − CD8 − double-negative cells followed by CD4 + CD8 + double-positive cells that express surface T-cell antigen receptor (TCR)/CD3 complex, which differentiate to either CD4 or CD8 single-positive T cells in the thymus, and then migrate to blood and bone marrow 31 . At 8 weeks, HE-7TF cell-derived thymocytes were predominantly CD4 + CD8 + (55 ± 22%), with few CD4 + CD8 − (1.8 ± 0.42%) and CD4 − CD8 + (0.80 ± 0.36%) cells (Extended Data Fig. 8e). Human CD3 + T cells differentiated from cord blood HSCs in NSG mice possess either TCRαβ (>60%) or TCRγδ (<30%) 19 . HE-7TF cell-derived grafts produced most TCRαβ (89 ± 9.3%) and only low levels of TCRγδ (3.8 ± 6.6%) cells (Extended Data Fig. 8e). Development of a diverse population of antigen-specific T cells requires rearrangement of germline-encoded TCR genes 32 , largely mediated by recombination of the complementarity determining region 3 (CDR3) within variable (V) gene segments of the TCRA and TCRB genes. To determine clonotype diversity, we profiled the CDR3 region of TCRB on CD3 + T cells isolated from reconstituted mice using next-generation sequencing. We observed a high degree of combinatorial diversity in the V-gene segment usage of CD3 + T cells isolated from either cord-blood-engrafted NSG or HE-7TF cell-engrafted NSG mice, with the CDR3 length following a standard Gaussian distribution ( Fig. 3f and Extended Data Fig. 8f-h). In addition, splenocytes isolated from engrafted mice harboured human B cells, naive T cells, and memory T cells (Extended Data Fig. 8i versus cord blood engraft in Extended Data Fig.  8j). We then determined whether the myeloid and lymphoid progeny of HE-7TF cells were of clonal origin by comparing sequences of lentiviral integration sites 33 . Genomic DNA from myeloid (CD33 + ), B (CD19 + ), and T cells (CD3 + ) from bone marrow and thymus at 5 and 8 weeks after transplantation was sonicated, followed by adaptor-ligation PCR and deep sequencing (8-week-old mouse used is indicated in Fig. 2b). Among all recovered integration sites, those common to myeloid cells, B cells, and T cells represented up to 10% of the total and were found in each individual recipient (Extended Data Fig. 9a-c), indicating that reconstitution is driven, at least in part, by single clonal HSC-like cells with multi-lineage potential. None of the genes adjacent to integrated proviruses have previously been linked to clonal expansion in previous HIV/AIDS studies (Extended Data Fig. 9d).

Discussion
The generation of functional HSC-like cells from PSCs has been a long-sought goal in haematology research. By directed differentiation of hPSCs to haemogenic endothelium followed by in vivo screening of transcription factors for haematopoietic progenitor specification, we have identified seven transcription factors that together confer HSC-like engraftment, self-renewal, and multi-lineage capacity. Although there is still a molecular and functional gap between our engineered cells and bona fide HSCs in their robustness of engraftment and full recapitulation of terminally differentiated cells (summarized in Extended Data Fig. 10 and Supplementary Table 1), our combined approach of morphogen-driven differentiation and transcription-factor-mediated cell fate conversion produces HSPCs from PSCs.
Combinations of transcription factors have been introduced into differentiated blood cells 2,7 to endow HSC-like properties in a mouse system 2 , but transplantable human HSCs with multi-lineage capacity and immunological functionality have so far not been derived from hPSCs. Recent advances in the directed differentiation of PSCs to definitive haemogenic endothelium have provided a supportive context for screening of HSC-specifying transcription factors. Each of the identified transcription factors in our study plays a role in HSC development, maintenance of long-term HSCs, or lineage commitment. RUNX1 is essential for haematopoietic commitment of haemogenic endothelium and can convert endothelial cells to haematopoietic progenitor cells 3,34,35 . LCOR, a component of a histone deacetylation complex, is mutated in B-cell lymphoma, suggesting a role in Blymphopoiesis 36,37 , but this factor has not previously been implicated in HSC function and its role remains to be defined. SPI1 (also known as PU.1) is required for haematopoietic progenitor cell emergence and regulates myeloid specification 38 . HOX family members have been reproducibly implicated in haematopoiesis across species 39,40 . HOXA9 is the key homeotic gene that defines HSC identity 41,42 , interacting with ERG to support HSC renewal during embryogenesis and stress haematopoiesis [43][44][45] , suggesting a basis for the functional cooperation of HOXA9 and ERG in our system. HOXA5 is a transcriptional target of Notch signalling in T-cell progenitors along with HOXA9 and HOXA10, consistent with a role in T-lymphopoiesis 46 . These factors share binding sites in the genome and cooperate to recruit chromatin modulators (for example, RUNX1 and HOXA families) 45,47 to induce and maintain HSPCs.
Our factor-minus-one approach to the defined seven transcription factors suggested that they each contribute to a common gene regulatory network with some redundancy, as exclusion of individual factors did not fully abrogate engraftment.
Our study suggests that we are tantalizingly close to realizing the potential of derivation of HSC-like cells from PSCs. Such cells, when derived from patients with genetic blood disorders, offer considerable promise for modelling human blood disease, for humanizing mice for research applications, and for testing the capacity of gene therapy vectors or pharmacological agents to restore haematopoietic function. Our ultimate goal remains the derivation of bona fide transgene-free HSCs for applications in research and therapy.

Online Content
Methods, along with any additional Extended Data display items and Source Data, are available in the online version of the paper; references unique to these sections appear only in the online paper.

Methods
A step-by step protocol describing the HSPC conversion of human PSCs can be found at Protocol Exchange 48 .

hPSC culture, registration, and deposition of hPSC lines
All experiments were performed with H9 hESC (NIHhESC-10-0062), PB34 iPS 49 , MSC-iPS1 50 , 1045-iPSC, and 1157-iPSC established by the hES Core Facility at Boston Children's Hospital. Human ESCs and iPSCs were maintained on hESC-qualified Matrigel (BD) in mTeSR1 media (Stem Cell Technologies) or mouse embryonic fibroblasts (GlobalStem) feeders in DMEM/F12 + 20% KnockOutSerum Replacement (Invitrogen), 1 mM L-glutamine, 1 mM NEAA, 0.1 mM β-mercaptoethanol, and 10 ng ml − 1 bFGF on 10 cm gelatinized culture dishes. Medium was changed daily and cells were passaged 1:4 onto fresh feeders every 7 days using standard clump passaging with dispase. Morphology of PSCs was checked by microscopy daily. As a quality control, only dishes with more than 70% of typical PSC colonies were processed for embryoid body formation. Cell lines were tested for mycoplasma routinely.

Lentiviral gene transfer
At day 3 of EHT culture, haemogenic endothelium cells were beginning to produce potentially haematopoietic 'round' cells; the occurrence of this phenomenon was used as quality control of haemogenic endothelium induction and transition to haematopoietic cells for each batch of experiments. The infection medium was EHT culture medium supplemented with Polybrene (8 μg ml −1 , Sigma). Lentiviral infections were performed in a total volume of 250 μl (24-well plate). The multiplicity of infection for the factors was as follows: Library 3.0 for each, ERG 5.0, HOXA5 5.0, HOXA9 5.0, HOXA10 5.0, LCOR 5.0, RUNX1 5.0, SPI1 5.0, LCOR-HOXA9-HOXA5 2.0, and RUNX1-ERG 2.0. Haemogenic endothelium was vulnerable to damage during spinoculation, thus infections were performed static for 12 h, then 250 μl of fresh EHT medium was supplemented to dilute Polybrene. Parallel wells were cultured for an additional 3 days to measure infection efficiency by the percentage of GFP + DAPI cells by FACS, achieving 30-70% of infection efficiency.

In vitro screening via CFU
Followed by lentiviral gene transfer, cells were maintained for 5 days in EHT culture medium supplemented with doxycycline (2 μg ml − 1, Sigma) to induce transgene expression in vitro. Fifty thousand cells were plated into 3 ml complete methylcellulose (H4434; StemCell Technologies). Additional cytokines added were 10 ng ml −1 FLT3, 10 ng ml −1 IL6, and 50 ng ml −1 TPO (R&D Systems). The mixture was distributed into two 60 mm dishes and maintained in a humidified chamber at 37 °C for 14 days. Colonies were scored manually or using a BD Pathway 855 fluorescent imager. At 14 days, granulocyte, erythrocyte, monocyte, megakaryocyte (GEMM) colonies were picked up by P20 pipette. Between 10 and 20 GEMM colonies were picked with 2 or 3 biological replicates. A QIAamp DNA Micro Kit (Qiagen) was used to collect and prepare total genomic DNA for PCR detection of transgenes. Nested PCR reaction was as follows: first round with LNCX forward primer (5′-AGC TCG TTT AGT GAA CCG TCA GAT C-3′) and EGFP N reverse primer (5′-CGT CGC CGT CCA GCT CGA CCA G-3′), 95 °C 5 min, 36 cycles of (95 °C for 30 s, 60 °C for 30 s, 72 °C for 5 min), 72 °C for 5 min, 4 °C hold; second round with forward primer for each gene and HA reverse primer (5′-TCT GGG ACG TCG TAT GGG TA-3′), 95 °C 5 min, 36 cycles of (95 °C for 30 s, 60 °C for 30 s, 72 °C for 30 s), 72 °C for 5 min, 4 °C hold.

In vivo screening via transplantation
Twelve hours after lentiviral gene transfer, cells were recovered by dispase for 5 min at 37 °C, and washed by PBS three times to ensure no carry-over of virus. Cells were resuspended at 0.3 × 10 5 to 3.0 × 10 5 cells per 25 μl buffer (PBS + 2% FBS from StemCell Technologies) and kept on ice until injection. Thirty thousand to 3.0 × 10 5 cells were intrafemorally injected in to NOD/LtSz-scidIL2Rgnull (NSG) mice and treated with doxycycline as described below (see section on 'Mouse transplantation'). Up to 100 μl peripheral blood was collected every 2-4 weeks, to 14 weeks. Mice were euthanized and bone marrow and thymus removed at 8-14 weeks. For transgene detection in engrafted cells, each lineage of cells was FACS-sorted from bone marrow. Myeloid cells: CD33 APC (P67.6; BD), CD45 PE-Cy5 (J33; Coulter). B cells: CD19 PE (HIB19; BD), CD45 PE-Cy5 (J33; Coulter). T cells: CD3 PE-Cy7 (SK7; BD), CD45 PE-Cy5 (J33; Coulter). Between 10,000 and 50,000 cells were isolated with 2 or 3 biological replicates for multiple cell lines (iPSCs and ESCs). The QIAamp DNA Micro kit (Qiagen) was used to collect and prepare total genomic DNA for PCR detection of transgenes. Nested PCR reaction was performed similarly to the in vitro screening described in the above section.

Mouse transplantation
NOD/LtSz-scidIL2Rgnull (NSG) mice (The Jackson Laboratory) were bred and housed at the Boston Children's Hospital animal care facility. Animal experiments were performed in accordance with institutional guidelines approved by Boston Children's Hospital Animal Care Committee. Intrafemoral transplantations were conducted with 6-to 10-week-old female mice irradiated (250 rad) 12 h before transplantation. Before transplantation, mice were temporarily sedated with isoflurane. A 26-half-gauge needle was used to drill the femur and a 0.3 × 10 5 to 3.0 × 10 5 range of cells was transplanted in a 25 μl volume using a 28.5-gauge insulin needle. Sulfatrim was administered in drinking water to prevent infections after irradiation. Doxycycline Rodent Diet (Envigo-Teklad Diets; 625 p.p.m.) and doxycycline (1.0 mg ml −1 ) were added to the drinking water to maintain transgene expression in vivo for 2 weeks (ref. 2). Secondary transplantation was performed with 1,000-3,000 human CD34 + cells (isolated from bone marrow by magnetic cell isolation with CD34 microbeads) at 8 weeks. Isolated cells were resuspended at 1,000-3,000 cells per 25 μl buffer (PBS + 2% FBS from StemCell Technologies) and kept on ice until injection. Cells were intrafemorally injected in to NSG mice. Sorted CD34 + CD43 + CD45 + (25,000 cells) or CD34 + CD43 − CD45 − (25,000 cells) HE-7TF cells were either intrafemorally or intravenously injected. For non-irradiated c-Kit-deficient immune-deficient recipients, the NOD.Cg-Kit W-41J Tyr + Prkdc scid Il2rg tm1Wj l/ThomJ model was used (The Jackson Laboratory). Investigators were blinded for the analysis of mice. The experiments were not randomized. No statistical methods were used to predetermine sample size. For this analysis, multi-lineage engraftment was defined as chimaerism of human CD45 + cells in bone marrow encompassing four distinct lineages (myeloid, erythroid, B-and T-lymphoid, each comprising more than 1% of engrafted human CD45 + cells) in this study.

Flow cytometry
Cells grown in EHT culture or harvested animal tissues were stained with the following antibody panels. Haemogenic endothelium panel: Coulter), CD15 APC (HI98; BD), and CD31/PECAM PE (WM59; BD). All stains were performed with fewer than 1 × 10 6 cells per 100 μl staining buffer (PBS + 2% FBS) with 1:100 dilution of each antibody, for 30 min at 4 °C in the dark. Compensation was performed by automated compensation with anti-mouse Igk negative beads (BD) and cord blood MNC stained with individual antibodies. All acquisitions were performed on a BD Fortessa cytometer. For detection of engraftment, human cord-blood-engrafted mouse marrow was used as a control to set gating; sorting was performed on a BD FACS Aria II cell sorter.

Quantitative PCR
RNA extraction was performed using an RNAeasy Microkit (Qiagen). Reverse transcription was performed using Superscript III (> 5,000 cells) or VILO reagent (< 5,000 cells) (Invitrogen). Quantitative PCR was performed in triplicate with SYBR Green (Applied Biosystems). Transcript abundance was calculated using the standard curve method. Primers used for globin genes were as follows 52

ELISA of terminally differentiated cells
FACS-isolated neutrophils (CD15 + PECAM + CD45 + ) and T cells (CD3 + CD45 + ) were cultured in IMDM + 10%FBS overnight in 96-well plates (flat-bottom), seeding 5,000-20,000 cells per well obtained from mice engrafted over 10% in primary recipients, or pooled mice engrafted less than 5% in primary recipients. Then supernatant was taken and analysed by MPO-or IFN-γ-ELISA -Ready-SET-Go! Kit (eBioscience) according to the manufacturer's protocol. The amount of IFN-γ was normalized per 1,000 cells. PMA (20 ng ml −1 ) and ionomycin (1 μg ml −1 ) were added to either neutrophils or T cells, then cells were cultured overnight (6-18 h). Human Ig production was measured from 50 μl of serum from NSG mice at 8 weeks (IgM) and 14 weeks after engraftment (IgG). Immunization of mice was done with OVA (F5503, Sigma) with Freund's complete adjuvant (F5881, Sigma), followed by booster doses of Freund's incomplete adjuvant (F5506, Sigma) according to the manufacturer's instructions. Six to 14 weeks after engraftment, mice were injected with antigen OVA (0.1%) emulsified in complete adjuvant subcutaneously at two sites on the back, injecting 100 μl at each site. A booster injection of antigen OVA (0.1%) emulsified in incomplete adjuvant was administered 14 days after immunization. The booster was given as a single subcutaneous injection with 100 μl at one site on the back. A serum sample was isolated from mice 7 days after the first booster dose, and human ova-specific antibody concentration was tested with an ovalbumin-specific IgG, OVA sIgG, ELISA Kit (Mybiosource, MB S700766) for human Ova-specific IgG and a Human Anti-Ovalbumin (Gal d 2) IgM ELISA Kit (Alpha Diagnostic, 670-145-OVM) to detect human Ova-specific IgG and IgM, respectively. The technical replicates were done with three measurements of the same experimental setup.

T-cell receptor CDR3 sequencing
Human CD3 + T cells were FACS-isolated from thymus of engrafted NSG mice. Purified DNA was subjected to next-generation sequencing of CDR3 using immunoSEQ (Adaptive Biotechnology, Seattle, Washington, USA) and analysed with immunoSEQ Analyzer software (Adaptive Biotechnology).

Affymetrix SNP 6.0 genotyping of engrafted cells
Aliquots (250 ng) of genomic DNA from human CD45 + bone marrow cells (CD45 PE-Cy5 (J33; Coulter)) from engrafted NSG mice and original PSCs (two biological replicates) were digested with either Nsp1 or Sty1. A universal adaptor oligonucleotide was then ligated to the digested DNAs. The ligated DNAs were diluted with water and three 10 μl aliquots from each well of the Sty 1 plate and four 10 μl aliquots from each well of the Nsp 1 plate were transferred to fresh 96-well plates. PCR master mix was added to each well and the reactions cycled as follows: 94 °C for 3 min; 30 cycles of 94 °C for 30 s, 60 °C for 45 s, 68 °C for 15 s; 68 °C for 7 min; 4 °C hold. After PCR, the seven reactions for each sample were combined and purified by precipitation from 2-propanol/7.5 M ammonium acetate. The ultraviolet absorbance of the purified PCR products was measured to ensure a yield ≥4 μg μl −1 . Forty-five microlitres (≥ 180 μg) of each PCR product were fragmented with DNase 1 so the largest fragments were < 185 base pairs. The fragmented PCR products were then endlabelled with a biotinylated nucleotide using terminal deoxynucleotidyl transferase. For hybridization, the end-labelled PCR products were combined with hybridization cocktail, denatured at 95 °C for 10 min and incubated at 49 °C. Two hundred microlitres of each mixture was loaded on a GeneChip and hybridized overnight at 50 °C and 60 r.p.m. After 16-18 h of hybridization, the chips were washed and stained using the GenomeWideSNP6_450 fluidics protocol with the appropriate buffers and stains. After washing and staining, the GeneChips were scanned on a GeneChip Scanner 3000 using AGCC software. Genotype calls (chp files) were generated in Affymetrix Genotyping Console using the default parameters. The resulting chp files were analysed for familial relationships using the identity by state algorithm implemented in Partek Genomics Suite.

RNA-seq
Engrafted human CD34 + CD38 − CD45 + HSCs were isolated from bone marrow from either iPS-derived haemogenic endothelium-or cord-blood-injected NSG mice, then RNA was purified with an RNeasy Micro kit (Qiagen). Quality control of RNA was done by Bioanalyzer and qubit analysis. Samples that passed quality control were converted into libraries and sequenced by a Nextseq PE75 kit. Raw reads were aligned to the human genome/transcriptome using TopHat2 software 53 . Gene expression levels and reads per kilobase per million (RPKM) values were estimated using a htseq-count tool 54 and the edgeR package 55 . For a legitimate transcriptome-wide comparison, we retrieved raw RNAseq data of two published from the Gene Expression Omnibus database (Long non-coding RNA profiling of human lymphoid progenitors reveals transcriptional divergence of B-cell and T-cell lineages, accession number GSE69239; Distinct routes of lineage development reshape the human blood hierarchy across ontogeny, accession number GSE76234) and calculated RPKM values using a same analysis pipeline.

Single-cell RNA-seq with in-droplet-seq technology
Engrafted human CD34 + CD38 − CD45 + HSCs were isolated from bone marrow from either iPS-derived haemogenic endothelium-or cord-blood-injected NSG mice, then processed for in-droplet barcoding according to a previous report 29 . The library was QCed with Bioanalyzer and sequenced by Nextseq PE 75 kit. The t-SNE algorithm was used to visualize transcriptome similarities and population heterogeneity of cord blood HSCs and iPSC-derived HSCs. The t-SNE algorithm performs a dimensionality reduction of multidimensional single-cell RNA-seq data into a low-dimensional space, preserving pairwise distances between data points as well as possible, allowing a global visualization of subpopulation structure and cell-cell similarities. We used the R package tsne in our analyses. The t-SNE map was initialized with point-to-point distances computed by classical multidimensional scaling, and the R plot function was used to visualize t-SNE maps annotated by cord blood or iPSC-derived HSCs. Plots showing t-SNE maps coloured by expression of selected genes were created using the ggplot2 package. For subpopulation identification, we used the top 500 genes with the highest variance to elucidate global differences among single cells. To assess transcriptome similarities in terms of induction of haematopoietic genes in iPSC-derived HSCs, we used 62 haematopoietic genes for t-SNE analysis in Supplementary Table 2.

Gene set enrichment analysis
Gene set enrichment analysis was performed with the desktop client version (javaGSEA, http://software.broadinstitute.org/gsea/downloads.jsp) with default parameters. RPKM values from the 7F-HSPC were obtained from the RNA-seq (described previously). These values were normalized to a terminally differentiated cell and the normalized values were used to rank the most differentially expressed genes. These differentially expressed genes were used to run gene set enrichment analysis with gene sets obtained from mSigDB (KEGG, Hallmark, immunological, transcription factors, and chemical and genetic perturbations gene sets were used). In addition, gene sets specific to progenitors, cord blood, or fetal-liver HSC were obtaine d from previous reports 16,56 . FDR < 0.25 with P < 0.05 was considered significant.

Lentiviral integration detection by ligation-mediated PCR and next-generation sequencing
CD33 + myeloid cells, CD19 + B cells, and CD3 + T cells were isolated from bone marrow from haemogenic-endothelium-injected NSG mice. Genomic DNA was purified with a QIAamp DNA Micro kit (Qiagen). Ligation-mediated PCR-based detection of lentiviral integration sites was done with a Lenti-X Integration Site Analysis Kit (Clontech) according to the manufacturer's instructions. Sequencing-based detection (integration sequencing) was done as previously described 57 .

Data availability
RNA-seq from this study have been deposited in the Gene Expression Omnibus under accession number GSE85112. We retrieved raw RNA-seq data of two published from the Gene Expression Omnibus database (Long non-coding RNA profiling of human lymphoid progenitors reveals transcriptional divergence of B-cell and T-cell lineages, accession number GSE69239; Distinct routes of lineage development reshape the human blood hierarchy across ontogeny, accession number GSE76234). The data are all in the paper, or are available from the corresponding author upon reasonable request if not.

Extended Data
Extended Data Figure 1. Induction of haemogenic endothelium from hPSCs a, Embryoid bodies formed from hPSCs. Subsequent panels show FACS analysis of day 8 embryoid bodies before magnetic cell isolation. b, CD34 + FLK1 + population indicates haemogenic endothelium cells.CD235A and FLK1 plots show percentage of FLK1 + CD235A − cells in embryoid bodies, further gated with CD43 and CD34 plots to detect haemogenic endothelium (CD34 + FLK1 + CD235A − CD43 − ). c, d, Haemogenic endothelium isolated at day 8 ( Supplementary Fig. 1) was further cultured in EHT medium for the indicated number of days. qRT-PCR showed (c) downregulation of endothelial genes and (d) upregulation of haematopoietic genes. HSPC genes (RUNX1, SCL/TAL1) peak on day 3 of EHT culture; consequently, this time point was chosen for introducing transcription factors followed by transplantation in subsequent experiments. e, Microscopy and FACS analysis on day 7 of EHT showed the appearance of haematopoietic cells (CD34 + CD45 + ). Data shown as mean ± s.d. Figure 2. Rationale for selecting candidate transcription factors for library screening a, Heatmap of the expression profile of HSC-specific transcription factors (TFs) in haemogenic endothelium (CD34 + FLK1 + CD43 − CD235A − ) versus fetal-liver HSCs (CD34 + CD38 − CD90 + CD45 + ). Twelve HSC-specific transcription factors enriched in fetalliver HSCs relative to haemogenic endothelium (blue box) were cloned individually into a Dox-inducible lentiviral vector. b, The expression level of SOX17, a marker of haemogenic endothelium, was 2.4-fold higher in haemogenic endothelium (N = 7) than fetal-liver HSCs (N = 10). *P < 0.001. c, The library was supplemented with genes identified in previous screens. Candidates in red were drawn from a previous screen by the Rafii group 3 ; blue from the Rossi group 2 ; green from the Daley group 7 . The final library of 26 candidates is shown. d, Diagram of the Dox-ON pInducer-21 lentiviral vector used in this study (top). rtTA3 and eGFP are driven by EF1α -promoter; infection efficiency is indicated by the GFP signal. e, GFP analysis by FACS and fluorescence 3 days after infection of haemogenic endothelium cells. Routinely, over 50% transduction efficiency was achieved. For transplantation, haemogenic endothelium cells infected at day 3 EHT were incubated for 24 h and injected into mice. Dox was provided for 2 weeks in vivo after transplantation into sub-lethally irradiated immune-deficient NSG mice. f, Scheme for screening the 26 transcription factors library, and resulting haematopoietic chimaerism. hPSC-derived haemogenic endothelium was cultured for an additional 3 days in EHT medium, then infected with the library of 26 transcription factors. Infected cells (100,000) were injected intrafemorally into sub-lethally irradiated (250 rad) NSG mice, which were treated with doxycycline for 2 weeks to induce transgene expression in vivo. g, FACS analysis of bone marrow and thymus of an engrafted recipient is shown. Human CD45 + cells from bone marrow were analysed for CD33 + myeloid cells, CD19 + B cells, and CD3 + T cells as indicated. Thymic cells were analysed for human CD4 and CD8, with percentages of single and double-positive cells indicated. A photograph of an engrafted thymus is shown (bottom right). Data shown as mean ± s.d.  Tables 1 and 2 show compiled data. c, Representative photomicrographs and FACS plots of robust EHT (top) and failed EHT (bottom). The appearance of round cells budding from adherent haemogenic endothelium cells by direct microscopic visualization was assessed routinely. RUNX1c+24 reporter positivity of haemogenic endothelium cells at day 3 of culture likewise reflected robustness of EHT (20% versus 5%).

Extended Data Figure 4. Identification of transcription factors that confer multi-lineage haematopoiesis in vivo
a, SNP analysis of engrafted blood cells compared with hiPSC and cord blood MNCs. SNP array genotyping was conducted to confirm the origin of engrafted human cells from hiPSCs in representative mice. SNP genotypes for human CD45 + cells taken from bone marrow of engrafted mice, original hPSCs and reference cord blood MNCs were clustered, as shown, showing concordance of original hiPSCs and human cells recovered from bone marrow of engrafted mice. b, Tabular presentation of SNP data: concordance > 99% indicates identity between cell types. Green highlight shows that original hPSC line (34hiPSC or H9 hESC) corresponds with human CD45 + cells in engrafted mice. Comparison with different hPSC lines or cord blood MNCs did not achieve 99% concordance, validating the SNP array as a means of defining origin of cells. c, Transgene detection in engrafted cells of primary recipients. CD33 + myeloid cells, CD19 + B cells, and CD3 + T cells were isolated from the human CD45 + population of bone marrow at 10 weeks from five independent mice. Genomic DNA of each cell type was analysed by PCR to detect integrated lentivirus. Identification number of recipients is shown (numbers 1, 5, and 6 were engrafted with hiPSC-derived haemogenic endothelium; numbers 2 and 3 were engrafted with hESCderived haemogenic endothelium). L, left femur (injected side); R, right femur; +, positive control (lentiviral vector with each transcription factor). -, negative control (lentiviral vector without transcription factor). d, Overlap between transcription factors that conferred in vivo engraftment from hiPSC-and hESC-derived cells injected into mice and in vitro multilineage CFU potential. Transcription factors detected from genomic DNA PCR from in vitro colony screening and in vivo engraftment screening are shown. For in vivo screening, multiple cell lines (iPSC and ESC) were used. Factors detected by in vivo screening were overlapped with those detected by in vitro screening (RUNX1 and LCOR). Consistently, MYB and RORA were detected by in vitro screening as previously reported 7 Thymus was collected at 8 weeks and  analysed for T-cell markers (CD4, CD8, CD3, TCRαβ, and TCRγδ). TCR phenotyping of the CD3 + population is shown on the right. One out of three recipients showed the presence of TCRγδ. Three thymic engrafted mice from independent experiments each. f-h, TCR rearrangement of thymocytes from cord blood CD34 + and HE-7TF engrafted in NSG. CD3 + T cells were isolated from NSG mice engrafted with (f) cord blood HSCs or (g) HE-7TF cells. Purified DNA was subjected to next-generation sequencing of the CDR3 using immunoSEQ (Adaptive Biotechnology) and analysed with the immunoSEQ Analyzer software (Adaptive Biotechnology). A high degree of combinatorial diversity in the V-gene segment usage was observed in CDR3 length, following a standard Gaussian distribution. h, Frequency of clonotype of T cells. i, Flow cytometric phenotyping of spleens from engrafted HE-7TF cells versus cord blood (j). Spleens were collected at 8 weeks and human CD45 + cells were analysed for T-cell markers (TCRβ, CD4, CD45RO, and CD45RA), and B-cell marker (CD19).

Supplementary Material
Refer to Web version on PubMed Central for supplementary material. Lentiviral vector with green fluorescent protein (GFP) was used as negative control. N = 2 mice analysed in two independent experiments with three mice each (two mice each for GFP). *P < 0.05. Average lineage distribution from each group is shown (right). Data shown as mean ± s.d. Sugimura Fig. 3. c, Limiting dilution assay of HE-7TF cells during secondary transplantation. Three thousand or 1,000 CD34 + cells isolated from bone marrow of primary recipients were transplanted to secondary recipients. Multi-lineage engraftment was counted as response. Confidence intervals of 1/(stem cell frequency) were calculated by ELDA (http://bioinf.wehi.edu.au/software/elda/) according to Poisson distribution. Limiting dilution assay of cord blood was used as reference. d, t-Distributed stochastic neighbour embedding (t-SNE) plot of inDrops single-cell RNA-seq data for CD34 + CD38-CD45 + measurements from three independent experiments performed each with three technical replicates. c, Neutrophils. Human CD45 + cells from bone marrow of NSG mice engrafted with HE-7TF cells at 8 weeks, analysed for human PECAM and CD15. Myeloperoxidase activity of isolated CD45 + PECAM + CD15 + neutrophils was measured with or without PMA stimulation. Neutrophils from NSG engrafted with cord blood HSCs were used as reference. The basal level of MPO of haemogenic endothelium was 0.40-fold less than cord blood (P = 0.036). PMA stimulation increased MPO production 2.5-fold (P = 0.010) (cord blood) and 3.0-fold (P = 0.10) (seven transcription factor). Stimulated MPO production of HE-7TF was 0.47-fold (0.049) versus cord blood. *P < 0.05. N = 6 measurements from two independent experiments performed each with three technical replicates. d, Human immunoglobulin. Serum was isolated from NSG mice engrafted with cord blood HSCs or HE-7TF cells at 8 weeks and 14 weeks. Production of IgM (8 weeks) and IgG (14 weeks) was measured by enzyme-linked immunosorbent assay (ELISA). Serum from mock transplant and NSG engrafted with cord blood HSCs was used as reference. Human ovalbumin-specific IgM and IgG after immunization with ovalbumin (ova) in serum of mice engrafted with HE-7TF cells, as indicated (right two bars of each panel). *P < 0.05. N = 6 measurements from two independent experiments (three for cord blood versus HE) performed with three technical replicates. e, Production of IFN-γ from human CD3 + cells isolated from bone marrow of NSG mice engrafted with cord blood HSCs and HE-7TF cells at 8 weeks, and cultured with or without PMA/ionomycin stimulation for 6 h. CD3 + T cells from NSG engrafted with cord blood HSCs were used as reference. The basal level of IFN-γ of haemogenic endothelium was 0.53-fold (P = 0.073) versus cord blood. PMA stimulation increased IFN-γ production 4.4-fold (P = 0.17) (cord blood) and 3.0-fold (P = 0.16) (haemogenic endothelium). Stimulated IFN-γ production of haemogenic endothelium was 0.36-fold (0.039) versus cord blood. IFN-γ production from cord blood HSCs and haemogenic endothelium themselves are shown as reference. *P <0.05. N = 6 measurements from two independent experiments performed with three technical replicates. f, TCR repertoire of engrafted T cells. Human CD3 + thymocytes of NSG mice engrafted with HE-7TF cells at 8 weeks were analysed by immunoSEQ to detect TCR rearrangement. CD3 + thymocytes from cord blood HSC-engrafted NSG were used as a reference.