ABSTRACT: Erythropoiesis in mammals replenishes the circulating red blood cell (RBC) pool from hematopoietic stem/progenitor cells (HSPCs). Two distinct erythropoietic programs have been described. In the first trimester, hematopoietic precursors in the fetal yolk sac follow a primitive pattern of erythropoiesis. However, in the second trimester, hematopoietic stem cells (HSCs) from the fetal liver and later from the bone marrow differentiate by a definitive program of erythropoiesis to yield enucleated erythrocytes. RBCs can also be derived from human induced pluripotent stem cells (hiPSCs) and can express many of the red cell proteins required for normal erythrocyte function, presaging in vitro RBC production for clinical use. However, expansion and enucleation from hiPSCs is less efficient than with erythroblasts (EBs) derived from adult or cord blood progenitors. We hypothesized that substantial differential gene expression during erythroid development from hiPSCs compared to that from adult blood or cord blood precursors could account for these hitherto unexplained differences in proliferation and enucleation. We have therefore grown EBs from human adult and cord blood progenitors and from hiPSCs. Gene expression during erythroid culture from each erythroblast source was analyzed using algorithms designed to cluster co-expressed genes in an unsupervised manner and the function of differentially expressed genes explored by gene ontology. Using these methods we identify specific patterns of gene regulation for adult- and cord- derived EBs, regardless of the medium used, that are substantially distinct from those observed during the differentiation of EBs from hiPSC progenitors which largely follows a pattern of primitive erythropoiesis. A total of 74 samples were analyzed. Our primary goal was to compare erythropoiesis from hiPSC-derived EBs with that from adult or cord blood PBMC-derieved EBs. We wanted to provide a high resolution atlas for investigators dissecting the programs of erythroid development and highlight the disparities between erythropoiesis in vitro from diverse stem/progenitor cell origins. Erythroblasts were grown in vitro from CD34+ erythroid progenitors isolated from Human Adult PBMCs (51 AB-EB samples), Human Cord PBMCs (10 CB-EB samples) or human induced pluripotent stem cells (13 hiPSC-EB samples). Three culture media were used for AB-EBs: Standard Erythroid Medium (SEM) containing 2% FBS (SEM-F), SEM in which FBS was replaced with 1% BSA (SEM-B) and SEM optimized for culture of EBs from hiPSCs (SEM-i). After culturing for 4,7,10,12, or 14 days in SEM-F, AB-EBs media were harvested in triplicate and sorted using flow cytometry (FACS) into enriched homogeneous erythroblast populations based on cell surface expression of CD36 (used for days 4-7), CD71 and CD235a. Magnetic beads specific for CD34 were used to isolate CD34+ day 0 progenitors. Magnetic beads were also used to isolate AB-EBs grown in SEM-F and SEM-i after 7 days (CD71+) and 14 days (CD235a beads). AB-EBs were also cultured in SEM-B for 4,7, or 14 days and sorted by FACS. CB-EBs were cultured in SEM-F for 7 or 14 days, and sorted by FACS as for AB-EBs. HiPSC-EBs were cultured in SEM-i for 7 or 14 days, and isolated using magnetic beads as fof AB-EBs. Total RNA was isolated from all these populations in triplicate (with a few exceptions), and also at day zero from CD34+ AB-EB progenitors, CB-EB progenitors and hiPSCs using magnetic beads. Additionally, CD31+ day zero hiPSCs were isolated using magnetic beads. RNA was extracted using MiRVana (Life Technologies) and DNA removed using Turbo DNA-Free (Life Technologies). The microarray target was prepared from 100ng total RNA for hybridization to Affymetrix GeneChip Human Transcriptome 2.0 ST microarrays (HTA2) using the Ambion WT protocol (Life Technologies) and Affymetrix labelling and hybridization kits (Affymetrix). Labelled DNA mean yield was 13.5 μg (minimum: 8.5 μg; maximum: 20.5 μg). HTA2 arrays were hybridized with 5 μg of labelled DNA. The Affymetrix GeneChip Fluidics Station 450 was used to wash and stain the arrays with streptavidin–phycoerythrin, according to the standard protocol for eukaryotic targets (IHC kit, Affymetrix). Arrays were scanned with an Affymetrix GeneChip scanner 3000 at 570nm. Intensity values were determined using GeneChip Operating Software (Affymetrix)and normalized by the Robust Multiarray Average algorithm using Affymetrix Expression Console software. Statistical analysis of differential expression was conducted using the Linear Models for Microarray Data package from the Bioconductor suite in R (www.bioconductor.org). The B values, p-values, and fold changes were used to select differentially expressed (DE) genes reaching a minimum linear expression value of 100 in all replicates of at least one sample group (p ≤ 0.01, fold change (FC) ≥ 2, B > 2.945). Principal component analysis (PCA) was conducted and displayed using Python packages. Hierarchical clustering (HC) analysis was performed using Python SciPy. Heat maps were generated using Python matplotlib. More advanced consensus clustering was conducted using SMART and Bi-CoPaM algorithms in MATLAB, using Biclustering from the biclust package in R. Biopython to fetch 1kb of genomic DNA sequence upstream of the transcriptional start sites (TSS) of each gene in the cluster. MEME suite was used for transcription factor binding site (TFBS) analysis, seeking homology with motifs in the Jolma database. Differentially-expressed genes were examined for enriched functional ontologies using GeneCoDis. Expression profiles of selected genes were validated by QPCR. Gene expression profiles (GEPs) of AB-EBs in SEM-F were as expected for erythroid genes and similar to those observed for CB-EBs in SEM-F. The media used had little effect on AB-EB GEPs. However, substantial robust and specific GEPs were observed in hiPSC-EBs which may, at least in part, underlie reduced proliferation and enucleation of erythroid cells from hiPSCs. These include genes involved in autophagy and cell cycle.