Impact of regulatory variation across human iPSCs and differentiated cells [RNA-seq]
Ontology highlight
ABSTRACT: Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. Here we investigate the use of iPSCs and iPSC-derived cells to study the impact of genetic variation across different cell types and as models for the genetics of complex disease. We established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring RNA, chromatin accessibility and DNA methylation. Regulatory variation between individuals is lower in iPSCs than in the differentiated cell types, consistent with the intuition that developmental processes are generally canalized. While most cell-type- specific regulatory effects lie in chromatin that is open only in the affected cell-types, we find that 20% of cell-type specific effects are in shared open chromatin. Finally, we developed deep neural network models to predict open chromatin regions in these cell types from DNA sequence alone and were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on tissue-specific chromatin accessibility. Our results provide a framework for using iPSC technology to study regulatory variation in cell types that are otherwise inaccessible. Keywords: Expression profiling by high throughput sequencing
Project description:Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation on gene regulation across different cell types and as models for studies of complex disease. To do so, we established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression levels, chromatin accessibility and DNA methylation. Our analysis focused on a comparison of inter-individual regulatory variation across cell types. While most cell type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell type-specific regulatory QTLs are in shared open chromatin. This observation motivated us to develop a deep neural network to predict open chromatin regions from DNA sequence alone. Using this approach, we were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell type-specific chromatin accessibility.
Project description:Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation on gene regulation across different cell types and as models for studies of complex disease. To do so, we established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression levels, chromatin accessibility and DNA methylation. Our analysis focused on a comparison of inter-individual regulatory variation across cell types. While most cell type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell type-specific regulatory QTLs are in shared open chromatin. This observation motivated us to develop a deep neural network to predict open chromatin regions from DNA sequence alone. Using this approach, we were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell type-specific chromatin accessibility.
Project description:Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation on gene regulation across different cell types and as models for studies of complex disease. To do so, we established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression levels, chromatin accessibility and DNA methylation. Our analysis focused on a comparison of inter-individual regulatory variation across cell types. While most cell type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell type-specific regulatory QTLs are in shared open chromatin. This observation motivated us to develop a deep neural network to predict open chromatin regions from DNA sequence alone. Using this approach, we were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell type-specific chromatin accessibility.
Project description:Induced pluripotent stem cells (iPSCs) are an essential tool for studying cellular differentiation and cell types that are otherwise difficult to access. We investigated the use of iPSCs and iPSC-derived cells to study the impact of genetic variation on gene regulation across different cell types and as models for studies of complex disease. To do so, we established a panel of iPSCs from 58 well-studied Yoruba lymphoblastoid cell lines (LCLs); 14 of these lines were further differentiated into cardiomyocytes. We characterized regulatory variation across individuals and cell types by measuring gene expression levels, chromatin accessibility and DNA methylation. Our analysis focused on a comparison of inter-individual regulatory variation across cell types. While most cell type-specific regulatory quantitative trait loci (QTLs) lie in chromatin that is open only in the affected cell types, we found that 20% of cell type-specific regulatory QTLs are in shared open chromatin. This observation motivated us to develop a deep neural network to predict open chromatin regions from DNA sequence alone. Using this approach, we were able to use the sequences of segregating haplotypes to predict the effects of common SNPs on cell type-specific chromatin accessibility.
Project description:Comparative genomics studies in primates are extremely restricted due to our limited access to samples from non-human apes. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To begin demonstrating the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating that the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude. We used ChIP-seq to characterize the genome-wide distribution of two types of histone modifications (H3K27me3 and H3K27ac) in three of our chimpanzee iPSCs and compared them to histone modification data from three human iPSC lines from the Roadmap Epigenomics project:
Project description:Comparative genomics studies in primates are extremely restricted due to our limited access to samples from non-human apes. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To begin demonstrating the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating that the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude.
Project description:<p>Variability in induced pluripotent stem cell (iPSC) lines remains a roadblock for disease modeling and regenerative medicine. Through linear mixed models we have described different sources of gene expression variability from RNA sequencing data in 317 human iPSC lines from 101 individuals. We found that ~50% of genome-wide expression variability is explained by variation across individuals and identified a set of expression quantitative trait loci that contribute to this variation. These analyses coupled with allele specific expression show that iPSCs retain a subject-specific gene expression pattern. Pathway enrichment and key driver analyses, based on predictive causal gene networks, found that Polycomb targets explain a significant part of the non-genetic variability present in iPSCs within and across individuals. These publically available iPSC lines and genetic datasets will be a resource to the scientific community and will open new avenues to reduce variability in iPSCs and improve their utility in disease modeling.</p> <p>SNP array data from individuals included in RNA-seq transcriptome profiling study of human induced pluripotent stem cells to characterize gene expression variation across individuals and within multiple iPSC lines from the same individual. Genotyping was performed on patient blood.</p> Data availability: <ul> <li>SNP-genotyping: dbGaP - current study</li> <li>RNA-seq counts: <a href="http://www.ncbi.nlm.nih.gov/geo/">GEO</a> - GSE79636</li> <li>FASTQ files: <a href="http://www.ncbi.nlm.nih.gov/sra">SRA</a> - SRP072417</li> </ul>
Project description:Comparative studies in primates are extremely restricted because we only have access to a few types of cell lines from non-human apes and to a limited collection of frozen tissues. In order to gain better insight into regulatory processes that underlie variation in complex phenotypes, we must have access to faithful model systems for a wide range of tissues and cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee (Pan troglodytes) induced pluripotent stem cell (iPSC) lines derived from fibroblasts of healthy donors. All lines are free of integration from exogenous reprogramming vectors, can be maintained using standard iPSC culture techniques, and have proliferative and differentiation potential similar to human and mouse lines. To begin demonstrating the utility of comparative iPSC panels, we collected RNA-seq data and methylation profiles from the chimpanzee iPSCs and their corresponding fibroblast precursors, as well as from 7 human iPSCs and their precursors, which were of multiple cell type and population origins. Overall, we observed much less regulatory variation within species in the iPSCs than in the somatic precursors, indicating that the reprogramming process has erased many of the differences observed between somatic cells of different origins. We identified 4,918 differentially expressed genes and 1,986 differentially methylated regions between iPSCs of the two species, many of which are novel inter-species differences and not observed between the somatic cells of the two species. Our panel will help realize the potential of iPSCs, and in combination with genomic technologies, transform studies of comparative evolution in primates. We obtained RNA sequencing and methylation profiles from 7 chimpanzee iPSCs and the fibroblasts used to generate them, as well as 7 human iPSCs and the LCLs and fibroblasts used to generate them.
Project description:Comparative studies in primates are extremely restricted because we only have access to a few types of cell lines from non-human apes and to a limited collection of frozen tissues. In order to gain better insight into regulatory processes that underlie variation in complex phenotypes, we must have access to faithful model systems for a wide range of tissues and cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee (Pan troglodytes) induced pluripotent stem cell (iPSC) lines derived from fibroblasts of healthy donors. All lines are free of integration from exogenous reprogramming vectors, can be maintained using standard iPSC culture techniques, and have proliferative and differentiation potential similar to human and mouse lines. To begin demonstrating the utility of comparative iPSC panels, we collected RNA-seq data and methylation profiles from the chimpanzee iPSCs and their corresponding fibroblast precursors, as well as from 7 human iPSCs and their precursors, which were of multiple cell type and population origins. Overall, we observed much less regulatory variation within species in the iPSCs than in the somatic precursors, indicating that the reprogramming process has erased many of the differences observed between somatic cells of different origins. We identified 4,918 differentially expressed genes and 1,986 differentially methylated regions between iPSCs of the two species, many of which are novel inter-species differences and not observed between the somatic cells of the two species. Our panel will help realise the potential of iPSCs, and in combination with genomic technologies, transform studies of comparative evolution in primates. We obtained RNA sequencing and methylation profiles from 7 chimpanzee iPSCs and the fibroblasts used to generate them, as well as 7 human iPSCs and the LCLs and fibroblasts used to generate them.
Project description:There is substantial interest in the genetic regulatory framework that is established in early human development, and in the evolutionary forces that shaped early developmental processes in humans. Progress in these areas has been slow because it is difficult to obtain relevant biological samples. Recent technological developments in the generation and differentiation of inducible pluripotent stem cells (iPSCs) provide the ability to develop in vitro models of early human and non-human primates developmental stages. We have previously established matched iPSC panels from humans and chimpanzees. Using these panels, we comparatively characterized gene regulatory changes through a four-day timecourse differentiation of iPSCs (day 1) into primary streak (day 2), endoderm progenitors (day 3), and definitive endoderm (day 4). As might be expected, we found that differentiation stage (in effect, cell type) is the major driver of variation in gene expression levels in our study, followed by species. We then identified thousands of differentially expressed genes between humans and chimpanzees in each differentiation stage. Yet, when we considered gene-specific dynamic regulatory trajectories throughout the timecourse, we found that 75% of genes, including nearly all known endoderm developmental markers, have conserved trajectories in the two species. Interestingly, we observed a marked reduction of both intra- and inter-species variation in gene expression levels in primitive streak samples compared to the iPSCs, with a recovery of variation in endoderm progenitors. The reduction in variation in gene expression levels at a specific developmental stage, paired with the high degree of conservation of temporal expression across species, is consistent with the dynamics of developmental canalization. Overall, we conclude that endoderm development in iPSC-based models are highly conserved and canalized between humans and our closest evolutionary relative.