Project description:Comparative genomics studies in primates are extremely restricted due to our limited access to samples from non-human apes. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To begin demonstrating the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating that the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude. We used ChIP-seq to characterize the genome-wide distribution of two types of histone modifications (H3K27me3 and H3K27ac) in three of our chimpanzee iPSCs and compared them to histone modification data from three human iPSC lines from the Roadmap Epigenomics project:
Project description:Comparative genomics studies in primates are extremely restricted due to our limited access to samples from non-human apes. In order to gain better insight into the genetic processes that underlie variation in complex phenotypes in primates, we must have access to faithful model systems for a wide range of cell types. To facilitate this, we have generated a panel of 7 fully characterized chimpanzee induced pluripotent stem cell (iPSC) lines derived from healthy donors. To begin demonstrating the utility of comparative iPSC panels, we collected RNA-sequencing and DNA methylation data from the chimpanzee iPSCs and the corresponding fibroblast lines, as well as from 7 human iPSCs and their source lines, which encompass multiple populations and cell types. We observe much less within-species variation in iPSCs than in somatic cells, indicating that the reprogramming process erases many inter-individual differences. The low within-species regulatory variation in iPSCs allowed us to identify many novel inter-species regulatory differences of small magnitude.
Project description:Different individuals of the same species are generally thought to have very similar genomes. However, there is growing evidence that structural variation in the form of copy number variation (CNV) and presence-absence variation (PAV) can lead to variation in the genome content of individuals within a species. In order to investigate the potential contribution of CNV and PAV to genomic diversity in maize we used array comparative genomic hybridization (CGH) to compare gene content and copy number variation among 25 diverse maize inbreds 14 genotypes of the wild ancestor of maize, teosinte. The microarray included multiple probes for each of the ~32,500 stringently filtered genes identified in the B73 reference genome. We identified 479 genes exhibiting higher copy number in some genotypes (UpCNV) and 3,410 genes that have either fewer copies or are missing in the genome of at least one genotype relative to B73 (DownCNV/PAV). Many of these DownCNV/PAV are examples of genes that are present in B73 but missing from the genome of several other genotypes. Over 70% of the CNV/PAV examples are identified in multiple genotypes and the majority of events are observed in both maize and teosinte suggesting that these reflect relatively old variants that are not associated with domestication or maize improvement. Many of the genes affected by CNV/PAV are either maize-specific or members of genes families suggesting that the gene loss can be tolerated through buffering by redundant functions encoded elsewhere in the genome. Many plant genomes are relatively large and contain the remnant of whole genome duplications which may provide the ability to tolerate high levels of structural variation. While this structural variation may not result in major qualitative variation due to genetic buffering, it may significantly contribute to quantitative variation.
Project description:The relationship between gene network structure and expression variation among individuals and species: Variation among individuals is a prerequisite of evolution by natural selection. As such, identifying the origins of variation is a fundamental goal of biology. We investigated the link between gene interactions and variation in gene expression among individuals and species, using the mammalian limb as a model system. We first built interaction networks for key genes regulating early (outgrowth; E9.5-11) and late (expansion and elongation; E11-13) limb development in mouse. This resulted in an Early (ESN) and Late (LSN) Stage Network. Computational perturbations of these networks suggest that the ESN is more robust. We then quantified levels of the same key genes among mouse individuals, and found that they vary less at earlier limb stages and that variation in gene expression is heritable. Finally, we quantified variation in gene expression levels among four mammals with divergent limbs (bat, opossum, mouse and pig), and found that levels vary less among species at earlier limb stages. We also found that variation in gene expression levels among individuals and species are correlated for earlier and later limb development. In conclusion, results are consistent with the robustness of the ESN buffering among-individual variation in gene expression levels early in mammalian limb development, and constraining the evolution of early limb development among mammalian species. Transcriptomic insights into the genetic basis of mammalian limb diversity: From bat wings to whale flippers, limb diversification has been crucial to the evolutionary success of mammals. We performed the first transcriptome-wide study of limb development in multiple species to explore the hypothesis that mammalian limb diversification has proceeded through the differential expression of conserved shared genes, rather than by major changes to limb patterning. Specifically, we investigated the manner in which the expression of shared genes has evolved within and among mammalian species. We assembled and compared transcriptomes of bat, mouse, opossum, and pig fore- and hind limbs at the ridge, bud, and paddle stages of development. Results suggest that gene expression patterns exhibit larger variation among species during later than earlier stages of limb development, while within species results are more mixed. Consistent with the former, results also suggest that genes expressed at later developmental stages tend to have a younger evolutionary age than genes expressed at earlier stages. A suite of key limb-patterning genes was identified as being differentially expressed among the homologous limbs of all species. However, only a small subset of shared genes is differentially expressed in the fore- and hind limbs of all examined species. Similarly, a small subset of shared genes is differentially expressed within the fore- and hind limb of a single species and among the forelimbs of different species. Taken together, results of this study do not support the existence of a phylotypic period of limb development ending at chondrogenesis, but do support the hypothesis that the hierarchical nature of development translates into increasing variation among species as development progresses.
Project description:We performed whole genome re-sequencing to reveal the comprehensive genetic variation of the fruit development between kumquat (Fortunella japonica) and Clementine mandarin. Total 5,865,235 single-nucleotide polymorphisms (SNPs) and 414,447 insertion/deletion (InDels) were identified in the two citrus species. Meanwhile, a total of 640,801 SNPs and 20,733 InDels were identified based on integrative analysis of genome and transcriptome of fruit. The variation feature, genomic distribution, functional effect and other characteristics of these genetic variation were explored. Total 1,090 differentially expressed genes (DEGs) were found during fruit development process of kumquat and Clementine mandarin by RNA-sequencing. Gene Ontology revealed that these genes were involved in various molecular functional and biological processes. Meanwhile, the genetic variation of 939 DEGs and 74 multiple fruit development pathway genes from previous reported were also identified. In addition, a global survey of genes splicing events identified 24,237 specific alternative splicing (AS) events in the two citrus species and showed that intron retention is the most prevalent pattern of alternative splicing.
Project description:This study examines genomic copy-number variation between two African cichlid species through array comparative genomic hybridization. Probe level hybridization ratios were compared to copy number variation identified in Illumina and Pacific Biosciences genome assemblies from both species. Array comparative genomic hybridization was performed with 3 samples (1 replicate array setup) of genomic DNA from Maylandia zebra vs. Orechromis niloticus XX clone genomic DNA from University of Stirling clonal lines.
Project description:Different individuals of the same species are generally thought to have very similar genomes. However, there is growing evidence that structural variation in the form of copy number variation (CNV) and presence-absence variation (PAV) can lead to variation in the genome content of individuals within a species. In order to investigate the potential contribution of CNV and PAV to genomic diversity in maize we used array comparative genomic hybridization (CGH) to compare gene content and copy number variation among 25 diverse maize inbreds 14 genotypes of the wild ancestor of maize, teosinte. The microarray included multiple probes for each of the ~32,500 stringently filtered genes identified in the B73 reference genome. We identified 479 genes exhibiting higher copy number in some genotypes (UpCNV) and 3,410 genes that have either fewer copies or are missing in the genome of at least one genotype relative to B73 (DownCNV/PAV). Many of these DownCNV/PAV are examples of genes that are present in B73 but missing from the genome of several other genotypes. Over 70% of the CNV/PAV examples are identified in multiple genotypes and the majority of events are observed in both maize and teosinte suggesting that these reflect relatively old variants that are not associated with domestication or maize improvement. Many of the genes affected by CNV/PAV are either maize-specific or members of genes families suggesting that the gene loss can be tolerated through buffering by redundant functions encoded elsewhere in the genome. Many plant genomes are relatively large and contain the remnant of whole genome duplications which may provide the ability to tolerate high levels of structural variation. While this structural variation may not result in major qualitative variation due to genetic buffering, it may significantly contribute to quantitative variation. 1-2 replications of 25 maize inbred and 14 teosinte genotypes were hybridized to an array designed from the ~32,400 genes in the maize B73 reference genome.