Project description:We perform a systematic classification of allelic imbalance in mouse hybrids derived from reciprocal crosses of divergent strains. We observe that deviation from balanced biallelic expression is common, occurring in ~20% of the mouse transcriptome. Allelic imbalance attributed to genotype is by far the most prevalent class and typically is tissue-specific. However, some genotype-based imbalance is maintained across tissues and is associated with greater genetic variation, especially in 5’ and 3’ termini of transcripts. We further identify novel random monoallelic and imprinted genes, and find that genotype can compete with parental origin even in the setting of large imprinted regions. PolyA-selected RNA-sequencing in F1 hybrid and parental cells of Mm. musculus and Mm. castaneus origin
Project description:We perform a systematic classification of allelic imbalance in mouse hybrids derived from reciprocal crosses of divergent strains. We observe that deviation from balanced biallelic expression is common, occurring in ~20% of the mouse transcriptome. Allelic imbalance attributed to genotype is by far the most prevalent class and typically is tissue-specific. However, some genotype-based imbalance is maintained across tissues and is associated with greater genetic variation, especially in 5’ and 3’ termini of transcripts. We further identify novel random monoallelic and imprinted genes, and find that genotype can compete with parental origin even in the setting of large imprinted regions.
Project description:The diploid fungal pathogen Candida albicans is a highly heterozygous organism, with numerous non-synonymous substitutions often seen within two alleles. RNA-sequencing of the wild-type strain SC5314 has revealed 233 genes with significant levels of allelic expression imbalance. Overall percentage protein identity comparisons were significantly lower in these differentially expressed alleles. This suggests that two different, perhaps functionally divergent, proteins are being expressed at significantly different quantities by the two alleles of a single gene. Previously, gene expression levels have been correlated with structural factors such as GC content, ORF length and codon usage. Here, these factors were first correlated with overall gene expression data to decipher the relationship they have with gene expression in Candida albicans. These relationships were then used to assess the contribution of these factors to allelic expression imbalance. GC content and codon usage did not differ significantly in differentially expressed alleles whereas ORF length was found to be significantly lower in the allele with lowest expression. This surprising result goes against the overall trend observed between length and gene expression. Differences in GC content and ORF length between alleles correlated strongly with percentage protein identity, suggesting an indirect link between these factors and allelic expression imbalance. One sample (SC5314: wild-type strain) assessed in triplicate and compared to the reference diploid genome
Project description:Genetic changes that help explain the differences between two individuals might create or disrupt sites complementary to microRNAs (miRNAs), but the extent to which such polymorphic sites influence miRNA-mediated repression is unknown. Here, we describe a method to measure mRNA allelic imbalances associated with a regulatory site found in mRNA transcribed from one allele but not found in that transcribed from the other. Applying this method, called allelic imbalance sequencing, to sites for three miRNAs (miR-1, miR-133 and miR-122) provided quantitative measurements of repression in vivo without altering either the miRNAs or their targets. A substantial fraction of polymorphic sites mediated repression in tissues that expressed the cognate miRNA, and downregulation was correlated with site type and site context. Extrapolating these results to the other broadly conserved miRNAs suggests that when comparing two mouse strains (or two human individuals), polymorphic miRNA sites cause expression of many genes (often hundreds) to differ. There are four distinct samples (pool1-4), each of which is a pool of ~60 distinct PCR or RT-PCR products. The samples were loaded into different segments of the same pyrosequencing plate. Three sequencing runs (run1-3) were performed for each sample pool. Any amplicons of the four pools are for miR-1, miR-122, miR-133 or control sites. To control for allele-specific PCR bias, genomic DNA was also used as a template for generating amplicons.
Project description:The diploid fungal pathogen Candida albicans is a highly heterozygous organism, with numerous non-synonymous substitutions often seen within two alleles. RNA-sequencing of the wild-type strain SC5314 has revealed 233 genes with significant levels of allelic expression imbalance. Overall percentage protein identity comparisons were significantly lower in these differentially expressed alleles. This suggests that two different, perhaps functionally divergent, proteins are being expressed at significantly different quantities by the two alleles of a single gene. Previously, gene expression levels have been correlated with structural factors such as GC content, ORF length and codon usage. Here, these factors were first correlated with overall gene expression data to decipher the relationship they have with gene expression in Candida albicans. These relationships were then used to assess the contribution of these factors to allelic expression imbalance. GC content and codon usage did not differ significantly in differentially expressed alleles whereas ORF length was found to be significantly lower in the allele with lowest expression. This surprising result goes against the overall trend observed between length and gene expression. Differences in GC content and ORF length between alleles correlated strongly with percentage protein identity, suggesting an indirect link between these factors and allelic expression imbalance.
Project description:Normal appearing airway samples from non-small cell lung (NSCLC) cancer patients were profiled using illumina sequencing arrays. Allelic imbalance was detected in normal-appearing large and small airway samples and affected known lung cancer driver genes.