Comparison of mitochondrial transcription errors in wild type and mutant mitochondrial RNA polymerase overexpression flies
Ontology highlight
ABSTRACT: To determine the error rate of mitochondrial transcription, we ananlyzed 33 and 37 million reads respectively for wild type (WT) and mutant (E423P) mitochondrial RNA polymerase (POLRMT) overexpression flies and found that the error frequency of mitochondrial transcripts were over 5 fold higher in E423P flies than that of WT. To gain more insight into the molecular mechanisms that drive the error rate of transcription by POLRMT, we examined its distribution of errors along the mitochondrial genome. We also evaluated mitochondrial RNA processing by quantifying the frequency of a single read spanning two adjacent genes. There was no significant increase of unprocessed RNAs in E423P than that of WT. These observations concluded that overexpression of E423P POLRMT in adult flies leads to a statistically significant increase of mitochondrial transcripts errors.
Project description:The goal of this study is to compare the frequency of unprocessed preRNAs in mitochondria. Total mitochondrial RNA profiles of adult flies were generated in triplicates. We analyzed over 100 million reads for each group and used featureCounts to perform quantification allowing a read to be assigned to more than one gene. Using this analysis pipe line, we mapped about 30 million reads per sample to the Drosophila mitochondrial genome and identified a fequency of unprocessed preRNAs as 0.43% in PolrMT group, which is similar to that in PolrMT-E423P flies. These result revealed that overexpression of PolrMT-E423P does not contribute to pre-RNA processing.
Project description:To investigate differential mitochondrial tRNA gene expression in wild type (WT) and mutant (E423P) mitochondrial RNA polymerase (POLRMT) overexpression flies, we performed Drsophila mitochondrial tRNA-seq according to the Hydro-tRNAseq method. tRNA-seq reads were subjected to 3' - adapter filtering , 3' - adapter trimming and quality control. The tRNAs expression levels were measured by tag count. For each tRNA sequence-based profile, the mapped reads number used to estimate the expression level of each tRNA. The tRNAs expression profiling was calculated based on uniquely mapped reads and including mapped reads, respectively. We found no difference in the relative abundance of mitochondrial individual tRNAs between WT and E423P overexpression flies.
Project description:Estimating the relationships between individuals is one of the fundamental challenges in many fields. In particular, relationship estimation could provide valuable information for missing persons cases. The recently developed investigative genetic genealogy approach uses high-density single nucleotide polymorphisms (SNPs) to determine close and more distant relationships, in which hundreds of thousands to tens of millions of SNPs are generated either by microarray genotyping or whole-genome sequencing. The current studies usually assume the SNP profiles were generated with minimum errors. However, in the missing person cases, the DNA samples can be highly degraded, and the SNP profiles generated from these samples usually contain lots of errors. In this study, a robust machine learning approach was developed for estimating the relationships with high error SNP profiles. In this approach, a hierarchical classification strategy was employed first to classify the relationships by degree and then the relationship types within each degree separately. As for each classification, feature selection was implemented to gain better performance. Both simulated and real data sets with various genotyping error rates were utilized in evaluating this approach, and the accuracies of this approach were higher than individual measures; namely, this approach was more accurate and robust than the individual measures for SNP profiles with genotyping errors. In addition, the highest accuracy could be obtained by providing the same genotyping error rates in train and test sets, and thus estimating genotyping errors of the SNP profiles is critical to obtaining high accuracy of relationship estimation.
Project description:<p>Translation fidelity is the limiting factor in the accuracy of gene expression. With an estimated frequency of 10-4, errors in mRNA decoding occur in a mostly stochastic manner. Little is known about the response of higher eukaryotes to chronic loss of ribosomal accuracy as per an increase in the random error rate of mRNA decoding. Here, we present a global and comprehensive picture of the cellular changes in response to translational accuracy in mammalian ribosomes impaired by genetic manipulation. In addition to affecting established protein quality control pathways, such as elevated transcript levels for cytosolic chaperones, activation of the ubiquitin-proteasome system, and translational slowdown, ribosomal mistranslation led to unexpected responses. In particular, we observed increased mitochondrial biogenesis associated with import of misfolded proteins into the mitochondria and silencing of the unfolded protein response in the endoplasmic reticulum.</p><p><br></p><p>This study describes the metabolomic analysis of HEK293 cells lines expressing mutant ribosomal protein RPS2 (human A226Y). RPS2 A226Y mutation has been shown to cause misreading and readthrough. Results provide insight into the response to chronic mistranslation in mammalian cells.</p>
Project description:While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a training data set, which is typically either from a part of the data set being M-bM-^@M-^\recalibratedM-bM-^@M-^] (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 units, and by as much as 13 units M-BM- at CpG sites. In addition, since reads mapping to the genome are not used for recalibration, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration. Four human RNA samples with equimolar ERCC spike-in standards were sequenced on Illumina. Two human brain/liver/muscle RNA mixtures with dynamic range of ERCC spike-in standards were sequenced on SOLiD.
Project description:While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a training data set, which is typically either from a part of the data set being “recalibrated” (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 units, and by as much as 13 units at CpG sites. In addition, since reads mapping to the genome are not used for recalibration, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration.
Project description:We investigated the role of TEFM in vivo by creating heart- and skeletal-muscle specific knockout of this gene. We identified that TEFM is required for the processivity of POLRMT and its loss leads to significantly decreased mitochondrial transcription.
Project description:We investigated the role of TEFM in vivo by creating heart- and skeletal-muscle specific knockout of this gene. We identified that TEFM is required for the processivity of POLRMT and its loss leads to significantly decreased mitochondrial transcription.