Project description:Liquid chromatography coupled to tandem mass spectrometry has become the main method for high-throughput identification and quantification of peptides and the inferred proteins. Discovery proteomics commonly employs data-dependent acquisition in combination with spectrum-centric analysis. The accumulation of data generated from thousands of samples by this method has approached saturation coverage of different proteomes. Recently, as a result of technological advances, methods based on data acquisition strategies compatible with peptide-centric scoring have also reached similar proteome coverage in individual runs, and scalability. This is exemplified by SWATH-MS, which combines data-independent acquisition (DIA) with targeted data extraction of groups of transitions uniquely detecting a peptide. As the data matrices generated by these experiments continue to grow with respect to both the number of peptides identified per sample and the number of samples analyzed per study, challenges for error rate control have emerged. Here, we discuss the adaptation of statistical concepts developed for discovery proteomics based on spectrum-centric scoring to large-scale DIA experiments analyzed with peptide-centric scoring strategies, and provide some guidance on their application. We propose that, in order to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported at each level as we progress from spectral evidence to identified or detected peptides and inferred proteins. These confidence criteria should equally be applied to proteomic analyses based on spectrum- and peptide-centric scoring strategies.
Project description:Liquid chromatography coupled to tandem mass spectrometry has become the main method for high-throughput identification and quantification of peptides and the inferred proteins. Discovery proteomics commonly employs data-dependent acquisition in combination with spectrum-centric analysis. The accumulation of data generated from thousands of samples by this method has approached saturation coverage of different proteomes. Recently, as a result of technological advances, methods based on data acquisition strategies compatible with peptide-centric scoring have also reached similar proteome coverage in individual runs, and scalability. This is exemplified by SWATH-MS, which combines data-independent acquisition (DIA) with targeted data extraction of groups of transitions uniquely detecting a peptide. As the data matrices generated by these experiments continue to grow with respect to both the number of peptides identified per sample and the number of samples analyzed per study, challenges for error rate control have emerged. Here, we discuss the adaptation of statistical concepts developed for discovery proteomics based on spectrum-centric scoring to large-scale DIA experiments analyzed with peptide-centric scoring strategies, and provide some guidance on their application. We propose that, in order to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported at each level as we progress from spectral evidence to identified or detected peptides and inferred proteins. These confidence criteria should equally be applied to proteomic analyses based on spectrum- and peptide-centric scoring strategies.
Project description:To determine the error rate of mitochondrial transcription, we ananlyzed 33 and 37 million reads respectively for wild type (WT) and mutant (E423P) mitochondrial RNA polymerase (POLRMT) overexpression flies and found that the error frequency of mitochondrial transcripts were over 5 fold higher in E423P flies than that of WT. To gain more insight into the molecular mechanisms that drive the error rate of transcription by POLRMT, we examined its distribution of errors along the mitochondrial genome. We also evaluated mitochondrial RNA processing by quantifying the frequency of a single read spanning two adjacent genes. There was no significant increase of unprocessed RNAs in E423P than that of WT. These observations concluded that overexpression of E423P POLRMT in adult flies leads to a statistically significant increase of mitochondrial transcripts errors.
Project description:To study the impact of the RNA polymerase II (Pol II) elongation rate on gene expression, we used CRISPR-Cas9 genome editing in S. pombe to generate a "slow" Pol II mutant with decreased elongation rate. Although the mutation is well tolerated as far as cell growth is concerned, transcriptomic analyses revealed that the slow mutant tends to terminate transcription prematurely. We distinguished two mechanisms by which premature termination affects gene expression in the slow mutant: It either (1) shortens 3'UTR, or (2) derepresses protein coding genes by prematurely terminating upstream interfering RNAs. Strikingly, the genes affected by these mechanisms are enriched for genes involved in phosphate uptake and purine synthesis, two processes essential for the maintenance of the nucleotide pool of the cell. Together with evidences that nucleotides are conditional for Pol II processive elongation, our results suggest that Pol II elongation rate acts as both sensor and effector in response to nucleotide depletion.