Project description:Long-read proteogenomic data was used to create sample-matched protein database for WTC11 sample. This includes many potential alternative protein isoforms (major and minor isoforms) per gene. The IS-PRM method called Tomahto was tested versus DDA to demonstrate improved coverage of isoform-specific peptides. We called this overall method of long-read RNA informed Tomahto targeting LRP-IS-PRM and used it for protein-level evidence of multiple isoforms derived from alternative splicing.
Project description:Long-read proteogenomic data was used to create sample-matched protein database for WTC11 sample. This includes many potential alternative protein isoforms (major and minor isoforms) per gene. The IS-PRM method called Tomahto was tested versus DDA to demonstrate improved coverage of isoform-specific peptides. We called this overall method of long-read RNA informed Tomahto targeting LRP-IS-PRM and used it for protein-level evidence of multiple isoforms derived from alternative splicing.
Project description:With an ability to compromise genome integrity, transposable elements (TEs) have significant associations with human diseases. Short-read sequencing has been used to study the expression of TEs; however, the highly repetitive nature of these elements makes multimapping a critical issue. Here we implement LocusMasterTE, an improved quantification method by integrating long-read sequencing. Introducing computed transcript per million(TPM) counts from long-read sequencing as prior distribution during Expectation-Maximization(EM) model in short-read TE quantification, multi-mapped reads are re-assigned to correct expression values. Based on simulated short reads, LocusMasterTE outperforms current quantitative approaches and is significantly favorable in capturing newly inserted TEs. We also verified that TEs quantified by LocusMasterTE clearly related to euchromatins and heterochromatins in cell line samples. With LocusMasterTE we anticipate that more accurate quantification can be performed, allowing novel functions of TEs to be uncovered.
Project description:Evaluation of short-read-only, long-read-only, and hybrid assembly approaches on metagenomic samples demonstrating how they affect gene and protein prediction which is relevant for downstream functional analyses. For a human gut microbiome sample, we use complementary metatranscriptomic, and metaproteomic data to evaluate the metagenomic-based protein predictions.