Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification

ABSTRACT: A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification

PROVIDER: PRJNA613060 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
	SRR11336118_subreads.fastq.gz	Fastqsanger.gz
	SRR11336119_subreads.fastq.gz	Fastqsanger.gz

Items per page:

1 - 2 of 2

Similar Datasets

A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification

Project description:Alternative splicing is widely acknowledged to be a crucial regulator of gene expression and is a key contributor to both normal developmental processes and disease states. While cost-effective and accurate for quantification, short-read RNA-seq lacks the ability to resolve full-length transcript isoforms despite increasingly sophisticated computational methods. Long-read sequencing platforms such as Pacific Biosciences (PacBio) and Oxford Nanopore (ONT) bypass the transcript reconstruction challenges of short-reads. Here we describe TALON, the ENCODE4 pipeline for analyzing PacBio cDNA and ONT direct-RNA transcriptomes. We apply TALON to three human ENCODE Tier 1 cell lines and show that while both technologies perform well at full-transcript discovery and quantification, each technology has its distinct artifacts. We further apply TALON to mouse cortical and hippocampal transcriptomes and find that a substantial proportion of neuronal genes have more reads associated with novel isoforms than annotated ones. The TALON pipeline for technology-agnostic, long-read transcriptome discovery and quantification tracks both known and novel transcript models as well as expression levels across datasets for both simple studies and larger projects such as ENCODE that seek to decode transcriptional regulation in the human and mouse genomes to predict more accurate expression levels of genes and transcripts than possible with short-reads alone.

2019-06-15 | GSE132766 | GEO

A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification

Project description:Alternative splicing is widely acknowledged to be a crucial regulator of gene expression and is a key contributor to both normal developmental processes and disease states. While cost-effective and accurate for quantification, short-read RNA-seq lacks the ability to resolve full-length transcript isoforms despite increasingly sophisticated computational methods. Long-read sequencing platforms such as Pacific Biosciences (PacBio) and Oxford Nanopore (ONT) bypass the transcript reconstruction challenges of short-reads. Here we describe TALON, the ENCODE4 pipeline for analyzing PacBio cDNA and ONT direct-RNA transcriptomes. We apply TALON to three human ENCODE Tier 1 cell lines and show that while both technologies perform well at full-transcript discovery and quantification, each one displayed distinct artifacts. We further apply TALON to mouse cortical and hippocampal transcriptomes and find that a substantial proportion of neuronal genes have more reads associated with novel isoforms than with annotated ones. These data show that TALON is a technology-agnostic long-read transcriptome discovery and quantification pipeline capable of tracking both known and novel transcript models, as well as their expression levels, across datasets for both simple studies and in larger projects. These properties will enable TALON users to move beyond the limitations of short-read data to perform isoform discovery and quantification in a uniform manner on existing and future long-read platforms.

2020-03-18 | GSE147118 | GEO

A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification

Project description:A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification

| PRJNA548942 | ENA

Accurate long-read transcript discovery and quantification at single-cell resolution with Isosceles [scRNA-seq]

Project description:Accurate detection and quantification of mRNA isoforms from nanopore long-read sequencing remains challenged by technical noise, particularly in single cells. To address this, we introduce Isosceles, a computational toolkit that outperforms other methods in isoform detection sensitivity and quantification accuracy across single-cell, pseudo-bulk and bulk resolution levels, as demonstrated using synthetic and biologically-derived datasets. Isosceles improves the fidelity of single-cell transcriptome quantification at the isoform-level, and enables flexible downstream analysis. As a case study, we apply Isosceles, uncovering coordinated splicing within and between neuronal differentiation lineages. Isosceles is suitable to be applied in diverse biological systems, facilitating studies of cellular heterogeneity across biomedical research applications.

2023-11-27 | GSE248115 | GEO

Accurate long-read transcript discovery and quantification at single-cell resolution with Isosceles [RNA-seq]

2023-11-27 | GSE248114 | GEO

LocusMasterTE: long-read assisted short-read TE quantification [short]

Project description:With an ability to compromise genome integrity, transposable elements (TEs) have significant associations with human diseases. Short-read sequencing has been used to study the expression of TEs; however, the highly repetitive nature of these elements makes multimapping a critical issue. Here we implement LocusMasterTE, an improved quantification method by integrating long-read sequencing. Introducing computed transcript per million(TPM) counts from long-read sequencing as prior distribution during Expectation-Maximization(EM) model in short-read TE quantification, multi-mapped reads are re-assigned to correct expression values. Based on simulated short reads, LocusMasterTE outperforms current quantitative approaches and is significantly favorable in capturing newly inserted TEs. We also verified that TEs quantified by LocusMasterTE clearly related to euchromatins and heterochromatins in cell line samples. With LocusMasterTE we anticipate that more accurate quantification can be performed, allowing novel functions of TEs to be uncovered.

2023-09-01 | GSE225380 | GEO

Application of annotation-agnostic RNA sequencing data analysis tools for biomarker discovery in liquid biopsy.

Project description:We demonstrate an agnostic method to identify transcribed fragments from small Rna-Seq data using ALS patients and healthy donorâ€™s plasma.

2022-09-22 | GSE183942 | GEO

RNA-seq of rat dorsal root ganglia after ligation to investigate central nervous system transcriptomics in chronic pain using agnostic splice site discovery methods

Project description:The study pursued dual goals: To advance mRNA-seq bioinformatics towards unbiased transcriptome capture and to demonstrate its potential for discovery in neuroscience by applying the approach to an in vivo model of neurological disease. We found that 12.4% of known genes were induced and 7% were suppressed in the dysfunctional (but anatomically intact) L4 dorsal root ganglion (DRG) 2 weeks after L5 spinal Nerve Ligation (SNL). A new algorithm for agnostic mapping of pre-mRNA splice junctions (SJ) achieved a precision of 97%. mRNA-seq of L4 DRG 2 weeks and 2 months after L5 spinal nerve ligation. CONTROL and SNL were used to identify differential gene expression between chronic pain and standard conditions in Rattus norvegicus. CONTROL and SNL and PILOT were used to perform 'agnostic splice site discovery' in the nervous system transcriptome in Rattus norvegicus

2010-05-01 | E-GEOD-20895 | biostudies-arrayexpress

LocusMasterTE: long-read assisted short-read RNA-seq TE quantification [long]

2023-09-01 | GSE225377 | GEO

ESPRESSO: Robust discovery and quantification of transcript isoforms from error-prone long-read RNA-seq data

Project description:Long-read RNA sequencing (RNA-seq) holds great potential for characterizing transcriptome variation and full-length transcript isoforms, but the relatively high error rate of current long-read sequencing platforms poses a major challenge. We present ESPRESSO, a computational tool for robust discovery and quantification of transcript isoforms from error-prone long reads. ESPRESSO jointly considers alignments of all long reads aligned to a gene and uses error profiles of individual reads to improve the identification of splice junctions and the discovery of their corresponding transcript isoforms. On both a synthetic spike-in RNA sample and human RNA samples, ESPRESSO outperforms multiple contemporary tools in not only transcript isoform discovery but also transcript isoform quantification. In total, we generated and analyzed ~1.1 billion nanopore RNA-seq reads covering 30 human tissue samples and three human cell lines. ESPRESSO and its companion dataset provide a useful resource for studying the RNA repertoire of eukaryotic transcriptomes.

2022-10-14 | GSE192955 | GEO

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data