Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Analysis of chimeric reads from RNA-seq data

ABSTRACT: RNA-seq analysis of Drosophila pharate adults

ORGANISM(S): Drosophila melanogaster

PROVIDER: GSE50711 | GEO | 2014/04/22

SECONDARY ACCESSION(S): PRJNA218548

REPOSITORIES: GEO

ACCESS DATA

Json Xml

Dataset's files

Source:

			Action	DRS
		Other

Items per page:

1 - 1 of 1

Similar Datasets

Analysis of chimeric reads from RNA-seq data

Project description:RNA-seq analysis of Drosophila pharate adults Total RNA extracted from four lines of Drosophila melanogaster pharate adults were depleted of ribosomal RNAs and sequenced.

2014-04-22 | E-GEOD-50711 | biostudies-arrayexpress

Vicinal: a method for the determination of ncRNA ends using chimeric reads from RNA-seq experiments.

Project description:Non-coding (nc)RNAs are important structural and regulatory molecules. Accurate determination of the primary sequence and secondary structure of ncRNAs is important for understanding their functions. During cDNA synthesis, RNA 3' end stem-loops can self-prime reverse transcription, creating RNA-cDNA chimeras. We found that chimeric RNA-cDNA fragments can also be detected at 5' end stem-loops, although at much lower frequency. Using the Gubler-Hoffman method, both types of chimeric fragments can be converted to cDNA during library construction, and they are readily detectable in high-throughput RNA sequencing (RNA-seq) experiments. Here, we show that these chimeric reads contain valuable information about the boundaries of ncRNAs. We developed a bioinformatic method, called Vicinal, to precisely map the ends of numerous fruitfly, mouse and human ncRNAs. Using this method, we analyzed chimeric reads from over 100 RNA-seq datasets, the results of which we make available for users to find RNAs of interest. In summary, we show that Vicinal is a useful tool for determination of the precise boundaries of uncharacterized ncRNAs, facilitating further structure/function studies.

| S-EPMC4027162 | biostudies-literature

RNA-Seq Analysis

Project description:RNA-Seq analysis carried out in three different backcrosses based on Iberian breed with Landrace, Pietrain and Duroc breeds

2017-10-31 | PRJEB23068 | EVA

MetaMap: an atlas of metatranscriptomic reads in human disease-related RNA-seq data.

Project description:With the advent of the age of big data in bioinformatics, large volumes of data and high-performance computing power enable researchers to perform re-analyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts; however, its generic nature also enables the detection of microbial and viral transcripts. We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-human-mapping read fraction. We validated this approach by recapitulating outcomes from six independent, controlled infection experiments of cell line models and compared them with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from more than 17,000 samples from more than 400 studies relevant to human disease using state-of-the-art high-performance computing systems. The resulting data from this large-scale re-analysis are made available in the presented MetaMap resource. Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation toward the role of the microbiome in human disease. Additionally, codes to process new datasets and perform statistical analyses are made available.

| S-EPMC6025204 | biostudies-literature

CRAC: an integrated approach to the analysis of RNA-seq reads.

Project description:A large number of RNA-sequencing studies set out to predict mutations, splice junctions or fusion RNAs. We propose a method, CRAC, that integrates genomic locations and local coverage to enable such predictions to be made directly from RNA-seq read analysis. A k-mer profiling approach detects candidate mutations, indels and splice or chimeric junctions in each single read. CRAC increases precision compared with existing tools, reaching 99:5% for splice junctions, without losing sensitivity. Importantly, CRAC predictions improve with read length. In cancer libraries, CRAC recovered 74% of validated fusion RNAs and predicted novel recurrent chimeric junctions. CRAC is available at http://crac.gforge.inria.fr.

| S-EPMC4053775 | biostudies-literature

Local sequence and sequencing depth dependent accuracy of RNA-seq reads.

Project description:Many biases and spurious effects are inherent in RNA-seq technology, resulting in a non-uniform distribution of sequencing read counts for each base position in a gene. Therefore, a base-level strategy is required to model the non-uniformity. Also, the properties of sequencing read counts can be leveraged to achieve a more precise estimation of the mean and variance of measurement.In this study, we aimed to unveil the effects on RNA-seq accuracy from multiple factors and develop accurate modeling of RNA-seq reads in comparison. We found that the overdispersion rate decreased when sequencing depth increased on the base level. Moreover, the influence of local sequence(s) on the overdispersion rate was notable but no longer significant after adjusting the effect from sequencing depth. Based on these findings, we propose a desirable beta-binomial model with a dynamic overdispersion rate on the base-level proportion of sequencing read counts from two samples.The current study provides thorough insights into the impact of overdispersion at the position level and especially into its relationship with sequencing depth, local sequence, and preparation protocol. These properties of RNA-seq will aid in improvement of the quality control procedure and development of statistical methods for RNA-seq downstream analyses.

| S-EPMC5550947 | biostudies-other

HLA typing from RNA-Seq sequence reads.

Project description:We present a method, seq2HLA, for obtaining an individual's human leukocyte antigen (HLA) class I and II type and expression using standard next generation sequencing RNA-Seq data. RNA-Seq reads are mapped against a reference database of HLA alleles, and HLA type, confidence score and locus-specific expression level are determined. We successfully applied seq2HLA to 50 individuals included in the HapMap project, yielding 100% specificity and 94% sensitivity at a P-value of 0.1 for two-digit HLA types. We determined HLA type and expression for previously un-typed Illumina Body Map tissues and a cohort of Korean patients with lung cancer. Because the algorithm uses standard RNA-Seq reads and requires no change to laboratory protocols, it can be used for both existing datasets and future studies, thus adding a new dimension for HLA typing and biomarker studies.

| S-EPMC4064318 | biostudies-literature

TransMeta simultaneously assembles multisample RNA-seq reads.

Project description:Assembling RNA-seq reads into full-length transcripts is crucial in transcriptomic studies and poses computational challenges. Here we present TransMeta, a simple and robust algorithm that simultaneously assembles RNA-seq reads from multiple samples. TransMeta is designed based on the newly introduced vector-weighted splicing graph model, which enables accurate reconstruction of the consensus transcriptome via incorporating a cosine similarity-based combing strategy and a newly designed label-setting path-searching strategy. Tests on both simulated and real data sets show that TransMeta consistently outperforms PsiCLASS, StringTie2 plus its merge mode, and Scallop plus TACO, the most popular tools, in terms of precision and recall under a wide range of coverage thresholds at the meta-assembly level. Additionally, TransMeta consistently shows superior performance at the individual sample level.

| S-EPMC9341511 | biostudies-literature

Targeted variant detection using unaligned RNA-Seq reads.

Project description:Mutations identified in acute myeloid leukemia patients are useful for prognosis and for selecting targeted therapies. Detection of such mutations using next-generation sequencing data requires a computationally intensive read mapping step followed by several variant calling methods. Targeted mutation identification drastically shifts the usual tradeoff between accuracy and performance by concentrating all computations over a small portion of sequence space. Here, we present km, an efficient approach leveraging k-mer decomposition of reads to identify targeted mutations. Our approach is versatile, as it can detect single-base mutations, several types of insertions and deletions, as well as fusions. We used two independent cohorts (The Cancer Genome Atlas and Leucegene) to show that mutation detection by km is fast, accurate, and mainly limited by sequencing depth. Therefore, km allows the establishment of fast diagnostics from next-generation sequencing data and could be suitable for clinical applications.

| S-EPMC6701478 | biostudies-literature

From reads to insight: a hitchhiker's guide to ATAC-seq data analysis.

Project description:Assay of Transposase Accessible Chromatin sequencing (ATAC-seq) is widely used in studying chromatin biology, but a comprehensive review of the analysis tools has not been completed yet. Here, we discuss the major steps in ATAC-seq data analysis, including pre-analysis (quality check and alignment), core analysis (peak calling), and advanced analysis (peak differential analysis and annotation, motif enrichment, footprinting, and nucleosome position analysis). We also review the reconstruction of transcriptional regulatory networks with multiomics data and highlight the current challenges of each step. Finally, we describe the potential of single-cell ATAC-seq and highlight the necessity of developing ATAC-seq specific analysis tools to obtain biologically meaningful insights.

| S-EPMC6996192 | biostudies-literature

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data