Unknown

Dataset Information

0

The PARA-suite: PAR-CLIP specific sequence read simulation and processing.


ABSTRACT:

Background

Next-generation sequencing technologies have profoundly impacted biology over recent years. Experimental protocols, such as photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP), which identifies protein-RNA interactions on a genome-wide scale, commonly employ deep sequencing. With PAR-CLIP, the incorporation of photoactivatable nucleosides into nascent transcripts leads to high rates of specific nucleotide conversions during reverse transcription. So far, the specific properties of PAR-CLIP-derived sequencing reads have not been assessed in depth.

Methods

We here compared PAR-CLIP sequencing reads to regular transcriptome sequencing reads (RNA-Seq) to identify distinctive properties that are relevant for reference-based read alignment of PAR-CLIP datasets. We developed a set of freely available tools for PAR-CLIP data analysis, called the PAR-CLIP analyzer suite (PARA-suite). The PARA-suite includes error model inference, PAR-CLIP read simulation based on PAR-CLIP specific properties, a full read alignment pipeline with a modified Burrows-Wheeler Aligner algorithm and CLIP read clustering for binding site detection.

Results

We show that differences in the error profiles of PAR-CLIP reads relative to regular transcriptome sequencing reads (RNA-Seq) make a distinct processing advantageous. We examine the alignment accuracy of commonly applied read aligners on 10 simulated PAR-CLIP datasets using different parameter settings and identified the most accurate setup among those read aligners. We demonstrate the performance of the PARA-suite in conjunction with different binding site detection algorithms on several real PAR-CLIP and HITS-CLIP datasets. Our processing pipeline allowed the improvement of both alignment and binding site detection accuracy.

Availability

The PARA-suite toolkit and the PARA-suite aligner are available at https://github.com/akloetgen/PARA-suite and https://github.com/akloetgen/PARA-suite_aligner, respectively, under the GNU GPLv3 license.

SUBMITTER: Kloetgen A 

PROVIDER: S-EPMC5088580 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

The PARA-suite: PAR-CLIP specific sequence read simulation and processing.

Kloetgen Andreas A   Borkhardt Arndt A   Hoell Jessica I JI   McHardy Alice C AC  

PeerJ 20161027


<h4>Background</h4>Next-generation sequencing technologies have profoundly impacted biology over recent years. Experimental protocols, such as photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP), which identifies protein-RNA interactions on a genome-wide scale, commonly employ deep sequencing. With PAR-CLIP, the incorporation of photoactivatable nucleosides into nascent transcripts leads to high rates of specific nucleotide conversions during reverse transcr  ...[more]

Similar Datasets

| S-EPMC3302668 | biostudies-literature
| S-EPMC5564215 | biostudies-literature
2018-07-03 | GSE114537 | GEO
| S-EPMC4053766 | biostudies-literature
| S-EPMC4691824 | biostudies-literature
| S-EPMC3508682 | biostudies-literature
| S-EPMC3549799 | biostudies-literature
| S-EPMC10256721 | biostudies-literature
2012-12-13 | GSE39682 | GEO
2012-12-13 | E-GEOD-39682 | biostudies-arrayexpress