Unknown

Dataset Information

0

Heuristic pairwise alignment of de Bruijn graphs to facilitate simultaneous transcript discovery in related organisms from RNA-Seq data.


ABSTRACT: The advance of high-throughput sequencing has made it possible to obtain new transcriptomes and study splicing mechanisms in non-model organisms. In these studies, there is often a need to investigate the transcriptomes of two related organisms at the same time in order to find the similarities and differences between them. The traditional approach to address this problem is to perform de novo transcriptome assemblies to obtain predicted transcripts for these organisms independently and then employ similarity comparison algorithms to study them.Instead of obtaining predicted transcripts for these organisms separately from the intermediate de Bruijn graph structures employed by de novo transcriptome assembly algorithms, we develop an algorithm to allow direct comparisons between paths in two de Bruijn graphs by first enumerating short paths in both graphs, and iteratively extending paths in one graph that have high similarity to paths in the other graph to obtain longer corresponding paths between the two graphs. These paths represent predicted transcripts that are present in both organisms.Our approach generalizes the pairwise sequence alignment problem to allow the input to be non-linear structures, and provides a heuristic to reliably recover similar paths from the two structures. Our algorithm allows detailed investigation of the similarities and differences in alternative splicing between the two organisms at both the sequence and structure levels, even in the absence of reference transcriptomes or a closely related model organism.

SUBMITTER: Fu S 

PROVIDER: S-EPMC4652555 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Heuristic pairwise alignment of de Bruijn graphs to facilitate simultaneous transcript discovery in related organisms from RNA-Seq data.

Fu Shuhua S   Tarone Aaron M AM   Sze Sing-Hoi SH  

BMC genomics 20151110


<h4>Background</h4>The advance of high-throughput sequencing has made it possible to obtain new transcriptomes and study splicing mechanisms in non-model organisms. In these studies, there is often a need to investigate the transcriptomes of two related organisms at the same time in order to find the similarities and differences between them. The traditional approach to address this problem is to perform de novo transcriptome assemblies to obtain predicted transcripts for these organisms indepen  ...[more]

Similar Datasets

| S-EPMC6122196 | biostudies-literature
| S-EPMC8901008 | biostudies-literature
| S-EPMC5872255 | biostudies-literature
| S-EPMC4120145 | biostudies-literature
| S-EPMC6612864 | biostudies-other
| S-EPMC4253301 | biostudies-literature
| S-EPMC8326735 | biostudies-literature
| S-EPMC3421212 | biostudies-literature
| S-EPMC6061703 | biostudies-literature
| S-EPMC8016496 | biostudies-literature