Unknown

Dataset Information

0

A novel codon-based de Bruijn graph algorithm for gene construction from unassembled transcriptomes.


ABSTRACT: Most gene prediction methods detect coding sequences from transcriptome assemblies in the absence of closely related reference genomes. Such methods are of limited application due to high transcript fragmentation and extensive assembly errors, which may lead to redundant or false coding sequence predictions. We present inGAP-CDG, which can construct full-length and non-redundant coding sequences from unassembled transcriptomes by using a codon-based de Bruijn graph to simplify the assembly process and a machine learning-based approach to filter false positives. Compared with other methods, inGAP-CDG exhibits a significant increase in predicted coding sequence length and robustness to sequencing errors and varied read length.

SUBMITTER: Peng G 

PROVIDER: S-EPMC5114782 | biostudies-literature | 2016 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

A novel codon-based de Bruijn graph algorithm for gene construction from unassembled transcriptomes.

Peng Gongxin G   Ji Peifeng P   Zhao Fangqing F  

Genome biology 20161117 1


Most gene prediction methods detect coding sequences from transcriptome assemblies in the absence of closely related reference genomes. Such methods are of limited application due to high transcript fragmentation and extensive assembly errors, which may lead to redundant or false coding sequence predictions. We present inGAP-CDG, which can construct full-length and non-redundant coding sequences from unassembled transcriptomes by using a codon-based de Bruijn graph to simplify the assembly proce  ...[more]

Similar Datasets

| S-EPMC5591975 | biostudies-literature
| S-EPMC8521641 | biostudies-literature
| S-EPMC3694675 | biostudies-literature
| S-EPMC9628837 | biostudies-literature
| S-EPMC3517413 | biostudies-literature
| S-EPMC6412133 | biostudies-literature
| S-EPMC4896364 | biostudies-literature
| S-EPMC7057571 | biostudies-literature
| S-EPMC8147420 | biostudies-literature
| S-EPMC3848682 | biostudies-literature