Unknown

Dataset Information

0

Improved structural annotation of protein-coding genes in the Meloidogyne hapla genome using RNA-Seq.


ABSTRACT: As high-throughput cDNA sequencing (RNA-Seq) is increasingly applied to hypothesis-driven biological studies, the prediction of protein coding genes based on these data are usurping strictly in silico approaches. Compared with computationally derived gene predictions, structural annotation is more accurate when based on biological evidence, particularly RNA-Seq data. Here, we refine the current genome annotation for the Meloidogyne hapla genome utilizing RNA-Seq data. Published structural annotation defines 14?420 protein-coding genes in the M. hapla genome. Of these, 25% (3751) were found to exhibit some incongruence with RNA-Seq data. Manual annotation enabled these discrepancies to be resolved. Our analysis revealed 544 new gene models that were missing from the prior annotation. Additionally, 1457 transcribed regions were newly identified on the ends of as-yet-unjoined contigs. We also searched for trans-spliced leaders, and based on RNA-Seq data, identified genes that appear to be trans-spliced. Four 22-bp trans-spliced leaders were identified using our pipeline, including the known trans-spliced leader, which is the M. hapla ortholog of SL1. In silico predictions of trans-splicing were validated by comparison with earlier results derived from an independent cDNA library constructed to capture trans-spliced transcripts. The new annotation, which we term HapPep5, is publically available at www.hapla.org.

SUBMITTER: Guo Y 

PROVIDER: S-EPMC4165543 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improved structural annotation of protein-coding genes in the Meloidogyne hapla genome using RNA-Seq.

Guo Yuelong Y   Bird David McK DM   Nielsen Dahlia M DM  

Worm 20140516


As high-throughput cDNA sequencing (RNA-Seq) is increasingly applied to hypothesis-driven biological studies, the prediction of protein coding genes based on these data are usurping strictly in silico approaches. Compared with computationally derived gene predictions, structural annotation is more accurate when based on biological evidence, particularly RNA-Seq data. Here, we refine the current genome annotation for the Meloidogyne hapla genome utilizing RNA-Seq data. Published structural annota  ...[more]

Similar Datasets

| S-EPMC3219749 | biostudies-literature
| S-EPMC6505119 | biostudies-literature
| PRJNA12686 | ENA
| PRJNA12707 | ENA
| S-EPMC2547418 | biostudies-literature
| S-EPMC6847864 | biostudies-literature
| S-EPMC4181115 | biostudies-literature
| S-EPMC2655684 | biostudies-literature
| S-EPMC8015272 | biostudies-literature
2010-11-10 | E-GEOD-21925 | biostudies-arrayexpress