Unknown

Dataset Information

0

Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).


ABSTRACT: The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp.

SUBMITTER: Kim S 

PROVIDER: S-EPMC4379974 | biostudies-literature | 2015 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

Kim Seungill S   Kim Myung-Shin MS   Kim Yong-Min YM   Yeom Seon-In SI   Cheong Kyeongchae K   Kim Ki-Tae KT   Jeon Jongbum J   Kim Sunggil S   Kim Do-Sun DS   Sohn Seong-Han SH   Lee Yong-Hwan YH   Choi Doil D  

DNA research : an international journal for rapid publication of reports on genes and genomes 20141031 1


The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated  ...[more]

Similar Datasets

| S-EPMC6409889 | biostudies-literature
| S-EPMC4564285 | biostudies-literature
| S-EPMC9933674 | biostudies-literature
| S-EPMC5023330 | biostudies-literature
2011-06-29 | GSE27132 | GEO
| S-EPMC8766973 | biostudies-literature
| S-EPMC7285762 | biostudies-literature
| S-EPMC7418993 | biostudies-literature
| S-EPMC5604068 | biostudies-literature
| S-EPMC7076509 | biostudies-literature