Dataset Information

Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote.

ABSTRACT:

Background

Upon the completion of whole genome sequencing, thorough genome annotation that associates genome sequences with biological meanings is essential. Genome annotation depends on the availability of transcript information as well as orthology information. In teleost fish, genome annotation is seriously hindered by genome duplication. Because of gene duplications, one cannot establish orthologies simply by homology comparisons. Rather intense phylogenetic analysis or structural analysis of orthologies is required for the identification of genes. To conduct phylogenetic analysis and orthology analysis, full-length transcripts are essential. Generation of large numbers of full-length transcripts using traditional transcript sequencing is very difficult and extremely costly.

Results

In this work, we took advantage of a doubled haploid catfish, which has two sets of identical chromosomes and in theory there should be no allelic variations. As such, transcript sequences generated from next-generation sequencing can be favorably assembled into full-length transcripts. Deep sequencing of the doubled haploid channel catfish transcriptome was performed using Illumina HiSeq 2000 platform, yielding over 300 million high-quality trimmed reads totaling 27 Gbp. Assembly of these reads generated 370,798 non-redundant transcript-derived contigs. Functional annotation of the assembly allowed identification of 25,144 unique protein-encoding genes. A total of 2,659 unique genes were identified as putative duplicated genes in the catfish genome because the assembly of the corresponding transcripts harbored PSVs or MSVs (in the form of pseudo-SNPs in the assembly). Of the 25,144 contigs with unique protein hits, around 20,000 contigs matched 50% length of reference proteins, and over 14,000 transcripts were identified as full-length with complete open reading frames. The characterization of consensus sequences surrounding start codon and the stop codon confirmed the correct assembly of the full-length transcripts.

Conclusions

The large set of transcripts assembled in this study is the most comprehensive set of genome resources ever developed from catfish, which will provide the much needed resources for functional genome research in catfish, serving as a reference transcriptome for genome annotation, analysis of gene duplication, gene family structures, and digital gene expression analysis. The putative set of duplicated genes provide a starting point for genome scale analysis of gene duplication in the catfish genome, and should be a valuable resource for comparative genome analysis, genome evolution, and genome function studies.

SUBMITTER: Liu S

PROVIDER: S-EPMC3582483 | biostudies-literature | 2012 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote.

Liu Shikai S Zhang Yu Y Zhou Zunchun Z Waldbieser Geoff G Sun Fanyue F Lu Jianguo J Zhang Jiaren J Jiang Yanliang Y Zhang Hao H Wang Xiuli X Rajendran K V KV Khoo Lester L Kucuktas Huseyin H Peatman Eric E Liu Zhanjiang Z

BMC genomics 20121105

<h4>Background</h4>Upon the completion of whole genome sequencing, thorough genome annotation that associates genome sequences with biological meanings is essential. Genome annotation depends on the availability of transcript information as well as orthology information. In teleost fish, genome annotation is seriously hindered by genome duplication. Because of gene duplications, one cannot establish orthologies simply by homology comparisons. Rather intense phylogenetic analysis or structural an ...[more]

PMID: 23127152

Similar Datasets

Project description:Customizable endonucleases are providing an effective tool for genome engineering. The resulting primary transgenic individuals (T0) are typically heterozygous and/or chimeric with respect to any mutations induced. To generate genetically fixed mutants, they are conventionally allowed to self-pollinate, a procedure which segregates individuals into mutant heterozygotes/homozygotes and wild types. The chances of recovering homozygous mutants among the progeny depend not only on meiotic segregation but also on the frequency of mutated germline cells in the chimeric mother plant. In Nicotiana species, the heritability of Cas9-induced mutations has not been demonstrated yet. RNA-guided Cas9 endonuclease-mediated mutagenesis was targeted to the green fluorescent protein (GFP) gene harbored by a transgenic tobacco line. Upon retransformation using a GFP-specific guide RNA/Cas9 construct, the T0 plants were allowed to either self-pollinate, or were propagated via regeneration from in vitro cultured embryogenic pollen which give rise to haploid/doubled haploid plants or from leaf explants that form plants vegetatively. Single or multiple mutations were detected in 80% of the T0 plants. About half of these mutations proved heritable via selfing. Regeneration from in vitro cultured embryogenic pollen allowed for homozygous mutants to be produced more efficiently than via sexual reproduction. Consequently, embryogenic pollen culture provides a convenient method to rapidly generate a variety of genetically fixed mutants following site-directed mutagenesis. The recovery of a mutation not found among sexually produced and analyzed progeny was shown to be achievable through vegetative plant propagation in vitro, which eventually resulted in heritability when the somatic clones were selfed. In addition, some in-frame mutations were associated with functional attenuation of the target gene rather than its full knock-out. The generation of mutants with compromised rather than abolished gene functionality holds promise for future approaches to the conclusive functional validation of genes which are indispensible for the plant.

Dataset Information

Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote.

Background

Results

Conclusions

Publications

Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets