Zhang2020 - Draft metabolic reconstruction model of Atlantic cod liver generated from gadMorTrinity and iHepatocyte2322
Ontology highlight
ABSTRACT: This is an auto-generated model with COBRA Matlab toolbox. The gadMorTrinigy de novo Trinity transcript assembly and peptide sequences are available at https://doi.org/10.6084/m9.figshare.c.5168303.v2
Project description:The goal of this study was to lay the groundwork for comparative transcriptomics of sex differences in the brain of wolf spiders, a non-model organism of the pyhlum Euarthropoda, by generating transcriptomes and analyzing gene expression. To examine differences in sex-differential gene expression, short read transcript sequencing and de novo transcriptome assembly were performed. Messenger RNA (mRNA) was isolated from dissected brain tissue of male and female subadult and mature wolf spiders (Schizocosa ocreata). The data consist of short read sequences for the two different life stages in each sex. Computational analyses on these data include de novo transcriptome assembly, using Trinity and CAP3 assembly suites, and differential expression analysis using the edgeR package. Sample-specific and combined transcriptomes, gene annotations, and differential expression results are described in this data note and are available from associated database submissions.
Project description:De novo assembly of eight immune system regions for individual HV31, generated using a multi-platform pipeline. A full description of the generation of these assemblies can be found at https://doi.org/10.1101/2021.02.03.429586.
Project description:We first report the use of next-generation massively parallel sequencing technologies and de novo transcriptome assembly to gain insight into the wide range of transcriptome of Hevea brasiliensis. The output of sequenced data showed that more than 12 million sequence reads with average length of 90nt were generated. Totally 48,768 unigenes (mean size = 488 bp) were assembled through transcriptome de novo assembly, which represent more than 3-fold of all the sequences of Hevea brasiliensis deposited in the GenBank. Assembled sequences were annotated with gene descriptions, gene ontology and clusters of orthologous group terms. Total 37,373 unigenes were successfully annotated and more than 10% of unigenes were aligned to known proteins of Euphorbiaceae. The unigenes contain nearly complete collection of known rubber-synthesis-related genes. Our data provides the most comprehensive sequence resource available for study rubber tree and demonstrates the availability of Illumina sequencing and de novo transcriptome assembly in a species lacking genome information. The transcriptome of latex and leaf in Hevea brasiliensis
Project description:The brown ghost knifefish (Apteronotus leptorhynchus) is a weakly electric teleost fish of particular interest as a model organism for a variety of research areas in neuroscience, including neurophysiology, neuroethology, and neurobiology. This versatile model system has been more recently used in the study of central nervous system development and regeneration during adulthood, as well as in the study of vertebrate aging and senescence. Despite substantial scientific interest in this species, no genomic resources are currently available. After evaluating several trimming and transcript reconstruction strategies, de novo assembly using Trinity uncovered at least 11,847 unique components (“genes”) containing full or near-full length protein sequences based on alignment to a reference set of known Actinopterygii protein sequences, with as many as 42,459 components containing at least a partial protein-coding sequence, providing broad coverage of the proteome. Shotgun proteomics confirmed translation of open reading frames from over 2,000 transcripts, including alternative splice variants. Assignment of tandem mass spectra obtained was shown to be greatly improved with the assembly compared with using databases of sequences from closely related organisms.
Project description:We describe a new strategy for LC-MS/MS based full length protein sequencing. Protein samples were unspecific hydrolyzed by three different methods(proteinase K, papain and microwave-assisted acid hydrolysis) to improve overlapping degree of peptides. After LC-MS/MS analysis, peptide sequences were gernated by de novo peptide sequencing program Pnovo. A new sequence assembly program, based on de brujin graph and Overlap-Layout-Consensus strategy, was desined to generate full length protein sequence using Pnovo results.
Project description:We first report the use of next-generation massively parallel sequencing technologies and de novo transcriptome assembly to gain insight into the wide range of transcriptome of Hevea brasiliensis. The output of sequenced data showed that more than 12 million sequence reads with average length of 90nt were generated. Totally 48,768 unigenes (mean size = 488 bp) were assembled through transcriptome de novo assembly, which represent more than 3-fold of all the sequences of Hevea brasiliensis deposited in the GenBank. Assembled sequences were annotated with gene descriptions, gene ontology and clusters of orthologous group terms. Total 37,373 unigenes were successfully annotated and more than 10% of unigenes were aligned to known proteins of Euphorbiaceae. The unigenes contain nearly complete collection of known rubber-synthesis-related genes. Our data provides the most comprehensive sequence resource available for study rubber tree and demonstrates the availability of Illumina sequencing and de novo transcriptome assembly in a species lacking genome information.
Project description:Shotgun protein sequencing with meta-contig assembly.
Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.