Project description:Full-length transcriptome sequencing , gene function annotation and SSR prediction of Clematis calyx based on single-molecule real-time (SMRT) sequencing platform.
Project description:A major informatic challenge in single cell RNA-sequencing analysis is the precise annotation of datasets where cells exhibit complex multilayered identities or transitory states. Here, we present devCellPy a highly accurate and precise machine learning-enabled tool that enables automated prediction of cell types across complex annotation hierarchies. To demonstrate the power of devCellPy, we construct a murine cardiac developmental atlas from published datasets encompassing 104,199 cells from E6.5-E16.5 and train devCellPy to generate a cardiac prediction algorithm. Using this algorithm, we observe a high prediction accuracy (>90%) across multiple layers of annotation and across de novo murine developmental data. Furthermore, we conduct a cross-species prediction of cardiomyocyte subtypes from in vitro-derived human induced pluripotent stem cells and unexpectedly uncover a predominance of left ventricular (LV) identity that we confirmed by an LV-specific TBX5 lineage tracing system. Together, our results show devCellPy to be a powerful tool for automated cell prediction across complex cellular hierarchies, species, and experimental systems.
Project description:As part of the ENCODE consortium the GENCODE project is producing a reference gene set through manual and automated gene prediction. Selected transcript models are verified experimentally by RT-PCR amplification followed by sequencing. In batch IX, a set of de novo transcript models was tested aiming to incorporate new long non-coding RNA models into the GENCODE annotation. The original set was built with Cufflinks from ENCODE RNAseq data derived from 15 cell lines by the Gingeras (CSHL) and Wold (CalTech) labs. A subset of multiexonic transcripts not overlapping the GENCODE v10 annotation was selected for this experiment.
Project description:This is an auto-generated model with COBRA Matlab toolbox. The gadMorTrinigy de novo Trinity transcript assembly and peptide sequences are available at https://doi.org/10.6084/m9.figshare.c.5168303.v2
2020-10-26 | MODEL2010090002 | BioModels
Project description:Chaerilus cimrmani De Novo Transcriptome Assembly and Annotation
| PRJNA1115122 | ENA
Project description:Chaerilus stockmannorum De Novo Transcriptome Assembly and Annotation + Genome Assembly and Annotation
Project description:itis vinifera cv. Tannat is largely cultivated in Uruguay for the production of high quality red wines. Its most notable characteristic is an elevated content of polyphenolic compounds, which provide an intense purple color and remarkable antioxidant properties to the wine. To characterize the genetic components encoding this important phenotypic characteristic, the genome of the Uruguayan Tannat clone UY11 was sequenced to 134X coverage using the Illumina technology and assembled with a mixed approach of de novo assembly and iterative mapping on the PN40024 reference genome. An approach based on both reference-guided annotation and de novo transcript assembly of RNA-Seq data allowed the definition of 3,673 genes not previously annotated in PN40024 that we consider novel, and the discovery of 2,228 genes not shared with the grapevine reference genome that we consider private to Tannat. Expression analysis showed that private genes contributed substantially (more than 50%) to the overall expression of enzymes involved in phenol and polyphenol biosynthesis indicating that the dispensable portion of the grapevine genome contains many private genes which are likely to contribute to the peculiar phenotypic characteristics of this grapevine variety.