Project description:Shotgun protein sequencing with meta-contig assembly.
Full-length de novo sequencing from tandem mass (MS/MS) spectra of unknown proteins such as antibodies or proteins from organisms with unsequenced genomes remains a challenging open problem. Conventional algorithms designed to individually sequence each MS/MS spectrum are limited by incomplete peptide fragmentation or low signal to noise ratios and tend to result in short de novo sequences at low sequencing accuracy. Our shotgun protein sequencing (SPS) approach was developed to ameliorate these limitations by first finding groups of unidentified spectra from the same peptides (contigs) and then deriving a consensus de novo sequence for each assembled set of spectra (contig sequences). But whereas SPS enables much more accurate reconstruction of de novo sequences longer than can be recovered from individual MS/MS spectra, it still requires error-tolerant matching to homologous proteins to group smaller contig sequences into full-length protein sequences, thus limiting its effectiveness on sequences from poorly annotated proteins. Using low and high resolution CID and high resolution HCD MS/MS spectra, we address this limitation with a Meta-SPS algorithm designed to overlap and further assemble SPS contigs into Meta-SPS de novo contig sequences extending as long as 100 amino acids at over 97% accuracy without requiring any knowledge of homologous protein sequences. We demonstrate Meta-SPS using distinct MS/MS data sets obtained with separate enzymatic digestions and discuss how the remaining de novo sequencing limitations relate to MS/MS acquisition settings.
Project description:We present evidence for a new level of genome folding, whereby distant domains megabases apart fuse to form meta-domains. Within meta-domains, certain gene promoters pair with structural intergenic elements in the distant TAD. These long-range associations occur in a large fraction of Drosophila neurons, but support transcription in only a subset of cells in the nervous system. Most of the associated genes encode neuronal determinants, including those engaged in axonal guidance and adhesion. We used single cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) to identify regions of open chromatin at single cell resolution.
Project description:We performed a meta analysis of publicly available TET1, 5mC, 5hmC and genome wide bisulfite profiling data mostly from mouse embryonic stem cells (ESC). Genome wide chromatin immunoprecipitation combined with deep sequencing (ChIP-seq) has revealed binding of the TET1 protein at CpG-island (CGI) promoters and at bivalent promoters. We show that TET1 also coincides with DNAseI hypersensitive sites (HS). Presence of TET1 at these THREE locations suggests that it may play a dual role: an active role at CpG-islands and DNAseI hypersensitive sites and a repressive role at bivalent loci. In line with the presence of TET1, significant enrichment of 5hmC but not 5mC is detected at bivalent promoters and DNaseI HS. Surprisingly, 5hmC is not detected or present at very low levels at CGI promoters notwithstanding the presence of TET1 at these loci. Our meta analysis suggest that asymmetric methylation is present at CA- and CT-repeats in the genome of some human ESC. Examination of the distribution of 5-methylcytosine and 5-hydroxymethylcytosine in the genome of mouse embryonic stem cells.
Project description:We present evidence for a new level of genome folding, whereby distant domains megabases apart fuse to form meta-domains. Within meta-domains, certain gene promoters pair with structural intergenic elements in the distant TAD. These long-range associations occur in a large fraction of Drosophila neurons, but support transcription in only a subset of cells in the nervous system. Most of the associated genes encode neuronal determinants, including those engaged in axonal guidance and adhesion. We used single cell RNA sequencing (scRNA-seq) to analyze gene misexpression genotypes after deleting intergenic meta-loop anchors.
Project description:Long non-coding RNAs (lncRNAs) are essential regulators of a broad range of biological processes in plants. Spectacular progress in next-generation sequencing technologies has enabled genome-wide identification of lncRNAs in multiple plant species. In this study, genome-wide lncRNA sequencing technology was used to identify cold-responsive lncRNAs at the booting stage in rice by comparison of a tolerant variety, Kongyu131 (KY131), and a sensitive variety, Dongnong422 (DN422). GO and KEGG enrichment analysis were performed, focusing on the cis- and trans- target genes of differential lncRNAs. To identify cold-responsive genes, a meta-analysis was used to integrate cold-tolerant QTLs at the booting stage. In total, 13 cold-responsive target genes were obtained by KEGG enrichment analysis combined with meta-analysis, as confirmed by qRT-PCR. Finally, three of these genes were identified in response to cold stress. These results sought to provide new insight into cold-resistance research for rice.