Project description:Based on our previous O-Search strategy, we have developed a new search method, O-Search-Pattern, to process searching for O-glycopeptide. In comparison of analyzing our human serum dataset generated from optimized energy, our new method can generate more GPSMs glycopeptide sequences than currently state-of-the-art search tools.
Project description:State-of-the-art algorithms for m6A detection and quantification via nanopore direct RNA sequencing have been continuously developed, little is known about their capacities and limitations, which makes a comprehensive assessment in urgent need. Therefore, we performed comprehensive benchmarking of 10 computational tools relying on current-based and base-calling “errors” strategies for m6A detection by nanopore sequencing.
Project description:Single-cell transcriptomics allows the identification of cellular types, subtypes and states through cell clustering. In this process, similar cells are grouped before determining co-expressed marker genes for phenotype inference. The performance of computational tools is directly associated to their marker identification accuracy, but the lack of an optimal solution challenges a systematic method comparison. Moreover, phenotypes from different studies are challenging to integrate, due to varying resolution, methodology and experimental design. In this work we introduce matchSCore (https://github.com/elimereu/matchSCore), a measure to fastly match cell populations across tools, experiments and technologies. We compared 14 computational methods and evaluated their accuracy in clustering and gene marker identification in simulated data sets. Further, we used matchSCore to project cell type identities across mouse or human cell atlas projects. Despite originated from different technologies, cell populations could be matched across datasets, allowing the assignment of clusters to reference maps and their annotation.
Project description:Pollen grains of Arabidopsis thaliana contain two haploid sperm cells enclosed in a haploid vegetative cell. Upon germination, the vegetative cell extrudes a pollen tube that carries the sperm to an ovule for fertilization. Knowing the identity, relative abundance, and splicing patterns of pollen transcripts will improve understanding of pollen and allow investigation of tissue-specific splicing in plants. Most Arabidopsis pollen transcriptome studies have used the ATH1 microarray, which does not assay splice variants and lacks specific probe sets for many genes. To investigate the pollen transcriptome, we performed high-throughput sequencing (RNA-Seq) of Arabidopsis pollen and seedlings for comparison. Gene expression was more diverse in seedling, and genes involved in cell wall biogenesis were highly expressed in pollen. RNA-Seq detected at least 4,172 protein coding genes expressed in pollen, including 289 assayed only by non-specific probe sets. Additional exons and previously unannotated 5’ and 3’ UTRs for pollen-expressed genes were revealed. We detected regions in the genome not previously annotated as expressed; 14 were tested and 12 confirmed by PCR. Gapped read alignments revealed 1,908 high-confidence new splicing events supported by 10 or more spliced read alignments. Alternative splicing patterns in pollen and seedling were highly correlated. For most alternatively spliced genes, the ratio of variants in pollen and seedling was similar, except for some encoding proteins involved in RNA splicing. This study highlights the robustness of splicing patterns in plants and the importance of on-going annotation and visualization of RNA-Seq data using interactive tools such as Integrated Genome Browser.