ABSTRACT: Our modified PANDAseq (Assembler) performs a modified statistical analysis using the sequencer supplied quality (Q) scores to find the most likely overlap, computes assembled Q scores for the read overlap region, and handles more complex overlap layouts
Project description:Here we describe a custom FMDV microarray and a companion feature and template-assisted assembler software (FAT-assembler) capable of resolving virus genome sequence using a moderate number of conserved microarray features. The results demonstrate that this approach may be used to rapidly characterize naturally occurring FMDV as well as an engineered chimeric strain of FMDV. The FAT-assembler, while applied to resolving FMDV genomes, represents a new bioinformatics approach that should be broadly applicable to interpreting microarray genotyping data for other viruses or target organisms
Project description:Epithelial cells were isolated by FACS from the mammary glands of adult (10 week old) female mice. A basal subpopulation of the epithelial cells was also isolated. Freshly sorted cells were submitted to a 10X Genomics Chromium System for single cell capture. cDNA synthesis and library preparation was done according to the protocol supplied by the manufacturer. Sequencing was carried out on an Illumina NextSeq500 sequencer to achieve 75 bp paired-end reads.
Project description:Epithelial cells were isolated by FACS from the mammary glands of pubescent (5 week old), estrus adult (10 week old) and diestrus adult (10 week old) female mice. Freshly sorted cells were submitted to a 10X Genomics Chromium System for single cell capture. cDNA synthesis and library preparation was done according to the protocol supplied by the manufacturer. Sequencing was carried out on an Illumina NextSeq500 sequencer using parameters recommended by 10X Genomics.
Project description:The mammary glands of adult female mice were divided into ductal tissue and terminal end buds (TEBs). Basal and luminal epithelial cells were FACS sorted and nuclei extracted for 75bp paired-end ATAC-seq profiling using an Illumina NextSeq 500 sequencer.
Project description:The mammary glands of adult female mice were divided into ductal tissue and terminal end buds (TEBs). Basal and luminal epithelial cells were FACS sorted and RNA extracted for 75bp paired-end RNA-seq profiling using an Illumina NextSeq 500 sequencer.
Project description:Structure probing coupled with high-throughput sequencing holds the potential to revolutionize our understanding of the role of RNA structure in regulation of gene expression. Despite major technological advances, intrinsic noise and high coverage requirements greatly limit the applicability of these techniques. Here we describe a probabilistic modeling pipeline which accounts for biological variability and biases in the data, yielding statistically interpretable scores for the probability of nucleotide modification transcriptome-wide. We demonstrate on two yeast data sets that our method has greatly increased sensitivity, enabling the identification of modified regions on many more transcripts compared with existing pipelines. It also provides confident predictions at much lower coverage levels than previously reported. Our results show that statistical modeling greatly extends the scope and potential of transcriptome-wide structure probing experiments.
Project description:Phosphoproteomics methods are commonly employed in labs to identify and quantify the sites of phosphorylation on proteins. In recent years, various software tools have been developed, incorporating scores or statistics related to whether a given phosphosite has been correctly identified, or to estimate the global false localisation rate (FLR) within a given data set for all sites reported. These scores have generally been calibrated using synthetic data sets, and their statistical reliability on real datasets is largely unknown. As a result, there is considerable problem in the field of reporting incorrectly localised phosphosites, due to inadequate statistical control. In this work, we develop the concept of using scoring and ranking modifications on a decoy amino acid, i.e. one that cannot be modified, to allow for independent estimation of global FLR. We test a variety of different amino acids to act as the decoy, on both synthetic and real data sets, demonstrating that the amino acid selection can make a substantial difference to the estimated global FLR. We conclude that while several different amino acids might be appropriate, the most reliable FLR results were achieved using alanine and leucine as decoys, although we have a preference for alanine due to the risk of potential confusion between leucine and isoleucine amino acids. We propose that the phosphoproteomics field should adopt the use of a decoy amino acid, so that there is better control of false reporting in the literature, and in public databases that re-distribute the data.
Project description:The mammary glands of adult female mice were divided into ductal tissue and terminal end buds (TEBs). Basal and luminal epithelial cells were FACS sorted. RNA and nuclei were extracted for RNA-seq and ATAC-seq profiling using an Illumina NextSeq 500 sequencer. This SuperSeries is composed of the SubSeries listed below.
Project description:PRDM9 is a histone methyltransferase expressed in meiotic germ cells that determines the location of genetic recombination hotspots through binding of its allele-specific DNA binding domain. Here we characterize the genome-wide chromatin modification for two human PRDM9 alleles (A and C) in human cell lines. HEK293 cells were transfected with both alleles and an empty vector control. Resulting chromatin was subjected to H3K4me3 ChIP followed by high-throughput sequencing. We find that different PRDM9 allele largely modified chromatin in entirely different genomic regions in somatic cells determined by the protein's zinc-finger DNA binding domains. Many of the allele-specific peaks overlap sites of meiotic double-strand breaks found in vivo in human germ cells suggesting that transient expression of PRDM9 in somatic cells can reflect binding in vivo. Identify PRDM9-dependent H3K4me3 sites by comparing modified chromatin after expression of different human PRDM9 alleles in HEK293 cells.
Project description:Clear cell renal cell carcinoma is the most common type of renal cancers, which forms tumors strongly supplied with blood vessels, here we wanted to check the exprresion of genes on different stages of tumor progression, and find which of them changes significantly with increased grade.