Project description:The locations of mammalian recombination hotspots are determined by PRDM9, a zinc finger histone methyltransferase that locally trimethylates histone H3 at residues K4 and K36. We previously reported two hypomorphic catalytic mutations, Prdm9-EP and Prdm9-EK, with different phenotypic effects. Prdm9-EP, but not Prdm9-EK, is compatible with female sub-fertility, while both mutations phenocopy the Prdm9-null condition in males. Here we directly compare and contrast the enzymatic effects of the two mutations in vitro and in vivo. We previously performed two biological H3K4me3 ChIP-seq replicates in spermatocytes isolated from Prdm9-EP homozygous males (GSE144144; SRX8588740 and SRX8588741), and re-processed previously reported H3K4me3 ChIP-seq data from spermatocytes isolated from wild-type B6 males (GSE52628; SRX381465 and SRX381466). We used those raw and processed files for this study (GSE144144). We also previously performed one biological H3K4me3 replicate in spermatocytes isolated from Prdm9-EP homozygous males (GSE112110; SRX4136625). We report an additional replicate here, and merged the two replicates for analysis; raw and processed files are reported here. We also performed ChIP-seq for H3K36me3 in both Prdm9-EP and Prdm9-EK homozygous spermatocytes. Raw and processed files are available here. For comparison, we re-mapped and re-analyzed H3K36me3 ChIP-seq data we previously reported from wild-type B6 spermatocytes (GSE76416; SRX1508234); processed files are available here.
Project description:The GSE114044 data was re-mapped and re-analyzed. Duplicated sample records were created for the convenient retrieval of the complete raw data from SRA.
Project description:Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated techniques depend heavily on sequence context and often underestimate the complexity of the proteome. We developed REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION), a de novo algorithm that takes advantage of experimental evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation. Ribo-seq next generation sequencing technique that provides a genome-wide snapshot of the position translating ribosome along an mRNA at the time of the experiment. REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds to screen for spurious ORFs based on a growth curve model. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel ORFs including variants of previously annotated and novel small ORFs (<71 codons). Our predictions were supported by matching mass spectrometry (MS) proteomics data and sequence conservation analysis. REPARATION is unique in that it makes use of experimental Ribo-seq data to perform de novo ORF delineation in bacterial genomes, and thus can identify putative coding ORFs irrespective of the sequence context of the reading frame.
Project description:Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated techniques depend heavily on sequence context and often underestimate the complexity of the proteome. We developed REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION), a de novo algorithm that takes advantage of experimental evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation. Ribo-seq next generation sequencing technique that provides a genome-wide snapshot of the position translating ribosome along an mRNA at the time of the experiment. REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds to screen for spurious ORFs based on a growth curve model. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel ORFs including variants of previously annotated and novel small ORFs (<71 codons). Our predictions were supported by matching mass spectrometry (MS) proteomics data and sequence conservation analysis. REPARATION is unique in that it makes use of experimental Ribo-seq data to perform de novo ORF delineation in bacterial genomes, and thus can identify putative coding ORFs irrespective of the sequence context of the reading frame.
Project description:Observational, Multicenter, Post-market, Minimal risk, Prospective data collection of PillCam SB3 videos (including PillCam reports) and raw data files and optional collection of Eneteroscopy reports
Project description:In the present study, RNA-seq technique was used to compare the expression profiles of lncRNAs from goat endometrium samples at gestational day 5 (pre-receptive endometrium, PE) and day 15 (receptive endometrium, RE). This yielded 18 gigabases (Gb) of sequence, representing approximately seven times the size of the genome (2.66 Gb). A total of 120 million raw reads were produced from the Illumina HiSeq 2500 platform. After discarding adaptor sequences and low-quality sequences, 90 to 97 million clear reads per sample were obtained, and the percentage of clean reads among raw tags in each library ranged from 75.79–81.03 %. A total of 668 lncRNAs were found to differ significantly in terms of expressional levels (P< 0.05) between the PE and RE libraries,98.35% of the DELs were mapped to “u” (Unknown, intergenic transcript).Our results included 76,844 lncRNAs that corresponded to 42,933 protein-coding genes within a range of 1-100 kb. It deserved to note that 783 target genes of the 200 DELs that were annotated to 153 GO terms meeting our designated criteria of P-values< 0.05, KEGG pathway annotation showed 242 target genes of the DELs were annotated for 146 KEGG pathways.
Project description:In this project, we aim to pair-wise analyze the genomes, transcriptomes and proteomes of in-bred rats originating from two different genetic backgrounds. These two strains are Brown Norway (BN-Lx) and Spontaneously Hypertensive Rats (SHR). First, we re-sequenced the genomes for both BN and SHR rats, followed by RNA-seq and proteomics of their liver tissues. We then append novel predicted gene models, non-synonymous SNPs and INDELs (derived from genome re-sequencing), as well as transcript variants such as RNA-editing and alternative splicing (derived from RNA-seq) that can diversify existing protein sequences onto the ENSEMBL rat FASTA (Build 68) to build an enhanced database. For proteomics studies, equal amount of liver lysates were digested with trypsin, LysC, GluC, AspN and chymotrypsin and were individually fractionated with strong cationic exchange chromatography. Doubly- and triply-charged fractions were analyzed with an Triple-TOF 5600 with collision-activated dissociation (CAD); while electron-transfer dissociation (ETD) was applied for fractions containing triple charges and above with a LTQ-Orbitrap Velos. Data analysis: Peak List generation: For Wiff files generated from TripleTOF 5600, tandem MS spectra were de-isotoped, charge- deconvoluted and peak lists converted to Mascot generic format (MGF) files using AB Sciex Data Converter (version 1.1). For data generated from the LTQ-Orbitrap Velos, Raw files were converted to MGF files using Proteome Discoverer (version 1.3). The non-fragment filter was used to simplify ETD spectra and the Top N filter for the HCD spectra. Three MGF files were generated (one for HCD, one for ETD IT and one for ETD FT). The files with an orbitrap readout were deisotoped and charge de-convoluted. Database Searching: All MGF files were queried with Mascot search engine (version 2.3) via Proteome Discoverer version 1.3 (PD 1.3, Thermo Fisher) for submission. The spectra were searched against in-house database (NGS_COMBINED). One of the five different enzymes used (Trypsin/P, LysC/P, Chymotrypsin, GluC-DE and AspN_ambic) were selected for each file and up to 9 missed cleavages were allowed. Cysteine carbamidomethylation was set as fixed modification, and oxidation of methionine and acetylation of the N-term as variable modifications. Peptide tolerance was initially set to 50 ppm and the MS/MS tolerance was set to 0.1 Da (for TOF readout), 0.02 Da (orbitrap readout) and 0.5 Da (ion trap readout). All peptide-spectrum matches (PSMs) were evaluated with Percolator for validation. We classified each PSM based on their q value. For proteins identification, we used set a high stringency filter of q = 0 (0% FDR). For peaks lists that do not yield any peptide matches, we exported them with PD 1.3 for further analysis. De novo search with PEAKS: Unassigned peak lists that are exported were re-analyzed with another software suite i.e. PEAKS Studio (version 6.0). The identification workflows is as follows. Peak lists were first filtered with a quality value of 0.65 as suggested by the manufacturer followed by de novo spectra interpretation. In this step, both peptide tolerance and MS/MS tolerance were set according to MASCOT search. To broaden the search space for these unassigned spectra, we additionally set de-amidation of asparagine and glutamine, and pyro-glu from glutamic acid and glutamine as variable modifications, on top of the other modifications indicated above. Maximum allowed variable PTM per peptide was set to 3. Finally de novo interpreted PSMs were submitted to PEAKS DB database matching, this time allowing semi-enzymatic specificity and a maximum cleavages per peptide of 2. Database used was set to NGS_COMBINED. FDR was estimated using decoy-fusion. The genomics and transcriptomics data are already deposited in the respective EBI repositories. Some of these data are derived from an already published manuscript. For the genomics data (from: Genetic basis of transcriptome differences between the founder strains of the rat HXB/BXH recombinant inbred panel by Simonis et al PMID:22541052) DNA data in Sequence Read Archive (SRA): BN-Lx genome: ERP001355 http://www.ebi.ac.uk/ena/data/view/ERP001355, SHR genome: ERP001371, BN reference genome: ERP000510, http://www.ebi.ac.uk/ena/data/view/ERP000510. RNA data in ArrayExpress: BN-Lx and SHR fragment RNA-seq data: E-MTAB-1029 http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1029, BN-Lx and SHR paired-end RNA-seq data: to be submitted.