Dataset Information

ABSTRACT: raw data of genome re-seq

PROVIDER: PRJNA778385 | ENA |

REPOSITORIES: ENA

ACCESS DATA

Dataset's files

Source:

			Action	DRS
	SRR16832143_1.fastq.gz	Fastqsanger.gz
	SRR16832143_2.fastq.gz	Fastqsanger.gz
	SRR16832144_1.fastq.gz	Fastqsanger.gz
	SRR16832144_2.fastq.gz	Fastqsanger.gz
	SRR16832145_1.fastq.gz	Fastqsanger.gz

Items per page:

1 - 5 of 28

Similar Datasets

Project description:In this project, we aim to pair-wise analyze the genomes, transcriptomes and proteomes of in-bred rats originating from two different genetic backgrounds. These two strains are Brown Norway (BN-Lx) and Spontaneously Hypertensive Rats (SHR). First, we re-sequenced the genomes for both BN and SHR rats, followed by RNA-seq and proteomics of their liver tissues. We then append novel predicted gene models, non-synonymous SNPs and INDELs (derived from genome re-sequencing), as well as transcript variants such as RNA-editing and alternative splicing (derived from RNA-seq) that can diversify existing protein sequences onto the ENSEMBL rat FASTA (Build 68) to build an enhanced database. For proteomics studies, equal amount of liver lysates were digested with trypsin, LysC, GluC, AspN and chymotrypsin and were individually fractionated with strong cationic exchange chromatography. Doubly- and triply-charged fractions were analyzed with an Triple-TOF 5600 with collision-activated dissociation (CAD); while electron-transfer dissociation (ETD) was applied for fractions containing triple charges and above with a LTQ-Orbitrap Velos. Data analysis: Peak List generation: For Wiff files generated from TripleTOF 5600, tandem MS spectra were de-isotoped, charge- deconvoluted and peak lists converted to Mascot generic format (MGF) files using AB Sciex Data Converter (version 1.1). For data generated from the LTQ-Orbitrap Velos, Raw files were converted to MGF files using Proteome Discoverer (version 1.3). The non-fragment filter was used to simplify ETD spectra and the Top N filter for the HCD spectra. Three MGF files were generated (one for HCD, one for ETD IT and one for ETD FT). The files with an orbitrap readout were deisotoped and charge de-convoluted. Database Searching: All MGF files were queried with Mascot search engine (version 2.3) via Proteome Discoverer version 1.3 (PD 1.3, Thermo Fisher) for submission. The spectra were searched against in-house database (NGS_COMBINED). One of the five different enzymes used (Trypsin/P, LysC/P, Chymotrypsin, GluC-DE and AspN_ambic) were selected for each file and up to 9 missed cleavages were allowed. Cysteine carbamidomethylation was set as fixed modification, and oxidation of methionine and acetylation of the N-term as variable modifications. Peptide tolerance was initially set to 50 ppm and the MS/MS tolerance was set to 0.1 Da (for TOF readout), 0.02 Da (orbitrap readout) and 0.5 Da (ion trap readout). All peptide-spectrum matches (PSMs) were evaluated with Percolator for validation. We classified each PSM based on their q value. For proteins identification, we used set a high stringency filter of q = 0 (0% FDR). For peaks lists that do not yield any peptide matches, we exported them with PD 1.3 for further analysis. De novo search with PEAKS: Unassigned peak lists that are exported were re-analyzed with another software suite i.e. PEAKS Studio (version 6.0). The identification workflows is as follows. Peak lists were first filtered with a quality value of 0.65 as suggested by the manufacturer followed by de novo spectra interpretation. In this step, both peptide tolerance and MS/MS tolerance were set according to MASCOT search. To broaden the search space for these unassigned spectra, we additionally set de-amidation of asparagine and glutamine, and pyro-glu from glutamic acid and glutamine as variable modifications, on top of the other modifications indicated above. Maximum allowed variable PTM per peptide was set to 3. Finally de novo interpreted PSMs were submitted to PEAKS DB database matching, this time allowing semi-enzymatic specificity and a maximum cleavages per peptide of 2. Database used was set to NGS_COMBINED. FDR was estimated using decoy-fusion. The genomics and transcriptomics data are already deposited in the respective EBI repositories. Some of these data are derived from an already published manuscript. For the genomics data (from: Genetic basis of transcriptome differences between the founder strains of the rat HXB/BXH recombinant inbred panel by Simonis et al PMID:22541052) DNA data in Sequence Read Archive (SRA): BN-Lx genome: ERP001355 http://www.ebi.ac.uk/ena/data/view/ERP001355, SHR genome: ERP001371, BN reference genome: ERP000510, http://www.ebi.ac.uk/ena/data/view/ERP000510. RNA data in ArrayExpress: BN-Lx and SHR fragment RNA-seq data: E-MTAB-1029 http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-1029, BN-Lx and SHR paired-end RNA-seq data: to be submitted.

Dataset Information

Dataset's files

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets