Project description:Microarray technology provides a powerful tool for defining gene expression profiles of airway epithelium that lend insight into the pathogenesis of human airway disorders. The focus of this study was to establish rigorous quality control parameters to ensure that microarray assessment of the airway epithelium is not confounded by experimental artifact. Samples (total n=223) of trachea, large and small airway epithelium were collected by fiberoptic bronchoscopy of 144 individuals (42 healthy non-smokers, 49 healthy smokers, 11 symptomatic smokers, 22 smokers with lone emphysema with normal spirometry, and 20 smokers with COPD) were processed and hybridized to Affymetrix HG-U133 2.0 Plus microarrays. The pre- and post-chip quality control (QC) criteria established, included: (1) RNA quality, assessed by RNA Integrity Number (RIN) ≥7.0 using Agilent 2100 Bioanalyzer software; (2) cRNA transcript integrity, assessed by signal intensity ratio of GAPDH 3' to 5' probe sets ≤3.0; and (3) the multi-chip normalization scaling factor ≤10.0 Of the 223 samples, 213 (95.5%) passed the QC criteria. In a data set of 34 arrays (10 samples failing QC criteria, 24 randomly chosen samples passing QC criteria), correlation coefficients for pairwise comparisons of expression levels for 100 housekeeping genes in which at least one array failed the QC criteria were significantly lower (average Pearson r = 0.90 ± 0.04) and more broadly dispersed than correlation coefficients for pairwise comparisons between any two arrays that passed the QC criteria (average Pearson r = 0.97 ± 0.01). By using the QC cutoff criteria, the inter-array variability, as assessed by the coefficient of variation in the expression levels for 100 housekeeping genes, was reduced from 35.7% to 21.7%. Based on the aberrant housekeeping gene data generated from samples failing the established QC criteria, we propose that the QC criteria outlined in this study can accurately distinguish high quality from low quality data and can be used to delete poor quality microarray samples before proceeding to higher-order biological analyses and interpretation.

Project description:Microarray technology provides a powerful tool for defining gene expression profiles of airway epithelium that lend insight into the pathogenesis of human airway disorders. The focus of this study was to establish rigorous quality control parameters to ensure that microarray assessment of the airway epithelium is not confounded by experimental artifact. Samples (total n=223) of trachea, large and small airway epithelium were collected by fiberoptic bronchoscopy of 144 individuals (42 healthy non-smokers, 49 healthy smokers, 11 symptomatic smokers, 22 smokers with lone emphysema with normal spirometry, and 20 smokers with COPD) were processed and hybridized to Affymetrix HG-U133 2.0 Plus microarrays. The pre- and post-chip quality control (QC) criteria established, included: (1) RNA quality, assessed by RNA Integrity Number (RIN) ≥7.0 using Agilent 2100 Bioanalyzer software; (2) cRNA transcript integrity, assessed by signal intensity ratio of GAPDH 3' to 5' probe sets ≤3.0; and (3) the multi-chip normalization scaling factor ≤10.0 Of the 223 samples, 213 (95.5%) passed the QC criteria. In a data set of 34 arrays (10 samples failing QC criteria, 24 randomly chosen samples passing QC criteria), correlation coefficients for pairwise comparisons of expression levels for 100 housekeeping genes in which at least one array failed the QC criteria were significantly lower (average Pearson r = 0.90 ± 0.04) and more broadly dispersed than correlation coefficients for pairwise comparisons between any two arrays that passed the QC criteria (average Pearson r = 0.97 ± 0.01). By using the QC cutoff criteria, the inter-array variability, as assessed by the coefficient of variation in the expression levels for 100 housekeeping genes, was reduced from 35.7% to 21.7%. Based on the aberrant housekeeping gene data generated from samples failing the established QC criteria, we propose that the QC criteria outlined in this study can accurately distinguish high quality from low quality data and can be used to delete poor quality microarray samples before proceeding to higher-order biological analyses and interpretation. Affymetrix arrays were used to assess the quality of gene expression data in trachea, large airway and small airway epithelium obtained by fiberoptic bronchoscopy of 42 healthy non-smokers, 49 healthy smokers, 11 symptomatic smokers, 22 smokers with lone emphysema with normal spirometry, and 20 smokers with COPD.

Project description:Formalin-fixed, paraffin-embedded (FFPE) tissues have many advantages for identification of risk biomarkers, including wide availability and potential for extended follow-up endpoints. However, RNA derived from archival FFPE samples has limited quality. Here we identified parameters that determine which FFPE samples have the potential for successful RNA extraction, library preparation, and generation of usable RNAseq data. We optimized library preparation protocols designed for use with FFPE samples using seven FFPE and Fresh Frozen replicate pairs, and tested optimized protocols using a study set of 130 FFPE biopsies from women with benign breast disease. Metrics from RNA extraction and preparation procedures were collected and compared with bioinformatics sequencing summary statistics. Finally, a decision tree model was built to learn the relationship between pre-sequencing lab metrics and qc pass/fail status as determined by bioinformatics metrics.. Samples that failed bioinformatics qc tended to have low median sample-wise correlation within the cohort (Spearman correlation < 0.75), low number of reads mapped to gene regions (< 25 million), or low number of detectable genes (11,400 # of detected genes with TPM > 4). The median RNA concentration and pre-capture library Qubit values for qc failed samples were 18.9 ng/ul and 2.08 ng/ul respectively, which were significantly lower than those of qc pass samples (40.8 ng/ul and 5.82 ng/ul). We built a decision tree model based on input RNA concentration, input library qubit values, and achieved an F score of 0.848 in predicting QC status (pass/fail) of FFPE samples. We provide a bioinformatics quality control recommendation for FFPE samples from breast tissue by evaluating bioinformatic and sample metrics. Our results suggest a minimum concentration of 25 ng/ul FFPE-extracted RNA for library preparation and 1.7 ng/ul pre-capture library output to achieve adequate RNA-seq data for downstream bioinformatics analysis.

Dataset Information

MiRTrace quality control of small RNA-Seq data prepared from low-input, degraded and contamianted HEK-293T RNA samples

Shared Molecules

Only show the datasets with similarity scores above: 0.5

Threshold

0.5

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets