Project description:Purpose Integrated genomics approaches have identified at least four distinct biological variants in medulloblastoma: WNT, SHH, group C, and group D. Non-WNT/Non-SHH tumors are associated with metastatic dissemination and an unfavorable prognosis. Additional markers may enhance outcome prediction in Non-WNT/Non-SHH medulloblastomas. Experimental Design We combined transcriptomic and DNA copy-number analyses for 64 primary medulloblastomas. Bioinformatic tools were applied to discover marker genes of molecular variants. Differentially expressed transcripts were evaluated for prognostic value in the screening cohort. Immunopositivity for FSTL5 was correlated with molecular and prognostic subgroups for 235 non-overlapping medulloblastoma samples on two independent tissue microarrays (TMA). Results Unsupervised clustering analyses of transcriptome profiles confirmed four distinct molecular variants. Stable subgroup separation was achieved using only the 300 most varying transcripts. Specific distributions of clinical and molecular characteristics were noted for each cluster. Distinct expression patterns of FSTL5 in each molecular subgroup were confirmed by quantitative real-time PCR. Immunopositivity of FSTL5 identified a large cohort of patients (84 of 235 patients; 36%) at high risk for relapse and death. Importantly, over 50% of Non-WNT/Non-SHH tumors displayed FSTL5 negativity, delineating a large patient cohort with an excellent prognosis who would be considered intermediate/high-risk based on current molecular subtyping. Conclusions Comprehensive analyses of transcriptomic and genetic alterations delineate four distinct variants of medulloblastoma. The addition of FSTL5 immunohistochemistry to existing molecular stratification schemes can effectively identify those Non-WNT/Non-SHH tumors with a poor outcome. Immunohistochemical staining for FSTL5 could be a high-quality and practical tool for stratification and prognostication in future clinical trials of medulloblastoma. Whole-genome transcriptional profiling of human medulloblastomas. Subgrouping based on mRNA expression profiles. Fresh frozen tumor material was collected during tumor resection. Dye-swap design used for expression profiling. Reference was a pool of normal cerebellum tissue from 24 donors. Gene expression profiles illustrate distinct expression pattern at diagnosis. This submission represents the gene expression component of the study.
Project description:Sampling the natural world and built environment underpins much of science, yet systems for managing material samples and associated (meta)data are fragmented across institutional catalogs, practices for identification, and discipline-specific (meta)data standards. The Internet of Samples (iSamples) is a standards-based collaboration to uniquely, consistently, and conveniently identify material samples, record core metadata about them, and link them to other samples, data, and research products. iSamples extends existing resources and best practices in data stewardship to render a cross-domain cyberinfrastructure that enables transdisciplinary research, discovery, and reuse of material samples in 21st century natural science.
Project description:Expression and differential expression analysis of breast cancer patient samples and normal samples from breast reduction operations. Fresh frozen tumor biopsies from early breast cancer cases were collected from 920 patients included in the Oslo Micrometastasis (MicMa) Study -- Oslo I from various hospitals between 1995 and 1998 (Naume et al. "Presence of bone marrow micrometastasis is associated with different recurrence risk within molecular subtypes of breast cancer." Mol Oncol 2007, 1: 160-171; Wiedswang et al. "Detection of isolated tumor cells in bone marrow is an independent prognostic factor in breast cancer." J Clin Oncol 2003, 21: 3469-3478.). Breast tissue samples from breast reduction operations were provided from the Colosseum Clinic, Oslo in co-operation with Akershus University Hospital, Lørenskog and are referred to as normal tissue. Expression and differential expression was assessed by using an Agilent custom microarray (244K, nONCOchip). The custom array contains probes for genomic regions that have been found to be differentially expressed (i) throughout cell cycle progression, (ii) in response to the anti-proliferative and pro-apoptotic p53 pathway, and (iii) the anti-apoptotic and pro-proliferative STAT-3 pathway by employing TAS (Kampa et al. "Novel RNAs identified from an in-depth analysis of the transcriptome of human chromosomes 21 and 22". Genome Research, 14:331-42, 2004). In addition, the Agilent custom array (244K) interrogates probes for genomic regions predicted to contain a conserved secondary structure identified by RNAz (Washietl et al. "Fast and reliable prediction of noncoding RNAs." Proc Natl Acad Sci USA. 102:2454-9, 2005.) or Evofold (Pedersen et al. "Identification and classification of conserved RNA secondary structures in the human genome." PLoS Comput Biol. 2:e33, 2006.), as well as known non-coding RNAs from public databases, and the Agilent mRNA probe set 014850. We analyzed 5 to 6 arrays each for breast cancer patient samples and normal samples.
Project description:Application of recent techniques to detect current pathogens in archival effluent samples collected and concentrated in 1987 lead to the characterization of norovirus GGII.6 Seacroft, unrecognized until 1990 in a clinical sample. Retrospective studies will likely increase our knowledge about waterborne transmission of emerging pathogens.
Project description:In a typical predictive modeling task, we are asked to produce a final predictive model to employ operationally for predictions, as well as an estimate of its out-of-sample predictive performance. Typically, analysts hold out a portion of the available data, called a Test set, to estimate the model predictive performance on unseen (out-of-sample) records, thus "losing these samples to estimation." However, this practice is unacceptable when the total sample size is low. To avoid losing data to estimation, we need a shift in our perspective: we do not estimate the performance of a specific model instance; we estimate the performance of the pipeline that produces the model. This pipeline is applied on all available samples to produce the final model; no samples are lost to estimation. An estimate of its performance is provided by training the same pipeline on subsets of the samples. When multiple pipelines are tried, additional considerations that correct for the "winner's curse" need to be in place.
Project description:In analytical ultracentrifugation it is often very useful to resuspend samples in situ after sedimentation experiments for further investigation. This can be achieved by manually subjecting the entire sample cell assembly to gentle motion that causes the air bubble in the sample compartment to repeatedly move through the solution and thereby cause convection. Here we describe a cell mixing device that can accomplish the same through axial rotation and slow rocking motion. This cell mixer is low-cost, open-source, and can be easily assembled from readily available components. It can efficiently mix multiple sample cells side-by-side and may be used with various centerpiece designs.
Project description:BackgroundThe use of early morning sputum samples (EMS) to diagnose tuberculosis (TB) can result in treatment delay given the need for the patient to return to the clinic with the EMS, increasing the chance of patients being lost during their diagnostic workup. However, there is little evidence to support the superiority of EMS over spot sputum samples. In this new analysis of the REMoxTB study, we compare the diagnostic accuracy of EMS with spot samples for identifying Mycobacterium tuberculosis pre- and post-treatment.MethodsPatients who were smear positive at screening were enrolled into the study. Paired sputum samples (one EMS and one spot) were collected at each trial visit pre- and post-treatment. Microscopy and culture on solid LJ and liquid MGIT media were performed on all samples; those missing corresponding paired results were excluded from the analyses.ResultsData from 1115 pre- and 2995 post-treatment paired samples from 1931 patients enrolled in the REMoxTB study were analysed. Patients were recruited from South Africa (47%), East Africa (21%), India (20%), Asia (11%), and North America (1%); 70% were male, median age 31 years (IQR 24-41), 139 (7%) co-infected with HIV with a median CD4 cell count of 399 cells/μL (IQR 318-535). Pre-treatment spot samples had a higher yield of positive Ziehl-Neelsen smears (98% vs. 97%, P = 0.02) and LJ cultures (87% vs. 82%, P = 0.006) than EMS, but there was no difference for positivity by MGIT (93% vs. 95%, P = 0.18). Contaminated and false-positive MGIT were found more often with EMS rather than spot samples. Surprisingly, pre-treatment EMS had a higher smear grading and shorter time-to-positivity, by 1 day, than spot samples in MGIT culture (4.5 vs. 5.5 days, P < 0.001). There were no differences in time to positivity in pre-treatment LJ culture, or in post-treatment MGIT or LJ cultures. Comparing EMS and spot samples in those with unfavourable outcomes, there were no differences in smear or culture results, and positive results were not detected earlier in Kaplan-Meier analyses in either EMS or spot samples.ConclusionsOur data do not support the hypothesis that EMS samples are superior to spot sputum samples in a clinical trial of patients with smear positive pulmonary TB. Observed small differences in mycobacterial burden are of uncertain significance and EMS samples do not detect post-treatment positives any sooner than spot samples.
Project description:Unsupervised learning techniques, such as clustering and embedding, have been increasingly popular to cluster biomedical samples from high-dimensional biomedical data. Extracting clinical data or sample meta-data shared in common among biomedical samples of a given biological condition remains a major challenge. Here, we describe a powerful analytical method called Statistical Enrichment Analysis of Samples (SEAS) for interpreting clustered or embedded sample data from omics studies. The method derives its power by focusing on sample sets, i.e., groups of biological samples that were constructed for various purposes, e.g., manual curation of samples sharing specific characteristics or automated clusters generated by embedding sample omic profiles from multi-dimensional omics space. The samples in the sample set share common clinical measurements, which we refer to as "clinotypes," such as age group, gender, treatment status, or survival days. We demonstrate how SEAS yields insights into biological data sets using glioblastoma (GBM) samples. Notably, when analyzing the combined The Cancer Genome Atlas (TCGA)-patient-derived xenograft (PDX) data, SEAS allows approximating the different clinical outcomes of radiotherapy-treated PDX samples, which has not been solved by other tools. The result shows that SEAS may support the clinical decision. The SEAS tool is publicly available as a freely available software package at https://aimed-lab.shinyapps.io/SEAS/.
Project description:Contemporary genotyping and sequencing methods do not provide information on linkage phase in diploid organisms. The application of statistical methods to infer and reconstruct linkage phase in samples of diploid sequences is a potentially time- and labor-saving method. The Stephens-Smith-Donnelly (SSD) algorithm is one such method, which incorporates concepts from population genetics theory in a Markov chain-Monte Carlo technique. We applied a modified SSD method, as well as the expectation-maximization and partition-ligation algorithms, to sequence data from eight loci spanning >1 Mb on the human X chromosome. We demonstrate that the accuracy of the modified SSD method is better than that of the other algorithms and is superior in terms of the number of sites that may be processed. Also, we find phase reconstructions by the modified SSD method to be highly accurate over regions with high linkage disequilibrium (LD). If only polymorphisms with a minor allele frequency >0.2 are analyzed and scored according to the fraction of neighbor relations correctly called, reconstructions are 95.2% accurate over entire 100-kb stretches and are 98.6% accurate within blocks of high LD.