Dataset Information

Single-center versus multi-center data sets for molecular prognostic modeling: a simulation study.

ABSTRACT: BACKGROUND:Prognostic models based on high-dimensional omics data generated from clinical patient samples, such as tumor tissues or biopsies, are increasingly used for prognosis of radio-therapeutic success. The model development process requires two independent discovery and validation data sets. Each of them may contain samples collected in a single center or a collection of samples from multiple centers. Multi-center data tend to be more heterogeneous than single-center data but are less affected by potential site-specific biases. Optimal use of limited data resources for discovery and validation with respect to the expected success of a study requires dispassionate, objective decision-making. In this work, we addressed the impact of the choice of single-center and multi-center data as discovery and validation data sets, and assessed how this impact depends on the three data characteristics signal strength, number of informative features and sample size. METHODS:We set up a simulation study to quantify the predictive performance of a model trained and validated on different combinations of in silico single-center and multi-center data. The standard bioinformatical analysis workflow of batch correction, feature selection and parameter estimation was emulated. For the determination of model quality, four measures were used: false discovery rate, prediction error, chance of successful validation (significant correlation of predicted and true validation data outcome) and model calibration. RESULTS:In agreement with literature about generalizability of signatures, prognostic models fitted to multi-center data consistently outperformed their single-center counterparts when the prediction error was the quality criterion of interest. However, for low signal strengths and small sample sizes, single-center discovery sets showed superior performance with respect to false discovery rate and chance of successful validation. CONCLUSIONS:With regard to decision making, this simulation study underlines the importance of study aims being defined precisely a priori. Minimization of the prediction error requires multi-center discovery data, whereas single-center data are preferable with respect to false discovery rate and chance of successful validation when the expected signal or sample size is low. In contrast, the choice of validation data solely affects the quality of the estimator of the prediction error, which was more precise on multi-center validation data.

SUBMITTER: Samaga D

PROVIDER: S-EPMC7227093 | biostudies-literature | 2020 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Single-center versus multi-center data sets for molecular prognostic modeling: a simulation study.

Samaga Daniel D Hornung Roman R Braselmann Herbert H Hess Julia J Zitzelsberger Horst H Belka Claus C Boulesteix Anne-Laure AL Unger Kristian K

Radiation oncology (London, England) 20200514 1

<h4>Background</h4>Prognostic models based on high-dimensional omics data generated from clinical patient samples, such as tumor tissues or biopsies, are increasingly used for prognosis of radio-therapeutic success. The model development process requires two independent discovery and validation data sets. Each of them may contain samples collected in a single center or a collection of samples from multiple centers. Multi-center data tend to be more heterogeneous than single-center data but are l ...[more]

PMID: 32410693

Similar Datasets

Project description:BackgroundColon cancer is one of the most common health threats for humans since its high morbidity and mortality. Detecting potential prognosis risk biomarkers (PRBs) is essential for the improvement of therapeutic strategies and drug development. Currently, although an integrated prognostic analysis of multi-omics for colon cancer is insufficient, it has been reported to be valuable for improving PRBs' detection in other cancer types.AimThis study aims to detect potential PRBs for colon adenocarcinoma (COAD) samples through the cancer genome atlas (TCGA) by integrating muti-omics.Materials and methodsThe multi-omics-based prognostic analysis (MPA) model was first constructed to systemically analyze the prognosis of colon cancer based on four-omics data of gene expression, exon expression, DNA methylation and somatic mutations on COAD samples. Then, the essential features related to prognosis were functionally annotated through protein-protein interaction (PPI) network and cancer-related pathways. Moreover, the significance of those essential prognostic features were further confirmed by the target regulation simulation (TRS) model. Finally, an independent testing dataset, as well as the single cell-based expression dataset were utilized to validate the generality and repeatability of PRBs detected in this study.ResultsBy integrating the result of MPA modeling, as well the PPI network, integrated pathway and TRS modeling, essential features with gene symbols such as EPB41, PSMA1, FGFR3, MRAS, LEP, C7orf46, LOC285000, LBP, ZNF35, SLC30A3, LECT2, RNF7, and DYNC1I1 were identified as PRBs which provide high potential as drug targets for COAD treatment. Validation on the independent testing dataset demonstrated that these PRBs could be applied to distinguish the prognosis of COAD patients. Moreover, the prognosis of patients with different clinical conditions could also be distinguished by the above PRBs.ConclusionsThe MPA and TRS models constructed in this paper, as well as the PPI network and integrated pathway analysis, could not only help detect PRBs as potential therapeutic targets for COAD patients but also make it a paradigm for the prognostic analysis of other cancers.

Project description:BACKGROUND:Hypofractionated-SRS (HF-SRS) may allow for improved local control and a reduced risk of radiation necrosis compared to single-fraction-SRS (SF-SRS). However, data comparing these two treatment approaches are limited. The purpose of this study was to compare clinical outcomes between SF-SRS versus HF-SRS across our multi-center academic network. METHODS:Patients treated with SF-SRS or HF-SRS for brain metastasis from 2013 to 2018 across 5 radiation oncology centers were retrospectively reviewed. SF-SRS dosing was standardized, whereas HF-SRS dosing regimens were variable. The co-primary endpoints of local control and radiation necrosis were estimated using the Kaplan Meier method. Multivariate analysis using Cox proportional hazards modeling was performed to evaluate the impact of select independent variables on the outcomes of interest. Propensity score adjustments were used to reduce the effects confounding variables. To assess dose response for HF-SRS, Biologic Effective Dose (BED) assuming an ?/? of 10 (BED10) was used as a surrogate for total dose. RESULTS:One-hundred and fifty six patients with 335 brain metastasis treated with SF-SRS (n?=?222 lesions) or HF-SRS (n?=?113 lesions) were included. Prior whole brain radiation was given in 33% (n?=?74) and 34% (n?=?38) of lesions treated with SF-SRS and HF-SRS, respectively (p?=?0.30). After a median follow up time of 12?months in each cohort, the adjusted 1-year rate of local control and incidence of radiation necrosis was 91% (95% CI 86-96%) and 85% (95% CI 75-95%) (p?=?0.26) and 10% (95% CI 5-15%) and 7% (95% CI 0.1-14%) (p?=?0.73) for SF-SRS and HF-SRS, respectively. For lesions >?2?cm, the adjusted 1?year local control was 97% (95% CI 84-100%) for SF-SRS and 64% (95% CI 43-85%) for HF-SRS (p?=?0.06). On multivariate analysis, SRS fractionation was not associated with local control and only size ?2?cm was associated with a decreased risk of developing radiation necrosis (HR 0.21; 95% CI 0.07-0.58, p?<?0.01). For HF-SRS, 1?year local control was 100% for lesions treated with a BED10???50 compared to 77% (95% CI 65-88%) for lesions that received a BED10?<?50 (p?=?0.09). CONCLUSIONS:In this comparison study of dose fractionation for the treatment of brain metastases, there was no difference in local control or radiation necrosis between HF-SRS and SF-SRS. For HF-SRS, a BED10 ??50 may improve local control.

Dataset Information

Single-center versus multi-center data sets for molecular prognostic modeling: a simulation study.

Publications

Single-center versus multi-center data sets for molecular prognostic modeling: a simulation study.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets