Project description:BackgroundAnonymization opens up innovative ways of using secondary data without the requirements of the GDPR, as anonymized data does not affect anymore the privacy of data subjects. Anonymization requires data alteration, and this project aims to compare the ability of such privacy protection methods to maintain reliability and utility of scientific data for secondary research purposes.MethodsThe French data protection authority (CNIL) defines anonymization as a processing activity that consists of using methods to make impossible any identification of people by any means in an irreversible manner. To answer project's objective, a series of analyses were performed on a cohort, and reproduced on four sets of anonymized data for comparison. Four assessment levels were used to evaluate impact of anonymization: level 1 referred to the replication of statistical outputs, level 2 referred to accuracy of statistical results, level 3 assessed data alteration (using Hellinger distances) and level 4 assessed privacy risks (using WP29 criteria).Results87 items were produced on the raw cohort data and then reproduced on each of the four anonymized data. The overall level 1 replication score ranged from 67% to 100% depending on the anonymization solution. The most difficult analyses to replicate were regression models (sub-score ranging from 78% to 100%) and survival analysis (sub-score ranging from 0% to 100. The overall level 2 accuracy score ranged from 22% to 79% depending on the anonymization solution. For level 3, three methods had some variables with different probability distributions (Hellinger distance = 1). For level 4, all methods had reduced the privacy risk of singling out, with relative risk reductions ranging from 41% to 65%.ConclusionNone of the anonymization methods reproduced all outputs and results. A trade-off has to be find between context risk and the usefulness of data to answer the research question.
Project description:BackgroundSurgical risk prediction tools can facilitate shared decision-making and efficient allocation of perioperative resources. Such tools should be externally validated in target populations before implementation.MethodsPredicted risk of 30-day mortality was retrospectively derived for surgical patients at Royal Perth Hospital from 2014 to 2021 using the Surgical Outcome Risk Tool (SORT) and the related NZRISK (n=44 031, 53 395 operations). In a sub-population (n=31 153), the Physiology and Operative Severity Score for the enumeration of Mortality (POSSUM) and the Portsmouth variant of this (P-POSSUM) were matched from the Copeland Risk Adjusted Barometer (C2-Ai, Cambridge, UK). The primary outcome was risk score discrimination of 30-day mortality as evaluated by area-under-receiver operator characteristic curve (AUROC) statistics. Calibration plots and outcomes according to risk decile and time were also explored.ResultsAll four risk scores showed high discrimination (AUROC) for 30-day mortality (SORT=0.922, NZRISK=0.909, P-POSSUM=0.893; POSSUM=0.881) but consistently over-predicted risk. SORT exhibited the best discrimination and calibration. Thresholds to denote the highest and second-highest deciles of SORT risk (>3.92% and 1.52-3.92%) captured the majority of deaths (76% and 13%, respectively) and hospital-acquired complications. Year-on-year SORT calibration performance drifted towards over-prediction, reflecting a decrease in 30-day mortality over time despite an increase in the surgical population risk.ConclusionsSORT was the best performing risk score in predicting 30-day mortality after surgery. Categorising patients based on SORT into low, medium (80-90th percentile), and high risk (90-100th percentile) might guide future allocation of perioperative resources. No tools were sufficiently calibrated to support shared decision-making based on absolute predictions of risk.
Project description:MotivationValidation and reproducibility of results is a central and pressing issue in genomics. Several recent embarrassing incidents involving the irreproducibility of high-profile studies have illustrated the importance of this issue and the need for rigorous methods for the assessment of reproducibility.ResultsHere, we describe an existing statistical model that is very well suited to this problem. We explain its utility for assessing the reproducibility of validation experiments, and apply it to a genome-scale study of adenosine deaminase acting on RNA (ADAR)-mediated RNA editing in Drosophila. We also introduce a statistical method for planning validation experiments that will obtain the tightest reproducibility confidence limits, which, for a fixed total number of experiments, returns the optimal number of replicates for the study.AvailabilityDownloadable software and a web service for both the analysis of data from a reproducibility study and for the optimal design of these studies is provided at http://ccmbweb.ccv.brown.edu/reproducibility.html .
Project description:Arctic ecosystems have experienced and are projected to experience continued large increases in temperature and declines in sea ice cover. It has been hypothesized that small changes in ecosystem drivers can fundamentally alter ecosystem functioning, and that this might be particularly pronounced for Arctic ecosystems. We present a suite of simple statistical analyses to identify changes in the statistical properties of data, emphasizing that changes in the standard error should be considered in addition to changes in mean properties. The methods are exemplified using sea ice extent, and suggest that the loss rate of sea ice accelerated by factor of ~5 in 1996, as reported in other studies, but increases in random fluctuations, as an early warning signal, were observed already in 1990. We recommend to employ the proposed methods more systematically for analyzing tipping points to document effects of climate change in the Arctic.
Project description:Differential expression analysis is one of the most common types of analyses performed on various biological data (e.g. RNA-seq or mass spectrometry proteomics). It is the process that detects features, such as genes or proteins, showing statistically significant differences between the sample groups under comparison. A major challenge in the analysis is the choice of an appropriate test statistic, as different statistics have been shown to perform well in different datasets. To this end, the reproducibility-optimized test statistic (ROTS) adjusts a modified t-statistic according to the inherent properties of the data and provides a ranking of the features based on their statistical evidence for differential expression between two groups. ROTS has already been successfully applied in a range of different studies from transcriptomics to proteomics, showing competitive performance against other state-of-the-art methods. To promote its widespread use, we introduce here a Bioconductor R package for performing ROTS analysis conveniently on different types of omics data. To illustrate the benefits of ROTS in various applications, we present three case studies, involving proteomics and RNA-seq data from public repositories, including both bulk and single cell data. The package is freely available from Bioconductor (https://www.bioconductor.org/packages/ROTS).
Project description:This article presents data on the external validity of an alcohol administration study of sexual decision-making in men who have sex with men (MSM) ages 21-50. Men (N = 135) randomized to alcohol (blood alcohol concentration [BAC] = .075%) or water control conditions reported intentions to engage in condomless anal intercourse (CAI) in response to video vignettes. Following the experiment participants provided 6 weeks of experience sampling method (ESM) data assessing intoxication, sexual arousal, partner relationship, and sexual behavior. Laboratory CAI intentions were hypothesized to predict future CAI behavior, and associations were hypothesized to be conditional upon sexual arousal and intoxication contextual factors as well as laboratory beverage condition. The hypotheses were partially supported. CAI intentions were correlated with subject proportions of days engaging in CAI (r = .29). A multilevel analysis indicated, on average, CAI intention predicted increased probability of CAI versus anal intercourse with a condom (relative risk ratio [RRR] = 1.43). There was mixed evidence of CAI intentions effects being conditional upon laboratory condition as well as arousal and intoxication contextual factors. Graphs of conditional marginal effects identified regions of significance. Effects of CAI intention for men in the alcohol condition on the CAI versus No Sex contrast were significant when sexual arousal was elevated. CAI intentions for men in the water control condition predicted a higher probability of CAI versus anal intercourse with a condom when intoxication was moderately elevated and/or arousal moderately low. The results support the external validity of alcohol administration experiments of sexual decision-making among MSM and, reciprocally, provide support for the validity of ESM assessment of sexual behavior. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Project description:AimsPhysiologically based pharmacokinetic (PBPK) models have been previously developed for betamethasone and buprenorphine for pregnant women. The goal of this work was to replicate and reassess these models using data from recently completed studies.MethodsBetamethasone and buprenorphine PBPK models were developed in Simcyp V19 based on prior publications using V17 and V15. Ability to replicate models was verified by comparing predictions in V19 to those previously published. Once replication was verified, models were reassessed by comparing predictions to observed data from additional studies in pregnant women. Model performance was based upon visual inspection of concentration vs. time profiles, and comparison of pharmacokinetic parameters. Models were deemed reproducible if parameter estimates were within 10% of previously reported values. External validations were considered acceptable if the predicted area under the concentration-time curve (AUC) and peak plasma concentration fell within 2-fold of the observed.ResultsThe betamethasone model was successfully replicated using Simcyp V19, with ratios of reported (V17) to reproduced (V19) peak plasma concentration of 0.98-1.04 and AUC of 0.95-1.07. The model-predicted AUC ratios ranged from 0.98-1.79 compared to external data. The previously published buprenorphine PBPK model was not reproducible, as we predicted intravenous clearance of 70% that reported previously (both in Simcyp V15).ConclusionWhile high interstudy variability was observed in the newly available clinical data, the PBPK model sufficiently predicted changes in betamethasone exposure across gestation. Model reproducibility and reassessment with external data are important for the advancement of the discipline. PBPK modelling publications should contain sufficient detail and clarity to enable reproducibility.
Project description:Identifying reproducible and generalizable brain-phenotype associations is a central goal of neuroimaging. Consistent with this goal, prediction frameworks evaluate brain-phenotype models in unseen data. Most prediction studies train and evaluate a model in the same dataset. However, external validation, or the evaluation of a model in an external dataset, provides a better assessment of robustness and generalizability. Despite the promise of external validation and calls for its usage, the statistical power of such studies has yet to be investigated. In this work, we ran over 60 million simulations across several datasets, phenotypes, and sample sizes to better understand how the sizes of the training and external datasets affect statistical power. We found that prior external validation studies used sample sizes prone to low power, which may lead to false negatives and effect size inflation. Furthermore, increases in the external sample size led to increased simulated power directly following theoretical power curves, whereas changes in the training dataset size offset the simulated power curves. Finally, we compared the performance of a model within a dataset to the external performance. The within-dataset performance was typically within r=0.2 of the cross-dataset performance, which could help decide how to power future external validation studies. Overall, our results illustrate the importance of considering the sample sizes of both the training and external datasets when performing external validation.
Project description:AimsThe VALIDATE-SWEDEHEART trial was a registry-based randomized trial comparing bivalirudin and heparin in patients with acute myocardial infarction undergoing percutaneous coronary intervention. It showed no differences in mortality at 30 or 180 days. This study examines how well the trial population results may generalize to the population of all screened patients with fulfilled inclusion criteria in regard to mortality at 30 and 180 days.MethodsThe standardized difference in the mean propensity score for trial inclusion between trial population and the screened not-enrolled with fulfilled inclusion criteria was calculated as a metric of similarity. Propensity scores were then used in an inverse-probability weighted Cox regression analysis using the trial population only to estimate the difference in mortality as it would have been had the trial included all screened patients with fulfilled inclusion criteria. Patients who were very likely to be included were weighted down and those who had a very low probability of being in the trial were weighted up.ResultsThe propensity score difference was 0.61. There were no significant differences in mortality between bivalirudin and heparin in the inverse-probability weighted analysis (hazard ratio 1.11, 95% confidence interval (0.73, 1.68)) at 30 days or 180 days (hazard ratio 0.98, 95% confidence interval (0.70, 1.36)).ConclusionThe propensity score difference demonstrated that the screened not-enrolled with fulfilled inclusion criteria and trial population were not similar. The inverse-probability weighted analysis showed no significant differences in mortality. From this, we conclude that the VALIDATE results may be generalized to the screened not-enrolled with fulfilled inclusion criteria.