Project description:A recent study of the replicability of key psychological findings is a major contribution toward understanding the human side of the scientific process. Despite the careful and nuanced analysis reported, the simple narrative disseminated by the mass, social, and scientific media was that in only 36% of the studies were the original results replicated. In the current study, however, we showed that 77% of the replication effect sizes reported were within a 95% prediction interval calculated using the original effect size. Our analysis suggests two critical issues in understanding replication of psychological studies. First, researchers' intuitive expectations for what a replication should show do not always match with statistical estimates of replication. Second, when the results of original studies are very imprecise, they create wide prediction intervals-and a broad range of replication effects that are consistent with the original estimates. This may lead to effects that replicate successfully, in that replication results are consistent with statistical expectations, but do not provide much information about the size (or existence) of the true effect. In this light, the results of the Reproducibility Project: Psychology can be viewed as statistically consistent with what one might expect when performing a large-scale replication experiment.
Project description:We measure how accurately replication of experimental results can be predicted by black-box statistical models. With data from four large-scale replication projects in experimental psychology and economics, and techniques from machine learning, we train predictive models and study which variables drive predictable replication. The models predicts binary replication with a cross-validated accuracy rate of 70% (AUC of 0.77) and estimates of relative effect sizes with a Spearman ρ of 0.38. The accuracy level is similar to market-aggregated beliefs of peer scientists [1, 2]. The predictive power is validated in a pre-registered out of sample test of the outcome of [3], where 71% (AUC of 0.73) of replications are predicted correctly and effect size correlations amount to ρ = 0.25. Basic features such as the sample and effect sizes in original papers, and whether reported effects are single-variable main effects or two-variable interactions, are predictive of successful replication. The models presented in this paper are simple tools to produce cheap, prognostic replicability metrics. These models could be useful in institutionalizing the process of evaluation of new findings and guiding resources to those direct replications that are likely to be most informative.
Project description:There is a broad agreement that psychology is facing a replication crisis. Even some seemingly well-established findings have failed to replicate. Numerous causes of the crisis have been identified, such as underpowered studies, publication bias, imprecise theories, and inadequate statistical procedures. The replication crisis is real, but it is less clear how it should be resolved. Here we examine potential solutions by modeling a scientific community under various different replication regimes. In one regime, all findings are replicated before publication to guard against subsequent replication failures. In an alternative regime, individual studies are published and are replicated after publication, but only if they attract the community's interest. We find that the publication of potentially non-replicable studies minimizes cost and maximizes efficiency of knowledge gain for the scientific community under a variety of assumptions. Provided it is properly managed, our findings suggest that low replicability can support robust and efficient science.
Project description:BackgroundThe multivariable fractional polynomial (MFP) approach combines variable selection using backward elimination with a function selection procedure (FSP) for fractional polynomial (FP) functions. It is a relatively simple approach which can be easily understood without advanced training in statistical modeling. For continuous variables, a closed test procedure is used to decide between no effect, linear, FP1, or FP2 functions. Influential points (IPs) and small sample sizes can both have a strong impact on a selected function and MFP model.MethodsWe used simulated data with six continuous and four categorical predictors to illustrate approaches which can help to identify IPs with an influence on function selection and the MFP model. Approaches use leave-one or two-out and two related techniques for a multivariable assessment. In eight subsamples, we also investigated the effects of sample size and model replicability, the latter by using three non-overlapping subsamples with the same sample size. For better illustration, a structured profile was used to provide an overview of all analyses conducted.ResultsThe results showed that one or more IPs can drive the functions and models selected. In addition, with a small sample size, MFP was not able to detect some non-linear functions and the selected model differed substantially from the true underlying model. However, when the sample size was relatively large and regression diagnostics were carefully conducted, MFP selected functions or models that were similar to the underlying true model.ConclusionsFor smaller sample size, IPs and low power are important reasons that the MFP approach may not be able to identify underlying functional relationships for continuous variables and selected models might differ substantially from the true model. However, for larger sample sizes, a carefully conducted MFP analysis is often a suitable way to select a multivariable regression model which includes continuous variables. In such a case, MFP can be the preferred approach to derive a multivariable descriptive model.
Project description:BackgroundHigh-dimensional datasets with low sample sizes (HDLSS) are pivotal in the fields of biology and bioinformatics. One of core objective of HDLSS is to select most informative features and discarding redundant or irrelevant features. This is particularly crucial in bioinformatics, where accurate feature (gene) selection can lead to breakthroughs in drug development and provide insights into disease diagnostics. Despite its importance, identifying optimal features is still a significant challenge in HDLSS.ResultsTo address this challenge, we propose an effective feature selection method that combines gradual permutation filtering with a heuristic tribrid search strategy, specifically tailored for HDLSS contexts. The proposed method considers inter-feature interactions and leverages feature rankings during the search process. In addition, a new performance metric for the HDLSS that evaluates both the number and quality of selected features is suggested. Through the comparison of the benchmark dataset with existing methods, the proposed method reduced the average number of selected features from 37.8 to 5.5 and improved the performance of the prediction model, based on the selected features, from 0.855 to 0.927.ConclusionsThe proposed method effectively selects a small number of important features and achieves high prediction performance.
Project description:The quality of psychological studies is currently a major concern. The Many Labs Project (MLP) and the Open-Science-Collaboration (OSC) have collected key data on replicability and statistical effect sizes. We build on this work by investigating the role played by three measurement types: ratings, proportions and unbounded (measures without conceptual upper limits, e.g. time). Both replicability and effect sizes are dependent on the amount of variability due to extraneous factors. We predicted that the role of such extraneous factors might depend on measurement type, and would be greatest for ratings, intermediate for proportions and least for unbounded. Our results support this conjecture. OSC replication rates for unbounded, 43% and proportion 40% combined are reliably higher than those for ratings at 20% (effect size, w = .20). MLP replication rates for the original studies are: proportion = .74, ratings = .40 (effect size w = .33). Original effect sizes (Cohen's d) are highest for: unbounded OSC cognitive = 1.45, OSC social = .90); next for proportions (OSC cognitive = 1.01, OSC social = .84, MLP = .82); and lowest for ratings (OSC social = .64, MLP = .31). These findings are of key importance to scientific methodology and design, even if the reasons for their occurrence are still at the level of conjecture.
Project description:Background and Objectives:Researchers typically use Cohen's guidelines of Pearson's r = .10, .30, and .50, and Cohen's d = 0.20, 0.50, and 0.80 to interpret observed effect sizes as small, medium, or large, respectively. However, these guidelines were not based on quantitative estimates and are only recommended if field-specific estimates are unknown. This study investigated the distribution of effect sizes in both individual differences research and group differences research in gerontology to provide estimates of effect sizes in the field. Research Design and Methods:Effect sizes (Pearson's r, Cohen's d, and Hedges' g) were extracted from meta-analyses published in 10 top-ranked gerontology journals. The 25th, 50th, and 75th percentile ranks were calculated for Pearson's r (individual differences) and Cohen's d or Hedges' g (group differences) values as indicators of small, medium, and large effects. A priori power analyses were conducted for sample size calculations given the observed effect size estimates. Results:Effect sizes of Pearson's r = .12, .20, and .32 for individual differences research and Hedges' g = 0.16, 0.38, and 0.76 for group differences research were interpreted as small, medium, and large effects in gerontology. Discussion and Implications:Cohen's guidelines appear to overestimate effect sizes in gerontology. Researchers are encouraged to use Pearson's r = .10, .20, and .30, and Cohen's d or Hedges' g = 0.15, 0.40, and 0.75 to interpret small, medium, and large effects in gerontology, and recruit larger samples.
Project description:BackgroundSample size planning (SSP) is vital for efficient studies that yield reliable outcomes. Hence, guidelines, emphasize the importance of SSP. The present study investigates the practice of SSP in current trials for depression.MethodsSeventy-eight randomized controlled trials published between 2013 and 2017 were examined. Impact of study design (e.g. number of randomized conditions) and study context (e.g. funding) on sample size was analyzed using multiple regression.ResultsOverall, sample size during pre-registration, during SSP, and in published articles was highly correlated (r's ≥ 0.887). Simultaneously, only 7-18% of explained variance related to study design (p = 0.055-0.155). This proportion increased to 30-42% by adding study context (p = 0.002-0.005). The median sample size was N = 106, with higher numbers for internet interventions (N = 181; p = 0.021) compared to face-to-face therapy. In total, 59% of studies included SSP, with 28% providing basic determinants and 8-10% providing information for comprehensible SSP. Expected effect sizes exhibited a sharp peak at d = 0.5. Depending on the definition, 10.2-20.4% implemented intense assessment to improve statistical power.ConclusionsFindings suggest that investigators achieve their determined sample size and pre-registration rates are increasing. During study planning, however, study context appears more important than study design. Study context, therefore, needs to be emphasized in the present discussion, as it can help understand the relatively stable trial numbers of the past decades. Acknowledging this situation, indications exist that digital psychiatry (e.g. Internet interventions or intense assessment) can help to mitigate the challenge of underpowered studies. The article includes a short guide for efficient study planning.
Project description:The ability to replicate scientific experiments is a cornerstone of the scientific method. Sharing ideas, workflows, data, and protocols facilitates testing the generalizability of results, increases the speed that science progresses, and enhances quality control of published work. Fields of science such as medicine, the social sciences, and the physical sciences have embraced practices designed to increase replicability. Granting agencies, for example, may require data management plans and journals may require data and code availability statements along with the deposition of data and code in publicly available repositories. While many tools commonly used in replicable workflows such as distributed version control systems (e.g., 'git') or script programming languages for data cleaning and analysis may have a steep learning curve, their adoption can increase individual efficiency and facilitate collaborations both within entomology and across disciplines. The open science movement is developing within the discipline of entomology, but practitioners of these concepts or those desiring to work more collaboratively across disciplines may be unsure where or how to embrace these initiatives. This article is meant to introduce some of the tools entomologists can incorporate into their workflows to increase the replicability and openness of their work. We describe these tools and others, recommend additional resources for learning more about these tools, and discuss the benefits to both individuals and the scientific community and potential drawbacks associated with implementing a replicable workflow.
Project description:The sample size effect on deformation mode of glasses is one of the most misunderstood properties of this class of material. This effect is intriguing, since materials deemed macroscopically brittle become plastic at small size. We propose an explanation of this phenomenon for metallic glasses. A thermodynamic description of the local rearrangement zones activated under an applied stress is proposed. Using the Poisson distribution to describe the statistics of these zones and the statistical physics to associate entropy, we define a critical sample size for the change in the deformation mode. Predictions are in agreement with experimental observations and reveal hidden structural parameters describing the glassy state.