Unknown

Dataset Information

0

Dataset decay and the problem of sequential analyses on open datasets.


ABSTRACT: Open data allows researchers to explore pre-existing datasets in new ways. However, if many researchers reuse the same dataset, multiple statistical testing may increase false positives. Here we demonstrate that sequential hypothesis testing on the same dataset by multiple researchers can inflate error rates. We go on to discuss a number of correction procedures that can reduce the number of false positives, and the challenges associated with these correction procedures.

SUBMITTER: Thompson WH 

PROVIDER: S-EPMC7237204 | biostudies-literature | 2020 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Dataset decay and the problem of sequential analyses on open datasets.

Thompson William Hedley WH   Wright Jessey J   Bissett Patrick G PG   Poldrack Russell A RA  

eLife 20200519


Open data allows researchers to explore pre-existing datasets in new ways. However, if many researchers reuse the same dataset, multiple statistical testing may increase false positives. Here we demonstrate that sequential hypothesis testing on the same dataset by multiple researchers can inflate error rates. We go on to discuss a number of correction procedures that can reduce the number of false positives, and the challenges associated with these correction procedures. ...[more]

Similar Datasets

| S-EPMC7979787 | biostudies-literature
| S-EPMC6969289 | biostudies-literature
| S-EPMC5704676 | biostudies-literature
| S-EPMC6312793 | biostudies-literature
| S-EPMC4349932 | biostudies-literature
2009-12-18 | PRD000081 | Pride
| S-EPMC5130981 | biostudies-literature
| S-EPMC7378878 | biostudies-literature
| S-EPMC7473573 | biostudies-literature
| S-EPMC8049987 | biostudies-literature