Unknown

Dataset Information

0

SNP characteristics predict replication success in association studies.


ABSTRACT: Successful independent replication is the most direct approach for distinguishing real genotype-disease associations from false discoveries in genome-wide association studies (GWAS). Selecting SNPs for replication has been primarily based on P values from the discovery stage, although additional characteristics of SNPs may be used to improve replication success. We used disease-associated SNPs from more than 2,000 published GWASs to identify predictors of SNP reproducibility. SNP reproducibility was defined as a proportion of successful replications among all replication attempts. The study reporting association for the first time was considered to be discovery and all consequent studies targeting the same phenotype replications. We found that -Log(P), where P is a P value from the discovery study, is the strongest predictor of the SNP reproducibility. Other significant predictors include type of the SNP (e.g., missense vs intronic SNPs) and minor allele frequency. Features of the genes linked to the disease-associated SNP also predict SNP reproducibility. Based on empirically defined rules, we developed a reproducibility score (RS) to predict SNP reproducibility independently of -Log(P). We used data from two lung cancer GWAS studies as well as recently reported disease-associated SNPs to validate RS. Minus Log(P) outperforms RS when the very top SNPs are selected, while RS works better with relaxed selection criteria. In conclusion, we propose an empirical model to predict SNP reproducibility, which can be used to select SNPs for validation and prioritization.

SUBMITTER: Gorlov IP 

PROVIDER: S-EPMC4384517 | biostudies-literature | 2014 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

SNP characteristics predict replication success in association studies.

Gorlov Ivan P IP   Moore Jason H JH   Peng Bo B   Jin Jennifer L JL   Gorlova Olga Y OY   Amos Christopher I CI  

Human genetics 20141002 12


Successful independent replication is the most direct approach for distinguishing real genotype-disease associations from false discoveries in genome-wide association studies (GWAS). Selecting SNPs for replication has been primarily based on P values from the discovery stage, although additional characteristics of SNPs may be used to improve replication success. We used disease-associated SNPs from more than 2,000 published GWASs to identify predictors of SNP reproducibility. SNP reproducibility  ...[more]

Similar Datasets

| S-EPMC3643925 | biostudies-literature
| S-EPMC1435944 | biostudies-literature
| S-EPMC4423838 | biostudies-literature
| S-EPMC3004292 | biostudies-literature
| S-EPMC2238878 | biostudies-literature
| S-EPMC3102637 | biostudies-literature
| S-EPMC3319325 | biostudies-literature
| S-EPMC2547855 | biostudies-other
| S-EPMC4539975 | biostudies-literature
| S-EPMC3032061 | biostudies-literature