Unknown

Dataset Information

0

Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples.


ABSTRACT: High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r?=?0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r?=?0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies.

SUBMITTER: Wang J 

PROVIDER: S-EPMC5025741 | biostudies-literature | 2016 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples.

Wang Jingwen J   Skoog Tiina T   Einarsdottir Elisabet E   Kaartokallio Tea T   Laivuori Hannele H   Grauers Anna A   Gerdhem Paul P   Hytönen Marjo M   Lohi Hannes H   Kere Juha J   Jiao Hong H  

Scientific reports 20160916


High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two se  ...[more]

Similar Datasets

| S-EPMC3315736 | biostudies-literature
| S-EPMC310828 | biostudies-literature
| S-EPMC3268604 | biostudies-literature
| S-EPMC3308056 | biostudies-literature
| S-EPMC4935848 | biostudies-literature
| S-EPMC4309441 | biostudies-literature
| S-EPMC3928660 | biostudies-literature
| S-EPMC3530907 | biostudies-literature
| S-EPMC5037392 | biostudies-literature
| S-EPMC3471313 | biostudies-literature