Unknown

Dataset Information

0

SNP detection for massively parallel whole-genome resequencing.


ABSTRACT: Next-generation massively parallel sequencing technologies provide ultrahigh throughput at two orders of magnitude lower unit cost than capillary Sanger sequencing technology. One of the key applications of next-generation sequencing is studying genetic variation between individuals using whole-genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by carefully considering the data quality, alignment, and experimental errors common to this technology. All of this information was integrated into a single quality score for each base under Bayesian theory to measure the accuracy of consensus calling. We tested this methodology using a large-scale human resequencing data set of 36x coverage and assembled a high-quality nonrepetitive consensus sequence for 92.25% of the diploid autosomes and 88.07% of the haploid X chromosome. Comparison of the consensus sequence with Illumina human 1M BeadChip genotyped alleles from the same DNA sample showed that 98.6% of the 37,933 genotyped alleles on the X chromosome and 98% of 999,981 genotyped alleles on autosomes were covered at 99.97% and 99.84% consistency, respectively. At a low sequencing depth, we used prior probability of dbSNP alleles and were able to improve coverage of the dbSNP sites significantly as compared to that obtained using a nonimputation model. Our analyses demonstrate that our method has a very low false call rate at any sequencing depth and excellent genome coverage at a high sequencing depth.

SUBMITTER: Li R 

PROVIDER: S-EPMC2694485 | biostudies-literature | 2009 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

SNP detection for massively parallel whole-genome resequencing.

Li Ruiqiang R   Li Yingrui Y   Fang Xiaodong X   Yang Huanming H   Wang Jian J   Kristiansen Karsten K   Wang Jun J  

Genome research 20090506 6


Next-generation massively parallel sequencing technologies provide ultrahigh throughput at two orders of magnitude lower unit cost than capillary Sanger sequencing technology. One of the key applications of next-generation sequencing is studying genetic variation between individuals using whole-genome or target region resequencing. Here, we have developed a consensus-calling and SNP-detection method for sequencing-by-synthesis Illumina Genome Analyzer technology. We designed this method by caref  ...[more]

Similar Datasets

| S-EPMC3276265 | biostudies-literature
| S-EPMC2739861 | biostudies-other
| S-EPMC1560136 | biostudies-literature
| S-EPMC4166930 | biostudies-literature
| S-EPMC4992833 | biostudies-literature
| S-EPMC5896239 | biostudies-literature
| S-EPMC2703445 | biostudies-literature
| S-EPMC3883664 | biostudies-literature
| S-EPMC5507899 | biostudies-literature
| S-EPMC2986170 | biostudies-literature