Unknown

Dataset Information

0

Ultra-deep mutant spectrum profiling: improving sequencing accuracy using overlapping read pairs.


ABSTRACT: BACKGROUND:High throughput sequencing is beginning to make a transformative impact in the area of viral evolution. Deep sequencing has the potential to reveal the mutant spectrum within a viral sample at high resolution, thus enabling the close examination of viral mutational dynamics both within- and between-hosts. The challenge however, is to accurately model the errors in the sequencing data and differentiate real viral mutations, particularly those that exist at low frequencies, from sequencing errors. RESULTS:We demonstrate that overlapping read pairs (ORP) -- generated by combining short fragment sequencing libraries and longer sequencing reads -- significantly reduce sequencing error rates and improve rare variant detection accuracy. Using this sequencing protocol and an error model optimized for variant detection, we are able to capture a large number of genetic mutations present within a viral population at ultra-low frequency levels (<0.05%). CONCLUSIONS:Our rare variant detection strategies have important implications beyond viral evolution and can be applied to any basic and clinical research area that requires the identification of rare mutations.

SUBMITTER: Chen-Harris H 

PROVIDER: S-EPMC3599684 | biostudies-literature | 2013 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Ultra-deep mutant spectrum profiling: improving sequencing accuracy using overlapping read pairs.

Chen-Harris Haiyin H   Borucki Monica K MK   Torres Clinton C   Slezak Tom R TR   Allen Jonathan E JE  

BMC genomics 20130212


<h4>Background</h4>High throughput sequencing is beginning to make a transformative impact in the area of viral evolution. Deep sequencing has the potential to reveal the mutant spectrum within a viral sample at high resolution, thus enabling the close examination of viral mutational dynamics both within- and between-hosts. The challenge however, is to accurately model the errors in the sequencing data and differentiate real viral mutations, particularly those that exist at low frequencies, from  ...[more]

Similar Datasets

| S-EPMC3464235 | biostudies-literature
| S-EPMC5103424 | biostudies-literature
| S-EPMC3982159 | biostudies-literature
| S-EPMC6520541 | biostudies-literature
| S-EPMC6045860 | biostudies-literature
| S-EPMC3616926 | biostudies-literature
| S-EPMC6370902 | biostudies-other
| S-EPMC7338406 | biostudies-literature
| S-EPMC5920284 | biostudies-literature
| S-EPMC7320720 | biostudies-literature