Detecting virus integration sites based on multiple related sequencing data by VirTect.
Ontology highlight
ABSTRACT: BACKGROUND:Since tumor often has a high level of intra-tumor heterogeneity, multiple tumor samples from the same patient at different locations or different time points are often sequenced to study tumor intra-heterogeneity or tumor evolution. In virus-related tumors such as human papillomavirus- and Hepatitis B Virus-related tumors, virus genome integrations can be critical driving events. It is thus important to investigate the integration sites of the virus genomes. Currently, a few algorithms for detecting virus integration sites based on high-throughput sequencing have been developed, but their insufficient performance in their sensitivity, specificity and computational complexity hinders their applications in multiple related tumor sequencing. RESULTS:We develop VirTect for detecting virus integration sites simultaneously from multiple related-sample data. This algorithm is mainly based on the joint analysis of short reads spanning breakpoints of integration sites from multiple samples. To achieve high specificity and breakpoint accuracy, a local precise sandwich alignment algorithm is used. Simulation and real data analyses show that, compared with other algorithms, VirTect is significantly more sensitive and has a similar or lower false discovery rate. CONCLUSIONS:VirTect can provide more accurate breakpoint position and is computationally much more efficient in terms both memory requirement and computational time.
SUBMITTER: Xia Y
PROVIDER: S-EPMC6357354 | biostudies-literature | 2019 Jan
REPOSITORIES: biostudies-literature
ACCESS DATA