Unknown

Dataset Information

0

PostSV: A Post-Processing Approach for Filtering Structural Variations.


ABSTRACT: Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect variants. However, most SV callers suffer from high false-positive rates, which diminishes the overall performance, especially in low-coverage genomes. In this article, we propose a post-processing classification-based algorithm that can be used to filter structural variation predictions produced by SV callers. Novel features are defined from putative SV predictions using reads at the local regions around the breakpoints. Several classifiers are employed to classify the candidate predictions and remove false positives. We test our classifier models on simulated and real genomes and show that the proposed approach improves the performance of state-of-the-art algorithms.

SUBMITTER: Alzaid E 

PROVIDER: S-EPMC6974750 | biostudies-literature | 2020

REPOSITORIES: biostudies-literature

altmetric image

Publications

PostSV: A Post-Processing Approach for Filtering Structural Variations.

Alzaid Eman E   Allali Achraf El AE  

Bioinformatics and biology insights 20200120


Genomic structural variations are significant causes of genome diversity and complex diseases. With advances in sequencing technologies, many algorithms have been designed to identify structural differences using next-generation sequencing (NGS) data. Due to repetitions in the human genome and the short reads produced by NGS, the discovery of structural variants (SVs) by state-of-the-art SV callers is not always accurate. To improve performance, multiple SV callers are often used to detect varia  ...[more]

Similar Datasets

| S-EPMC5535871 | biostudies-literature
| S-EPMC3637191 | biostudies-literature
| S-EPMC4157590 | biostudies-literature
| S-EPMC6319923 | biostudies-literature
| S-EPMC3458090 | biostudies-literature
| S-EPMC8570516 | biostudies-literature
| S-EPMC6248965 | biostudies-literature
| S-EPMC5567409 | biostudies-literature
2015-03-10 | E-GEOD-66573 | biostudies-arrayexpress
| S-EPMC7078380 | biostudies-literature