Dataset Information

Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation.

ABSTRACT:

Background

With the rapid development of long-read sequencing technologies, it is possible to reveal the full spectrum of genetic structural variation (SV). However, the expensive cost, finite read length and high sequencing error for long-read data greatly limit the widespread adoption of SV calling. Therefore, it is urgent to establish guidance concerning sequencing coverage, read length, and error rate to maintain high SV yields and to achieve the lowest cost simultaneously.

Results

In this study, we generated a full range of simulated error-prone long-read datasets containing various sequencing settings and comprehensively evaluated the performance of SV calling with state-of-the-art long-read SV detection methods. The benchmark results demonstrate that almost all SV callers perform better when the long-read data reach 20× coverage, 20 kbp average read length, and approximately 10-7.5% or below 1% error rates. Furthermore, high sequencing coverage is the most influential factor in promoting SV calling, while it also directly determines the expensive costs.

Conclusions

Based on the comprehensive evaluation results, we provide important guidelines for selecting long-read sequencing settings for efficient SV calling. We believe these recommended settings of long-read sequencing will have extraordinary guiding significance in cutting-edge genomic studies and clinical practices.

SUBMITTER: Jiang T

PROVIDER: S-EPMC8588741 | biostudies-literature | 2021 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation.

Jiang Tao T Liu Shiqi S Cao Shuqi S Liu Yadong Y Cui Zhe Z Wang Yadong Y Guo Hongzhe H

BMC bioinformatics 20211112 1

<h4>Background</h4>With the rapid development of long-read sequencing technologies, it is possible to reveal the full spectrum of genetic structural variation (SV). However, the expensive cost, finite read length and high sequencing error for long-read data greatly limit the widespread adoption of SV calling. Therefore, it is urgent to establish guidance concerning sequencing coverage, read length, and error rate to maintain high SV yields and to achieve the lowest cost simultaneously.<h4>Result ...[more]

PMID: 34772337

Dataset Information

Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation.

Background

Results

Conclusions

Publications

Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Reply: Correspondence on NanoVar's performance outlined by Jiang T. et al. in 'Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation'.
| S-EPMC10510213 | biostudies-literature

Comprehensive evaluation of structural variant genotyping methods based on long-read sequencing data.
| S-EPMC9034514 | biostudies-literature

Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing.
| S-EPMC6547561 | biostudies-literature

Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies.
| S-EPMC8206509 | biostudies-literature

Evaluating Structural Variation Detection Tools for Long-Read Sequencing Datasets in <i>Saccharomyces cerevisiae</i>.
| S-EPMC7075250 | biostudies-literature

Comprehensive assessment of long-read sequencing platforms and calling algorithms for detection of copy number variation.
| S-EPMC11387058 | biostudies-literature

Long-read-based human genomic structural variation detection with cuteSV.
| S-EPMC7477834 | biostudies-literature

Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data.
| S-EPMC10940726 | biostudies-literature

Comprehensive and deep evaluation of structural variation detection pipelines with third-generation sequencing data.
| S-EPMC11247875 | biostudies-literature

Systematic benchmarking of tools for structural variation detection using short- and long-read sequencing data in pigs.
| S-EPMC11889634 | biostudies-literature