Unknown

Dataset Information

0

Shedding light on dark genes: enhanced targeted resequencing by optimizing the combination of enrichment technology and DNA fragment length.


ABSTRACT: The exome contains many obscure regions difficult to explore with current short-read sequencing methods. Repetitious genomic regions prevent the unique alignment of reads, which is essential for the identification of clinically-relevant genetic variants. Long-read technologies attempt to resolve multiple-mapping regions, but they still produce many sequencing errors. Thus, a new approach is required to enlighten the obscure regions of the genome and rescue variants that would be otherwise neglected. This work aims to improve the alignment of multiple-mapping reads through the extension of the standard DNA fragment size. As Illumina can sequence fragments up to 550?bp, we tested different DNA fragment lengths using four major commercial WES platforms and found that longer DNA fragments achieved a higher genotypability. This metric, which indicates base calling calculated by combining depth of coverage with the confidence of read alignment, increased from hundreds to thousands of genes, including several associated with clinical phenotypes. While depth of coverage has been considered crucial for the assessment of WES performance, we demonstrated that genotypability has a greater impact in revealing obscure regions, with ~1% increase in variant calling in respect to shorter DNA fragments. Results confirmed that this approach enlightened many regions previously not explored.

SUBMITTER: Iadarola B 

PROVIDER: S-EPMC7287100 | biostudies-literature | 2020 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Shedding light on dark genes: enhanced targeted resequencing by optimizing the combination of enrichment technology and DNA fragment length.

Iadarola Barbara B   Xumerle Luciano L   Lavezzari Denise D   Paterno Marta M   Marcolungo Luca L   Beltrami Cristina C   Fortunati Elisabetta E   Mei Davide D   Vetro Annalisa A   Guerrini Renzo R   Parrini Elena E   Rossato Marzia M   Delledonne Massimo M  

Scientific reports 20200610 1


The exome contains many obscure regions difficult to explore with current short-read sequencing methods. Repetitious genomic regions prevent the unique alignment of reads, which is essential for the identification of clinically-relevant genetic variants. Long-read technologies attempt to resolve multiple-mapping regions, but they still produce many sequencing errors. Thus, a new approach is required to enlighten the obscure regions of the genome and rescue variants that would be otherwise neglec  ...[more]

Similar Datasets

| S-EPMC7249604 | biostudies-literature
| S-EPMC4957270 | biostudies-literature
2020-03-01 | E-MTAB-8647 | biostudies-arrayexpress
| PRJEB12651 | ENA
| S-EPMC7671308 | biostudies-literature
| S-EPMC5864207 | biostudies-literature
| S-EPMC4783766 | biostudies-literature
| S-EPMC6590786 | biostudies-literature
| S-EPMC3418918 | biostudies-other
| S-EPMC6792376 | biostudies-literature