Unknown

Dataset Information

0

Improving read alignment through the generation of alternative reference via iterative strategy.


ABSTRACT: There is generally one standard reference sequence for each species. When extensive variations exist in other breeds of the species, it can lead to ambiguous alignment and inaccurate variant calling and, in turn, compromise the accuracy of downstream analysis. Here, with the help of the FPGA hardware platform, we present a method that generates an alternative reference via an iterative strategy to improve the read alignment for breeds that are genetically distant to the reference breed. Compared to the published reference genomes, by using the alternative reference sequences we built, the mapping rates of Chinese indigenous pigs and chickens were improved by 0.61-1.68% and 0.09-0.45%, respectively. These sequences also enable researchers to recover highly variable regions that could be missed using public reference sequences. We also determined that the optimal number of iterations needed to generate alternative reference sequences were seven and five for pigs and chickens, respectively. Our results show that, for genetically distant breeds, generating an alternative reference sequence can facilitate read alignment and variant calling and improve the accuracy of downstream analyses.

SUBMITTER: Bu L 

PROVIDER: S-EPMC7599232 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improving read alignment through the generation of alternative reference via iterative strategy.

Bu Lina L   Wang Qi Q   Gu Wenjin W   Yang Ruifei R   Zhu Di D   Song Zhuo Z   Liu Xiaojun X   Zhao Yiqiang Y  

Scientific reports 20201030 1


There is generally one standard reference sequence for each species. When extensive variations exist in other breeds of the species, it can lead to ambiguous alignment and inaccurate variant calling and, in turn, compromise the accuracy of downstream analysis. Here, with the help of the FPGA hardware platform, we present a method that generates an alternative reference via an iterative strategy to improve the read alignment for breeds that are genetically distant to the reference breed. Compared  ...[more]

Similar Datasets

| S-EPMC3464235 | biostudies-literature
| S-EPMC6416333 | biostudies-literature
| S-EPMC5103424 | biostudies-literature
| S-EPMC3468387 | biostudies-literature
| S-EPMC3218665 | biostudies-literature
2013-07-15 | E-MTAB-1728 | biostudies-arrayexpress
| S-EPMC6195179 | biostudies-literature
| S-EPMC2894513 | biostudies-other
| S-EPMC3945748 | biostudies-literature
| S-EPMC4609002 | biostudies-literature