Unknown

Dataset Information

0

An accurate method for identifying recent recombinants from unaligned sequences.


ABSTRACT:

Motivation

Recombination is a fundamental process in molecular evolution, and the identification of recombinant sequences is thus of major interest. However, current methods for detecting recombinants are primarily designed for aligned sequences. Thus they struggle with analyses of highly diverse genes, such as the var genes of the malaria parasite Plasmodium falciparum, which are known to diversify primarily through recombination.

Results

We introduce an algorithm to detect recent recombinant sequences from a dataset without a full multiple alignment. Our algorithm can handle thousands of gene-length sequences without the need for a reference panel. We demonstrate the accuracy of our algorithm through extensive numerical simulations; in particular, it maintains its effectiveness in the presence of insertions and deletions. We apply our algorithm to a dataset of 17,335 DBLα types in var genes from Ghana, observing that sequences belonging to the same ups group or domain subclass recombine amongst themselves more frequently, and that non-recombinant DBLα types are more conserved than recombinant ones.

Availability

Source code is freely available at https://github.com/qianfeng2/detREC_program.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Feng Q 

PROVIDER: S-EPMC8963311 | biostudies-literature | 2022 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

An accurate method for identifying recent recombinants from unaligned sequences.

Feng Qian Q   Tiedje Kathryn E KE   Ruybal-Pesántez Shazia S   Tonkin-Hill Gerry G   Duffy Michael F MF   Day Karen P KP   Shim Heejung H   Chan Yao-Ban YB  

Bioinformatics (Oxford, England) 20220301 7


<h4>Motivation</h4>Recombination is a fundamental process in molecular evolution, and the identification of recombinant sequences is thus of major interest. However, current methods for detecting recombinants are primarily designed for aligned sequences. Thus, they struggle with analyses of highly diverse genes, such as the var genes of the malaria parasite Plasmodium falciparum, which are known to diversify primarily through recombination.<h4>Results</h4>We introduce an algorithm to detect rece  ...[more]

Similar Datasets

| S-EPMC7195022 | biostudies-literature
| S-EPMC55461 | biostudies-literature
| S-EPMC10231473 | biostudies-literature
| S-EPMC3516148 | biostudies-literature
| S-EPMC2638147 | biostudies-literature
| S-EPMC2957682 | biostudies-literature
| S-EPMC434454 | biostudies-literature
| S-EPMC3167047 | biostudies-literature
| S-EPMC6311941 | biostudies-literature
| S-EPMC8000046 | biostudies-literature