Dataset Information

Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison.

ABSTRACT: Genetic maps order genetic markers along chromosomes. They are, for instance, extensively used in marker-assisted selection to accelerate breeding programs. Even for the same species, people often have to deal with several alternative maps obtained using different ordering methods or different datasets, e.g. resulting from different segregating populations. Having efficient tools to identify the consistency and discrepancy of alternative maps is thus essential to facilitate genetic map comparisons. We propose to encode genetic maps by bucket order, a kind of order, which takes into account the blurred parts of the marker order while being an efficient data structure to achieve low complexity algorithms. The main result of this paper is an O(n log(n)) procedure to identify the largest agreements between two bucket orders of n elements, their Longest Common Subsequence (LCS), providing an efficient solution to highlight discrepancies between two genetic maps. The LCS of two maps, being the largest set of their collinear markers, is used as a building block to compute pairwise map congruence, to visually emphasize maker collinearity and in some scaffolding methods relying on genetic maps to improve genome assembly. As the LCS computation is a key subroutine of all these genetic map related tools, replacing the current LCS subroutine of those methods by ours -to do the exact same work but faster- could significantly speed up those methods without changing their accuracy. To ease such transition we provide all required algorithmic details in this self contained paper as well as an R package implementing them, named LCSLCIS, which is freely available at: https://github.com/holtzy/LCSLCIS.

SUBMITTER: De Matteo L

PROVIDER: S-EPMC6320017 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison.

De Mattéo Lisa L Holtz Yan Y Ranwez Vincent V Bérard Sèverine S

PloS one 20181227 12

Genetic maps order genetic markers along chromosomes. They are, for instance, extensively used in marker-assisted selection to accelerate breeding programs. Even for the same species, people often have to deal with several alternative maps obtained using different ordering methods or different datasets, e.g. resulting from different segregating populations. Having efficient tools to identify the consistency and discrepancy of alternative maps is thus essential to facilitate genetic map compariso ...[more]

PMID: 30589848

Dataset Information

Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison.

Publications

Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Automatic ICD-10 coding algorithm using an improved longest common subsequence based on semantic similarity.
| S-EPMC5356997 | biostudies-literature

Speeding up tandem mass spectrometry-based database searching by longest common prefix.
| S-EPMC3000425 | biostudies-literature

Longest follow-up of in situ working Bjork Shiley valve: 42-year follow-up.
| S-EPMC6716407 | biostudies-literature

Speeding up all-against-all protein comparisons while maintaining sensitivity by considering subsequence-level homology.
| S-EPMC4193403 | biostudies-literature

Seeding with minimized subsequence.
| S-EPMC10311335 | biostudies-literature

PIR pairwise alignment - a slip up for signal peptides.
| S-EPMC1891675 | biostudies-literature

Varying environments can speed up evolution.
| S-EPMC1948871 | biostudies-literature

Computational speed-up with a single qudit.
| S-EPMC4597186 | biostudies-literature

Phenotypic Switching Can Speed up Microbial Evolution.
| S-EPMC5997679 | biostudies-literature

Global alignment of pairwise protein interaction networks for maximal common conserved patterns.
| S-EPMC3654364 | biostudies-literature