Unknown

Dataset Information

0

A fast word search algorithm for the representation of sequence similarity in genomic DNA.


ABSTRACT: Representation of sequence similarity by dot matrix plots is a method widely used for comparing biological sequences. The user is presented with an overall view of similarity between two sequences. Computation of this plot has been reconsidered here. An improvement is proposed through the preprocessing of the data into an automation recognizing the word structure of a sequence. The main advantage of this approach is to systematically eliminate the repetitions during word comparison. Simple heuristics are also considered to greatly speed up pattern matching. As a result, large sequences are handled very efficiently. This is illustrated by a comparison of large genomic DNA. The algorithm has been implemented in an interactive application on a microcomputer.

SUBMITTER: Lefevre C 

PROVIDER: S-EPMC523596 | biostudies-other | 1994 Feb

REPOSITORIES: biostudies-other

altmetric image

Publications

A fast word search algorithm for the representation of sequence similarity in genomic DNA.

Lefèvre C C   Ikeda J E JE  

Nucleic acids research 19940201 3


Representation of sequence similarity by dot matrix plots is a method widely used for comparing biological sequences. The user is presented with an overall view of similarity between two sequences. Computation of this plot has been reconsidered here. An improvement is proposed through the preprocessing of the data into an automation recognizing the word structure of a sequence. The main advantage of this approach is to systematically eliminate the repetitions during word comparison. Simple heuri  ...[more]

Similar Datasets

| S-EPMC3591303 | biostudies-literature
| S-EPMC5274646 | biostudies-literature
| S-EPMC8570820 | biostudies-literature
| S-EPMC8016470 | biostudies-literature
| S-EPMC3113943 | biostudies-literature
| S-EPMC5860095 | biostudies-literature
| S-EPMC7461893 | biostudies-literature
| S-EPMC4080745 | biostudies-literature
| S-EPMC4699916 | biostudies-literature
| S-EPMC162336 | biostudies-literature