Unknown

Dataset Information

0

Gene classification based on amino acid motifs and residues: the DLX (distal-less) test case.


ABSTRACT: BACKGROUND:Comparative studies using hundreds of sequences can give a detailed picture of the evolution of a given gene family. Nevertheless, retrieving only the sequences of interest from public databases can be difficult, in particular, when working with highly divergent sequences. The difficulty increases substantially when one wants to include in the study sequences from many (or less well studied) species whose genomes are non-annotated or incompletely annotated. METHODOLOGY/PRINCIPAL FINDINGS:In this work we evaluate the usefulness of different approaches of gene retrieval and classification, using the distal-less (DLX) gene family as a test case. Furthermore, we evaluate whether the use of a large number of gene sequences from a wide range of animal species, the use of multiple alternative alignments, and the use of amino acids aligned with high confidence only, is enough to recover the accepted DLX evolutionary history. CONCLUSIONS/SIGNIFICANCE:The canonical DLX homeobox gene sequence here derived, together with the characteristic amino acid variants here identified in the DLX homeodomain region, can be used to retrieve and classify DLX genes in a simple and efficient way. A program is made available that allows the easy retrieval of synteny information that can be used to classify gene sequences. Maximum likelihood trees using hundreds of sequences can be used for gene identification. Nevertheless, for the DLX case, the proposed DLX evolutionary is not recovered even when multiple alignment algorithms are used.

SUBMITTER: Fonseca NA 

PROVIDER: S-EPMC2685005 | biostudies-literature | 2009 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Gene classification based on amino acid motifs and residues: the DLX (distal-less) test case.

Fonseca Nuno A NA   Vieira Cristina P CP   Vieira Jorge J  

PloS one 20090601 6


<h4>Background</h4>Comparative studies using hundreds of sequences can give a detailed picture of the evolution of a given gene family. Nevertheless, retrieving only the sequences of interest from public databases can be difficult, in particular, when working with highly divergent sequences. The difficulty increases substantially when one wants to include in the study sequences from many (or less well studied) species whose genomes are non-annotated or incompletely annotated.<h4>Methodology/prin  ...[more]

Similar Datasets

| S-EPMC3199776 | biostudies-literature
2017-02-13 | GSE61009 | GEO
| S-EPMC5818739 | biostudies-literature
2018-10-25 | GSE89266 | GEO
2015-05-08 | E-GEOD-68668 | biostudies-arrayexpress
2015-05-08 | GSE68668 | GEO
| S-EPMC7576571 | biostudies-literature
| S-EPMC1134043 | biostudies-literature
| S-EPMC4273668 | biostudies-literature
| S-EPMC5995559 | biostudies-literature