Unknown

Dataset Information

0

Use of artificial genomes in assessing methods for atypical gene detection.


ABSTRACT: Parametric methods for identifying laterally transferred genes exploit the directional mutational biases unique to each genome. Yet the development of new, more robust methods--as well as the evaluation and proper implementation of existing methods--relies on an arbitrary assessment of performance using real genomes, where the evolutionary histories of genes are not known. We have used the framework of a generalized hidden Markov model to create artificial genomes modeled after genuine genomes. To model a genome, "core" genes--those displaying patterns of mutational biases shared among large numbers of genes--are identified by a novel gene clustering approach based on the Akaike information criterion. Gene models derived from multiple "core" gene clusters are used to generate an artificial genome that models the properties of a genuine genome. Chimeric artificial genomes--representing those having experienced lateral gene transfer--were created by combining genes from multiple artificial genomes, and the performance of the parametric methods for identifying "atypical" genes was assessed directly. We found that a hidden Markov model that included multiple gene models, each trained on sets of genes representing the range of genotypic variability within a genome, could produce artificial genomes that mimicked the properties of genuine genomes. Moreover, different methods for detecting foreign genes performed differently--i.e., they had different sets of strengths and weaknesses--when identifying atypical genes within chimeric artificial genomes.

SUBMITTER: Azad RK 

PROVIDER: S-EPMC1282332 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4743424 | biostudies-literature
| S-EPMC2615622 | biostudies-literature
2008-09-16 | GSE12770 | GEO
| S-EPMC1849888 | biostudies-literature
2008-10-25 | E-GEOD-12770 | biostudies-arrayexpress
2022-05-11 | GSE196416 | GEO
| S-EPMC7228096 | biostudies-literature
| S-EPMC3089488 | biostudies-literature
2016-10-31 | GSE88850 | GEO
| S-EPMC6116571 | biostudies-literature