Unknown

Dataset Information

0

Homology-independent metrics for comparative genomics.


ABSTRACT: A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference.

SUBMITTER: Coutinho TJ 

PROVIDER: S-EPMC4446528 | biostudies-literature | 2015

REPOSITORIES: biostudies-literature

altmetric image

Publications

Homology-independent metrics for comparative genomics.

Coutinho Tarcisio José Domingos TJ   Franco Glória Regina GR   Lobo Francisco Pereira FP  

Computational and structural biotechnology journal 20150504


A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter"  ...[more]

Similar Datasets

| S-EPMC6048835 | biostudies-literature
| S-EPMC3280499 | biostudies-literature
| S-EPMC6220562 | biostudies-literature
| S-EPMC3486910 | biostudies-literature
| S-EPMC5557505 | biostudies-literature
| S-EPMC2268676 | biostudies-literature
| S-EPMC2034462 | biostudies-literature
| S-EPMC3899966 | biostudies-literature
| S-EPMC4551310 | biostudies-literature
| S-EPMC9795662 | biostudies-literature