Ontology highlight
ABSTRACT: Motivation
Segmental duplications > 1 kb in length with >or= 90% sequence identity between copies comprise nearly 5% of the human genome. They are frequently found in large, contiguous regions known as duplication blocks that can contain mosaic patterns of thousands of segmental duplications. Reconstructing the evolutionary history of these complex genomic regions is a non-trivial, but important task.Results
We introduce parsimony and likelihood techniques to analyze the evolutionary relationships between duplication blocks. Both techniques rely on a generic model of duplication in which long, contiguous substrings are copied and reinserted over large physical distances, allowing for a duplication block to be constructed by aggregating substrings of other blocks. For the likelihood method, we give an efficient dynamic programming algorithm to compute the weighted ensemble of all duplication scenarios that account for the construction of a duplication block. Using this ensemble, we derive the probabilities of various duplication scenarios. We formalize the task of reconstructing the evolutionary history of segmental duplications as an optimization problem on the space of directed acyclic graphs. We use a simulated annealing heuristic to solve the problem for a set of segmental duplications in the human genome in both parsimony and likelihood settings.Availability
Supplementary information is available at http://www.cs.brown.edu/people/braphael/supplements/.
SUBMITTER: Kahn CL
PROVIDER: S-EPMC2935423 | biostudies-literature | 2010 Sep
REPOSITORIES: biostudies-literature
Kahn Crystal L CL Hristov Borislav H BH Raphael Benjamin J BJ
Bioinformatics (Oxford, England) 20100901 18
<h4>Motivation</h4>Segmental duplications > 1 kb in length with >or= 90% sequence identity between copies comprise nearly 5% of the human genome. They are frequently found in large, contiguous regions known as duplication blocks that can contain mosaic patterns of thousands of segmental duplications. Reconstructing the evolutionary history of these complex genomic regions is a non-trivial, but important task.<h4>Results</h4>We introduce parsimony and likelihood techniques to analyze the evolutio ...[more]