Unknown

Dataset Information

0

Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.


ABSTRACT: The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

SUBMITTER: Alkan C 

PROVIDER: S-EPMC1994983 | biostudies-literature | 2007 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

Alkan Can C   Ventura Mario M   Archidiacono Nicoletta N   Rocchi Mariano M   Sahinalp S Cenk SC   Eichler Evan E EE  

PLoS computational biology 20070901 9


The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computati  ...[more]

Similar Datasets

| S-EPMC1781350 | biostudies-literature
| S-EPMC4365909 | biostudies-literature
| S-EPMC3457223 | biostudies-literature
| S-EPMC8553948 | biostudies-literature
| S-EPMC6256703 | biostudies-literature
| S-EPMC4585704 | biostudies-literature
| S-EPMC3194837 | biostudies-literature
| S-EPMC8913259 | biostudies-literature
| S-EPMC5031281 | biostudies-literature
| S-EPMC4598191 | biostudies-literature