Project description:Centromeres play an essential role in cell division by specifying the site of kinetochore formation on each chromosome so that chromosomes can attach to the mitotic spindle for segregation. Centromeres are defined epigenetically by the histone H3 variant CEntromere Protein A (CENP-A). Dividing cells maintain the centromere by depositing new CENP-A each cell cycle to replenish CENP-A diluted by replication. The CENP-A nucleosome serves as the primary signal to the machinery responsible for its replenishment. Vertebrate centromeres are frequently built on repetitive sequences organized in tandem arrays. Repetitive centromeric DNA has been suggested to play a role in centromere maintenance and in de novo centromere formation, but this has been difficult to dissect because of the difficulty in manipulating centromere in cells. Extracts from Xenopus laevis eggs are able to assemble centromeres and kinetochores in vitro and thus provide a useful system for studying the role of centromeric DNA in centromere formation. However centromeric sequences in X. laevis have not been extensively characterized.. In this study we characterize repeat sequences found at X. laevis centromeres. We utilize a k-mer based approach in order to uncover the previously unknown diversity of X. laevis centromeric sequences. We validate centromere localization of repeat sequences by in situ hybridization and identify the location of the centromeric repetitive array on each chromosome by mapping the distribution of centromere enriched k-mers on the Xenopus genome. Our identification of X. laevis centromere sequences enables previously unapproachable genomic studies of centromeres. The k-mer based approach that we used to investigate centromeric repetitive DNA is suitable for the analysis of other repetitive sequences found across the genome or the study of repeats in other organisms.