Unknown

Dataset Information

0

Large scale in silico characterization of repeat expansion variation in human genomes.


ABSTRACT: Significant progress has been made in elucidating single nucleotide polymorphism diversity in the human population. However, the majority of the variation space in the genome is structural and remains partially elusive. One form of structural variation is tandem repeats (TRs). Expansion of TRs are responsible for over 40 diseases, but we hypothesize these represent only a fraction of the pathogenic repeat expansions that exist. Here we characterize long or expanded TR variation in 1,115 human genomes as well as a replication cohort of 2,504 genomes, identified using ExpansionHunter Denovo. We found that individual genomes typically harbor several rare, large TRs, generally in non-coding regions of the genome. We noticed that these large TRs are enriched in their proximity to Alu elements. The vast majority of these large TRs seem to be expansions of smaller TRs that are already present in the reference genome. We are providing this TR profile as a resource for comparison to undiagnosed rare disease genomes in order to detect novel disease-causing repeat expansions.

SUBMITTER: Fazal S 

PROVIDER: S-EPMC7479135 | biostudies-literature | 2020 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Large scale in silico characterization of repeat expansion variation in human genomes.

Fazal Sarah S   Danzi Matt C MC   Cintra Vivian P VP   Bis-Brewer Dana M DM   Dolzhenko Egor E   Eberle Michael A MA   Zuchner Stephan S  

Scientific data 20200908 1


Significant progress has been made in elucidating single nucleotide polymorphism diversity in the human population. However, the majority of the variation space in the genome is structural and remains partially elusive. One form of structural variation is tandem repeats (TRs). Expansion of TRs are responsible for over 40 diseases, but we hypothesize these represent only a fraction of the pathogenic repeat expansions that exist. Here we characterize long or expanded TR variation in 1,115 human ge  ...[more]

Similar Datasets

| S-EPMC4705683 | biostudies-literature
| PRJEB34435 | ENA
| S-EPMC5984062 | biostudies-literature
| S-EPMC3106317 | biostudies-literature
| S-EPMC419331 | biostudies-literature
| S-EPMC1431705 | biostudies-literature
| S-EPMC4173682 | biostudies-literature
| S-EPMC4027155 | biostudies-literature
| S-EPMC2533575 | biostudies-literature
| S-EPMC3141000 | biostudies-literature