Unknown

Dataset Information

0

Human-specific tandem repeat expansion and differential gene expression during primate evolution.


ABSTRACT: Short tandem repeats (STRs) and variable number tandem repeats (VNTRs) are important sources of natural and disease-causing variation, yet they have been problematic to resolve in reference genomes and genotype with short-read technology. We created a framework to model the evolution and instability of STRs and VNTRs in apes. We phased and assembled 3 ape genomes (chimpanzee, gorilla, and orangutan) using long-read and 10x Genomics linked-read sequence data for 21,442 human tandem repeats discovered in 6 haplotype-resolved assemblies of Yoruban, Chinese, and Puerto Rican origin. We define a set of 1,584 STRs/VNTRs expanded specifically in humans, including large tandem repeats affecting coding and noncoding portions of genes (e.g., MUC3A, CACNA1C). We show that short interspersed nuclear element-VNTR-Alu (SVA) retrotransposition is the main mechanism for distributing GC-rich human-specific tandem repeat expansions throughout the genome but with a bias against genes. In contrast, we observe that VNTRs not originating from retrotransposons have a propensity to cluster near genes, especially in the subtelomere. Using tissue-specific expression from human and chimpanzee brains, we identify genes where transcript isoform usage differs significantly, likely caused by cryptic splicing variation within VNTRs. Using single-cell expression from cerebral organoids, we observe a strong effect for genes associated with transcription profiles analogous to intermediate progenitor cells. Finally, we compare the sequence composition of some of the largest human-specific repeat expansions and identify 52 STRs/VNTRs with at least 40 uninterrupted pure tracts as candidates for genetically unstable regions associated with disease.

SUBMITTER: Sulovari A 

PROVIDER: S-EPMC6859368 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Human-specific tandem repeat expansion and differential gene expression during primate evolution.

Sulovari Arvis A   Li Ruiyang R   Audano Peter A PA   Porubsky David D   Vollger Mitchell R MR   Logsdon Glennis A GA   Warren Wesley C WC   Pollen Alex A AA   Chaisson Mark J P MJP   Eichler Evan E EE  

Proceedings of the National Academy of Sciences of the United States of America 20191028 46


Short tandem repeats (STRs) and variable number tandem repeats (VNTRs) are important sources of natural and disease-causing variation, yet they have been problematic to resolve in reference genomes and genotype with short-read technology. We created a framework to model the evolution and instability of STRs and VNTRs in apes. We phased and assembled 3 ape genomes (chimpanzee, gorilla, and orangutan) using long-read and 10x Genomics linked-read sequence data for 21,442 human tandem repeats discov  ...[more]

Similar Datasets

| S-EPMC1785376 | biostudies-literature
| S-EPMC7477013 | biostudies-literature
| S-EPMC4655330 | biostudies-literature
| S-EPMC3177219 | biostudies-literature
| S-EPMC6917484 | biostudies-literature
| S-EPMC10858039 | biostudies-literature
| S-EPMC7049690 | biostudies-literature
2014-03-11 | E-GEOD-55255 | biostudies-arrayexpress
| S-EPMC5725417 | biostudies-literature
| S-EPMC3665846 | biostudies-literature