Unknown

Dataset Information

0

A familial, telomere-to-telomere reference for human de novo mutation and recombination from a four-generation pedigree.


ABSTRACT: Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess de novo mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 de novo single-nucleotide variants (SNVs), 7.4 non-tandem repeat indels, 79.6 de novo indels or structural variants (SVs) originating from tandem repeats, 7.7 centromeric de novo SVs and SNVs, and 12.4 de novo Y chromosome events per generation. STRs and VNTRs are the most mutable with 32 loci exhibiting recurrent mutation through the generations. We accurately assemble 288 centromeres and six Y chromosomes across the generations, documenting de novo SVs, and demonstrate that the DNM rate varies by an order of magnitude depending on repeat content, length, and sequence identity. We show a strong paternal bias (75-81%) for all forms of germline DNM, yet we estimate that 17% of de novo SNVs are postzygotic in origin with no paternal bias. We place all this variation in the context of a high-resolution recombination map (~3.5 kbp breakpoint resolution). We observe a strong maternal recombination bias (1.36 maternal:paternal ratio) with a consistent reduction in the number of crossovers with increasing paternal (r=0.85) and maternal (r=0.65) age. However, we observe no correlation between meiotic crossover locations and de novo SVs, arguing against non-allelic homologous recombination as a predominant mechanism. The use of multiple orthogonal technologies, near-telomere-to-telomere phased genome assemblies, and a multi-generation family to assess transmission has created the most comprehensive, publicly available "truth set" of all classes of genomic variants. The resource can be used to test and benchmark new algorithms and technologies to understand the most fundamental processes underlying human genetic variation.

SUBMITTER: Porubsky D 

PROVIDER: S-EPMC11326147 | biostudies-literature | 2024 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

A familial, telomere-to-telomere reference for human <i>de novo</i> mutation and recombination from a four-generation pedigree.

Porubsky David D   Dashnow Harriet H   Sasani Thomas A TA   Logsdon Glennis A GA   Hallast Pille P   Noyes Michelle D MD   Kronenberg Zev N ZN   Mokveld Tom T   Koundinya Nidhi N   Nolan Cillian C   Steely Cody J CJ   Guarracino Andrea A   Dolzhenko Egor E   Harvey William T WT   Rowell William J WJ   Grigorev Kirill K   Nicholas Thomas J TJ   Oshima Keisuke K KK   Lin Jiadong J   Ebert Peter P   Watkins W Scott WS   Leung Tiffany Y TY   Hanlon Vincent C T VCT   McGee Sean S   Pedersen Brent S BS   Goldberg Michael E ME   Happ Hannah C HC   Jeong Hyeonsoo H   Munson Katherine M KM   Hoekzema Kendra K   Chan Daniel D DD   Wang Yanni Y   Knuth Jordan J   Garcia Gage H GH   Fanslow Cairbre C   Lambert Christine C   Lee Charles C   Smith Joshua D JD   Levy Shawn S   Mason Christopher E CE   Garrison Erik E   Lansdorp Peter M PM   Neklason Deborah W DW   Jorde Lynn B LB   Quinlan Aaron R AR   Eberle Michael A MA   Eichler Evan E EE  

bioRxiv : the preprint server for biology 20240805


Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess <i>de novo</i> mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 <i>de novo</i> single-nucleotide variants (SNVs), 7.4 non-tandem repeat indels, 79.6 <i>de novo</i> indels or structural variants (SVs) originating  ...[more]

Similar Datasets

| S-EPMC11384008 | biostudies-literature
| S-EPMC11460824 | biostudies-literature
| S-EPMC11848710 | biostudies-literature
| S-EPMC11625190 | biostudies-literature
| S-EPMC11783595 | biostudies-literature
| S-EPMC11275785 | biostudies-literature
| S-EPMC11507141 | biostudies-literature
| S-EPMC11651111 | biostudies-literature
| S-EPMC11785061 | biostudies-literature
| S-EPMC11291067 | biostudies-literature