Unknown

Dataset Information

0

Multi-platform discovery of haplotype-resolved structural variation in human genomes.


ABSTRACT: The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50?bp) and 27,622 SVs (?50?bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.

SUBMITTER: Chaisson MJP 

PROVIDER: S-EPMC6467913 | biostudies-literature | 2019 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multi-platform discovery of haplotype-resolved structural variation in human genomes.

Chaisson Mark J P MJP   Sanders Ashley D AD   Zhao Xuefang X   Malhotra Ankit A   Porubsky David D   Rausch Tobias T   Gardner Eugene J EJ   Rodriguez Oscar L OL   Guo Li L   Collins Ryan L RL   Fan Xian X   Wen Jia J   Handsaker Robert E RE   Fairley Susan S   Kronenberg Zev N ZN   Kong Xiangmeng X   Hormozdiari Fereydoun F   Lee Dillon D   Wenger Aaron M AM   Hastie Alex R AR   Antaki Danny D   Anantharaman Thomas T   Audano Peter A PA   Brand Harrison H   Cantsilieris Stuart S   Cao Han H   Cerveira Eliza E   Chen Chong C   Chen Xintong X   Chin Chen-Shan CS   Chong Zechen Z   Chuang Nelson T NT   Lambert Christine C CC   Church Deanna M DM   Clarke Laura L   Farrell Andrew A   Flores Joey J   Galeev Timur T   Gorkin David U DU   Gujral Madhusudan M   Guryev Victor V   Heaton William Haynes WH   Korlach Jonas J   Kumar Sushant S   Kwon Jee Young JY   Lam Ernest T ET   Lee Jong Eun JE   Lee Joyce J   Lee Wan-Ping WP   Lee Sau Peng SP   Li Shantao S   Marks Patrick P   Viaud-Martinez Karine K   Meiers Sascha S   Munson Katherine M KM   Navarro Fabio C P FCP   Nelson Bradley J BJ   Nodzak Conor C   Noor Amina A   Kyriazopoulou-Panagiotopoulou Sofia S   Pang Andy W C AWC   Qiu Yunjiang Y   Rosanio Gabriel G   Ryan Mallory M   Stütz Adrian A   Spierings Diana C J DCJ   Ward Alistair A   Welch AnneMarie E AE   Xiao Ming M   Xu Wei W   Zhang Chengsheng C   Zhu Qihui Q   Zheng-Bradley Xiangqun X   Lowy Ernesto E   Yakneen Sergei S   McCarroll Steven S   Jun Goo G   Ding Li L   Koh Chong Lek CL   Ren Bing B   Flicek Paul P   Chen Ken K   Gerstein Mark B MB   Kwok Pui-Yan PY   Lansdorp Peter M PM   Marth Gabor T GT   Sebat Jonathan J   Shi Xinghua X   Bashir Ali A   Ye Kai K   Devine Scott E SE   Talkowski Michael E ME   Mills Ryan E RE   Marschall Tobias T   Korbel Jan O JO   Eichler Evan E EE   Lee Charles C  

Nature communications 20190416 1


The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per ge  ...[more]

Similar Datasets

| S-EPMC8026704 | biostudies-literature
| S-EPMC7190621 | biostudies-literature
| S-EPMC7954703 | biostudies-literature
| S-EPMC9882142 | biostudies-literature
| S-EPMC11222905 | biostudies-literature
| S-EPMC4244235 | biostudies-literature
| S-EPMC9464699 | biostudies-literature
| S-EPMC4393510 | biostudies-literature
| S-EPMC9903816 | biostudies-literature
| S-EPMC6476705 | biostudies-literature