Unknown

Dataset Information

0

PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants.


ABSTRACT: Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for identifying disease-related SVs focus on deletion/duplication only and cannot prioritize individual genes affected by SVs, especially for noncoding SVs. Here, we introduce PhenoSV, a phenotype-aware machine-learning model that interprets all major types of SVs and genes affected. PhenoSV segments and annotates SVs with diverse genomic features and employs a transformer-based architecture to predict their impacts under a multiple-instance learning framework. With phenotype information, PhenoSV further utilizes gene-phenotype associations to prioritize phenotype-related SVs. Evaluation on extensive human SV datasets covering all SV types demonstrates PhenoSV's superior performance over competing methods. Applications in diseases suggest that PhenoSV can determine disease-related genes from SVs. A web server and a command-line tool for PhenoSV are available at https://phenosv.wglab.org .

SUBMITTER: Xu Z 

PROVIDER: S-EPMC10684511 | biostudies-literature | 2023 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

PhenoSV: interpretable phenotype-aware model for the prioritization of genes affected by structural variants.

Xu Zhuoran Z   Li Quan Q   Marchionni Luigi L   Wang Kai K  

Nature communications 20231128 1


Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for identifying disease-related SVs focus on deletion/duplication only and cannot prioritize individual genes affected by SVs, especially for noncoding SVs. Here, we introduce PhenoSV, a phenotype-aware machi  ...[more]

Similar Datasets

| S-EPMC8456162 | biostudies-literature
| S-EPMC9950798 | biostudies-literature
| S-EPMC6894143 | biostudies-literature
| S-EPMC6364462 | biostudies-literature
| S-EPMC3264765 | biostudies-literature
| S-EPMC10399873 | biostudies-literature
| S-EPMC4718403 | biostudies-literature
| S-EPMC3912424 | biostudies-literature
| S-EPMC8896633 | biostudies-literature
| S-EPMC11541207 | biostudies-literature