Dataset Information

Family-specific analysis of variant pathogenicity prediction tools.

ABSTRACT: Using the presently available datasets of annotated missense variants, we ran a protein family-specific benchmarking of tools for predicting the pathogenicity of single amino acid variants. We find that despite the high overall accuracy of all tested methods, each tool has its Achilles heel, i.e. protein families in which its predictions prove unreliable (expected accuracy does not exceed 51% in any method). As a proof of principle, we show that choosing the optimal tool and pathogenicity threshold at a protein family-individual level allows obtaining reliable predictions in all Pfam domains (accuracy no less than 68%). A functional analysis of the sets of protein domains annotated exclusively by neutral or pathogenic mutations indicates that specific protein functions can be associated with a high or low sensitivity to mutations, respectively. The highly sensitive sets of protein domains are involved in the regulation of transcription and DNA sequence-specific transcription factor binding, while the domains that do not result in disease when mutated are responsible for mediating immune and stress responses. These results suggest that future predictors of pathogenicity and especially variant prioritization tools may benefit from considering functional annotation.

SUBMITTER: Zaucha J

PROVIDER: S-EPMC7671395 | biostudies-literature | 2020 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Family-specific analysis of variant pathogenicity prediction tools.

Zaucha Jan J Heinzinger Michael M Tarnovskaya Svetlana S Rost Burkhard B Frishman Dmitrij D

NAR genomics and bioinformatics 20200228 2

Using the presently available datasets of annotated missense variants, we ran a protein family-specific benchmarking of tools for predicting the pathogenicity of single amino acid variants. We find that despite the high overall accuracy of all tested methods, each tool has its Achilles heel, i.e. protein families in which its predictions prove unreliable (expected accuracy does not exceed 51% in any method). As a proof of principle, we show that choosing the optimal tool and pathogenicity thresh ...[more]

PMID: 33575576

Similar Datasets

Project description:Malignant hyperthermia (MH) is a pharmacogenetic disorder of skeletal muscle metabolism characterized by generalized muscle rigidity, increased body temperature, rhabdomyolysis, hyperkalemia and severe metabolic acidosis. The underlying mechanism of MH involves excessive Ca2+ release from myotubes via the ryanodine receptor type 1 (RYR1) and the voltage-dependent L-type calcium channel (CACNA1S). As more than 300 variants of unknown significance have been detected to date, we examined whether freely available pathogenicity prediction tools are able to detect relevant MH causing variants. In this diagnostic accuracy study, blood samples from 235 individuals with a history of a clinical malignant hyperthermia or their close relatives were genetically screened for RYR1 variants of all 106 RYR1 exons and additionally for known variants of CACNA1S. In vitro contracture tests were conducted on muscle biopsies obtained from all individuals, independently of whether a pathogenic variant, a variant of unknown significance or no variant was detected. Comparisons were made to three established bioinformatic pathogenicity detection tools to identify the clinical impact of the variants of unknown significance. All detected genetic variants were tested for pathogenicity by three in silico approaches and compared to the in vitro contracture test. Sensitivity and specificity of exon screening of all individuals listed in our MH database was analyzed. Exon screening identified 97 (41%) of the 235 individuals as carriers of pathogenic variants. Variants of unknown significance were detected in 21 individuals. Variants of unknown significance were subdivided into 19 malignant-hyperthermia-susceptible individuals and 2 non-malignant-hyperthermia-susceptible individuals. All pathogenic variants as well as the malignant-hyperthermia-suspectible variants were correctly identified by the bioinformatic prediction tools. Sensitivity of in silico approaches ranged between 0.71 and 0.98 (Polyphen 0.94 [CI 95% 0.75; 0.99]; Sift 0.98 [CI 95% 0.81; 0.99]; MutationTaster 0.92 [CI 95% 0.75; 0.99]). Specificity differed depending on the used tool (Polphen 0.98 [CI 95% 0.32; 0.99]; Sift 0.98 [CI 95% 0.32; 0.99]; MutationTaster 0.00 [CI 95% 0.00; 0.60]). All pathogenic variants and variants of unknown significance were scored as probably damaging in individuals, demonstrating a high sensitivity. Specificity was very low in one of the three tested programs. However, due to potential genotype-phenotype discordance, bioinformatic prediction tools are currently of limited value in diagnosing pathogenicity of MH-susceptible variants.

Dataset Information

Family-specific analysis of variant pathogenicity prediction tools.

Publications

Family-specific analysis of variant pathogenicity prediction tools.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets