Unknown

Dataset Information

0

Identifying Disease-Causing Mutations with Privacy Protection.


ABSTRACT:

Motivation

The use of genome data for diagnosis and treatment is becoming increasingly common. Researchers need access to as many genomes as possible to interpret the patient genome, to obtain some statistical patterns, and to reveal disease-gene relationships. The sensitive information contained in the genome data and the high risk of re-identification increase the privacy and security concerns associated with sharing such data. In this paper, we present an approach to identify disease-associated variants and genes while ensuring patient privacy. The proposed method uses secure multi-party computation to find disease-causing mutations under specific inheritance models without sacrificing the privacy of individuals. It discloses only variants or genes obtained as a result of the analysis. Thus, the vast majority of patient data can be kept private.

Results

Our prototype implementation performs analyses on thousands of genomic data in milliseconds, and the runtime scales logarithmically with the number of patients. We present the first inheritance model (recessive, dominant, compound heterozygous) based privacy-preserving analyses of genomic data in order to find disease-causing mutations. Furthermore, we reimplement the privacy-preserving methods (MAX, SETDIFF, and INTERSECTION) proposed in a previous study. Our MAX, SETDIFF, and INTERSECTION implementations are 2.5, 1122, and 341 times faster than the corresponding operations of the state-of-the-art protocol, respectively.

Availability

https://gitlab.com/DIFUTURE/privacy-preserving-genomic-diagnosis.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Akgun M 

PROVIDER: S-EPMC7850099 | biostudies-literature | 2020 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Identifying disease-causing mutations with privacy protection.

Akgün Mete M   Ünal Ali Burak AB   Ergüner Bekir B   Pfeifer Nico N   Kohlbacher Oliver O  

Bioinformatics (Oxford, England) 20210101 21


<h4>Motivation</h4>The use of genome data for diagnosis and treatment is becoming increasingly common. Researchers need access to as many genomes as possible to interpret the patient genome, to obtain some statistical patterns and to reveal disease-gene relationships. The sensitive information contained in the genome data and the high risk of re-identification increase the privacy and security concerns associated with sharing such data. In this article, we present an approach to identify disease  ...[more]

Similar Datasets

| EGAS00001000023 | EGA
| S-EPMC2823424 | biostudies-literature
| S-EPMC5557957 | biostudies-literature
| S-EPMC1779944 | biostudies-literature
| S-EPMC1579204 | biostudies-literature
| S-EPMC5360345 | biostudies-literature
| S-EPMC4183365 | biostudies-literature
| S-EPMC7893070 | biostudies-literature
| S-EPMC5029899 | biostudies-literature
| S-EPMC1995201 | biostudies-literature